From: "J. Bruce Fields" <bfields@fieldses.org> To: Dave Chinner <david@fromorbit.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com>, Casey Schaufler <casey@schaufler-ca.com>, Andy Lutomirski <luto@amacapital.net>, Seth Forshee <seth.forshee@canonical.com>, Alexander Viro <viro@zeniv.linux.org.uk>, Linux FS Devel <linux-fsdevel@vger.kernel.org>, LSM List <linux-security-module@vger.kernel.org>, SELinux-NSA <selinux@tycho.nsa.gov>, Serge Hallyn <serge.hallyn@canonical.com>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org> Subject: Re: [PATCH 0/7] Initial support for user namespace owned mounts Date: Wed, 22 Jul 2015 10:09:23 -0400 [thread overview] Message-ID: <20150722140923.GD22718@fieldses.org> (raw) In-Reply-To: <20150722075640.GE7943@dastard> On Wed, Jul 22, 2015 at 05:56:40PM +1000, Dave Chinner wrote: > On Tue, Jul 21, 2015 at 01:37:21PM -0400, J. Bruce Fields wrote: > > On Fri, Jul 17, 2015 at 12:47:35PM +1000, Dave Chinner wrote: > > > On Thu, Jul 16, 2015 at 07:42:03PM -0500, Eric W. Biederman wrote: > > > > Dave Chinner <david@fromorbit.com> writes: > > > > > The key difference is that desktops only do this when you physically > > > > > plug in a device. With unprivileged mounts, a hostile attacker > > > > > doesn't need physical access to the machine to exploit lurking > > > > > kernel filesystem bugs. i.e. they can just use loopback mounts, and > > > > > they can keep mounting corrupted images until they find something > > > > > that works. > > > > > > > > Yep. That magnifies the problem quite a bit. > > > > > > > > > User namespaces are supposed to provide trust separation. The > > > > > kernel filesystems simply aren't hardened against unprivileged > > > > > attacks from below - there is a trust relationship between root and > > > > > the filesystem in that they are the only things that can write to > > > > > the disk. Mounts from within a userns destroys this relationship as > > > > > the userns root, by definition, is not a trusted actor. > > > > > > > > I talked to Ted Tso a while back and ext4 is at least in principle > > > > already hardened against that kind of attack. I am not certain I > > > > believe it, but if it is true I think it is fantastic. > > > > > > No, it's not. No filesystem is, because to harden against such > > > attacks requires complete verification of all metadata when it is > > > read from disk, before it is used, or some method or ensuring the > > > block was not tampered with. CRCs are not sufficient, because they > > > can be tampered with, too. > > > > > > The only way a filesystem would be able to trust what it reads from > > > disk has not been tampered with in a system with untrusted mounts is > > > if it has some kind of cryptographically secure signature in the > > > metadata and the attacker is unable to access the key for that > > > signature. > > > > Preventing tampering is a little different from protecting the kernel > > from attack, isn't it? I thought the latter was what people were asking > > about. > > People might be asking for the latter, but the only attack vector > that can be made against filesystems from below is via tampering > with the on-disk structure. > > An untrusted user in an untrusted container can construct arbitrary > untrusted filesystem structures and get them parsed by a context > running as $DIETY that assumes the structure is from a trusted > source. What can possibly go wrong? > > IOWs, To protect the kernel against attack from untrusted filesystem > images, we either have to be able to guarantee the image can not be > modified by untrusted parties (i.e. needs to be created with > signed tools, contain only signed filesystem metadata and > signed/encrypted data), I don't think that works--who exactly would be the "trusted party"? It can't be this kernel or this hardware--users expect to be able to mount filesystems created by older kernels, on other machines, running other distributions (even other operating systems). It can't be the user--then any user could compromise the kernel by signing a bad filesystem. Authenticating the creator of the filesystem might be useful for other reasons, but it sounds to me like at best only very weak protection against corrupted filesystems. As a similar example, browser makers are stuck both implementing SSL and hardening their code against malicious content. Those address separate problems. > or we have to sandbox the filesystem parsing > code completely (i.e. fuse). > > > So, for example, a screwed up on-disk directory structure shouldn't > > result in creating a cycle in the dcache and then deadlocking. > > Therein lies the problem: how do you detect such structural defects > without doing a full structure validation? You can prevent cycles in a graph if you can prevent adding an edge which would be part of a cycle. For the dcache, it's d_splice_alias that does that (using d_ancestor). (And I believe the main motivation for that was NFS, where you don't need a filesystem cycle, just a server-side race that can briefly make it look like there's one--an example of the changing filesystem problem that you point out below.) > e.g. cyclic links may > only manifest when completely unrelated pieces of metadata are linked > together in a specific way. > > Further, the problem is not restricted to validation at mount time - > if the user can write to the filesystem image file, then they can > modify it after it has been mounted, too. That means the attacker > may be someone who has broken into a container, not necessarily the > user you trusted with unprivileged mounts. That means every cold > metadata read needs to be treated with suspicion, not just at mount > time. Yes. Agreed that this is difficult. (I can't actually give an example of an existing problem of this sort, but I'd be surprised if they don't exist.) --b.
WARNING: multiple messages have this Message-ID (diff)
From: "J. Bruce Fields" <bfields@fieldses.org> To: Dave Chinner <david@fromorbit.com> Cc: Serge Hallyn <serge.hallyn@canonical.com>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, Andy Lutomirski <luto@amacapital.net>, Seth Forshee <seth.forshee@canonical.com>, LSM List <linux-security-module@vger.kernel.org>, SELinux-NSA <selinux@tycho.nsa.gov>, Linux FS Devel <linux-fsdevel@vger.kernel.org>, Alexander Viro <viro@zeniv.linux.org.uk> Subject: Re: [PATCH 0/7] Initial support for user namespace owned mounts Date: Wed, 22 Jul 2015 10:09:23 -0400 [thread overview] Message-ID: <20150722140923.GD22718@fieldses.org> (raw) In-Reply-To: <20150722075640.GE7943@dastard> On Wed, Jul 22, 2015 at 05:56:40PM +1000, Dave Chinner wrote: > On Tue, Jul 21, 2015 at 01:37:21PM -0400, J. Bruce Fields wrote: > > On Fri, Jul 17, 2015 at 12:47:35PM +1000, Dave Chinner wrote: > > > On Thu, Jul 16, 2015 at 07:42:03PM -0500, Eric W. Biederman wrote: > > > > Dave Chinner <david@fromorbit.com> writes: > > > > > The key difference is that desktops only do this when you physically > > > > > plug in a device. With unprivileged mounts, a hostile attacker > > > > > doesn't need physical access to the machine to exploit lurking > > > > > kernel filesystem bugs. i.e. they can just use loopback mounts, and > > > > > they can keep mounting corrupted images until they find something > > > > > that works. > > > > > > > > Yep. That magnifies the problem quite a bit. > > > > > > > > > User namespaces are supposed to provide trust separation. The > > > > > kernel filesystems simply aren't hardened against unprivileged > > > > > attacks from below - there is a trust relationship between root and > > > > > the filesystem in that they are the only things that can write to > > > > > the disk. Mounts from within a userns destroys this relationship as > > > > > the userns root, by definition, is not a trusted actor. > > > > > > > > I talked to Ted Tso a while back and ext4 is at least in principle > > > > already hardened against that kind of attack. I am not certain I > > > > believe it, but if it is true I think it is fantastic. > > > > > > No, it's not. No filesystem is, because to harden against such > > > attacks requires complete verification of all metadata when it is > > > read from disk, before it is used, or some method or ensuring the > > > block was not tampered with. CRCs are not sufficient, because they > > > can be tampered with, too. > > > > > > The only way a filesystem would be able to trust what it reads from > > > disk has not been tampered with in a system with untrusted mounts is > > > if it has some kind of cryptographically secure signature in the > > > metadata and the attacker is unable to access the key for that > > > signature. > > > > Preventing tampering is a little different from protecting the kernel > > from attack, isn't it? I thought the latter was what people were asking > > about. > > People might be asking for the latter, but the only attack vector > that can be made against filesystems from below is via tampering > with the on-disk structure. > > An untrusted user in an untrusted container can construct arbitrary > untrusted filesystem structures and get them parsed by a context > running as $DIETY that assumes the structure is from a trusted > source. What can possibly go wrong? > > IOWs, To protect the kernel against attack from untrusted filesystem > images, we either have to be able to guarantee the image can not be > modified by untrusted parties (i.e. needs to be created with > signed tools, contain only signed filesystem metadata and > signed/encrypted data), I don't think that works--who exactly would be the "trusted party"? It can't be this kernel or this hardware--users expect to be able to mount filesystems created by older kernels, on other machines, running other distributions (even other operating systems). It can't be the user--then any user could compromise the kernel by signing a bad filesystem. Authenticating the creator of the filesystem might be useful for other reasons, but it sounds to me like at best only very weak protection against corrupted filesystems. As a similar example, browser makers are stuck both implementing SSL and hardening their code against malicious content. Those address separate problems. > or we have to sandbox the filesystem parsing > code completely (i.e. fuse). > > > So, for example, a screwed up on-disk directory structure shouldn't > > result in creating a cycle in the dcache and then deadlocking. > > Therein lies the problem: how do you detect such structural defects > without doing a full structure validation? You can prevent cycles in a graph if you can prevent adding an edge which would be part of a cycle. For the dcache, it's d_splice_alias that does that (using d_ancestor). (And I believe the main motivation for that was NFS, where you don't need a filesystem cycle, just a server-side race that can briefly make it look like there's one--an example of the changing filesystem problem that you point out below.) > e.g. cyclic links may > only manifest when completely unrelated pieces of metadata are linked > together in a specific way. > > Further, the problem is not restricted to validation at mount time - > if the user can write to the filesystem image file, then they can > modify it after it has been mounted, too. That means the attacker > may be someone who has broken into a container, not necessarily the > user you trusted with unprivileged mounts. That means every cold > metadata read needs to be treated with suspicion, not just at mount > time. Yes. Agreed that this is difficult. (I can't actually give an example of an existing problem of this sort, but I'd be surprised if they don't exist.) --b.
next prev parent reply other threads:[~2015-07-22 14:09 UTC|newest] Thread overview: 232+ messages / expand[flat|nested] mbox.gz Atom feed top 2015-07-15 19:46 [PATCH 0/7] Initial support for user namespace owned mounts Seth Forshee 2015-07-15 19:46 ` Seth Forshee 2015-07-15 19:46 ` [PATCH 1/7] fs: Add user namesapace member to struct super_block Seth Forshee 2015-07-15 19:46 ` Seth Forshee 2015-07-16 2:47 ` Eric W. Biederman 2015-07-16 2:47 ` Eric W. Biederman 2015-08-05 21:03 ` Seth Forshee 2015-08-05 21:03 ` Seth Forshee 2015-08-05 21:19 ` Eric W. Biederman 2015-08-05 21:19 ` Eric W. Biederman 2015-08-06 14:20 ` Seth Forshee 2015-08-06 14:20 ` Seth Forshee 2015-08-06 14:51 ` Stephen Smalley 2015-08-06 14:51 ` Stephen Smalley 2015-08-06 15:44 ` Seth Forshee 2015-08-06 15:44 ` Seth Forshee 2015-08-06 16:11 ` Stephen Smalley 2015-08-06 16:11 ` Stephen Smalley 2015-08-07 14:16 ` Seth Forshee 2015-08-07 14:16 ` Seth Forshee 2015-08-07 14:32 ` Seth Forshee 2015-08-07 14:32 ` Seth Forshee 2015-08-07 18:35 ` Casey Schaufler 2015-08-07 18:35 ` Casey Schaufler 2015-08-07 18:57 ` Seth Forshee 2015-08-07 18:57 ` Seth Forshee 2015-07-15 19:46 ` [PATCH 2/7] userns: Simpilify MNT_NODEV handling Seth Forshee 2015-07-15 19:46 ` Seth Forshee 2015-07-15 19:46 ` [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces Seth Forshee 2015-07-15 19:46 ` Seth Forshee 2015-07-15 21:48 ` Serge E. Hallyn 2015-07-15 21:48 ` Serge E. Hallyn 2015-07-15 21:50 ` Andy Lutomirski 2015-07-15 21:50 ` Andy Lutomirski 2015-07-15 22:35 ` Eric W. Biederman 2015-07-15 22:35 ` Eric W. Biederman 2015-07-16 1:14 ` Seth Forshee 2015-07-16 1:14 ` Seth Forshee 2015-07-16 1:23 ` Andy Lutomirski 2015-07-16 1:23 ` Andy Lutomirski 2015-07-16 13:06 ` Seth Forshee 2015-07-16 13:06 ` Seth Forshee 2015-07-16 1:19 ` Andy Lutomirski 2015-07-16 1:19 ` Andy Lutomirski 2015-07-16 4:23 ` Eric W. Biederman 2015-07-16 4:23 ` Eric W. Biederman 2015-07-16 4:49 ` Andy Lutomirski 2015-07-16 4:49 ` Andy Lutomirski 2015-07-16 5:04 ` Eric W. Biederman 2015-07-16 5:04 ` Eric W. Biederman 2015-07-16 5:15 ` Andy Lutomirski 2015-07-16 5:15 ` Andy Lutomirski 2015-07-16 5:44 ` Eric W. Biederman 2015-07-16 5:44 ` Eric W. Biederman 2015-07-16 13:13 ` Seth Forshee 2015-07-16 13:13 ` Seth Forshee 2015-07-17 0:43 ` Eric W. Biederman 2015-07-17 0:43 ` Eric W. Biederman 2015-07-29 16:04 ` Serge E. Hallyn 2015-07-29 16:04 ` Serge E. Hallyn 2015-07-29 16:18 ` Serge E. Hallyn 2015-07-29 16:18 ` Serge E. Hallyn 2015-07-15 19:46 ` [PATCH 4/7] fs: Treat foreign mounts as nosuid Seth Forshee 2015-07-15 19:46 ` Seth Forshee 2015-07-17 6:46 ` Nikolay Borisov 2015-07-17 6:46 ` Nikolay Borisov 2015-07-15 19:46 ` [PATCH 5/7] security: Restrict security attribute updates for userns mounts Seth Forshee 2015-07-15 19:46 ` Seth Forshee 2015-07-15 19:46 ` [PATCH 6/7] selinux: Ignore security labels on user namespace mounts Seth Forshee 2015-07-15 19:46 ` Seth Forshee 2015-07-16 13:23 ` Stephen Smalley 2015-07-22 16:02 ` Stephen Smalley 2015-07-22 16:14 ` Seth Forshee 2015-07-22 16:14 ` Seth Forshee 2015-07-22 20:25 ` Stephen Smalley 2015-07-22 20:25 ` Stephen Smalley 2015-07-22 20:40 ` Stephen Smalley 2015-07-22 20:40 ` Stephen Smalley 2015-07-23 13:57 ` Stephen Smalley 2015-07-23 13:57 ` Stephen Smalley 2015-07-23 14:39 ` Seth Forshee 2015-07-23 14:39 ` Seth Forshee 2015-07-23 15:36 ` Stephen Smalley 2015-07-23 15:36 ` Stephen Smalley 2015-07-23 16:23 ` Seth Forshee 2015-07-23 16:23 ` Seth Forshee 2015-07-24 15:11 ` Seth Forshee 2015-07-24 15:11 ` Seth Forshee 2015-07-30 15:57 ` Stephen Smalley 2015-07-30 15:57 ` Stephen Smalley 2015-07-30 16:24 ` Seth Forshee 2015-07-30 16:24 ` Seth Forshee 2015-07-15 19:46 ` [PATCH 7/7] smack: Don't use security labels for " Seth Forshee 2015-07-15 19:46 ` Seth Forshee 2015-07-15 20:43 ` Casey Schaufler 2015-07-15 20:43 ` Casey Schaufler 2015-07-15 20:36 ` [PATCH 0/7] Initial support for user namespace owned mounts Casey Schaufler 2015-07-15 20:36 ` Casey Schaufler 2015-07-15 21:06 ` Eric W. Biederman 2015-07-15 21:06 ` Eric W. Biederman 2015-07-15 21:48 ` Seth Forshee 2015-07-15 21:48 ` Seth Forshee 2015-07-15 22:28 ` Eric W. Biederman 2015-07-15 22:28 ` Eric W. Biederman 2015-07-16 1:05 ` Andy Lutomirski 2015-07-16 1:05 ` Andy Lutomirski 2015-07-16 2:20 ` Eric W. Biederman 2015-07-16 2:20 ` Eric W. Biederman 2015-07-16 13:12 ` Stephen Smalley 2015-07-16 13:12 ` Stephen Smalley 2015-07-15 23:04 ` Casey Schaufler 2015-07-15 23:04 ` Casey Schaufler 2015-07-15 22:39 ` Casey Schaufler 2015-07-15 22:39 ` Casey Schaufler 2015-07-16 1:08 ` Andy Lutomirski 2015-07-16 1:08 ` Andy Lutomirski 2015-07-16 2:54 ` Casey Schaufler 2015-07-16 2:54 ` Casey Schaufler 2015-07-16 4:47 ` Eric W. Biederman 2015-07-16 4:47 ` Eric W. Biederman 2015-07-17 0:09 ` Dave Chinner 2015-07-17 0:09 ` Dave Chinner 2015-07-17 0:42 ` Eric W. Biederman 2015-07-17 0:42 ` Eric W. Biederman 2015-07-17 2:47 ` Dave Chinner 2015-07-17 2:47 ` Dave Chinner 2015-07-21 17:37 ` J. Bruce Fields 2015-07-21 17:37 ` J. Bruce Fields 2015-07-22 7:56 ` Dave Chinner 2015-07-22 7:56 ` Dave Chinner 2015-07-22 14:09 ` J. Bruce Fields [this message] 2015-07-22 14:09 ` J. Bruce Fields 2015-07-22 16:52 ` Austin S Hemmelgarn 2015-07-22 16:52 ` Austin S Hemmelgarn 2015-07-22 17:41 ` J. Bruce Fields 2015-07-22 17:41 ` J. Bruce Fields 2015-07-23 1:51 ` Dave Chinner 2015-07-23 1:51 ` Dave Chinner 2015-07-23 13:19 ` J. Bruce Fields 2015-07-23 13:19 ` J. Bruce Fields 2015-07-23 23:48 ` Dave Chinner 2015-07-23 23:48 ` Dave Chinner 2015-07-18 0:07 ` Serge E. Hallyn 2015-07-18 0:07 ` Serge E. Hallyn 2015-07-20 17:54 ` Colin Walters 2015-07-20 17:54 ` Colin Walters 2015-07-16 11:16 ` Lukasz Pawelczyk 2015-07-16 11:16 ` Lukasz Pawelczyk 2015-07-17 0:10 ` Eric W. Biederman 2015-07-17 0:10 ` Eric W. Biederman 2015-07-17 10:13 ` Lukasz Pawelczyk 2015-07-17 10:13 ` Lukasz Pawelczyk 2015-07-16 3:15 ` Eric W. Biederman 2015-07-16 3:15 ` Eric W. Biederman 2015-07-16 13:59 ` Seth Forshee 2015-07-16 13:59 ` Seth Forshee 2015-07-16 15:09 ` Casey Schaufler 2015-07-16 15:09 ` Casey Schaufler 2015-07-16 18:57 ` Seth Forshee 2015-07-16 18:57 ` Seth Forshee 2015-07-16 21:42 ` Casey Schaufler 2015-07-16 21:42 ` Casey Schaufler 2015-07-16 22:27 ` Andy Lutomirski 2015-07-16 22:27 ` Andy Lutomirski 2015-07-16 23:08 ` Casey Schaufler 2015-07-16 23:08 ` Casey Schaufler 2015-07-16 23:29 ` Andy Lutomirski 2015-07-16 23:29 ` Andy Lutomirski 2015-07-17 0:45 ` Casey Schaufler 2015-07-17 0:45 ` Casey Schaufler 2015-07-17 0:59 ` Andy Lutomirski 2015-07-17 0:59 ` Andy Lutomirski 2015-07-17 14:28 ` Serge E. Hallyn 2015-07-17 14:28 ` Serge E. Hallyn 2015-07-17 14:56 ` Seth Forshee 2015-07-17 14:56 ` Seth Forshee 2015-07-21 20:35 ` Seth Forshee 2015-07-21 20:35 ` Seth Forshee 2015-07-22 1:52 ` Casey Schaufler 2015-07-22 1:52 ` Casey Schaufler 2015-07-22 15:56 ` Seth Forshee 2015-07-22 15:56 ` Seth Forshee 2015-07-22 18:10 ` Casey Schaufler 2015-07-22 18:10 ` Casey Schaufler 2015-07-22 19:32 ` Seth Forshee 2015-07-22 19:32 ` Seth Forshee 2015-07-23 0:05 ` Casey Schaufler 2015-07-23 0:05 ` Casey Schaufler 2015-07-23 0:15 ` Eric W. Biederman 2015-07-23 0:15 ` Eric W. Biederman 2015-07-23 5:15 ` Seth Forshee 2015-07-23 5:15 ` Seth Forshee 2015-07-23 21:48 ` Casey Schaufler 2015-07-23 21:48 ` Casey Schaufler 2015-07-28 20:40 ` Seth Forshee 2015-07-28 20:40 ` Seth Forshee 2015-07-30 16:18 ` Casey Schaufler 2015-07-30 16:18 ` Casey Schaufler 2015-07-30 17:05 ` Eric W. Biederman 2015-07-30 17:05 ` Eric W. Biederman 2015-07-30 17:25 ` Seth Forshee 2015-07-30 17:25 ` Seth Forshee 2015-07-30 17:33 ` Eric W. Biederman 2015-07-30 17:33 ` Eric W. Biederman 2015-07-17 13:21 ` Seth Forshee 2015-07-17 13:21 ` Seth Forshee 2015-07-17 17:14 ` Casey Schaufler 2015-07-17 17:14 ` Casey Schaufler 2015-07-16 15:59 ` Seth Forshee 2015-07-16 15:59 ` Seth Forshee 2015-07-30 4:24 Amir Goldstein 2015-07-30 4:24 ` Amir Goldstein 2015-07-30 13:55 ` Seth Forshee 2015-07-30 13:55 ` Seth Forshee 2015-07-30 14:47 ` Amir Goldstein 2015-07-30 14:47 ` Amir Goldstein 2015-07-30 15:33 ` Casey Schaufler 2015-07-30 15:33 ` Casey Schaufler 2015-07-30 15:52 ` Colin Walters 2015-07-30 15:52 ` Colin Walters 2015-07-30 16:15 ` Eric W. Biederman 2015-07-30 16:15 ` Eric W. Biederman 2015-07-30 13:57 ` Serge Hallyn 2015-07-30 13:57 ` Serge Hallyn 2015-07-30 15:09 ` Amir Goldstein 2015-07-30 15:09 ` Amir Goldstein 2015-07-31 8:11 Amir Goldstein 2015-07-31 8:11 ` Amir Goldstein 2015-07-31 19:56 ` Casey Schaufler 2015-07-31 19:56 ` Casey Schaufler 2015-08-01 17:01 ` Amir Goldstein 2015-08-01 17:01 ` Amir Goldstein
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20150722140923.GD22718@fieldses.org \ --to=bfields@fieldses.org \ --cc=casey@schaufler-ca.com \ --cc=david@fromorbit.com \ --cc=ebiederm@xmission.com \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-security-module@vger.kernel.org \ --cc=luto@amacapital.net \ --cc=selinux@tycho.nsa.gov \ --cc=serge.hallyn@canonical.com \ --cc=seth.forshee@canonical.com \ --cc=viro@zeniv.linux.org.uk \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.