From: Dave Chinner <david@fromorbit.com> To: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Cc: TongZhang <ztong@vt.edu>, darrick.wong@oracle.com, linux-xfs@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>, linux-security-module@vger.kernel.org, Wenbo Shen <shenwenbosmile@gmail.com> Subject: Re: Leaking Path in XFS's ioctl interface(missing LSM check) Date: Mon, 1 Oct 2018 10:25:21 +1000 [thread overview] Message-ID: <20181001002521.GM31060@dastard> (raw) In-Reply-To: <20180930151652.6975610c@alans-desktop> On Sun, Sep 30, 2018 at 03:16:52PM +0100, Alan Cox wrote: > > > CAP_SYS_ADMIN is also a bit weird because low level access usually > > > implies you can bypass access controls so you should also check > > > CAP_SYS_DAC ? > > > > Do you mean CAP_DAC_READ_SEARCH as per the newer handle syscalls? > > But that only allows bypassing directory search operations, so maybe > > you mean CAP_DAC_OVERRIDE? > > It depends what the ioctl allows you to do. If it allows me to bypass > DAC and manipulate the file system to move objects around then it's a > serious issue. These interfaces have always been allowed to do that. You can't do transparent online background defragmentation without bypassing DAC and moving objects around. You can't scrub metadata and data without bypassing DAC. You can't do dedupe without bypassing /some level/ of DAC to get access to the filesystem used space map and the raw block device to hash the data. But the really important access control for dedupe - avoiding deduping data across files at different security levels - isn't controlled at all. > The underlying problem is if CAP_SYS_ADMIN is able to move objects around > then I can move modules around. Yup, anything with direct access to block devices can do that. Many filesystem and storage utilities are given direct access to the block device, because that's what they need to work. e.g. in DM land, the control ioctls (ctl_ioctl()) are protected by: /* only root can play with this */ if (!capable(CAP_SYS_ADMIN)) return -EACCES; Think about it - if DM control ioctls only require CAP_SYS_ADMIN, then if have that cap you can use DM to remap any block in a block device to any other block. You don't need to the filesystem to move stuff around, it can be moved around without the filesystem knowing anything about it. > We already have a problem with > CAP_DAC_OVERRIDE giving you CAP_SYS_RAWIO (ie totally owning the machine) > unless the modules are signed, if xfs allows ADMIN as well then > CAP_SYS_ADMIN is much easier to obtain and you'd get total system > ownership from it. Always been the case, and it's not isolated to XFS. $ git grep CAP_SYS_ADMIN fs/ |wc -l 139 $ git grep CAP_SYS_ADMIN block/ |wc -l 16 $ git grep CAP_SYS_ADMIN drivers/block/ drivers/scsi |wc -l 88 The "CAP_SYS_ADMIN for ioctls" trust model in the storage stack extends both above and below the filesystem. If you don't trust CAP_SYS_ADMIN, then you are basically saying that you cannot trust your storage management and maintenance utilities at any level. > Not good. > > > Regardless, this horse bolted long before those syscalls were > > introduced. The time to address this issue was when XFS was merged > > into linux all those years ago, back when the apps that run in > > highly secure restricted environments that use these interfaces were > > being ported to linux. We can't change this now without breaking > > userspace.... > > That's what people said about setuid shell scripts. Completely different. setuid shell scripts got abused as a hack for the lazy to avoid setting up permissions properly and hence were easily exploited. The storage stack is completely dependent on a simplisitic layered trust model and that root (CAP_SYS_ADMIN) is god. The storage trust model falls completely apart if we don't have a trusted root user to administer all layers of the storage stack. This isn't the first time I've raised this issue - I raised it back when the user namespace stuff was ram-roaded into the kernel, and was essentially ignored by the userns people. As a result, we end up with all the storage management ioctls restricted to the initns where we have trusted CAP_SYS_ADMIN users. I've also raised it more recently in the unprivileged mount discussions (so untrusted root in containers can mount filesystems) - no solution to the underlying trust model deficiencies was found in those discussions, either. Instead, filesystems that can be mounted by untrusted users (i.e. FUSE) have a special flag in their fstype definition to say this is allowed. Systems restricted by LSMs to the point where CAP_SYS_ADMIN is not trusted have exactly the same issues. i.e. there's nobody trusted by the kernel to administer the storage stack, and nobody has defined a workable security model that can prevent untrusted users from violating the existing storage trust model.... Cheers, Dave. -- Dave Chinner david@fromorbit.com
WARNING: multiple messages have this Message-ID (diff)
From: david@fromorbit.com (Dave Chinner) To: linux-security-module@vger.kernel.org Subject: Leaking Path in XFS's ioctl interface(missing LSM check) Date: Mon, 1 Oct 2018 10:25:21 +1000 [thread overview] Message-ID: <20181001002521.GM31060@dastard> (raw) In-Reply-To: <20180930151652.6975610c@alans-desktop> On Sun, Sep 30, 2018 at 03:16:52PM +0100, Alan Cox wrote: > > > CAP_SYS_ADMIN is also a bit weird because low level access usually > > > implies you can bypass access controls so you should also check > > > CAP_SYS_DAC ? > > > > Do you mean CAP_DAC_READ_SEARCH as per the newer handle syscalls? > > But that only allows bypassing directory search operations, so maybe > > you mean CAP_DAC_OVERRIDE? > > It depends what the ioctl allows you to do. If it allows me to bypass > DAC and manipulate the file system to move objects around then it's a > serious issue. These interfaces have always been allowed to do that. You can't do transparent online background defragmentation without bypassing DAC and moving objects around. You can't scrub metadata and data without bypassing DAC. You can't do dedupe without bypassing /some level/ of DAC to get access to the filesystem used space map and the raw block device to hash the data. But the really important access control for dedupe - avoiding deduping data across files at different security levels - isn't controlled at all. > The underlying problem is if CAP_SYS_ADMIN is able to move objects around > then I can move modules around. Yup, anything with direct access to block devices can do that. Many filesystem and storage utilities are given direct access to the block device, because that's what they need to work. e.g. in DM land, the control ioctls (ctl_ioctl()) are protected by: /* only root can play with this */ if (!capable(CAP_SYS_ADMIN)) return -EACCES; Think about it - if DM control ioctls only require CAP_SYS_ADMIN, then if have that cap you can use DM to remap any block in a block device to any other block. You don't need to the filesystem to move stuff around, it can be moved around without the filesystem knowing anything about it. > We already have a problem with > CAP_DAC_OVERRIDE giving you CAP_SYS_RAWIO (ie totally owning the machine) > unless the modules are signed, if xfs allows ADMIN as well then > CAP_SYS_ADMIN is much easier to obtain and you'd get total system > ownership from it. Always been the case, and it's not isolated to XFS. $ git grep CAP_SYS_ADMIN fs/ |wc -l 139 $ git grep CAP_SYS_ADMIN block/ |wc -l 16 $ git grep CAP_SYS_ADMIN drivers/block/ drivers/scsi |wc -l 88 The "CAP_SYS_ADMIN for ioctls" trust model in the storage stack extends both above and below the filesystem. If you don't trust CAP_SYS_ADMIN, then you are basically saying that you cannot trust your storage management and maintenance utilities at any level. > Not good. > > > Regardless, this horse bolted long before those syscalls were > > introduced. The time to address this issue was when XFS was merged > > into linux all those years ago, back when the apps that run in > > highly secure restricted environments that use these interfaces were > > being ported to linux. We can't change this now without breaking > > userspace.... > > That's what people said about setuid shell scripts. Completely different. setuid shell scripts got abused as a hack for the lazy to avoid setting up permissions properly and hence were easily exploited. The storage stack is completely dependent on a simplisitic layered trust model and that root (CAP_SYS_ADMIN) is god. The storage trust model falls completely apart if we don't have a trusted root user to administer all layers of the storage stack. This isn't the first time I've raised this issue - I raised it back when the user namespace stuff was ram-roaded into the kernel, and was essentially ignored by the userns people. As a result, we end up with all the storage management ioctls restricted to the initns where we have trusted CAP_SYS_ADMIN users. I've also raised it more recently in the unprivileged mount discussions (so untrusted root in containers can mount filesystems) - no solution to the underlying trust model deficiencies was found in those discussions, either. Instead, filesystems that can be mounted by untrusted users (i.e. FUSE) have a special flag in their fstype definition to say this is allowed. Systems restricted by LSMs to the point where CAP_SYS_ADMIN is not trusted have exactly the same issues. i.e. there's nobody trusted by the kernel to administer the storage stack, and nobody has defined a workable security model that can prevent untrusted users from violating the existing storage trust model.... Cheers, Dave. -- Dave Chinner david at fromorbit.com
next prev parent reply other threads:[~2018-10-01 0:25 UTC|newest] Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-09-26 0:51 Leaking Path in XFS's ioctl interface(missing LSM check) TongZhang 2018-09-26 0:51 ` TongZhang 2018-09-26 1:33 ` Dave Chinner 2018-09-26 1:33 ` Dave Chinner 2018-09-26 13:23 ` Stephen Smalley 2018-09-26 13:23 ` Stephen Smalley 2018-09-27 2:08 ` Dave Chinner 2018-09-27 2:08 ` Dave Chinner 2018-09-26 18:24 ` Alan Cox 2018-09-26 18:24 ` Alan Cox 2018-09-27 1:38 ` Dave Chinner 2018-09-27 1:38 ` Dave Chinner 2018-09-27 21:23 ` James Morris 2018-09-27 21:23 ` James Morris 2018-09-27 22:19 ` Dave Chinner 2018-09-27 22:19 ` Dave Chinner 2018-09-27 23:12 ` Tetsuo Handa 2018-09-27 23:12 ` Tetsuo Handa 2018-09-30 14:16 ` Alan Cox 2018-09-30 14:16 ` Alan Cox 2018-10-01 0:25 ` Dave Chinner [this message] 2018-10-01 0:25 ` Dave Chinner 2018-10-01 15:04 ` Alan Cox 2018-10-01 15:25 ` Theodore Y. Ts'o 2018-10-01 22:53 ` Dave Chinner 2018-10-01 15:44 ` Darrick J. Wong 2018-10-01 20:08 ` James Morris 2018-10-01 22:45 ` Dave Chinner 2018-10-02 19:20 ` James Morris 2018-10-02 22:42 ` Dave Chinner
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20181001002521.GM31060@dastard \ --to=david@fromorbit.com \ --cc=darrick.wong@oracle.com \ --cc=gnomes@lxorguk.ukuu.org.uk \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-security-module@vger.kernel.org \ --cc=linux-xfs@vger.kernel.org \ --cc=shenwenbosmile@gmail.com \ --cc=ztong@vt.edu \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.