linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
To: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: Linux FS Devel
	<linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux Containers
	<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
	"linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
Subject: Re: [REVIEW][PATCH 1/2] userns: Better restrictions on when proc and sysfs can be mounted
Date: Fri, 15 Nov 2013 14:14:58 +0800	[thread overview]
Message-ID: <5285BBE2.7010001@cn.fujitsu.com> (raw)
In-Reply-To: <87txfexo25.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>

On 11/15/2013 12:54 PM, Eric W. Biederman wrote:
> Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> writes:
> 
>> On 11/15/2013 12:54 AM, Andy Lutomirski wrote:
>>> On Thu, Nov 14, 2013 at 3:10 AM, Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> wrote:
>>>> On 11/13/2013 03:26 PM, Gao feng wrote:
>>>>> On 11/09/2013 01:42 PM, Eric W. Biederman wrote:
>>>>>> Right now I would rather not have the empty directory exception than
>>>>>> remove this code.
>>>>>>
>>>>>> The test is a little trickier to write than it might otherwise be
>>>>>> because /proc and /sys tend to be slightly imperfect filesystems.
>>>>>>
>>>>>> I think the only way to really test that is to call readdir on the
>>>>>> directory itself :(  I don't like that thought.
>>>>>>
>>>>>> I don't know what I was thinking when I wrote that test but I definitely
>>>>>> goofed up.  Grr!
>>>>>>
>>>>>> I can certainly filter out any directory with nlink > 2.  That would be
>>>>>> an easy partial step forward.
>>>>>>
>>>>>> The real question though is how do I detect directories it is safe to
>>>>>> mount on where there will not be files in them.  I can't call iterate
>>>>>> with the namespace_lock held so things are a bit tricky.
>>>>>>
>>>>>
>>>>> I know this problem is not easy to be resolved. why not let the user
>>>>> make the decision?  maybe we can introduce a new mount option MS_LOCK,
>>>>> if user wants to use mount to hide something, he should use mount with
>>>>> option MS_LOCK. so the unpriviged user can't umount this filesystem and
>>>>> fail to mount the filesystem if one of it's child mount is mounted with
>>>>> MS_LOCK option otherwise he use MS_REC too.
>>>>>
>>>>
>>>> Something like this.
>>>>
>>>> From 437f33ea366623c7a9d557b2e84cae424876a44f Mon Sep 17 00:00:00 2001
>>>> From: Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
>>>> Date: Wed, 13 Nov 2013 16:06:46 +0800
>>>> Subject: [PATCH] userns: introduce new mount option MS_LOCK
>>>>
>>>> After commit 5ff9d8a65ce80efb509ce4e8051394e9ed2cd942
>>>> vfs: Lock in place mounts from more privileged users,
>>>> in userns, the mounts of child mntns which copied from
>>>> parent mntns is locked and user has no rights to umount/move
>>>> them, it's too strict.
>>>>
>>>> The core purpose of above commit is trying to prevent
>>>> unprivileged user from accessing files hidden by mount.
>>>> This patch introduces a new mount option MS_LOCK, this
>>>> gives user the capable to mount filesystem as the type
>>>> of lock if he wants to use mount to hide something.
>>>>
>>>
>>> This is bad -- if something was secure in old kernels, it needs to
>>> stay secure.  If you had MS_NOT_A_LOCK, that would be okay, but it
>>> might not solve your problem.
>>>
>>
>> what you mean old kernels here? I saw patch "vfs: Lock in place mounts from more privileged users"
>> is merged into upstream in linux 3.12-rc1, this is not very old. I think there
>> are not many userspace processes rely on this feature.
> 
> Sort of true.  Most people aren't that silly.  This feature was added to
> defend against a theoretical attack that you can use with mount
> namespaces.
> 
> In particular the scenario we are concerned with is:
> 
> Suppose the file system looks like:
> 
> Suppose there are two filesystems a and b that look like:
> 
> a:/usr/
> a:/usr/my_very_secret_file
> a:/dev/
> a:/etc/
> a:/lib/
> 
> b:/bin/
> b:/etc/
> b:/games/
> b:/include/
> b:/lib/
> b:/lib32/
> b:/local/
> b:/sbin/
> b:/share/
> b:/src/
> 
> And filesystem b is mounted on a:/usr hiding a:/usr/my_very_secret_file
> 
> So the filesystem looks like:
> 
> /usr/
> /usr/bin/
> /usr/etc/
> /usr/games/
> /usr/include/
> /usr/lib/
> /usr/lib32/
> /usr/local/
> /usr/sbin/
> /usr/share/
> /usr/src/
> /dev/
> /etc/
> /lib/
> 
> Without locking mounts into place an unprivileged user can clone the
> mount namespace and do "umount /usr" and read /usr/my_very_secret_file.
> 
> Most systems don't hide sensitive things with mounts but it is very
> possible and guarding against is fairly cheap and easy.  And while a
> little annoying it should not be a large impediment to unprivileged user
> of the user namespace because pivot root still works.
> 
> This thread started talking about bugs in fs_fully_visible.  And those
> bugs are fixable and I aim to get to them shortly.  At the very least
> I can lie and test for nlink <= 2 which fixes the regression in mounting
> proc.
> 
> Then I can write the fun version that takes references and drops locks
> so it can call the internal version of readdir to see if a directory is
> actually empty.
> 
> But the principle remains the same we really don't want to reveal
> anything that is hidden under a mount on purpose or by mistake.  Just
> because then we don't have to think about those things from a security
> point of view making everyone's life easier.
> 

Ok,I agree with you that we should make container security by default.

What's your idea that introduces option MS_NOT_A_LOCK just like Andy's
advisement?

In libvirt, host creates dev and devpts directories for container,then
mount devpts, tmpfs on them and create device nodes inside these dirs
for container. and then in container, these filesystems are moved to
container's /dev/ /dev/pts directory. We really have no need to lock
these mounts. they are just created for container.

Thanks

  parent reply	other threads:[~2013-11-15  6:14 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-27 21:44 [REVIEW][PATCH 1/2] userns: Better restrictions on when proc and sysfs can be mounted Eric W. Biederman
     [not found] ` <878uzmhkqg.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-08-27 21:46   ` [REVIEW][PATCH 2/2] sysfs: Restrict mounting sysfs Eric W. Biederman
     [not found]     ` <874naahkng.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-08-28 19:00       ` Greg Kroah-Hartman
2013-09-23 10:33       ` James Hogan
     [not found]         ` <524018EA.9070202-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>
2013-09-23 21:41           ` [PATCH] sysfs: Allow mounting without CONFIG_NET Eric W. Biederman
     [not found]             ` <87ioxrrzb6.fsf_-_-HxuHnoDHeQZYhcs0q7wBk77fW72O3V7zAL8bYrjMMd8@public.gmane.org>
2013-09-24 11:25               ` James Hogan
2013-08-27 21:47   ` [REVIEW][PATCH 1/2] userns: Better restrictions on when proc and sysfs can be mounted Andy Lutomirski
     [not found]     ` <CALCETrWPDzuoaJp2ko5jAbwYUBqSdPjvO5uGo-gZVsS4Wm1PKQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-08-27 21:57       ` Eric W. Biederman
2013-09-01  4:45         ` Eric W. Biederman
     [not found]           ` <87eh99noa0.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-09-03 17:40             ` Andy Lutomirski
2013-11-02  6:06   ` Gao feng
     [not found]     ` <52749663.2000701-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-11-04  7:00       ` Janne Karhunen
     [not found]         ` <CAE=NcrY+CzX+H4XQTdGj7CSZ98a5T=bNgT6=jGZzcjyaHb-ttw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-09  5:22           ` Eric W. Biederman
2013-11-08  2:33       ` Gao feng
     [not found]         ` <527C4D88.10907-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-11-09  5:42           ` Eric W. Biederman
     [not found]             ` <87k3gigmgj.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-11-13  7:26               ` Gao feng
     [not found]                 ` <5283299B.8080702-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-11-14 11:10                   ` Gao feng
     [not found]                     ` <5284AF90.7060506-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-11-14 16:54                       ` Andy Lutomirski
2013-11-15  1:16                         ` Gao feng
     [not found]                           ` <528575EC.2030309-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-11-15  4:54                             ` Eric W. Biederman
     [not found]                               ` <87txfexo25.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-11-15  6:14                                 ` Gao feng [this message]
     [not found]                                   ` <5285BBE2.7010001-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-11-15  8:37                                     ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5285BBE2.7010001@cn.fujitsu.com \
    --to=gaofeng-bthxqxjhjhxqfuhtdcdx3a@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).