linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* unprivileged mounts git tree
@ 2008-05-07 12:05 Miklos Szeredi
  2008-08-07 22:27 ` Serge E. Hallyn
  0 siblings, 1 reply; 39+ messages in thread
From: Miklos Szeredi @ 2008-05-07 12:05 UTC (permalink / raw)
  To: akpm; +Cc: hch, viro, linux-fsdevel, linux-kernel

Here's a git tree of the unprivileged mounts patchset:

  git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git unprivileged-mounts

Could this be added to -mm (and dropped if it's in the way of
something) for some testing and added visibility until it's reviewed
by Christoph/Al?

I'm not reposting the whole patchset, since it's essentially the same
as the last submission, only updated to the latest git.  But if
somebody wants it I can post them.

Thanks,
Miklos


 Documentation/filesystems/fuse.txt |   88 ++++++++-
 Documentation/filesystems/proc.txt |   40 ++++
 fs/filesystems.c                   |   60 ++++++
 fs/fuse/inode.c                    |   21 ++
 fs/internal.h                      |    3 +-
 fs/namespace.c                     |  366 +++++++++++++++++++++++++++---------
 fs/pnode.c                         |   22 ++-
 fs/pnode.h                         |    2 +
 fs/super.c                         |   26 ---
 include/linux/fs.h                 |    7 +
 include/linux/mount.h              |    4 +
 kernel/sysctl.c                    |   16 ++
 12 files changed, 527 insertions(+), 128 deletions(-)

Miklos Szeredi (10):
      unprivileged mounts: add user mounts to the kernel
      unprivileged mounts: allow unprivileged umount
      unprivileged mounts: propagate error values from clone_mnt
      unprivileged mounts: account user mounts
      unprivileged mounts: allow unprivileged bind mounts
      unprivileged mounts: allow unprivileged mounts
      unprivileged mounts: add sysctl tunable for "safe" property
      unprivileged mounts: make fuse safe
      unprivileged mounts: propagation: inherit owner from parent
      unprivileged mounts: add "no submounts" flag


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-05-07 12:05 unprivileged mounts git tree Miklos Szeredi
@ 2008-08-07 22:27 ` Serge E. Hallyn
  2008-08-08  0:07   ` Eric W. Biederman
  0 siblings, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2008-08-07 22:27 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: akpm, hch, viro, linux-fsdevel, linux-kernel, Eric W. Biederman

Quoting Miklos Szeredi (miklos@szeredi.hu):
> Here's a git tree of the unprivileged mounts patchset:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git unprivileged-mounts
> 
> Could this be added to -mm (and dropped if it's in the way of
> something) for some testing and added visibility until it's reviewed
> by Christoph/Al?
> 
> I'm not reposting the whole patchset, since it's essentially the same
> as the last submission, only updated to the latest git.  But if
> somebody wants it I can post them.
> 
> Thanks,
> Miklos
> 
> 
>  Documentation/filesystems/fuse.txt |   88 ++++++++-
>  Documentation/filesystems/proc.txt |   40 ++++
>  fs/filesystems.c                   |   60 ++++++
>  fs/fuse/inode.c                    |   21 ++
>  fs/internal.h                      |    3 +-
>  fs/namespace.c                     |  366 +++++++++++++++++++++++++++---------
>  fs/pnode.c                         |   22 ++-
>  fs/pnode.h                         |    2 +
>  fs/super.c                         |   26 ---
>  include/linux/fs.h                 |    7 +
>  include/linux/mount.h              |    4 +
>  kernel/sysctl.c                    |   16 ++
>  12 files changed, 527 insertions(+), 128 deletions(-)
> 
> Miklos Szeredi (10):
>       unprivileged mounts: add user mounts to the kernel
>       unprivileged mounts: allow unprivileged umount
>       unprivileged mounts: propagate error values from clone_mnt
>       unprivileged mounts: account user mounts
>       unprivileged mounts: allow unprivileged bind mounts
>       unprivileged mounts: allow unprivileged mounts
>       unprivileged mounts: add sysctl tunable for "safe" property
>       unprivileged mounts: make fuse safe
>       unprivileged mounts: propagation: inherit owner from parent
>       unprivileged mounts: add "no submounts" flag

Hi Miklos,

so on the bright side I pulled this tree today and it compiled and
passed ltp with no problems.

But then I played around a bit and found I could do the following:

(hmm, i'm trying to remember the exact order :)

as root:
	mmount --bind -o user=500 /home/hallyn/etc/ /home/hallyn/etc/
	mount --bind /mnt /mnt
	mount --make-rshared /mnt
	mount --bind /dev /mnt/dev

as hallyn:
	mmount --bind /mnt /home/hallyn/etc/mnt
	/usr/src/mmount-0.3/mmount --bind mnt/dev mnt/src

Now /mnt/src contained /dev.

Is this what we want?  Do we want to tell the admin it's his fault for
not somehow forcing a slave relationship between /mnt and
/home/hallyn/etc/mnt?  Except I don't think he can do that preemptively,
it has to be done after hallyn does the mmount.

So does that mean that if non-root user X does:

	mount a b

where b is user=X but a is not, then if a is shared we should force it
to be mounted as slave at b?

-serge

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-08-07 22:27 ` Serge E. Hallyn
@ 2008-08-08  0:07   ` Eric W. Biederman
  2008-08-08  0:25     ` Serge E. Hallyn
  0 siblings, 1 reply; 39+ messages in thread
From: Eric W. Biederman @ 2008-08-08  0:07 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Miklos Szeredi, akpm, hch, viro, linux-fsdevel, linux-kernel,
	Eric W. Biederman

"Serge E. Hallyn" <serue@us.ibm.com> writes:

> Quoting Miklos Szeredi (miklos@szeredi.hu):
>> Here's a git tree of the unprivileged mounts patchset:
>> 
>> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git
> unprivileged-mounts
>> 
>> Could this be added to -mm (and dropped if it's in the way of
>> something) for some testing and added visibility until it's reviewed
>> by Christoph/Al?
>> 
>> I'm not reposting the whole patchset, since it's essentially the same
>> as the last submission, only updated to the latest git.  But if
>> somebody wants it I can post them.
>> 
>> Thanks,
>> Miklos
>> 
>> 
>>  Documentation/filesystems/fuse.txt |   88 ++++++++-
>>  Documentation/filesystems/proc.txt |   40 ++++
>>  fs/filesystems.c                   |   60 ++++++
>>  fs/fuse/inode.c                    |   21 ++
>>  fs/internal.h                      |    3 +-
>> fs/namespace.c | 366 +++++++++++++++++++++++++++---------
>>  fs/pnode.c                         |   22 ++-
>>  fs/pnode.h                         |    2 +
>>  fs/super.c                         |   26 ---
>>  include/linux/fs.h                 |    7 +
>>  include/linux/mount.h              |    4 +
>>  kernel/sysctl.c                    |   16 ++
>>  12 files changed, 527 insertions(+), 128 deletions(-)
>> 
>> Miklos Szeredi (10):
>>       unprivileged mounts: add user mounts to the kernel
>>       unprivileged mounts: allow unprivileged umount
>>       unprivileged mounts: propagate error values from clone_mnt
>>       unprivileged mounts: account user mounts
>>       unprivileged mounts: allow unprivileged bind mounts
>>       unprivileged mounts: allow unprivileged mounts
>>       unprivileged mounts: add sysctl tunable for "safe" property
>>       unprivileged mounts: make fuse safe
>>       unprivileged mounts: propagation: inherit owner from parent
>>       unprivileged mounts: add "no submounts" flag
>
> Hi Miklos,
>
> so on the bright side I pulled this tree today and it compiled and
> passed ltp with no problems.
>
> But then I played around a bit and found I could do the following:
>
> (hmm, i'm trying to remember the exact order :)
>
> as root:
> 	mmount --bind -o user=500 /home/hallyn/etc/ /home/hallyn/etc/
> 	mount --bind /mnt /mnt
> 	mount --make-rshared /mnt
> 	mount --bind /dev /mnt/dev
>
> as hallyn:
> 	mmount --bind /mnt /home/hallyn/etc/mnt
> 	/usr/src/mmount-0.3/mmount --bind mnt/dev mnt/src

You are using relative directory names here which makes it confusing.
I'm assuming you in /home/hallyn/etc ?

>
> Now /mnt/src contained /dev.
>
> Is this what we want?

I don't think so.

I think the simplest answer is to not allow mounting of shared
subtrees controlled by a different user.

Serge I think you are right downgrading the mount from shared to slave
looks like the sane thing to do if the mount owners match.

> Do we want to tell the admin it's his fault for
> not somehow forcing a slave relationship between /mnt and
> /home/hallyn/etc/mnt?  Except I don't think he can do that preemptively,
> it has to be done after hallyn does the mmount.
>
> So does that mean that if non-root user X does:
>
> 	mount a b
>
> where b is user=X but a is not, then if a is shared we should force it
> to be mounted as slave at b?
>
> -serge

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-08-08  0:07   ` Eric W. Biederman
@ 2008-08-08  0:25     ` Serge E. Hallyn
  2008-08-25 11:01       ` Miklos Szeredi
  0 siblings, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2008-08-08  0:25 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Miklos Szeredi, akpm, hch, viro, linux-fsdevel, linux-kernel

Quoting Eric W. Biederman (ebiederm@xmission.com):
> "Serge E. Hallyn" <serue@us.ibm.com> writes:
> 
> > Quoting Miklos Szeredi (miklos@szeredi.hu):
> >> Here's a git tree of the unprivileged mounts patchset:
> >> 
> >> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git
> > unprivileged-mounts
> >> 
> >> Could this be added to -mm (and dropped if it's in the way of
> >> something) for some testing and added visibility until it's reviewed
> >> by Christoph/Al?
> >> 
> >> I'm not reposting the whole patchset, since it's essentially the same
> >> as the last submission, only updated to the latest git.  But if
> >> somebody wants it I can post them.
> >> 
> >> Thanks,
> >> Miklos
> >> 
> >> 
> >>  Documentation/filesystems/fuse.txt |   88 ++++++++-
> >>  Documentation/filesystems/proc.txt |   40 ++++
> >>  fs/filesystems.c                   |   60 ++++++
> >>  fs/fuse/inode.c                    |   21 ++
> >>  fs/internal.h                      |    3 +-
> >> fs/namespace.c | 366 +++++++++++++++++++++++++++---------
> >>  fs/pnode.c                         |   22 ++-
> >>  fs/pnode.h                         |    2 +
> >>  fs/super.c                         |   26 ---
> >>  include/linux/fs.h                 |    7 +
> >>  include/linux/mount.h              |    4 +
> >>  kernel/sysctl.c                    |   16 ++
> >>  12 files changed, 527 insertions(+), 128 deletions(-)
> >> 
> >> Miklos Szeredi (10):
> >>       unprivileged mounts: add user mounts to the kernel
> >>       unprivileged mounts: allow unprivileged umount
> >>       unprivileged mounts: propagate error values from clone_mnt
> >>       unprivileged mounts: account user mounts
> >>       unprivileged mounts: allow unprivileged bind mounts
> >>       unprivileged mounts: allow unprivileged mounts
> >>       unprivileged mounts: add sysctl tunable for "safe" property
> >>       unprivileged mounts: make fuse safe
> >>       unprivileged mounts: propagation: inherit owner from parent
> >>       unprivileged mounts: add "no submounts" flag
> >
> > Hi Miklos,
> >
> > so on the bright side I pulled this tree today and it compiled and
> > passed ltp with no problems.
> >
> > But then I played around a bit and found I could do the following:
> >
> > (hmm, i'm trying to remember the exact order :)
> >
> > as root:
> > 	mmount --bind -o user=500 /home/hallyn/etc/ /home/hallyn/etc/
> > 	mount --bind /mnt /mnt
> > 	mount --make-rshared /mnt
> > 	mount --bind /dev /mnt/dev
> >
> > as hallyn:
> > 	mmount --bind /mnt /home/hallyn/etc/mnt
> > 	/usr/src/mmount-0.3/mmount --bind mnt/dev mnt/src
> 
> You are using relative directory names here which makes it confusing.
> I'm assuming you in /home/hallyn/etc ?

Sorry, yeah.

> > Now /mnt/src contained /dev.
> >
> > Is this what we want?
> 
> I don't think so.
> 
> I think the simplest answer is to not allow mounting of shared
> subtrees controlled by a different user.
> 
> Serge I think you are right downgrading the mount from shared to slave
> looks like the sane thing to do if the mount owners match.

I assume you mean "if the mount owners don't match"?

Miklos, what do you think?

The next question then becomes, how can we prove to ourselves that that
closes the last security hole with unprivileged mounts?  So long as
we treat each mount event as a piece of information and look at it as an
information flow problem, maybe we can actually come up with a good
description of the logic that is implemented and show that there is no
way a user can "leak" info...  (where a leak is a mount event, a
violation of intended DAC on open(file) or mkdir, etc)

-serge

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-08-08  0:25     ` Serge E. Hallyn
@ 2008-08-25 11:01       ` Miklos Szeredi
  2008-08-27 15:36         ` Serge E. Hallyn
  0 siblings, 1 reply; 39+ messages in thread
From: Miklos Szeredi @ 2008-08-25 11:01 UTC (permalink / raw)
  To: serue; +Cc: ebiederm, miklos, akpm, hch, viro, linux-fsdevel, linux-kernel

On Thu, 7 Aug 2008, Serge E. Hallyn wrote:
> Quoting Eric W. Biederman (ebiederm@xmission.com):
> > "Serge E. Hallyn" <serue@us.ibm.com> writes:
> > > so on the bright side I pulled this tree today and it compiled and
> > > passed ltp with no problems.
> > >
> > > But then I played around a bit and found I could do the following:
> > >
> > > (hmm, i'm trying to remember the exact order :)
> > >
> > > as root:
> > > 	mmount --bind -o user=500 /home/hallyn/etc/ /home/hallyn/etc/
> > > 	mount --bind /mnt /mnt
> > > 	mount --make-rshared /mnt
> > > 	mount --bind /dev /mnt/dev
> > >
> > > as hallyn:
> > > 	mmount --bind /mnt /home/hallyn/etc/mnt
> > > 	/usr/src/mmount-0.3/mmount --bind mnt/dev mnt/src
> > 
> > You are using relative directory names here which makes it confusing.
> > I'm assuming you in /home/hallyn/etc ?
> 
> Sorry, yeah.
> 
> > > Now /mnt/src contained /dev.
> > >
> > > Is this what we want?
> > 
> > I don't think so.
> > 
> > I think the simplest answer is to not allow mounting of shared
> > subtrees controlled by a different user.
> > 
> > Serge I think you are right downgrading the mount from shared to slave
> > looks like the sane thing to do if the mount owners match.
> 
> I assume you mean "if the mount owners don't match"?
> 
> Miklos, what do you think?

Sorry about the late reply: I was on a long summer vacation...

Serge, thanks for spotting this: it looks indeed a nasty hole!  I also
agree about the solution.

> The next question then becomes, how can we prove to ourselves that that
> closes the last security hole with unprivileged mounts?  So long as
> we treat each mount event as a piece of information and look at it as an
> information flow problem, maybe we can actually come up with a good
> description of the logic that is implemented and show that there is no
> way a user can "leak" info...  (where a leak is a mount event, a
> violation of intended DAC on open(file) or mkdir, etc)

"Information flow problem" doesn't mean much to me (I'm actually an
electric engineer, who ended up doing programming for living ;)

But yeah, we should think this over very carefully.  Especially
interaction with mount propagation, which has very complicated and
sometimes rather counter-intuitive semantics.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-08-25 11:01       ` Miklos Szeredi
@ 2008-08-27 15:36         ` Serge E. Hallyn
  2008-08-27 15:55           ` Miklos Szeredi
  0 siblings, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2008-08-27 15:36 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

Quoting Miklos Szeredi (miklos@szeredi.hu):
> On Thu, 7 Aug 2008, Serge E. Hallyn wrote:
> > Quoting Eric W. Biederman (ebiederm@xmission.com):
> > > "Serge E. Hallyn" <serue@us.ibm.com> writes:
> > > > so on the bright side I pulled this tree today and it compiled and
> > > > passed ltp with no problems.
> > > >
> > > > But then I played around a bit and found I could do the following:
> > > >
> > > > (hmm, i'm trying to remember the exact order :)
> > > >
> > > > as root:
> > > > 	mmount --bind -o user=500 /home/hallyn/etc/ /home/hallyn/etc/
> > > > 	mount --bind /mnt /mnt
> > > > 	mount --make-rshared /mnt
> > > > 	mount --bind /dev /mnt/dev
> > > >
> > > > as hallyn:
> > > > 	mmount --bind /mnt /home/hallyn/etc/mnt
> > > > 	/usr/src/mmount-0.3/mmount --bind mnt/dev mnt/src
> > > 
> > > You are using relative directory names here which makes it confusing.
> > > I'm assuming you in /home/hallyn/etc ?
> > 
> > Sorry, yeah.
> > 
> > > > Now /mnt/src contained /dev.
> > > >
> > > > Is this what we want?
> > > 
> > > I don't think so.
> > > 
> > > I think the simplest answer is to not allow mounting of shared
> > > subtrees controlled by a different user.
> > > 
> > > Serge I think you are right downgrading the mount from shared to slave
> > > looks like the sane thing to do if the mount owners match.
> > 
> > I assume you mean "if the mount owners don't match"?
> > 
> > Miklos, what do you think?
> 
> Sorry about the late reply: I was on a long summer vacation...
> 
> Serge, thanks for spotting this: it looks indeed a nasty hole!  I also
> agree about the solution.

Are you implementing it, or did you want me to?

> > The next question then becomes, how can we prove to ourselves that that
> > closes the last security hole with unprivileged mounts?  So long as
> > we treat each mount event as a piece of information and look at it as an
> > information flow problem, maybe we can actually come up with a good
> > description of the logic that is implemented and show that there is no
> > way a user can "leak" info...  (where a leak is a mount event, a
> > violation of intended DAC on open(file) or mkdir, etc)
> 
> "Information flow problem" doesn't mean much to me (I'm actually an
> electric engineer, who ended up doing programming for living ;)
> 
> But yeah, we should think this over very carefully.  Especially
> interaction with mount propagation, which has very complicated and
> sometimes rather counter-intuitive semantics.

I know we discussed before about whether a propagated mount from a
non-user mount to a user mount should end up being owned by the user
or not.  I don't recall (and am not checking the code at the moment
as your tree is sitting elsewhere) whether we mark the propagated
tree with the right nosuid and nodev flags, or whether we call it
a user mount or not.

-serge

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-08-27 15:36         ` Serge E. Hallyn
@ 2008-08-27 15:55           ` Miklos Szeredi
  2008-08-27 18:46             ` Serge E. Hallyn
  0 siblings, 1 reply; 39+ messages in thread
From: Miklos Szeredi @ 2008-08-27 15:55 UTC (permalink / raw)
  To: serue; +Cc: miklos, ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

On Wed, 27 Aug 2008, Serge E. Hallyn wrote:
> Quoting Miklos Szeredi (miklos@szeredi.hu):
> > Serge, thanks for spotting this: it looks indeed a nasty hole!  I also
> > agree about the solution.
> 
> Are you implementing it, or did you want me to?

I'll implement it.

> > But yeah, we should think this over very carefully.  Especially
> > interaction with mount propagation, which has very complicated and
> > sometimes rather counter-intuitive semantics.
> 
> I know we discussed before about whether a propagated mount from a
> non-user mount to a user mount should end up being owned by the user
> or not.  I don't recall (and am not checking the code at the moment
> as your tree is sitting elsewhere) whether we mark the propagated
> tree with the right nosuid and nodev flags, or whether we call it
> a user mount or not.

If the destination is a user mount, then

 - the propagated mount(s) will be owned by the same user as the destination
 - the propagated mount(s) will inherit 'nosuid' from the destination

I remember also thinking about 'nodev' and why it doesn't need similar
treatment to 'nosuid'.  The reasoning was that 'nodev' is safe as long
as permissions are enforced, namespace shuffling cannot make it
insecure.  Does that sound correct?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-08-27 15:55           ` Miklos Szeredi
@ 2008-08-27 18:46             ` Serge E. Hallyn
  2008-09-03 18:45               ` Miklos Szeredi
  0 siblings, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2008-08-27 18:46 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

Quoting Miklos Szeredi (miklos@szeredi.hu):
> On Wed, 27 Aug 2008, Serge E. Hallyn wrote:
> > Quoting Miklos Szeredi (miklos@szeredi.hu):
> > > Serge, thanks for spotting this: it looks indeed a nasty hole!  I also
> > > agree about the solution.
> > 
> > Are you implementing it, or did you want me to?
> 
> I'll implement it.

Ok, thanks.  I look forward to playing around with it when you publish
the resulting git tree  :)

> > > But yeah, we should think this over very carefully.  Especially
> > > interaction with mount propagation, which has very complicated and
> > > sometimes rather counter-intuitive semantics.
> > 
> > I know we discussed before about whether a propagated mount from a
> > non-user mount to a user mount should end up being owned by the user
> > or not.  I don't recall (and am not checking the code at the moment
> > as your tree is sitting elsewhere) whether we mark the propagated
> > tree with the right nosuid and nodev flags, or whether we call it
> > a user mount or not.
> 
> If the destination is a user mount, then
> 
>  - the propagated mount(s) will be owned by the same user as the destination
>  - the propagated mount(s) will inherit 'nosuid' from the destination
> 
> I remember also thinking about 'nodev' and why it doesn't need similar
> treatment to 'nosuid'.  The reasoning was that 'nodev' is safe as long
> as permissions are enforced, namespace shuffling cannot make it
> insecure.  Does that sound correct?

Yes that sounds correct, thanks for the refresher.

-serge

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-08-27 18:46             ` Serge E. Hallyn
@ 2008-09-03 18:45               ` Miklos Szeredi
  2008-09-03 21:54                 ` Serge E. Hallyn
  2008-09-03 22:02                 ` Serge E. Hallyn
  0 siblings, 2 replies; 39+ messages in thread
From: Miklos Szeredi @ 2008-09-03 18:45 UTC (permalink / raw)
  To: serue; +Cc: miklos, ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

On Wed, 27 Aug 2008, Serge E. Hallyn wrote:
> Ok, thanks.  I look forward to playing around with it when you publish
> the resulting git tree  :)

A couple of centuries later...

...here's the updated git tree:

git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git unprivileged-mounts

Changes since the previous version:

 - update to apply against latest git
 - downgrade shared mounts to slave for unprivileged binds (if owners differ)
 - don't allow unprivileged recursive binds

Serge, thanks again for testing and reviewing these patches!

Miklos

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-03 18:45               ` Miklos Szeredi
@ 2008-09-03 21:54                 ` Serge E. Hallyn
  2008-09-03 22:02                 ` Serge E. Hallyn
  1 sibling, 0 replies; 39+ messages in thread
From: Serge E. Hallyn @ 2008-09-03 21:54 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

Quoting Miklos Szeredi (miklos@szeredi.hu):
> On Wed, 27 Aug 2008, Serge E. Hallyn wrote:
> > Ok, thanks.  I look forward to playing around with it when you publish
> > the resulting git tree  :)
> 
> A couple of centuries later...
> 
> ...here's the updated git tree:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git unprivileged-mounts
> 
> Changes since the previous version:
> 
>  - update to apply against latest git
>  - downgrade shared mounts to slave for unprivileged binds (if owners differ)
>  - don't allow unprivileged recursive binds
> 
> Serge, thanks again for testing and reviewing these patches!

Well I see where a shared mount *should* be turned into a slave mount
when bind-mounted as a user mount, but it doesn't seem to be happening.
In particular, after doing a user mount of /mnt onto
/home/hallyn/etc/mnt, /proc/self/mountinfo ends in:

22 13 3:1 /mnt /mnt rw shared:1 - ext3 /dev/root
rw,errors=continue,user_xattr,acl,data=ordered
23 13 3:1 /mnt /root/mnt rw shared:1 - ext3 /dev/root
rw,errors=continue,user_xattr,acl,data=ordered
24 13 3:1 /mnt /home/hallyn/etc/mnt rw,user=500 shared:1 - ext3
/dev/root rw,errors=continue,user_xattr,acl,data=ordered

I assume this means that /mnt and /home/hallyn/etc/mnt are peers
in peergroup 1?

And indeed if hallyn does mount --bind /usr /home/hallyn/etc/mnt/usr,
then /mnt/usr shows the contents of /usr.

I see that in do_loopback() you are adding CL_SLAVE|CL_MAKE_SHARED to
flags so I don't get what is going on.  Still looking through the code.

-serge

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-03 18:45               ` Miklos Szeredi
  2008-09-03 21:54                 ` Serge E. Hallyn
@ 2008-09-03 22:02                 ` Serge E. Hallyn
  2008-09-03 22:25                   ` Miklos Szeredi
  1 sibling, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2008-09-03 22:02 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

Quoting Miklos Szeredi (miklos@szeredi.hu):
> On Wed, 27 Aug 2008, Serge E. Hallyn wrote:
> > Ok, thanks.  I look forward to playing around with it when you publish
> > the resulting git tree  :)
> 
> A couple of centuries later...
> 
> ...here's the updated git tree:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git unprivileged-mounts
> 
> Changes since the previous version:
> 
>  - update to apply against latest git
>  - downgrade shared mounts to slave for unprivileged binds (if owners differ)
>  - don't allow unprivileged recursive binds
> 
> Serge, thanks again for testing and reviewing these patches!

Ooh.

You predicate the turning of shared mount to a slave mount on
!capable(CAP_SYS_ADMIN).  But in fact it's the mount by a privileged
user, turning the mount into a user mount, which you want to convert.
So my series of steps was:

	as root:
		(1) mount --bind /mnt /mnt
		(2) mount --make-rshared /mnt
		(3) /usr/src/mmount-0.3/mmount --bind -o user=hallyn /mnt \
			/home/hallyn/etc/mnt
	as hallyn:
		(4) mount --bind /usr /home/hallyn/etc/mnt/usr

You are turning mounts from shared->slave at step 4, but in fact we need
to do it at step 3, where we do have CAP_SYS_ADMIN.

-serge

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-03 22:02                 ` Serge E. Hallyn
@ 2008-09-03 22:25                   ` Miklos Szeredi
  2008-09-03 22:43                     ` Serge E. Hallyn
  0 siblings, 1 reply; 39+ messages in thread
From: Miklos Szeredi @ 2008-09-03 22:25 UTC (permalink / raw)
  To: serue; +Cc: miklos, ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

On Wed, 3 Sep 2008, Serge E. Hallyn wrote:
> Ooh.
> 
> You predicate the turning of shared mount to a slave mount on
> !capable(CAP_SYS_ADMIN).  But in fact it's the mount by a privileged
> user, turning the mount into a user mount, which you want to convert.
> So my series of steps was:
> 
> 	as root:
> 		(1) mount --bind /mnt /mnt
> 		(2) mount --make-rshared /mnt
> 		(3) /usr/src/mmount-0.3/mmount --bind -o user=hallyn /mnt \
> 			/home/hallyn/etc/mnt
> 	as hallyn:
> 		(4) mount --bind /usr /home/hallyn/etc/mnt/usr
> 
> You are turning mounts from shared->slave at step 4, but in fact we need
> to do it at step 3, where we do have CAP_SYS_ADMIN.

Well, that's arguable: I think root should be able to shoot itself in
the foot by doing step 3.  Generally we don't restrict what root can
do.  OTOH I agree that current behavior is ugly in that it provides
different semantics for privileged/non-privileged callers.

Perhaps it would be cleaner to simply not allow step 4, instead of
playing tricks with changing the propagation type.

Miklos

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-03 22:25                   ` Miklos Szeredi
@ 2008-09-03 22:43                     ` Serge E. Hallyn
  2008-09-04  6:42                       ` Miklos Szeredi
  0 siblings, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2008-09-03 22:43 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

Quoting Miklos Szeredi (miklos@szeredi.hu):
> On Wed, 3 Sep 2008, Serge E. Hallyn wrote:
> > Ooh.
> > 
> > You predicate the turning of shared mount to a slave mount on
> > !capable(CAP_SYS_ADMIN).  But in fact it's the mount by a privileged
> > user, turning the mount into a user mount, which you want to convert.
> > So my series of steps was:
> > 
> > 	as root:
> > 		(1) mount --bind /mnt /mnt
> > 		(2) mount --make-rshared /mnt
> > 		(3) /usr/src/mmount-0.3/mmount --bind -o user=hallyn /mnt \
> > 			/home/hallyn/etc/mnt
> > 	as hallyn:
> > 		(4) mount --bind /usr /home/hallyn/etc/mnt/usr
> > 
> > You are turning mounts from shared->slave at step 4, but in fact we need
> > to do it at step 3, where we do have CAP_SYS_ADMIN.
> 
> Well, that's arguable: I think root should be able to shoot itself in
> the foot by doing step 3.

Maybe I'm not thinking right, but long-term is there any reason why we
should require privilege in order to do step 3, so long as the user has
read access to the source and write access to the destination?

I don't think there is.  Other than this glitch.  That's a powerful
reason to fix the glitch.

The other argument is that, frankly, I think most people are still
either unaware of, or confused by, mounts propagation.  Letting root
shoot himself in the foot is reasonable only to a point.

> Generally we don't restrict what root can
> do.  OTOH I agree that current behavior is ugly in that it provides
> different semantics for privileged/non-privileged callers.
> 
> Perhaps it would be cleaner to simply not allow step 4, instead of
> playing tricks with changing the propagation type.

If the user or admin can simply (I haven't tested)

	mmount --bind --make-rslave -o user=hallyn /mnt \
		/home/hallyn/etc/mnt

then returning -EPERM if --make-rslave was not provided is reasonable
IMO.

-serge

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-03 22:43                     ` Serge E. Hallyn
@ 2008-09-04  6:42                       ` Miklos Szeredi
  2008-09-04 13:28                         ` Serge E. Hallyn
  0 siblings, 1 reply; 39+ messages in thread
From: Miklos Szeredi @ 2008-09-04  6:42 UTC (permalink / raw)
  To: serue; +Cc: miklos, ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

On Wed, 3 Sep 2008, Serge E. Hallyn wrote:
> Quoting Miklos Szeredi (miklos@szeredi.hu):
> > On Wed, 3 Sep 2008, Serge E. Hallyn wrote:
> > > Ooh.
> > > 
> > > You predicate the turning of shared mount to a slave mount on
> > > !capable(CAP_SYS_ADMIN).  But in fact it's the mount by a privileged
> > > user, turning the mount into a user mount, which you want to convert.
> > > So my series of steps was:
> > > 
> > > 	as root:
> > > 		(1) mount --bind /mnt /mnt
> > > 		(2) mount --make-rshared /mnt
> > > 		(3) /usr/src/mmount-0.3/mmount --bind -o user=hallyn /mnt \
> > > 			/home/hallyn/etc/mnt
> > > 	as hallyn:
> > > 		(4) mount --bind /usr /home/hallyn/etc/mnt/usr
> > > 
> > > You are turning mounts from shared->slave at step 4, but in fact we need
> > > to do it at step 3, where we do have CAP_SYS_ADMIN.
> > 
> > Well, that's arguable: I think root should be able to shoot itself in
> > the foot by doing step 3.
> 
> Maybe I'm not thinking right, but long-term is there any reason why we
> should require privilege in order to do step 3, so long as the user has
> read access to the source and write access to the destination?
> 
> I don't think there is.  Other than this glitch.  That's a powerful
> reason to fix the glitch.

Agreed, without privileges it's unacceptable to allow step 3 as is.

> The other argument is that, frankly, I think most people are still
> either unaware of, or confused by, mounts propagation.  Letting root
> shoot himself in the foot is reasonable only to a point.

Hmm, I think there are infinite ways in which root can mess up mount
propagation, and this is not even the worst.  I'm not trying to
belittle this bug: done unprivileged it's unacceptable.  But with
privileges, I really don't know if we should change the propagation
semantics for this corner case, they are complicated enough already.

> > Generally we don't restrict what root can
> > do.  OTOH I agree that current behavior is ugly in that it provides
> > different semantics for privileged/non-privileged callers.
> > 
> > Perhaps it would be cleaner to simply not allow step 4, instead of
> > playing tricks with changing the propagation type.
> 
> If the user or admin can simply (I haven't tested)
> 
> 	mmount --bind --make-rslave -o user=hallyn /mnt \
> 		/home/hallyn/etc/mnt
> 
> then returning -EPERM if --make-rslave was not provided is reasonable
> IMO.

Right, that sounds perfect.  the only problem is, bind mount currently
ignores the propagation flags, for no good reason I can see.

That's a separate patch though.  I'll look into it.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-04  6:42                       ` Miklos Szeredi
@ 2008-09-04 13:28                         ` Serge E. Hallyn
  2008-09-04 14:06                           ` Miklos Szeredi
  0 siblings, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2008-09-04 13:28 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

Quoting Miklos Szeredi (miklos@szeredi.hu):
> On Wed, 3 Sep 2008, Serge E. Hallyn wrote:
> > Quoting Miklos Szeredi (miklos@szeredi.hu):
> > > On Wed, 3 Sep 2008, Serge E. Hallyn wrote:
> > > > Ooh.
> > > > 
> > > > You predicate the turning of shared mount to a slave mount on
> > > > !capable(CAP_SYS_ADMIN).  But in fact it's the mount by a privileged
> > > > user, turning the mount into a user mount, which you want to convert.
> > > > So my series of steps was:
> > > > 
> > > > 	as root:
> > > > 		(1) mount --bind /mnt /mnt
> > > > 		(2) mount --make-rshared /mnt
> > > > 		(3) /usr/src/mmount-0.3/mmount --bind -o user=hallyn /mnt \
> > > > 			/home/hallyn/etc/mnt
> > > > 	as hallyn:
> > > > 		(4) mount --bind /usr /home/hallyn/etc/mnt/usr
> > > > 
> > > > You are turning mounts from shared->slave at step 4, but in fact we need
> > > > to do it at step 3, where we do have CAP_SYS_ADMIN.
> > > 
> > > Well, that's arguable: I think root should be able to shoot itself in
> > > the foot by doing step 3.
> > 
> > Maybe I'm not thinking right, but long-term is there any reason why we
> > should require privilege in order to do step 3, so long as the user has
> > read access to the source and write access to the destination?
> > 
> > I don't think there is.  Other than this glitch.  That's a powerful
> > reason to fix the glitch.
> 
> Agreed, without privileges it's unacceptable to allow step 3 as is.
> 
> > The other argument is that, frankly, I think most people are still
> > either unaware of, or confused by, mounts propagation.  Letting root
> > shoot himself in the foot is reasonable only to a point.
> 
> Hmm, I think there are infinite ways in which root can mess up mount
> propagation, and this is not even the worst.  I'm not trying to
> belittle this bug: done unprivileged it's unacceptable.  But with
> privileges, I really don't know if we should change the propagation
> semantics for this corner case, they are complicated enough already.
> 
> > > Generally we don't restrict what root can
> > > do.  OTOH I agree that current behavior is ugly in that it provides
> > > different semantics for privileged/non-privileged callers.
> > > 
> > > Perhaps it would be cleaner to simply not allow step 4, instead of
> > > playing tricks with changing the propagation type.
> > 
> > If the user or admin can simply (I haven't tested)
> > 
> > 	mmount --bind --make-rslave -o user=hallyn /mnt \
> > 		/home/hallyn/etc/mnt
> > 
> > then returning -EPERM if --make-rslave was not provided is reasonable
> > IMO.
> 
> Right, that sounds perfect.  the only problem is, bind mount currently
> ignores the propagation flags, for no good reason I can see.
> 
> That's a separate patch though.  I'll look into it.
> 
> Thanks,
> Miklos

Cool, thanks, Miklos :)

Are you going to revert the change forcing CL_SLAVE for
!capable(CAP_SYS_ADMIN)?  I don't think we want that - I think that
*within* a set of user mounts, propagation should be safe, right?

Will you be able to do this soon?  If not, should we just do the part
returning -EPERM when turning a shared mount into a user mount? 
Because I think that would then be ready for testing in -mm, and would
love to see it tested.

Were you going to push a patch to mount to do the user mounts, or
put sample code in Documentation, git log, or under samples/?

thanks,
-serge

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-04 13:28                         ` Serge E. Hallyn
@ 2008-09-04 14:06                           ` Miklos Szeredi
  2008-09-04 15:40                             ` Miklos Szeredi
  0 siblings, 1 reply; 39+ messages in thread
From: Miklos Szeredi @ 2008-09-04 14:06 UTC (permalink / raw)
  To: serue; +Cc: miklos, ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

On Thu, 4 Sep 2008, Serge E. Hallyn wrote:
> Are you going to revert the change forcing CL_SLAVE for
> !capable(CAP_SYS_ADMIN)?  I don't think we want that - I think that
> *within* a set of user mounts, propagation should be safe, right?
> 
> Will you be able to do this soon?  If not, should we just do the part
> returning -EPERM when turning a shared mount into a user mount? 

OK, let's do that first and the tricky part (propagation vs. user
mounts) later.  Will push after I've tested it.

> Because I think that would then be ready for testing in -mm, and would
> love to see it tested.
> 
> Were you going to push a patch to mount to do the user mounts, or
> put sample code in Documentation, git log, or under samples/?

I've got a patch against util-linux-ng, but Karel doesn't want to take
it (understandibly) until the kernel patches have made it into
mainline.

Here it is, if you want to play with it.

Thanks,
Miklos
----

Subject: util-linux-ng: unprivileged mounts support

From: Miklos Szeredi <mszeredi@suse.cz>

This is an experimental patch for supporing unprivileged mounts and
umounts.  The following features are added:

1) If mount/umount are suid, first try without privileges.

This is done by forking, dropping privileges in child, and redirecting
stderr to /dev/null.  If this succeeds, then parent exits with zero
exit code.  Otherwise parent continues normally (with privileges).
This isn't perfect, because the wrong error message will be printed if
mount/umount failed not because of insufficient privileges, but some
other error (e.g. mountpoint busy).

2) Support user mounts in kernel.

If /etc/mtab is a symlink (to /proc/mounts if it's been set up
correctly), then change fsuid for the duration of the mount syscall
and use the MS_SETOWNER flag.  Old kernels will simply ignore this,
and everything will work as before.  Kernels with the unprivileged
mounts patches will set the owner of the mount, and the relevant line
in /proc/PID/mounts will contain a "user=XYZ" option, making this
backward compatible with /etc/mtab.

3) Root can force a specific user for a mount with "-ouser=XYZ".  This
has worked previously as well, but that was probably just accidental
(the option was copied verbatim to /etc/mtab).

4) Add support for "submnt" and "nosubmnt" options.  These can be used
to allow or prohibit unprivileged submounting of a user mount.

Example:

root# mount --bind -ouser=xyz /home/xyz /home/xyz
xyz> mount --bind ~/foo ~/bar
xyz> umount ~/bar

Changes since last version:

 - rename options:  'nomnt' -> 'nosubmnt', 'mnt' -> 'submnt'
 - change error message for EMFILE
 - fix bug in handling 'user' option in /etc/fstab
 - default to 'nosubmnt' for fstab based user mounts, for backward
   compatibility

Todo:

 - proper error reporting for unprivileged mounts and umounts
 - ./configure magic for non-linux systems

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 configure.ac            |    5 +
 mount/Makefile.am       |    4 +
 mount/fsprobe.h         |    1 
 mount/fstab.c           |    2 
 mount/fstab.h           |    1 
 mount/mount.c           |  174 ++++++++++++++++++++++++++++++++++++++++++------
 mount/mount_constants.h |    6 +
 mount/umount.c          |   39 +++++++++-
 8 files changed, 209 insertions(+), 23 deletions(-)

Index: util-linux-ng/configure.ac
===================================================================
--- util-linux-ng.orig/configure.ac	2008-09-04 15:51:56.000000000 +0200
+++ util-linux-ng/configure.ac	2008-09-04 15:52:06.000000000 +0200
@@ -139,6 +139,11 @@ fi
 UTIL_CHECK_LIB(util, openpty)
 UTIL_CHECK_LIB(termcap, tgetnum)
 
+UTIL_CHECK_LIB(cap, cap_get_proc)
+if test $have_cap = no; then
+   AC_MSG_ERROR([libcap is required for mount]);
+fi
+
 AC_ARG_WITH([fsprobe],
   [AS_HELP_STRING([--with-fsprobe], [library to guess filesystems (blkid|volume_id), default is blkid])],
   [], [with_fsprobe=blkid]
Index: util-linux-ng/mount/Makefile.am
===================================================================
--- util-linux-ng.orig/mount/Makefile.am	2008-09-04 15:51:46.000000000 +0200
+++ util-linux-ng/mount/Makefile.am	2008-09-04 15:52:06.000000000 +0200
@@ -69,6 +69,10 @@ mount_LDADD += $(SELINUX_LIBS)
 mount_static_LDADD = $(SELINUX_LIBS_STATIC)
 endif
 
+if HAVE_CAP
+mount_LDADD += -lcap
+endif
+
 if HAVE_VOLUME_ID
 utils_common += fsprobe_volumeid.c
 swapon_SOURCES += ../lib/linux_version.c ../lib/blkdev.c
Index: util-linux-ng/mount/fsprobe.h
===================================================================
--- util-linux-ng.orig/mount/fsprobe.h	2008-09-04 15:51:46.000000000 +0200
+++ util-linux-ng/mount/fsprobe.h	2008-09-04 15:52:06.000000000 +0200
@@ -33,6 +33,7 @@ struct mountargs {
 	const char *type;
 	int flags;
 	void *data;
+	uid_t uid;
 };
 
 extern int fsprobe_known_fstype_in_procfs(const char *type);
Index: util-linux-ng/mount/fstab.c
===================================================================
--- util-linux-ng.orig/mount/fstab.c	2008-09-04 15:51:46.000000000 +0200
+++ util-linux-ng/mount/fstab.c	2008-09-04 15:52:06.000000000 +0200
@@ -52,7 +52,7 @@ mtab_does_not_exist(void) {
 	return var_mtab_does_not_exist;
 }
 
-static int
+int
 mtab_is_a_symlink(void) {
 	get_mtab_info();
 	return var_mtab_is_a_symlink;
Index: util-linux-ng/mount/fstab.h
===================================================================
--- util-linux-ng.orig/mount/fstab.h	2008-09-04 15:51:46.000000000 +0200
+++ util-linux-ng/mount/fstab.h	2008-09-04 15:52:06.000000000 +0200
@@ -2,6 +2,7 @@
 #define MOUNT_FSTAB_H
 
 #include "mount_mntent.h"
+int mtab_is_a_symlink(void);
 int mtab_is_writable(void);
 int mtab_does_not_exist(void);
 void reset_mtab_info(void);
Index: util-linux-ng/mount/mount.c
===================================================================
--- util-linux-ng.orig/mount/mount.c	2008-09-04 15:51:56.000000000 +0200
+++ util-linux-ng/mount/mount.c	2008-09-04 15:52:06.000000000 +0200
@@ -19,6 +19,8 @@
 #include <sys/stat.h>
 #include <sys/wait.h>
 #include <sys/mount.h>
+#include <sys/fsuid.h>
+#include <sys/capability.h>
 
 #include <mntent.h>
 
@@ -101,7 +103,7 @@ struct opt_map {
 #define MS_USER		0x20000000
 #define MS_OWNER	0x10000000
 #define MS_GROUP	0x08000000
-#define MS_COMMENT	0x02000000
+#define MS_COMMENT	0x04000000
 #define MS_LOOP		0x00010000
 
 /* Options that we keep the mount system call from seeing.  */
@@ -113,10 +115,10 @@ struct opt_map {
 #define MS_PROPAGATION  (MS_SHARED|MS_SLAVE|MS_UNBINDABLE|MS_PRIVATE)
 
 /* Options that we make ordinary users have by default.  */
-#define MS_SECURE	(MS_NOEXEC|MS_NOSUID|MS_NODEV)
+#define MS_SECURE	(MS_NOEXEC|MS_NOSUID|MS_NODEV|MS_NOSUBMNT)
 
 /* Options that we make owner-mounted devices have by default */
-#define MS_OWNERSECURE	(MS_NOSUID|MS_NODEV)
+#define MS_OWNERSECURE	(MS_NOSUID|MS_NODEV|MS_NOSUBMNT)
 
 static const struct opt_map opt_map[] = {
   { "defaults",	0, 0, 0		},	/* default options */
@@ -176,6 +178,8 @@ static const struct opt_map opt_map[] =
 					  to mtime/ctime */
 #endif
   { "nofail",	0, 0, MS_COMMENT},	/* Do not fail if ENOENT on dev */
+  { "submnt",	0, 1, MS_NOSUBMNT},	/* permit unprivileged submounts */
+  { "nosubmnt",0, 0, MS_NOSUBMNT},	/* no unprivileged submounts */
   { NULL,	0, 0, 0		}
 };
 
@@ -359,7 +363,7 @@ append_context(const char *optname, char
  * For the options uid= and gid= replace user or group name by its value.
  */
 static inline void
-parse_opt(char *opt, int *mask, char **extra_opts) {
+parse_opt(char *opt, int *mask, char **extra_opts, char **user) {
 	const struct opt_map *om;
 
 	for (om = opt_map; om->opt != NULL; om++)
@@ -404,6 +408,11 @@ parse_opt(char *opt, int *mask, char **e
 			return;
 		}
 	}
+	if (strncmp(opt, "user=", 5) == 0) {
+		free(*user);
+		*user = xstrdup(opt + 5);
+		return;
+	}
 
 #ifdef HAVE_LIBSELINUX
 	if (strncmp(opt, "context=", 8) == 0 && *(opt+8)) {
@@ -425,7 +434,7 @@ parse_opt(char *opt, int *mask, char **e
 /* Take -o options list and compute 4th and 5th args to mount(2).  flags
    gets the standard options (indicated by bits) and extra_opts all the rest */
 static void
-parse_opts (const char *options, int *flags, char **extra_opts) {
+parse_opts (const char *options, int *flags, char **extra_opts, char **user) {
 	*flags = 0;
 	*extra_opts = NULL;
 
@@ -448,7 +457,7 @@ parse_opts (const char *options, int *fl
 			/* end of option item or last item */
 			if (*p == '\0' || *(p+1) == '\0') {
 				if (!parse_string_opt(opt))
-					parse_opt(opt, flags, extra_opts);
+					parse_opt(opt, flags, extra_opts, user);
 				opt = NULL;
 			}
 		}
@@ -519,6 +528,7 @@ create_mtab (void) {
 	struct my_mntent mnt;
 	int flags;
 	mntFILE *mfp;
+	char *user = NULL;
 
 	lock_mtab();
 
@@ -532,11 +542,11 @@ create_mtab (void) {
 	/* Find the root entry by looking it up in fstab */
 	if ((fstab = getfs_by_dir ("/")) || (fstab = getfs_by_dir ("root"))) {
 		char *extra_opts;
-		parse_opts (fstab->m.mnt_opts, &flags, &extra_opts);
+		parse_opts (fstab->m.mnt_opts, &flags, &extra_opts, &user);
 		mnt.mnt_dir = "/";
 		mnt.mnt_fsname = fsprobe_get_devname(fstab->m.mnt_fsname);
 		mnt.mnt_type = fstab->m.mnt_type;
-		mnt.mnt_opts = fix_opts_string (flags, extra_opts, NULL);
+		mnt.mnt_opts = fix_opts_string (flags, extra_opts, user);
 		mnt.mnt_freq = mnt.mnt_passno = 0;
 		free(extra_opts);
 
@@ -560,6 +570,39 @@ create_mtab (void) {
 	reset_mtab_info();
 }
 
+static int set_fsuid(uid_t uid, uid_t *olduid)
+{
+	cap_t cap;
+	int res;
+
+	cap = cap_get_proc();
+	if (!cap) {
+		die(EX_FAIL, _("mount: failed to get capabilities: %s"),
+		    strerror(errno));
+	}
+
+	res = setfsuid(uid);
+	if (olduid)
+		*olduid = res;
+
+	if (setfsuid(uid) != uid) {
+		error(_("mount: failed to set user"));
+		errno = EPERM;
+		return -1;
+	}
+
+	res = cap_set_proc(cap);
+	if (res == -1) {
+		die(EX_FAIL, _("mount: failed to restore capabilities"),
+		    strerror(errno));
+	}
+
+	if (verbose > 2)
+		printf(_("mount: fsuid set to %i\n"), uid);
+
+	return 0;
+}
+
 /* count successful mount system calls */
 static int mountcount = 0;
 
@@ -571,16 +614,33 @@ static int mountcount = 0;
 static int
 do_mount_syscall (struct mountargs *args) {
 	int flags = args->flags;
+	uid_t olduid = 0;
+	int res;
 
 	if ((flags & MS_MGC_MSK) == 0)
 		flags |= MS_MGC_VAL;
 
+	if (args->flags & MS_SETUSER) {
+		if (set_fsuid(args->uid, &olduid) == -1)
+			return -1;
+	}
+
 	if (verbose > 2)
 		printf("mount: mount(2) syscall: source: \"%s\", target: \"%s\", "
 			"filesystemtype: \"%s\", mountflags: %d, data: %s\n",
 			args->spec, args->node, args->type, flags, (char *) args->data);
+	res = mount(args->spec, args->node, args->type, flags, args->data);
+
+	if (args->flags & MS_SETUSER) {
+		int errno_save = errno;
+
+		if (set_fsuid(olduid, NULL) == -1)
+			die(EX_FAIL, _("mount: failed to restore fsuid"));
+
+		errno = errno_save;
+	}
 
-	return mount (args->spec, args->node, args->type, flags, args->data);
+	return res;
 }
 
 /*
@@ -702,6 +762,31 @@ guess_fstype_by_devname(const char *devn
    return type;
 }
 
+static int get_user(const char *user, uid_t *uid)
+{
+	char *e;
+	long val;
+	int res;
+
+	if (!user[0])
+		return -1;
+
+	val = strtoul(user, &e, 10);
+	if (!e[0]) {
+		if (val < 0 || (long) (uid_t) val != val)
+			return -1;
+
+		*uid = val;
+	} else {
+		struct passwd *pw = getpwnam(user);
+		if (pw == NULL)
+			return -1;
+
+		*uid = pw->pw_uid;
+	}
+	return 0;
+}
+
 /*
  * guess_fstype_and_mount()
  *	Mount a single file system. Guess the type when unknown.
@@ -711,8 +796,22 @@ guess_fstype_by_devname(const char *devn
  */
 static int
 guess_fstype_and_mount(const char *spec, const char *node, const char **types,
-		       int flags, char *mount_opts, int *special, int *status) {
-   struct mountargs args = { spec, node, NULL, flags & ~MS_NOSYS, mount_opts };
+		       int flags, char *mount_opts, int *special, int *status,
+		       char *user) {
+   struct mountargs args = { spec, node, NULL, flags & ~MS_NOSYS, mount_opts};
+
+   /*
+    * Only set user for the mount if /etc/mtab is a symlink.  Otherwise
+    * user could add submounts without modifying mtab, and so cause
+    * confusion.
+    */
+   args.flags &= ~MS_SETUSER;
+   if (user != NULL && mtab_is_a_symlink()) {
+	   if (get_user(user, &args.uid) != 0)
+		   return 1;
+
+	   args.flags |= MS_SETUSER;
+   }
 
    if (*types && strcasecmp (*types, "auto") == 0)
       *types = NULL;
@@ -766,6 +865,9 @@ guess_fstype_and_mount(const char *spec,
 static void
 suid_check(const char *spec, const char *node, int *flags, char **user) {
   if (suid) {
+      if (*user != NULL)
+	      die (EX_USAGE, _("mount: only root can set user"));
+
       /*
        * MS_OWNER: Allow owners to mount when fstab contains
        * the owner option.  Note that this should never be used
@@ -1067,7 +1169,7 @@ try_mount_one (const char *spec0, const
   char *extra_opts;		/* written in mtab */
   char *mount_opts;		/* actually used on system call */
   const char *opts, *spec, *node, *types;
-  char *user = 0;
+  char *user = NULL;
   int loop = 0;
   const char *loopdev = 0, *loopfile = 0;
   struct stat statbuf;
@@ -1088,7 +1190,7 @@ try_mount_one (const char *spec0, const
   types = types1 = xstrdup(types0);
   opts = opts1 = xstrdup(opts0);
 
-  parse_opts (opts, &flags, &extra_opts);
+  parse_opts (opts, &flags, &extra_opts, &user);
   extra_opts1 = extra_opts;
 
   /* quietly succeed for fstab entries that don't get mounted automatically */
@@ -1139,7 +1241,7 @@ mount_retry:
 
   if (!fake) {
     mnt5_res = guess_fstype_and_mount (spec, node, &types, flags & ~MS_NOSYS,
-				       mount_opts, &special, &status);
+				       mount_opts, &special, &status, user);
 
     if (special) {
       block_signals (SIG_UNBLOCK);
@@ -1278,7 +1380,8 @@ mount_retry:
       break;
     }
     case EMFILE:
-      error (_("mount table full")); break;
+      error (_("maximum number of user mounts exceeded,\n"
+	       "        see: /proc/sys/fs/max_user_mounts\n")); break;
     case EIO:
       error (_("mount: %s: can't read superblock"), spec); break;
     case ENODEV:
@@ -2014,10 +2117,43 @@ main(int argc, char *argv[]) {
 	}
 
 	if (getuid () != geteuid ()) {
-		suid = 1;
-		if (types || options || readwrite || nomtab || mount_all ||
-		    fake || mounttype || (argc + specseen) != 1)
-			die (EX_USAGE, _("mount: only root can do that"));
+		int pid;
+
+		pid = fork();
+		if (pid == -1) {
+			die(EX_SYSERR, _("mount: cannot fork: %s"),
+			    strerror(errno));
+		} else if (pid == 0) {
+			/*
+			 * Child will continue as normal, but first it
+			 * drops privileges, and redirects stderr to
+			 * /dev/null, because we will retry any error
+			 * with the suid privileges.
+			 */
+
+			if (verbose > 1)
+				printf(_("mount: trying without privileges...\n"));
+
+			if(setgid(getgid()) == -1 ||
+			   setuid(getuid()) == -1)
+				exit(1);
+
+			dup2(open("/dev/null", O_WRONLY), 2);
+		} else {
+			int st;
+
+			wait(&st);
+			if (WIFEXITED(st) && WEXITSTATUS(st) == 0)
+				exit(0);
+
+			suid = 1;
+			if (types || options || readwrite || nomtab ||
+			    mount_all || fake || mounttype ||
+			    (argc + specseen) != 1)
+				die (EX_USAGE, _("mount: only root can do that"));
+			if (verbose > 1)
+				printf(_("mount: retrying with privileges...\n"));
+		}
 	}
 
 	atexit(unlock_mtab);
Index: util-linux-ng/mount/mount_constants.h
===================================================================
--- util-linux-ng.orig/mount/mount_constants.h	2008-09-04 15:51:46.000000000 +0200
+++ util-linux-ng/mount/mount_constants.h	2008-09-04 15:52:06.000000000 +0200
@@ -56,6 +56,12 @@
 #ifndef MS_SHARED
 #define MS_SHARED	(1<<20)	/* 1048576 Shared*/
 #endif
+#ifndef MS_SETUSER
+#define MS_SETUSER	(1<<24)
+#endif
+#ifndef MS_NOSUBMNT
+#define MS_NOSUBMNT	(1<<25)
+#endif
 /*
  * Magic mount flag number. Had to be or-ed to the flag values.
  */
Index: util-linux-ng/mount/umount.c
===================================================================
--- util-linux-ng.orig/mount/umount.c	2008-09-04 15:51:56.000000000 +0200
+++ util-linux-ng/mount/umount.c	2008-09-04 15:52:06.000000000 +0200
@@ -643,9 +643,42 @@ main (int argc, char *argv[]) {
 		}
 
 	if (getuid () != geteuid ()) {
-		suid = 1;
-		if (all || types || nomtab || force || remount)
-			die (2, _("umount: only root can do that"));
+		int pid;
+
+		pid = fork();
+		if (pid == -1) {
+			die(EX_SYSERR, _("umount: cannot fork: %s"),
+			    strerror(errno));
+		} else if (pid == 0) {
+			/*
+			 * Child will continue as normal, but first it
+			 * drops privileges, and redirects stderr to
+			 * /dev/null, because we will retry any error
+			 * with the suid privileges.
+			 */
+
+			if (verbose)
+				printf(_("umount: trying without privileges...\n"));
+
+			if(setgid(getgid()) == -1 ||
+			   setuid(getuid()) == -1)
+				exit(1);
+
+			dup2(open("/dev/null", O_WRONLY), 2);
+		} else {
+			int st;
+
+			wait(&st);
+			if (WIFEXITED(st) && WEXITSTATUS(st) == 0)
+				exit(0);
+
+			suid = 1;
+			if (all || types || nomtab || force || remount)
+				die (2, _("umount: only root can do that"));
+
+			if (verbose)
+				printf(_("umount: retrying with privileges...\n"));
+		}
 	}
 
 	argc -= optind;

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-04 14:06                           ` Miklos Szeredi
@ 2008-09-04 15:40                             ` Miklos Szeredi
  2008-09-04 16:17                               ` Serge E. Hallyn
  2008-09-05 15:31                               ` Serge E. Hallyn
  0 siblings, 2 replies; 39+ messages in thread
From: Miklos Szeredi @ 2008-09-04 15:40 UTC (permalink / raw)
  To: serue; +Cc: miklos, ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

On Thu, 04 Sep 2008, Miklos Szeredi wrote:
> On Thu, 4 Sep 2008, Serge E. Hallyn wrote:
> > Are you going to revert the change forcing CL_SLAVE for
> > !capable(CAP_SYS_ADMIN)?  I don't think we want that - I think that
> > *within* a set of user mounts, propagation should be safe, right?
> > 
> > Will you be able to do this soon?  If not, should we just do the part
> > returning -EPERM when turning a shared mount into a user mount? 
> 
> OK, let's do that first and the tricky part (propagation vs. user
> mounts) later.  Will push after I've tested it.

Here it is:

git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git unprivileged-mounts

I don't know what's next, this patchset has been in and out of -mm for
as long as I can remember, but it hasn't generated much interest
outside the two of us :)

I do think this is an important feature though, even if not as sexy as
some other things.

Al?  Is there any chance of this making it to 2.6.28?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-04 15:40                             ` Miklos Szeredi
@ 2008-09-04 16:17                               ` Serge E. Hallyn
  2008-09-04 17:42                                 ` Miklos Szeredi
  2008-09-05 15:31                               ` Serge E. Hallyn
  1 sibling, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2008-09-04 16:17 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

Quoting Miklos Szeredi (miklos@szeredi.hu):
> On Thu, 04 Sep 2008, Miklos Szeredi wrote:
> > On Thu, 4 Sep 2008, Serge E. Hallyn wrote:
> > > Are you going to revert the change forcing CL_SLAVE for
> > > !capable(CAP_SYS_ADMIN)?  I don't think we want that - I think that
> > > *within* a set of user mounts, propagation should be safe, right?
> > > 
> > > Will you be able to do this soon?  If not, should we just do the part
> > > returning -EPERM when turning a shared mount into a user mount? 
> > 
> > OK, let's do that first and the tricky part (propagation vs. user
> > mounts) later.  Will push after I've tested it.
> 
> Here it is:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git unprivileged-mounts

but you're still doing

	if (IS_MNT_SHARED(old_nd.path.mnt) && !capable(CAP_SYS_ADMIN))
		goto out;

shouldn't it be something like

	if (IS_MNT_SHARED(old_nd.path.mnt) && (old_nd.path.mnt & MNT_USER))
		goto out;

?

> I don't know what's next, this patchset has been in and out of -mm for
> as long as I can remember, but it hasn't generated much interest
> outside the two of us :)
> 
> I do think this is an important feature though, even if not as sexy as
> some other things.
> 
> Al?  Is there any chance of this making it to 2.6.28?
> 
> Thanks,
> Miklos

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-04 16:17                               ` Serge E. Hallyn
@ 2008-09-04 17:42                                 ` Miklos Szeredi
  2008-09-04 17:48                                   ` Serge E. Hallyn
  0 siblings, 1 reply; 39+ messages in thread
From: Miklos Szeredi @ 2008-09-04 17:42 UTC (permalink / raw)
  To: serue; +Cc: miklos, ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

On Thu, 4 Sep 2008, Serge E. Hallyn wrote:
> Quoting Miklos Szeredi (miklos@szeredi.hu):
> > On Thu, 04 Sep 2008, Miklos Szeredi wrote:
> > > On Thu, 4 Sep 2008, Serge E. Hallyn wrote:
> > > > Are you going to revert the change forcing CL_SLAVE for
> > > > !capable(CAP_SYS_ADMIN)?  I don't think we want that - I think that
> > > > *within* a set of user mounts, propagation should be safe, right?
> > > > 
> > > > Will you be able to do this soon?  If not, should we just do the part
> > > > returning -EPERM when turning a shared mount into a user mount? 
> > > 
> > > OK, let's do that first and the tricky part (propagation vs. user
> > > mounts) later.  Will push after I've tested it.
> > 
> > Here it is:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git unprivileged-mounts
> 
> but you're still doing
> 
> 	if (IS_MNT_SHARED(old_nd.path.mnt) && !capable(CAP_SYS_ADMIN))
> 		goto out;
> 
> shouldn't it be something like
> 
> 	if (IS_MNT_SHARED(old_nd.path.mnt) && (old_nd.path.mnt & MNT_USER))
> 		goto out;
> 
> ?

Why would that be an error?  There's no real security gain to be had
from restricting a privileged user, but could cause a lot of
annoyance.  If we think this is dangerous, then protection should be
built into mount(8) with an option to override.  But not into the
kernel, IMO.

Miklos

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-04 17:42                                 ` Miklos Szeredi
@ 2008-09-04 17:48                                   ` Serge E. Hallyn
  2008-09-04 18:03                                     ` Miklos Szeredi
  0 siblings, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2008-09-04 17:48 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: serue, ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

Quoting Miklos Szeredi (miklos@szeredi.hu):
> On Thu, 4 Sep 2008, Serge E. Hallyn wrote:
> > Quoting Miklos Szeredi (miklos@szeredi.hu):
> > > On Thu, 04 Sep 2008, Miklos Szeredi wrote:
> > > > On Thu, 4 Sep 2008, Serge E. Hallyn wrote:
> > > > > Are you going to revert the change forcing CL_SLAVE for
> > > > > !capable(CAP_SYS_ADMIN)?  I don't think we want that - I think that
> > > > > *within* a set of user mounts, propagation should be safe, right?
> > > > > 
> > > > > Will you be able to do this soon?  If not, should we just do the part
> > > > > returning -EPERM when turning a shared mount into a user mount? 
> > > > 
> > > > OK, let's do that first and the tricky part (propagation vs. user
> > > > mounts) later.  Will push after I've tested it.
> > > 
> > > Here it is:
> > > 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git unprivileged-mounts
> > 
> > but you're still doing
> > 
> > 	if (IS_MNT_SHARED(old_nd.path.mnt) && !capable(CAP_SYS_ADMIN))
> > 		goto out;
> > 
> > shouldn't it be something like
> > 
> > 	if (IS_MNT_SHARED(old_nd.path.mnt) && (old_nd.path.mnt & MNT_USER))
> > 		goto out;
> > 
> > ?
> 
> Why would that be an error?  There's no real security gain to be had
> from restricting a privileged user, but could cause a lot of
> annoyance.  If we think this is dangerous, then protection should be
> built into mount(8) with an option to override.  But not into the
> kernel, IMO.

We disagree on that.  But can we agree that the check you added is wrong?
There is no reason why a user mount should not be able to do shared
mounts, is there?  So should the check above just go away then?

-serge

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-04 17:48                                   ` Serge E. Hallyn
@ 2008-09-04 18:03                                     ` Miklos Szeredi
  2008-09-04 18:49                                       ` Serge E. Hallyn
  0 siblings, 1 reply; 39+ messages in thread
From: Miklos Szeredi @ 2008-09-04 18:03 UTC (permalink / raw)
  To: serge
  Cc: miklos, serue, ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

On Thu, 4 Sep 2008, Serge E. Hallyn wrote:
> Quoting Miklos Szeredi (miklos@szeredi.hu):
> > On Thu, 4 Sep 2008, Serge E. Hallyn wrote:
> > > but you're still doing
> > > 
> > > 	if (IS_MNT_SHARED(old_nd.path.mnt) && !capable(CAP_SYS_ADMIN))
> > > 		goto out;
> > > 
> > > shouldn't it be something like
> > > 
> > > 	if (IS_MNT_SHARED(old_nd.path.mnt) && (old_nd.path.mnt & MNT_USER))
> > > 		goto out;
> > > 
> > > ?
> > 
> > Why would that be an error?  There's no real security gain to be had
> > from restricting a privileged user, but could cause a lot of
> > annoyance.  If we think this is dangerous, then protection should be
> > built into mount(8) with an option to override.  But not into the
> > kernel, IMO.
> 
> We disagree on that.  But can we agree that the check you added is wrong?

No :)

> There is no reason why a user mount should not be able to do shared
> mounts, is there?

I don't know.  It's something to think about in the future, but not
essential.  We know that without the above check the user can do bad
things: propagate mounts back into the source, and we don't want that.

We could allow binding a shared mount if

  a) the owners of the source and destination match
  b) the destination is made a slave of the source

But the current patchset doesn't allow _any_ changes to propagation
without CAP_SYS_ADMIN, so why should bind be an exception?

And yes, this is something to think about, but I think it's a rather
uncommon corner case, and so the patchset very much makes sense
without having to deal with unprivileged mount propagation changes.

>  So should the check above just go away then?

No, we'd be back with the original problem.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-04 18:03                                     ` Miklos Szeredi
@ 2008-09-04 18:49                                       ` Serge E. Hallyn
  2008-09-04 22:26                                         ` Miklos Szeredi
  0 siblings, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2008-09-04 18:49 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

Quoting Miklos Szeredi (miklos@szeredi.hu):
> On Thu, 4 Sep 2008, Serge E. Hallyn wrote:
> > Quoting Miklos Szeredi (miklos@szeredi.hu):
> > > On Thu, 4 Sep 2008, Serge E. Hallyn wrote:
> > > > but you're still doing
> > > > 
> > > > 	if (IS_MNT_SHARED(old_nd.path.mnt) && !capable(CAP_SYS_ADMIN))
> > > > 		goto out;
> > > > 
> > > > shouldn't it be something like
> > > > 
> > > > 	if (IS_MNT_SHARED(old_nd.path.mnt) && (old_nd.path.mnt & MNT_USER))
> > > > 		goto out;
> > > > 
> > > > ?
> > > 
> > > Why would that be an error?  There's no real security gain to be had
> > > from restricting a privileged user, but could cause a lot of
> > > annoyance.  If we think this is dangerous, then protection should be
> > > built into mount(8) with an option to override.  But not into the
> > > kernel, IMO.
> > 
> > We disagree on that.  But can we agree that the check you added is wrong?
> 
> No :)
> 
> > There is no reason why a user mount should not be able to do shared
> > mounts, is there?
> 
> I don't know.  It's something to think about in the future, but not
> essential.  We know that without the above check the user can do bad
> things: propagate mounts back into the source, and we don't want that.

When is propagating mounts back into the source bad?  Because you are
not preventing it.

You are preventing future propagation back into the user's own mounts,
but not into mounts not owned by the user.

It's not right.

> We could allow binding a shared mount if
> 
>   a) the owners of the source and destination match
>   b) the destination is made a slave of the source

Well that's what I've been saying...

> But the current patchset doesn't allow _any_ changes to propagation
> without CAP_SYS_ADMIN, so why should bind be an exception?

Because it's not a change in propagation among existing mounts, instead
it's defining propagation for the new user mounts.  And since user
mounts don't currently exist, we're in no position to talk about
exceptions to existing behavior.

> And yes, this is something to think about, but I think it's a rather
> uncommon corner case, and so the patchset very much makes sense
> without having to deal with unprivileged mount propagation changes.

I'm willing to accept that if we simply leave the patchset as it was
before, but your new check just adds inconsistencies for absolutely zero
security gain.

> >  So should the check above just go away then?
> 
> No, we'd be back with the original problem.

We still have the original problem.

When root does

	mount -bind /mnt /mnt
	mount --make-rshared /mnt
	mount --bind -o user=hallyn /mnt /home/hallyn/mnt

and hallyn does

	mount --bind /usr /home/hallyn/mnt/usr

then the kernel happily propagates the mount to /mnt/usr.
Now if hallyn does

	mount --bind /home/hallyn/mnt/usr /home/hallyn/mnt/usr2

THAT gives him a -EPERM.

To what end?

-serge

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-04 18:49                                       ` Serge E. Hallyn
@ 2008-09-04 22:26                                         ` Miklos Szeredi
  2008-09-04 23:32                                           ` Serge E. Hallyn
  0 siblings, 1 reply; 39+ messages in thread
From: Miklos Szeredi @ 2008-09-04 22:26 UTC (permalink / raw)
  To: serue; +Cc: miklos, ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

On Thu, 4 Sep 2008, Serge E. Hallyn wrote:
> We still have the original problem.
> 
> When root does
> 
> 	mount -bind /mnt /mnt
> 	mount --make-rshared /mnt
> 	mount --bind -o user=hallyn /mnt /home/hallyn/mnt
> 
> and hallyn does
> 
> 	mount --bind /usr /home/hallyn/mnt/usr
> 
> then the kernel happily propagates the mount to /mnt/usr.

Obviously, and that's exactly what root _instructed_ in the last step.
If it's a security problem, root shouldn't do that.

Your original bug report correctly pointed out the real security
problem:

|  as root:
|  	mmount --bind -o user=500 /home/hallyn/etc/ /home/hallyn/etc/
|  	mount --bind /mnt /mnt
|  	mount --make-rshared /mnt
|  	mount --bind /dev /mnt/dev
| 
|  as hallyn:
|  	mmount --bind /mnt /home/hallyn/etc/mnt
|  	/usr/src/mmount-0.3/mmount --bind mnt/dev mnt/src

Here root does nothing "unsafe", yet the user can get propagation back
into /mnt, due to the fact that a bind mount makes the new mount part
of the old peer group.  This is the security hole that is fixed, and
AFAICS the only security hole related to propagation vs. user mounts.

(I'm going to be offline tomorrow and the weekend, but will hopefully
have email access next week).

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-04 22:26                                         ` Miklos Szeredi
@ 2008-09-04 23:32                                           ` Serge E. Hallyn
  0 siblings, 0 replies; 39+ messages in thread
From: Serge E. Hallyn @ 2008-09-04 23:32 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

Quoting Miklos Szeredi (miklos@szeredi.hu):
> On Thu, 4 Sep 2008, Serge E. Hallyn wrote:
> > We still have the original problem.
> > 
> > When root does
> > 
> > 	mount -bind /mnt /mnt
> > 	mount --make-rshared /mnt
> > 	mount --bind -o user=hallyn /mnt /home/hallyn/mnt
> > 
> > and hallyn does
> > 
> > 	mount --bind /usr /home/hallyn/mnt/usr
> > 
> > then the kernel happily propagates the mount to /mnt/usr.
> 
> Obviously, and that's exactly what root _instructed_ in the last step.
> If it's a security problem, root shouldn't do that.
> 
> Your original bug report correctly pointed out the real security
> problem:
> 
> |  as root:
> |  	mmount --bind -o user=500 /home/hallyn/etc/ /home/hallyn/etc/
> |  	mount --bind /mnt /mnt
> |  	mount --make-rshared /mnt
> |  	mount --bind /dev /mnt/dev
> | 
> |  as hallyn:
> |  	mmount --bind /mnt /home/hallyn/etc/mnt
> |  	/usr/src/mmount-0.3/mmount --bind mnt/dev mnt/src
> 
> Here root does nothing "unsafe", yet the user can get propagation back
> into /mnt, due to the fact that a bind mount makes the new mount part
> of the old peer group.  This is the security hole that is fixed, and
> AFAICS the only security hole related to propagation vs. user mounts.
> 
> (I'm going to be offline tomorrow and the weekend, but will hopefully
> have email access next week).
> 
> Thanks,
> Miklos

(&(*$&%, you're right, of course.

Ok, will play with it a bit more, but I think it'd be *great* to see
this show up in -mm again.

-serge

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-04 15:40                             ` Miklos Szeredi
  2008-09-04 16:17                               ` Serge E. Hallyn
@ 2008-09-05 15:31                               ` Serge E. Hallyn
  2008-09-09 13:34                                 ` Miklos Szeredi
  1 sibling, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2008-09-05 15:31 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

Quoting Miklos Szeredi (miklos@szeredi.hu):
> On Thu, 04 Sep 2008, Miklos Szeredi wrote:
> > On Thu, 4 Sep 2008, Serge E. Hallyn wrote:
> > > Are you going to revert the change forcing CL_SLAVE for
> > > !capable(CAP_SYS_ADMIN)?  I don't think we want that - I think that
> > > *within* a set of user mounts, propagation should be safe, right?
> > > 
> > > Will you be able to do this soon?  If not, should we just do the part
> > > returning -EPERM when turning a shared mount into a user mount? 
> > 
> > OK, let's do that first and the tricky part (propagation vs. user
> > mounts) later.  Will push after I've tested it.
> 
> Here it is:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git unprivileged-mounts
> 
> I don't know what's next, this patchset has been in and out of -mm for
> as long as I can remember, but it hasn't generated much interest
> outside the two of us :)
> 
> I do think this is an important feature though, even if not as sexy as
> some other things.
> 
> Al?  Is there any chance of this making it to 2.6.28?
> 
> Thanks,
> Miklos

Ok I should take the time to properly add these to ltp, but for now here
is the result of 15-minutes of playing around with shell scripts to do
some basic testing.

Run usermounts_root.sh as root first, then usermounts_user as a user.
Cleanup for the usermounts_root.sh side-effects is not done.

Miklos, do you have better-thought-out or more complete testcases?

-serge

=====================================================================
FILE usermounts_root.sh
=====================================================================
#!/bin/sh
MMOUNTDIR=/usr/src/mmount-0.3
MOUNT=${MMOUNTDIR}/mmount
UMOUNT=${MMOUNTIDR}/ummount

mkdir -p /mnt/shared /mnt/slave /mnt/private
mkdir /mnt/shared/d /mnt/slave/d /mnt/private/d

touch /mnt/shared/a
touch /mnt/slave/b
touch /mnt/private/c

mount --bind /mnt/shared /mnt/shared
mount --make-rshared /mnt/shared
mount --bind /mnt/shared /mnt/slave
mount --make-rslave /mnt/slave
=====================================================================


=====================================================================
FILE usermounts_user.sh
=====================================================================
#!/bin/sh
MMOUNTDIR=/usr/src/mmount-0.3
MOUNT=${MMOUNTDIR}/mmount
UMOUNT=${MMOUNTDIR}/ummount

mkdir t1

# user bind a root shared mount.  Should fail -EPERM.

$MOUNT --bind /mnt/shared t1
rc=$?
if [ $rc -eq 0 ]; then
	echo "FAIL: succeeded in user-binding a root-shared dir"
	exit 1
fi
echo "PASS: first test passed (refused to user-bind a root-shared dir"

# user bind a root shared mount, then bind into there.  Make sure
# that the two binds work, and the second is not propagated to the
# first

$MOUNT --bind /mnt/slave t1
$MOUNT --bind /mnt/private t1/d
if [ ! -f t1/d/c ]; then
	echo "failed mounting private under slave/d"
	exit 1
fi
if [ -f /mnt/slave/d/c ]; then
	echo "user mount of private under slave/d was propagated to /mnt/slave!"
	exit 1
fi
if [ -f /mnt/shared/d/c ]; then
	echo "user mount of private under slave/d was propagated to /mnt/shared!"
	exit 1
fi

ret=0
$UMOUNT t1/d || ret=1
$UMOUNT t1 || ret=1
if [ $ret -eq 1 ]; then
	echo "user umount refused in second test"
	exit 1
fi
if [ -f t1/a ]; then
	echo "user umount failed in second test"
	exit 1
fi

rmdir t1

echo "PASS: second test passed (user-mounts of root-slave and root-private dirs)"

# bind mount /etc/shadow.  First make sure that we cannot read
# /etc/shadow and can read the target.  Then make sure that we
# can no longer read the target after the bind mount.

cat /etc/shadow >> /dev/null
rc=$?
if [ $rc -eq 0 ]; then
	echo "test 3: Oh no, I'm able to read /etc/shadow!"
	exit 1
fi
echo ab > t1
cat t1 >> /dev/null
rc=$?
if [ $rc -ne 0 ]; then
	echo "test 3: Odd, couldn't read my own file."
	exit 1
fi
$MOUNT --bind /etc/shadow t1
cat t1 >> /dev/null
rc=$?
if [ $rc -eq 0 ]; then
	echo "test 3: bind mount of /etc/shadow gave me read access!"
	exit 1
fi
$UMOUNT t1
rm t1

echo "PASS: third test passed (user mount of file)"

echo "all tests succeeded"
exit 0
=====================================================================

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-05 15:31                               ` Serge E. Hallyn
@ 2008-09-09 13:34                                 ` Miklos Szeredi
  2008-09-11 10:37                                   ` Eric W. Biederman
  0 siblings, 1 reply; 39+ messages in thread
From: Miklos Szeredi @ 2008-09-09 13:34 UTC (permalink / raw)
  To: serue; +Cc: miklos, ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

On Fri, 5 Sep 2008, Serge E. Hallyn wrote:
> Ok I should take the time to properly add these to ltp, but for now here
> is the result of 15-minutes of playing around with shell scripts to do
> some basic testing.
> 
> Run usermounts_root.sh as root first, then usermounts_user as a user.
> Cleanup for the usermounts_root.sh side-effects is not done.
> 
> Miklos, do you have better-thought-out or more complete testcases?

Thanks for the scripts, I don't have any better ones, I have only been
testing by hand.

As for -mm, sure it would be nice.  I'll do a resubmission of the
whole series with proper changelog, etc.  It would be really nice if
Al and/or Christoph could review if there are any major conceptual
problems left.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-09 13:34                                 ` Miklos Szeredi
@ 2008-09-11 10:37                                   ` Eric W. Biederman
  2008-09-11 14:43                                     ` Miklos Szeredi
  0 siblings, 1 reply; 39+ messages in thread
From: Eric W. Biederman @ 2008-09-11 10:37 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: serue, akpm, hch, viro, linux-fsdevel, linux-kernel

Miklos Szeredi <miklos@szeredi.hu> writes:

> On Fri, 5 Sep 2008, Serge E. Hallyn wrote:
>> Ok I should take the time to properly add these to ltp, but for now here
>> is the result of 15-minutes of playing around with shell scripts to do
>> some basic testing.
>> 
>> Run usermounts_root.sh as root first, then usermounts_user as a user.
>> Cleanup for the usermounts_root.sh side-effects is not done.
>> 
>> Miklos, do you have better-thought-out or more complete testcases?
>
> Thanks for the scripts, I don't have any better ones, I have only been
> testing by hand.
>
> As for -mm, sure it would be nice.  I'll do a resubmission of the
> whole series with proper changelog, etc.  It would be really nice if
> Al and/or Christoph could review if there are any major conceptual
> problems left.

There is a weird corner case I'm trying to wrap my head around.
unlink and rmdir do not work on dentries that are mount points
in another mount namespace.

Which is at least needed for the moment so we don't leak mounts.

Once we have unprivileged mounts does that introduce a DOS attack?

Eric

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-11 10:37                                   ` Eric W. Biederman
@ 2008-09-11 14:43                                     ` Miklos Szeredi
  2008-09-11 15:20                                       ` Serge E. Hallyn
  2008-09-11 19:04                                       ` Eric W. Biederman
  0 siblings, 2 replies; 39+ messages in thread
From: Miklos Szeredi @ 2008-09-11 14:43 UTC (permalink / raw)
  To: ebiederm; +Cc: miklos, serue, akpm, hch, viro, linux-fsdevel, linux-kernel

On Thu, 11 Sep 2008, ebiederm@xmission.com (Eric W. Biederman)
> There is a weird corner case I'm trying to wrap my head around.
> unlink and rmdir do not work on dentries that are mount points
> in another mount namespace.
> 
> Which is at least needed for the moment so we don't leak mounts.
> 
> Once we have unprivileged mounts does that introduce a DOS attack?

Hmm, yes.  That's a tough one...

I think if the dentry has only user mounts, unlink should go ahead and
on success dissolve any mounts on the dentry.  Does that sound
workable?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-11 14:43                                     ` Miklos Szeredi
@ 2008-09-11 15:20                                       ` Serge E. Hallyn
  2008-09-11 15:44                                         ` Miklos Szeredi
  2008-09-11 18:54                                         ` Eric W. Biederman
  2008-09-11 19:04                                       ` Eric W. Biederman
  1 sibling, 2 replies; 39+ messages in thread
From: Serge E. Hallyn @ 2008-09-11 15:20 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

Quoting Miklos Szeredi (miklos@szeredi.hu):
> On Thu, 11 Sep 2008, ebiederm@xmission.com (Eric W. Biederman)
> > There is a weird corner case I'm trying to wrap my head around.
> > unlink and rmdir do not work on dentries that are mount points
> > in another mount namespace.
> > 
> > Which is at least needed for the moment so we don't leak mounts.
> > 
> > Once we have unprivileged mounts does that introduce a DOS attack?
> 
> Hmm, yes.  That's a tough one...
> 
> I think if the dentry has only user mounts, unlink should go ahead and
> on success dissolve any mounts on the dentry.  Does that sound
> workable?
> 
> Thanks,
> Miklos

Is it really a problem?  The admin can always go ahead and kill the
user, which already takes care of any mounts in private namespaces,
which I think is Eric's primary concern.  IT also takes care of that
user's processes pinning files under the mounts.  So now the admin can
umount all the user's mounts in the init namespace (using a script
parsing /proc/self/mountinfo if need be), and delete the files.

Doesn't really seem like a problem.

Or am I missing Eric's real concern?

-serge

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-11 15:20                                       ` Serge E. Hallyn
@ 2008-09-11 15:44                                         ` Miklos Szeredi
  2008-09-11 18:54                                         ` Eric W. Biederman
  1 sibling, 0 replies; 39+ messages in thread
From: Miklos Szeredi @ 2008-09-11 15:44 UTC (permalink / raw)
  To: serue; +Cc: miklos, ebiederm, akpm, hch, viro, linux-fsdevel, linux-kernel

On Thu, 11 Sep 2008, Serge E. Hallyn wrote:
> Is it really a problem?  The admin can always go ahead and kill the
> user, which already takes care of any mounts in private namespaces,

It's not necessarily against the admin, it could deny service to
another user or a script running as root.

The nasty thing is: an unprivileged user can cause unlink/rmdir to
fail, even when otherwise it would have succeed.  It's not a huge
issue: unprivileged fuse mounts have the same effect, and nobody
complained yet.  But I think we have to deal with this in some way,
not just leaving it to the admin.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-11 15:20                                       ` Serge E. Hallyn
  2008-09-11 15:44                                         ` Miklos Szeredi
@ 2008-09-11 18:54                                         ` Eric W. Biederman
  2008-09-12 22:08                                           ` Serge E. Hallyn
  1 sibling, 1 reply; 39+ messages in thread
From: Eric W. Biederman @ 2008-09-11 18:54 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Miklos Szeredi, akpm, hch, viro, linux-fsdevel, linux-kernel

"Serge E. Hallyn" <serue@us.ibm.com> writes:

> Quoting Miklos Szeredi (miklos@szeredi.hu):
>> On Thu, 11 Sep 2008, ebiederm@xmission.com (Eric W. Biederman)
>> > There is a weird corner case I'm trying to wrap my head around.
>> > unlink and rmdir do not work on dentries that are mount points
>> > in another mount namespace.
>> > 
>> > Which is at least needed for the moment so we don't leak mounts.
>> > 
>> > Once we have unprivileged mounts does that introduce a DOS attack?
>> 
>> Hmm, yes.  That's a tough one...
>> 
>> I think if the dentry has only user mounts, unlink should go ahead and
>> on success dissolve any mounts on the dentry.  Does that sound
>> workable?
>> 
>> Thanks,
>> Miklos
>
> Is it really a problem?  The admin can always go ahead and kill the
> user, which already takes care of any mounts in private namespaces,
> which I think is Eric's primary concern.  IT also takes care of that
> user's processes pinning files under the mounts.  So now the admin can
> umount all the user's mounts in the init namespace (using a script
> parsing /proc/self/mountinfo if need be), and delete the files.
>
> Doesn't really seem like a problem.
>
> Or am I missing Eric's real concern?

Assume /user is the base unprivileged mount point.

echo dummy > /tmp/1234
mount --bind /etc /user/etc
mount --bind /tmp/1234 /user/etc/passwd

Now you can't create /etc/passwd.new and rename it to /etc/passwd.
Stopping adduser from working.

As Miklos said this can apply to any file or any directory, so it can
be a DOS against any other user on the system.

It is also contrary to classic unix and linux semantics as open files
don't otherwise prevent unlink, rename or, rmdir from happening. So
applications are not going to be ready for it.

At a practical level I recently replace chroot with mount namespaces
to simplify handling of mounts and ouch!  When a process goes crazy
and doesn't exit when you expect and then you try and delete the
directory it is a pain.

Eric


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-11 14:43                                     ` Miklos Szeredi
  2008-09-11 15:20                                       ` Serge E. Hallyn
@ 2008-09-11 19:04                                       ` Eric W. Biederman
  2008-09-11 19:58                                         ` Eric W. Biederman
  1 sibling, 1 reply; 39+ messages in thread
From: Eric W. Biederman @ 2008-09-11 19:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: serue, akpm, hch, viro, linux-fsdevel, linux-kernel

Miklos Szeredi <miklos@szeredi.hu> writes:

> On Thu, 11 Sep 2008, ebiederm@xmission.com (Eric W. Biederman)
>> There is a weird corner case I'm trying to wrap my head around.
>> unlink and rmdir do not work on dentries that are mount points
>> in another mount namespace.
>> 
>> Which is at least needed for the moment so we don't leak mounts.
>> 
>> Once we have unprivileged mounts does that introduce a DOS attack?
>
> Hmm, yes.  That's a tough one...
>
> I think if the dentry has only user mounts, unlink should go ahead and
> on success dissolve any mounts on the dentry.  Does that sound
> workable?

I don't think only user mounts is the right filter.

We have support for lazy unmounts so it is possible to handle
that case.

Technically all we need to do is transform d_mounted from a counter
to a hlist_head and thread yet another list through struct vfs_mount
to track this.

I need to think about the semantics a little more before I have a good
feel of what makes sense.  In particular do we want a full
recursive lazy unmount or do we want to handle submounts in a different
way.

This also intersects in interesting ways with dcache pruning, and
automounting.

Eric

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-11 19:04                                       ` Eric W. Biederman
@ 2008-09-11 19:58                                         ` Eric W. Biederman
  0 siblings, 0 replies; 39+ messages in thread
From: Eric W. Biederman @ 2008-09-11 19:58 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: serue, akpm, hch, viro, linux-fsdevel, linux-kernel

ebiederm@xmission.com (Eric W. Biederman) writes:

> This also intersects in interesting ways with dcache pruning, and
> automounting.

In particular we already have mounts that expire.  So deleting the
files or directories of mount points (when allowed) should just be 
a case of expiring the mount point.

When it is ok to do that is a separate question.

Eric

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-11 18:54                                         ` Eric W. Biederman
@ 2008-09-12 22:08                                           ` Serge E. Hallyn
  2008-09-13  3:12                                             ` Eric W. Biederman
  0 siblings, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2008-09-12 22:08 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Miklos Szeredi, akpm, hch, viro, linux-fsdevel, linux-kernel

Quoting Eric W. Biederman (ebiederm@xmission.com):
> "Serge E. Hallyn" <serue@us.ibm.com> writes:
> 
> > Quoting Miklos Szeredi (miklos@szeredi.hu):
> >> On Thu, 11 Sep 2008, ebiederm@xmission.com (Eric W. Biederman)
> >> > There is a weird corner case I'm trying to wrap my head around.
> >> > unlink and rmdir do not work on dentries that are mount points
> >> > in another mount namespace.
> >> > 
> >> > Which is at least needed for the moment so we don't leak mounts.
> >> > 
> >> > Once we have unprivileged mounts does that introduce a DOS attack?
> >> 
> >> Hmm, yes.  That's a tough one...
> >> 
> >> I think if the dentry has only user mounts, unlink should go ahead and
> >> on success dissolve any mounts on the dentry.  Does that sound
> >> workable?
> >> 
> >> Thanks,
> >> Miklos
> >
> > Is it really a problem?  The admin can always go ahead and kill the
> > user, which already takes care of any mounts in private namespaces,
> > which I think is Eric's primary concern.  IT also takes care of that
> > user's processes pinning files under the mounts.  So now the admin can
> > umount all the user's mounts in the init namespace (using a script
> > parsing /proc/self/mountinfo if need be), and delete the files.
> >
> > Doesn't really seem like a problem.
> >
> > Or am I missing Eric's real concern?
> 
> Assume /user is the base unprivileged mount point.
> 
> echo dummy > /tmp/1234
> mount --bind /etc /user/etc
> mount --bind /tmp/1234 /user/etc/passwd

Ok, but this is all done as root.  Kind of a silly thing for root to
do :)

So in order for me as an unprivileged user to pin a dentry by mounting
over it, I have to have write permission to the dentry to begin with
as well as the dentry being under a user=hallyn mount.

> Now you can't create /etc/passwd.new and rename it to /etc/passwd.
> Stopping adduser from working.
> 
> As Miklos said this can apply to any file or any directory, so it can
> be a DOS against any other user on the system.

Except I need to own the mount as well as the dentry.  So after
root does

	mmount --bind -o user=hallyn /home/hallyn /home/hallyn
	mmount --bind -o user=hallyn /home/serge /home/serge

if user serge (uid 501) tries to

	mmount --bind /etc /home/hallyn/etc
	mmount --bind /etc /home/serge/etc

permission for the first will be denied because serge does not
have write perms to /home/hallyn/etc, and permission for the second
will be denied because only hallyn may mount under /home/serge.

If root properly did

	mmount --bind -o user=hallyn /home/hallyn /home/hallyn
	mmount --bind -o user=serge /home/serge /home/serge

and then hallyn does

	mmount --bind /etc /home/hallyn/etc

and serge does

	mmount --bind /home/hallyn/etc /home/serge/etc

then hallyn can still ummount /home/hallyn/etc.

And we've decided that users cannot (for now) do shared mounts.
So I'm still not sure where there is the potential for danger?

> It is also contrary to classic unix and linux semantics as open files
> don't otherwise prevent unlink, rename or, rmdir from happening. So
> applications are not going to be ready for it.
> 
> At a practical level I recently replace chroot with mount namespaces
> to simplify handling of mounts and ouch!  When a process goes crazy
> and doesn't exit when you expect and then you try and delete the
> directory it is a pain.
> 
> Eric

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-12 22:08                                           ` Serge E. Hallyn
@ 2008-09-13  3:12                                             ` Eric W. Biederman
  2008-09-14  1:56                                               ` Serge E. Hallyn
  0 siblings, 1 reply; 39+ messages in thread
From: Eric W. Biederman @ 2008-09-13  3:12 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Miklos Szeredi, akpm, hch, viro, linux-fsdevel, linux-kernel

"Serge E. Hallyn" <serue@us.ibm.com> writes:


> Ok, but this is all done as root.  Kind of a silly thing for root to
> do :)

There are less silly examples like setting up a chroot type
environment contained in a mount namespace and having a kernel oops
and then not being able to delete all of your files.

> So in order for me as an unprivileged user to pin a dentry by mounting
> over it, I have to have write permission to the dentry to begin with
> as well as the dentry being under a user=hallyn mount.

That second condition is interesting requiring write permission of the
dentry.  I thought we had obviated the need for that when we added
ownership to the mounts themselves.  In this case at least it shouldn't
it be write permission on the directory containing the dentry.

>> Now you can't create /etc/passwd.new and rename it to /etc/passwd.
>> Stopping adduser from working.
>> 
>> As Miklos said this can apply to any file or any directory, so it can
>> be a DOS against any other user on the system.
>
> Except I need to own the mount as well as the dentry.  So after
> root does
>
> 	mmount --bind -o user=hallyn /home/hallyn /home/hallyn
> 	mmount --bind -o user=hallyn /home/serge /home/serge
>
> if user serge (uid 501) tries to
>
> 	mmount --bind /etc /home/hallyn/etc
> 	mmount --bind /etc /home/serge/etc
>
> permission for the first will be denied because serge does not
> have write perms to /home/hallyn/etc, and permission for the second
> will be denied because only hallyn may mount under /home/serge.
>
> If root properly did
>
> 	mmount --bind -o user=hallyn /home/hallyn /home/hallyn
> 	mmount --bind -o user=serge /home/serge /home/serge
>
> and then hallyn does
>
> 	mmount --bind /etc /home/hallyn/etc
>
> and serge does
>
> 	mmount --bind /home/hallyn/etc /home/serge/etc
>
> then hallyn can still ummount /home/hallyn/etc.
>
> And we've decided that users cannot (for now) do shared mounts.
> So I'm still not sure where there is the potential for danger?

Ok.  Let's pick on something a little more interesting.

root does:
	mmount --bind -o user=hallyn /home/hallyn /home/hallyn
hallyn does:
	mount --bind /tmp /home/hallyn/tmp
        touch dummy
	mount --bind dummy /home/hallyn/tmp/some_shared_file_I_have_write_access_to.

Which allows me to transform write permissions into the ability to
deny someone else the ability to delete a file.

This seems to mess up things like revoke.

At a practical level it is a real annoyance, regardless of the security
implications.

As a point of comparison plan9 does not have that restriction.

Eric


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-13  3:12                                             ` Eric W. Biederman
@ 2008-09-14  1:56                                               ` Serge E. Hallyn
  2008-09-14  3:06                                                 ` Eric W. Biederman
  0 siblings, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2008-09-14  1:56 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Miklos Szeredi, akpm, hch, viro, linux-fsdevel, linux-kernel

Quoting Eric W. Biederman (ebiederm@xmission.com):
> "Serge E. Hallyn" <serue@us.ibm.com> writes:
> 
> 
> > Ok, but this is all done as root.  Kind of a silly thing for root to
> > do :)
> 
> There are less silly examples like setting up a chroot type
> environment contained in a mount namespace and having a kernel oops
> and then not being able to delete all of your files.

I wasn't saying that I believe I can win an argument by knocking down
one example :)

In fact I'm not trying to win an argument.  Because I'm quite sure you
and Miklos are right, and I just need to figure out what I'm missing.

> > So in order for me as an unprivileged user to pin a dentry by mounting
> > over it, I have to have write permission to the dentry to begin with
> > as well as the dentry being under a user=hallyn mount.
> 
> That second condition is interesting requiring write permission of the
> dentry.  I thought we had obviated the need for that when we added
> ownership to the mounts themselves.  In this case at least it shouldn't
> it be write permission on the directory containing the dentry.

Oh no, it seems I'm wrong, that's not a condition.  Just tested it.

> >> Now you can't create /etc/passwd.new and rename it to /etc/passwd.
> >> Stopping adduser from working.
> >> 
> >> As Miklos said this can apply to any file or any directory, so it can
> >> be a DOS against any other user on the system.
> >
> > Except I need to own the mount as well as the dentry.  So after
> > root does
> >
> > 	mmount --bind -o user=hallyn /home/hallyn /home/hallyn
> > 	mmount --bind -o user=hallyn /home/serge /home/serge
> >
> > if user serge (uid 501) tries to
> >
> > 	mmount --bind /etc /home/hallyn/etc
> > 	mmount --bind /etc /home/serge/etc
> >
> > permission for the first will be denied because serge does not
> > have write perms to /home/hallyn/etc, and permission for the second
> > will be denied because only hallyn may mount under /home/serge.
> >
> > If root properly did
> >
> > 	mmount --bind -o user=hallyn /home/hallyn /home/hallyn
> > 	mmount --bind -o user=serge /home/serge /home/serge
> >
> > and then hallyn does
> >
> > 	mmount --bind /etc /home/hallyn/etc
> >
> > and serge does
> >
> > 	mmount --bind /home/hallyn/etc /home/serge/etc
> >
> > then hallyn can still ummount /home/hallyn/etc.
> >
> > And we've decided that users cannot (for now) do shared mounts.
> > So I'm still not sure where there is the potential for danger?
> 
> Ok.  Let's pick on something a little more interesting.
> 
> root does:
> 	mmount --bind -o user=hallyn /home/hallyn /home/hallyn
> hallyn does:
> 	mount --bind /tmp /home/hallyn/tmp
>         touch dummy
> 	mount --bind dummy /home/hallyn/tmp/some_shared_file_I_have_write_access_to.
> 
> Which allows me to transform write permissions into the ability to
> deny someone else the ability to delete a file.

Yup, that's an interesting example.

Still an admin *can* work around that, if he can sufficiently parse
/proc/self/mountinfo to know to umount
/home/hallyn/tmp/some_shared_file_I_have_write_access_to.

Is that sufficient?  Probably not?

> This seems to mess up things like revoke.

Hey, do we have revoke now?  :)

> At a practical level it is a real annoyance, regardless of the security
> implications.
> 
> As a point of comparison plan9 does not have that restriction.

Why doesn't it have that restriction?  Does it always allow you to rm a
mounted-over file?

-serge

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-14  1:56                                               ` Serge E. Hallyn
@ 2008-09-14  3:06                                                 ` Eric W. Biederman
  2008-09-30 19:39                                                   ` Serge E. Hallyn
  0 siblings, 1 reply; 39+ messages in thread
From: Eric W. Biederman @ 2008-09-14  3:06 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Miklos Szeredi, akpm, hch, viro, linux-fsdevel, linux-kernel

"Serge E. Hallyn" <serue@us.ibm.com> writes:

>> > So in order for me as an unprivileged user to pin a dentry by mounting
>> > over it, I have to have write permission to the dentry to begin with
>> > as well as the dentry being under a user=hallyn mount.
>> 
>> That second condition is interesting requiring write permission of the
>> dentry.  I thought we had obviated the need for that when we added
>> ownership to the mounts themselves.  In this case at least it shouldn't
>> it be write permission on the directory containing the dentry.
>
> Oh no, it seems I'm wrong, that's not a condition.  Just tested it.

Ok.  I couldn't find Mikolos's code when I looked quickly.
What are the current rules?  It sounds like my original example could apply.

>> This seems to mess up things like revoke.
>
> Hey, do we have revoke now?  :)

Periodically someone works on revoke ;)

>> At a practical level it is a real annoyance, regardless of the security
>> implications.
>> 
>> As a point of comparison plan9 does not have that restriction.
>
> Why doesn't it have that restriction?  

I haven't heard the reason.

> Does it always allow you to rm a mounted-over file?

Yes.

The implementation of mounts in plan9 is a bit different.  In
plan9 mounts are looked up by mount namespace and qid (effectively
the inode number).  So the semantics are slightly different.
Requiring a new mount namespace if you want to see what a filesystem
looks like without a mount on top of a given file.

It also looks like plan9 is likely to have a unmountable and unusable mount
if you actually delete an file, from another mount namespace.

Linux actually has a very similar case where we can loose a mount if you rename
the directory that contains it to be somewhere higher in the mount hierarchy
than the location the mount was under.

All of that said there is a very good practical justification for it all.
In a filesystem where not all changes come through our local VFS the
VFS cannot prevent changes to a filesystem it can only cache and respond
to changes that do occur.

So things like sysfs_rename_dir and sysfs_move_dir routinely bypass
the VFS restriction against changes happening under mount points.  It
isn't a problem in practice because people don't mount on sysfs but the
problem is very much there today.

It looks to me like the right solution is very much to do the lazy
unmount we do for expired mounts, and purge everything that happens
and then we just don't have to worry about mounts causing a denial
of service attack and life gets simpler.

Rename is a bit trickier than unlink and rmdir in that we should allow
it on the underlying filesystem and only remove the directory if the
rename goes out of scope for the overlying mount.

Eric

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-14  3:06                                                 ` Eric W. Biederman
@ 2008-09-30 19:39                                                   ` Serge E. Hallyn
  2008-10-06 11:05                                                     ` Miklos Szeredi
  0 siblings, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2008-09-30 19:39 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Miklos Szeredi, akpm, hch, viro, linux-fsdevel, linux-kernel

Quoting Eric W. Biederman (ebiederm@xmission.com):
> "Serge E. Hallyn" <serue@us.ibm.com> writes:
> 
> >> > So in order for me as an unprivileged user to pin a dentry by mounting
> >> > over it, I have to have write permission to the dentry to begin with
> >> > as well as the dentry being under a user=hallyn mount.
> >> 
> >> That second condition is interesting requiring write permission of the
> >> dentry.  I thought we had obviated the need for that when we added
> >> ownership to the mounts themselves.  In this case at least it shouldn't
> >> it be write permission on the directory containing the dentry.
> >
> > Oh no, it seems I'm wrong, that's not a condition.  Just tested it.
> 
> Ok.  I couldn't find Mikolos's code when I looked quickly.
> What are the current rules?  It sounds like my original example could apply.

Rules according to permit_mount:

	if CAP_SYS_ADMIN, allowed
	if fs type is not marked safe for user mounts (and not bind), -EPERM
	if target is a link, -EPERM
	if target is labeled with new nosubmnt flag, -EPERM
	if target is not a user mount owned by current->uid, -EPERM
	if recursive bind mount, -EPERM (?)

> >> This seems to mess up things like revoke.
> >
> > Hey, do we have revoke now?  :)
> 
> Periodically someone works on revoke ;)
> 
> >> At a practical level it is a real annoyance, regardless of the security
> >> implications.
> >> 
> >> As a point of comparison plan9 does not have that restriction.
> >
> > Why doesn't it have that restriction?  
> 
> I haven't heard the reason.
> 
> > Does it always allow you to rm a mounted-over file?
> 
> Yes.
> 
> The implementation of mounts in plan9 is a bit different.  In
> plan9 mounts are looked up by mount namespace and qid (effectively
> the inode number).  So the semantics are slightly different.
> Requiring a new mount namespace if you want to see what a filesystem
> looks like without a mount on top of a given file.
> 
> It also looks like plan9 is likely to have a unmountable and unusable mount
> if you actually delete an file, from another mount namespace.
> 
> Linux actually has a very similar case where we can loose a mount if you rename
> the directory that contains it to be somewhere higher in the mount hierarchy
> than the location the mount was under.

Can you give an example?

> All of that said there is a very good practical justification for it all.
> In a filesystem where not all changes come through our local VFS the
> VFS cannot prevent changes to a filesystem it can only cache and respond
> to changes that do occur.
> 
> So things like sysfs_rename_dir and sysfs_move_dir routinely bypass
> the VFS restriction against changes happening under mount points.  It
> isn't a problem in practice because people don't mount on sysfs but the
> problem is very much there today.
> 
> It looks to me like the right solution is very much to do the lazy
> unmount we do for expired mounts, and purge everything that happens

You mean if we rm something, do the same thing we do for a lazy umount?

That sounds reasonble, if the implementation is doable.  And should
avoid any potential revoke issues for Pekka.

> and then we just don't have to worry about mounts causing a denial
> of service attack and life gets simpler.
> 
> Rename is a bit trickier than unlink and rmdir in that we should allow

The rename issue here being what you mentioned above about losing a
mount if we mv something to the wrong place?

> it on the underlying filesystem and only remove the directory if the
> rename goes out of scope for the overlying mount.
> 
> Eric

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: unprivileged mounts git tree
  2008-09-30 19:39                                                   ` Serge E. Hallyn
@ 2008-10-06 11:05                                                     ` Miklos Szeredi
  0 siblings, 0 replies; 39+ messages in thread
From: Miklos Szeredi @ 2008-10-06 11:05 UTC (permalink / raw)
  To: serue; +Cc: ebiederm, miklos, akpm, hch, viro, linux-fsdevel, linux-kernel

On Tue, 30 Sep 2008, Serge E. Hallyn wrote:
> Quoting Eric W. Biederman (ebiederm@xmission.com):

> > Ok.  I couldn't find Mikolos's code when I looked quickly.  What
> > are the current rules?  It sounds like my original example could
> > apply.
> 
> Rules according to permit_mount:
> 
> 	if CAP_SYS_ADMIN, allowed
> 	if fs type is not marked safe for user mounts (and not bind), -EPERM
> 	if target is a link, -EPERM
> 	if target is labeled with new nosubmnt flag, -EPERM
> 	if target is not a user mount owned by current->uid, -EPERM
> 	if recursive bind mount, -EPERM (?)

The rationale for the last one is that the user might not have access
to the submounts.  Recursive bind mounts can be implemented in
userspace if needed.

> > The implementation of mounts in plan9 is a bit different.  In
> > plan9 mounts are looked up by mount namespace and qid (effectively
> > the inode number).  So the semantics are slightly different.
> > Requiring a new mount namespace if you want to see what a
> > filesystem looks like without a mount on top of a given file.
> > 
> > It also looks like plan9 is likely to have a unmountable and
> > unusable mount if you actually delete an file, from another mount
> > namespace.
> > 
> > Linux actually has a very similar case where we can loose a mount
> > if you rename the directory that contains it to be somewhere
> > higher in the mount hierarchy than the location the mount was
> > under.
> 
> Can you give an example?

# mkdir -p /tmp/a/b/c
# mkdir /tmp/x
# mount --bind /tmp/a /tmp/x
# mount --bind /bin /tmp/x/b/c
# mv /tmp/a/b/c /tmp
mv: cannot move `/tmp/a/b/c' to `/tmp/c': Device or resource busy
# mv /tmp/a/b /tmp
# ls -al /tmp/x
total 64
drwxr-xr-x  2 root root  4096 Oct  6 12:55 .
drwxrwxrwt 95 root root 57344 Oct  6 12:55 ..
# umount /tmp/x
umount: /tmp/x: device is busy.

The mount under /tmp/x/b/c is now inaccessible and lost forever.  The
only way to get rid of it is to lazy umount /tmp/x itself.

What this means is that the current EBUSY restriction only prevents
the mountpoint disappearing in a subset of cases.

Miklos

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2008-10-06 11:06 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-05-07 12:05 unprivileged mounts git tree Miklos Szeredi
2008-08-07 22:27 ` Serge E. Hallyn
2008-08-08  0:07   ` Eric W. Biederman
2008-08-08  0:25     ` Serge E. Hallyn
2008-08-25 11:01       ` Miklos Szeredi
2008-08-27 15:36         ` Serge E. Hallyn
2008-08-27 15:55           ` Miklos Szeredi
2008-08-27 18:46             ` Serge E. Hallyn
2008-09-03 18:45               ` Miklos Szeredi
2008-09-03 21:54                 ` Serge E. Hallyn
2008-09-03 22:02                 ` Serge E. Hallyn
2008-09-03 22:25                   ` Miklos Szeredi
2008-09-03 22:43                     ` Serge E. Hallyn
2008-09-04  6:42                       ` Miklos Szeredi
2008-09-04 13:28                         ` Serge E. Hallyn
2008-09-04 14:06                           ` Miklos Szeredi
2008-09-04 15:40                             ` Miklos Szeredi
2008-09-04 16:17                               ` Serge E. Hallyn
2008-09-04 17:42                                 ` Miklos Szeredi
2008-09-04 17:48                                   ` Serge E. Hallyn
2008-09-04 18:03                                     ` Miklos Szeredi
2008-09-04 18:49                                       ` Serge E. Hallyn
2008-09-04 22:26                                         ` Miklos Szeredi
2008-09-04 23:32                                           ` Serge E. Hallyn
2008-09-05 15:31                               ` Serge E. Hallyn
2008-09-09 13:34                                 ` Miklos Szeredi
2008-09-11 10:37                                   ` Eric W. Biederman
2008-09-11 14:43                                     ` Miklos Szeredi
2008-09-11 15:20                                       ` Serge E. Hallyn
2008-09-11 15:44                                         ` Miklos Szeredi
2008-09-11 18:54                                         ` Eric W. Biederman
2008-09-12 22:08                                           ` Serge E. Hallyn
2008-09-13  3:12                                             ` Eric W. Biederman
2008-09-14  1:56                                               ` Serge E. Hallyn
2008-09-14  3:06                                                 ` Eric W. Biederman
2008-09-30 19:39                                                   ` Serge E. Hallyn
2008-10-06 11:05                                                     ` Miklos Szeredi
2008-09-11 19:04                                       ` Eric W. Biederman
2008-09-11 19:58                                         ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).