All of lore.kernel.org
 help / color / mirror / Atom feed
* plan9 semantics on Linux - mount namespaces
@ 2018-02-13 22:12 Enrico Weigelt
       [not found] ` <0f058286-a432-379b-f559-f2fe713807ab-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
  2018-02-13 22:19 ` Enrico Weigelt
  0 siblings, 2 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-13 22:12 UTC (permalink / raw)
  To: linux-kernel

Hi folks,


I'm currently trying to implement plan9 semantics on Linux and
yet sorting out how to do the mount namespace handling.

On plan9, any unprivileged process can create its own namespace
and mount/bind at will, while on Linux this requires CAP_SYS_ADMIN.

What is the reason for not allowing arbitrary users to create their
own private mount namespace ? What could go wrong here ?

IMHO, we could allow mount/bind under the following conditions:

* the process is in a private mount namespace
* no suid-flag is honored (either force all mounts to nosuid or
   completely mask it out)
* only certain whitelisted filesystems allowed (eg. 9P and FUSE)

Maybe that all could be enabled by a new capability.


any suggestions ?


--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
       [not found] ` <0f058286-a432-379b-f559-f2fe713807ab-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
@ 2018-02-13 22:19   ` Enrico Weigelt
  0 siblings, 0 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-13 22:19 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA; +Cc: Linux Containers

On 13.02.2018 22:12, Enrico Weigelt wrote:

CC @containers@lists.linux-foundation.org

> Hi folks,
> 
> 
> I'm currently trying to implement plan9 semantics on Linux and
> yet sorting out how to do the mount namespace handling.
> 
> On plan9, any unprivileged process can create its own namespace
> and mount/bind at will, while on Linux this requires CAP_SYS_ADMIN.
> 
> What is the reason for not allowing arbitrary users to create their
> own private mount namespace ? What could go wrong here ?
> 
> IMHO, we could allow mount/bind under the following conditions:
> 
> * the process is in a private mount namespace
> * no suid-flag is honored (either force all mounts to nosuid or
>    completely mask it out)
> * only certain whitelisted filesystems allowed (eg. 9P and FUSE)
> 
> Maybe that all could be enabled by a new capability.
> 
> 
> any suggestions ?
> 
> 
> --mtx
> 


-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-13 22:12 plan9 semantics on Linux - mount namespaces Enrico Weigelt
       [not found] ` <0f058286-a432-379b-f559-f2fe713807ab-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
@ 2018-02-13 22:19 ` Enrico Weigelt
       [not found]   ` <5633d335-3926-d98f-d6d7-948b1e2a0b2c-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
  1 sibling, 1 reply; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-13 22:19 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linux Containers

On 13.02.2018 22:12, Enrico Weigelt wrote:

CC @containers@lists.linux-foundation.org

> Hi folks,
> 
> 
> I'm currently trying to implement plan9 semantics on Linux and
> yet sorting out how to do the mount namespace handling.
> 
> On plan9, any unprivileged process can create its own namespace
> and mount/bind at will, while on Linux this requires CAP_SYS_ADMIN.
> 
> What is the reason for not allowing arbitrary users to create their
> own private mount namespace ? What could go wrong here ?
> 
> IMHO, we could allow mount/bind under the following conditions:
> 
> * the process is in a private mount namespace
> * no suid-flag is honored (either force all mounts to nosuid or
>    completely mask it out)
> * only certain whitelisted filesystems allowed (eg. 9P and FUSE)
> 
> Maybe that all could be enabled by a new capability.
> 
> 
> any suggestions ?
> 
> 
> --mtx
> 


-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-13 22:19 ` Enrico Weigelt
@ 2018-02-13 22:27       ` Aleksa Sarai
  0 siblings, 0 replies; 43+ messages in thread
From: Aleksa Sarai @ 2018-02-13 22:27 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 883 bytes --]

On 2018-02-13, Enrico Weigelt <lkml-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org> wrote:
> On 13.02.2018 22:12, Enrico Weigelt wrote:
> > I'm currently trying to implement plan9 semantics on Linux and
> > yet sorting out how to do the mount namespace handling.
> > 
> > On plan9, any unprivileged process can create its own namespace
> > and mount/bind at will, while on Linux this requires CAP_SYS_ADMIN.
> > 
> > What is the reason for not allowing arbitrary users to create their
> > own private mount namespace ? What could go wrong here ?

You can do this by creating a new user namespace (CLONE_NEWUSER), which
then gives you the required permissions to create other namespaces
(CLONE_NEWNS). This is how "rootless containers" or unprivileged
containers operate.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 205 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
@ 2018-02-13 22:27       ` Aleksa Sarai
  0 siblings, 0 replies; 43+ messages in thread
From: Aleksa Sarai @ 2018-02-13 22:27 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: linux-kernel, Linux Containers

[-- Attachment #1: Type: text/plain, Size: 853 bytes --]

On 2018-02-13, Enrico Weigelt <lkml@metux.net> wrote:
> On 13.02.2018 22:12, Enrico Weigelt wrote:
> > I'm currently trying to implement plan9 semantics on Linux and
> > yet sorting out how to do the mount namespace handling.
> > 
> > On plan9, any unprivileged process can create its own namespace
> > and mount/bind at will, while on Linux this requires CAP_SYS_ADMIN.
> > 
> > What is the reason for not allowing arbitrary users to create their
> > own private mount namespace ? What could go wrong here ?

You can do this by creating a new user namespace (CLONE_NEWUSER), which
then gives you the required permissions to create other namespaces
(CLONE_NEWNS). This is how "rootless containers" or unprivileged
containers operate.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-13 22:27       ` Aleksa Sarai
  (?)
@ 2018-02-14  0:01       ` Enrico Weigelt
  -1 siblings, 0 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14  0:01 UTC (permalink / raw)
  To: Aleksa Sarai; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 13.02.2018 22:27, Aleksa Sarai wrote:

> You can do this by creating a new user namespace (CLONE_NEWUSER), which
> then gives you the required permissions to create other namespaces
> (CLONE_NEWNS). This is how "rootless containers" or unprivileged
> containers operate.

hmm, unshare -U doesn't work for me (even as root). But docker works,
so user namespaces should be working. Any idea what could be wrong ?


--mtx


-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-13 22:27       ` Aleksa Sarai
  (?)
  (?)
@ 2018-02-14  0:01       ` Enrico Weigelt
       [not found]         ` <39b08c53-3449-3164-c1b1-44ac587dd4ea-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
  -1 siblings, 1 reply; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14  0:01 UTC (permalink / raw)
  To: Aleksa Sarai; +Cc: linux-kernel, Linux Containers

On 13.02.2018 22:27, Aleksa Sarai wrote:

> You can do this by creating a new user namespace (CLONE_NEWUSER), which
> then gives you the required permissions to create other namespaces
> (CLONE_NEWNS). This is how "rootless containers" or unprivileged
> containers operate.

hmm, unshare -U doesn't work for me (even as root). But docker works,
so user namespaces should be working. Any idea what could be wrong ?


--mtx


-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14  0:01       ` Enrico Weigelt
@ 2018-02-14  4:54             ` Aleksa Sarai
  0 siblings, 0 replies; 43+ messages in thread
From: Aleksa Sarai @ 2018-02-14  4:54 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 1240 bytes --]

On 2018-02-14, Enrico Weigelt <lkml-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org> wrote:
> On 13.02.2018 22:27, Aleksa Sarai wrote:
> 
> > You can do this by creating a new user namespace (CLONE_NEWUSER), which
> > then gives you the required permissions to create other namespaces
> > (CLONE_NEWNS). This is how "rootless containers" or unprivileged
> > containers operate.
> 
> hmm, unshare -U doesn't work for me (even as root). But docker works,
> so user namespaces should be working. Any idea what could be wrong ?

It depends how old your kernel is and what distro you use. Arch Linux
disables user namespaces entirely, Debian requires that you set a sysctl
to enable unprivileged user namespaces, and RHEL requires you to set
both a sysctl and a kernel boot-flag. Also check how old your kernel is
(unprivileged user namespace support was added in 3.8).

Also Docker doesn't use user namespaces by default (you need to manually
enable it with --userns-remap, check the docs for more details). You
probably also want to be using "unshare -r" in your testing (as "unshare
-U" will leave you without mapped users).

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 205 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
@ 2018-02-14  4:54             ` Aleksa Sarai
  0 siblings, 0 replies; 43+ messages in thread
From: Aleksa Sarai @ 2018-02-14  4:54 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: linux-kernel, Linux Containers

[-- Attachment #1: Type: text/plain, Size: 1210 bytes --]

On 2018-02-14, Enrico Weigelt <lkml@metux.net> wrote:
> On 13.02.2018 22:27, Aleksa Sarai wrote:
> 
> > You can do this by creating a new user namespace (CLONE_NEWUSER), which
> > then gives you the required permissions to create other namespaces
> > (CLONE_NEWNS). This is how "rootless containers" or unprivileged
> > containers operate.
> 
> hmm, unshare -U doesn't work for me (even as root). But docker works,
> so user namespaces should be working. Any idea what could be wrong ?

It depends how old your kernel is and what distro you use. Arch Linux
disables user namespaces entirely, Debian requires that you set a sysctl
to enable unprivileged user namespaces, and RHEL requires you to set
both a sysctl and a kernel boot-flag. Also check how old your kernel is
(unprivileged user namespace support was added in 3.8).

Also Docker doesn't use user namespaces by default (you need to manually
enable it with --userns-remap, check the docs for more details). You
probably also want to be using "unshare -r" in your testing (as "unshare
-U" will leave you without mapped users).

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14  4:54             ` Aleksa Sarai
  (?)
@ 2018-02-14 10:18             ` Enrico Weigelt
  -1 siblings, 0 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14 10:18 UTC (permalink / raw)
  To: Aleksa Sarai; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 14.02.2018 04:54, Aleksa Sarai wrote:

> It depends how old your kernel is and what distro you use. Arch Linux > disables user namespaces entirely, Debian requires that you set a 
sysctl> to enable unprivileged user namespaces, and RHEL requires you to 
set> both a sysctl and a kernel boot-flag. Also check how old your 
kernel is> (unprivileged user namespace support was added in 3.8).
Just tried on a mainline kernel (4.15). Same problem:

root@alphabox:~ unshare -U -r
unshare: unshare(0x14000000): Invalid argument


root@alphabox:/proc/sys/user cat max_user_namespaces
5922


Am I missing something ?


--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14  4:54             ` Aleksa Sarai
  (?)
  (?)
@ 2018-02-14 10:18             ` Enrico Weigelt
       [not found]               ` <9c097fd9-3035-d5be-a829-fc18e7734f18-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
  2018-02-14 10:24               ` Aleksa Sarai
  -1 siblings, 2 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14 10:18 UTC (permalink / raw)
  To: Aleksa Sarai; +Cc: linux-kernel, Linux Containers

On 14.02.2018 04:54, Aleksa Sarai wrote:

> It depends how old your kernel is and what distro you use. Arch Linux > disables user namespaces entirely, Debian requires that you set a 
sysctl> to enable unprivileged user namespaces, and RHEL requires you to 
set> both a sysctl and a kernel boot-flag. Also check how old your 
kernel is> (unprivileged user namespace support was added in 3.8).
Just tried on a mainline kernel (4.15). Same problem:

root@alphabox:~ unshare -U -r
unshare: unshare(0x14000000): Invalid argument


root@alphabox:/proc/sys/user cat max_user_namespaces
5922


Am I missing something ?


--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
       [not found]               ` <9c097fd9-3035-d5be-a829-fc18e7734f18-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
@ 2018-02-14 10:24                 ` Aleksa Sarai
  0 siblings, 0 replies; 43+ messages in thread
From: Aleksa Sarai @ 2018-02-14 10:24 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 1024 bytes --]

On 2018-02-14, Enrico Weigelt <lkml-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org> wrote:
> On 14.02.2018 04:54, Aleksa Sarai wrote:
> 
> > It depends how old your kernel is and what distro you use. Arch Linux >
> > disables user namespaces entirely, Debian requires that you set a
> sysctl> to enable unprivileged user namespaces, and RHEL requires you to
> set> both a sysctl and a kernel boot-flag. Also check how old your kernel
> is> (unprivileged user namespace support was added in 3.8).
> Just tried on a mainline kernel (4.15). Same problem:
> 
> root@alphabox:~ unshare -U -r
> unshare: unshare(0x14000000): Invalid argument
> root@alphabox:/proc/sys/user cat max_user_namespaces
> 5922

What distribution are you using and which release? Also, are you trying
to do this inside a Docker container or something similar (Docker has
seccomp filters that block CLONE_NEWUSER by default, for instance).

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 205 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 10:18             ` Enrico Weigelt
       [not found]               ` <9c097fd9-3035-d5be-a829-fc18e7734f18-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
@ 2018-02-14 10:24               ` Aleksa Sarai
  2018-02-14 11:27                   ` Enrico Weigelt
  1 sibling, 1 reply; 43+ messages in thread
From: Aleksa Sarai @ 2018-02-14 10:24 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: linux-kernel, Linux Containers

[-- Attachment #1: Type: text/plain, Size: 994 bytes --]

On 2018-02-14, Enrico Weigelt <lkml@metux.net> wrote:
> On 14.02.2018 04:54, Aleksa Sarai wrote:
> 
> > It depends how old your kernel is and what distro you use. Arch Linux >
> > disables user namespaces entirely, Debian requires that you set a
> sysctl> to enable unprivileged user namespaces, and RHEL requires you to
> set> both a sysctl and a kernel boot-flag. Also check how old your kernel
> is> (unprivileged user namespace support was added in 3.8).
> Just tried on a mainline kernel (4.15). Same problem:
> 
> root@alphabox:~ unshare -U -r
> unshare: unshare(0x14000000): Invalid argument
> root@alphabox:/proc/sys/user cat max_user_namespaces
> 5922

What distribution are you using and which release? Also, are you trying
to do this inside a Docker container or something similar (Docker has
seccomp filters that block CLONE_NEWUSER by default, for instance).

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 10:24               ` Aleksa Sarai
@ 2018-02-14 11:27                   ` Enrico Weigelt
  0 siblings, 0 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14 11:27 UTC (permalink / raw)
  To: Aleksa Sarai; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 14.02.2018 11:24, Aleksa Sarai wrote:

> What distribution are you using and which release? 

On a self-compiled system.

Forgot to enable namespaces in the kernel. Now it seems to work
as root, but not as an unprivileged user:


daemon@alphabox:~ unshare -r -U
unshare: can't open '/proc/self/setgroups': Permission denied
daemon@alphabox:~ unshare -f -r -U
unshare: can't open '/proc/self/setgroups': Permission denied


--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
@ 2018-02-14 11:27                   ` Enrico Weigelt
  0 siblings, 0 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14 11:27 UTC (permalink / raw)
  To: Aleksa Sarai; +Cc: linux-kernel, Linux Containers

On 14.02.2018 11:24, Aleksa Sarai wrote:

> What distribution are you using and which release? 

On a self-compiled system.

Forgot to enable namespaces in the kernel. Now it seems to work
as root, but not as an unprivileged user:


daemon@alphabox:~ unshare -r -U
unshare: can't open '/proc/self/setgroups': Permission denied
daemon@alphabox:~ unshare -f -r -U
unshare: can't open '/proc/self/setgroups': Permission denied


--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
       [not found]                   ` <24ddea73-5c84-e098-caae-8a4c14834cbd-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
@ 2018-02-14 11:30                     ` Richard Weinberger
  0 siblings, 0 replies; 43+ messages in thread
From: Richard Weinberger @ 2018-02-14 11:30 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Wed, Feb 14, 2018 at 12:27 PM, Enrico Weigelt <lkml-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org> wrote:
> On 14.02.2018 11:24, Aleksa Sarai wrote:
>
>> What distribution are you using and which release?
>
>
> On a self-compiled system.
>
> Forgot to enable namespaces in the kernel. Now it seems to work
> as root, but not as an unprivileged user:
>
>
> daemon@alphabox:~ unshare -r -U
> unshare: can't open '/proc/self/setgroups': Permission denied
> daemon@alphabox:~ unshare -f -r -U
> unshare: can't open '/proc/self/setgroups': Permission denied
>

Please read http://man7.org/linux/man-pages/man7/user_namespaces.7.html
setgroups is a corner case and needs special care.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 11:27                   ` Enrico Weigelt
  (?)
@ 2018-02-14 11:30                   ` Richard Weinberger
       [not found]                     ` <CAFLxGvzxLP_UTQbwEY99bQfyftWzZHwaOP+WrzJ8099EKtbVLg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  -1 siblings, 1 reply; 43+ messages in thread
From: Richard Weinberger @ 2018-02-14 11:30 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: Aleksa Sarai, Linux Containers, linux-kernel

On Wed, Feb 14, 2018 at 12:27 PM, Enrico Weigelt <lkml@metux.net> wrote:
> On 14.02.2018 11:24, Aleksa Sarai wrote:
>
>> What distribution are you using and which release?
>
>
> On a self-compiled system.
>
> Forgot to enable namespaces in the kernel. Now it seems to work
> as root, but not as an unprivileged user:
>
>
> daemon@alphabox:~ unshare -r -U
> unshare: can't open '/proc/self/setgroups': Permission denied
> daemon@alphabox:~ unshare -f -r -U
> unshare: can't open '/proc/self/setgroups': Permission denied
>

Please read http://man7.org/linux/man-pages/man7/user_namespaces.7.html
setgroups is a corner case and needs special care.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 11:30                   ` Richard Weinberger
@ 2018-02-14 12:38                         ` Enrico Weigelt
  0 siblings, 0 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14 12:38 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 14.02.2018 12:30, Richard Weinberger wrote:
> On Wed, Feb 14, 2018 at 12:27 PM, Enrico Weigelt <lkml-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org> wrote:
>> On 14.02.2018 11:24, Aleksa Sarai wrote:
>>
>>> What distribution are you using and which release?
>>
>>
>> On a self-compiled system.
>>
>> Forgot to enable namespaces in the kernel. Now it seems to work
>> as root, but not as an unprivileged user:
>>
>>
>> daemon@alphabox:~ unshare -r -U
>> unshare: can't open '/proc/self/setgroups': Permission denied
>> daemon@alphabox:~ unshare -f -r -U
>> unshare: can't open '/proc/self/setgroups': Permission denied
>>
> 
> Please read http://man7.org/linux/man-pages/man7/user_namespaces.7.html
> setgroups is a corner case and needs special care.

I'm still confused. Does the unshare program do something wrong here ?

Anyways, I doubt that user namespaces help solving my problem.

What I'd like to achieve is that processes can manipulate their private 
namespace at will and mount other filesystems (primarily 9p and fuse).

For that, I need to get rid of setuid (and per-file caps) for these
private namespaces.


--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
@ 2018-02-14 12:38                         ` Enrico Weigelt
  0 siblings, 0 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14 12:38 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Aleksa Sarai, Linux Containers, linux-kernel

On 14.02.2018 12:30, Richard Weinberger wrote:
> On Wed, Feb 14, 2018 at 12:27 PM, Enrico Weigelt <lkml@metux.net> wrote:
>> On 14.02.2018 11:24, Aleksa Sarai wrote:
>>
>>> What distribution are you using and which release?
>>
>>
>> On a self-compiled system.
>>
>> Forgot to enable namespaces in the kernel. Now it seems to work
>> as root, but not as an unprivileged user:
>>
>>
>> daemon@alphabox:~ unshare -r -U
>> unshare: can't open '/proc/self/setgroups': Permission denied
>> daemon@alphabox:~ unshare -f -r -U
>> unshare: can't open '/proc/self/setgroups': Permission denied
>>
> 
> Please read http://man7.org/linux/man-pages/man7/user_namespaces.7.html
> setgroups is a corner case and needs special care.

I'm still confused. Does the unshare program do something wrong here ?

Anyways, I doubt that user namespaces help solving my problem.

What I'd like to achieve is that processes can manipulate their private 
namespace at will and mount other filesystems (primarily 9p and fuse).

For that, I need to get rid of setuid (and per-file caps) for these
private namespaces.


--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
       [not found]                         ` <4864d279-9a3f-eaf4-c297-ea34be604e41-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
@ 2018-02-14 12:53                           ` Richard Weinberger
  0 siblings, 0 replies; 43+ messages in thread
From: Richard Weinberger @ 2018-02-14 12:53 UTC (permalink / raw)
  To: Enrico Weigelt, Aleksa Sarai
  Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA

Enrico,

Am Mittwoch, 14. Februar 2018, 13:38:48 CET schrieb Enrico Weigelt:
> On 14.02.2018 12:30, Richard Weinberger wrote:
> > On Wed, Feb 14, 2018 at 12:27 PM, Enrico Weigelt <lkml-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org> wrote:
> >> On 14.02.2018 11:24, Aleksa Sarai wrote:
> >>> What distribution are you using and which release?
> >> 
> >> On a self-compiled system.
> >> 
> >> Forgot to enable namespaces in the kernel. Now it seems to work
> >> as root, but not as an unprivileged user:
> >> 
> >> 
> >> daemon@alphabox:~ unshare -r -U
> >> unshare: can't open '/proc/self/setgroups': Permission denied
> >> daemon@alphabox:~ unshare -f -r -U
> >> unshare: can't open '/proc/self/setgroups': Permission denied
> > 
> > Please read http://man7.org/linux/man-pages/man7/user_namespaces.7.html
> > setgroups is a corner case and needs special care.
> 
> I'm still confused. Does the unshare program do something wrong here ?

It does what you ask it for.
Also see the --setgroups switch.
AFAICT --setgroups=deny is the new default, then your command line should just 
work. Maybe your unshare tool is too old.

> Anyways, I doubt that user namespaces help solving my problem.
> 
> What I'd like to achieve is that processes can manipulate their private
> namespace at will and mount other filesystems (primarily 9p and fuse).
> 
> For that, I need to get rid of setuid (and per-file caps) for these
> private namespaces.

This is exactly why we have the user namespace.
In the user namespace you can create your own mount namespace and do (almost) 
whatever you want.
Please note that you cannot mount any kind of filesystem.
For FUSE, see https://lwn.net/Articles/684774/

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 12:38                         ` Enrico Weigelt
  (?)
@ 2018-02-14 12:53                         ` Richard Weinberger
  2018-02-14 14:03                             ` Enrico Weigelt
  -1 siblings, 1 reply; 43+ messages in thread
From: Richard Weinberger @ 2018-02-14 12:53 UTC (permalink / raw)
  To: Enrico Weigelt, Aleksa Sarai; +Cc: Linux Containers, linux-kernel

Enrico,

Am Mittwoch, 14. Februar 2018, 13:38:48 CET schrieb Enrico Weigelt:
> On 14.02.2018 12:30, Richard Weinberger wrote:
> > On Wed, Feb 14, 2018 at 12:27 PM, Enrico Weigelt <lkml@metux.net> wrote:
> >> On 14.02.2018 11:24, Aleksa Sarai wrote:
> >>> What distribution are you using and which release?
> >> 
> >> On a self-compiled system.
> >> 
> >> Forgot to enable namespaces in the kernel. Now it seems to work
> >> as root, but not as an unprivileged user:
> >> 
> >> 
> >> daemon@alphabox:~ unshare -r -U
> >> unshare: can't open '/proc/self/setgroups': Permission denied
> >> daemon@alphabox:~ unshare -f -r -U
> >> unshare: can't open '/proc/self/setgroups': Permission denied
> > 
> > Please read http://man7.org/linux/man-pages/man7/user_namespaces.7.html
> > setgroups is a corner case and needs special care.
> 
> I'm still confused. Does the unshare program do something wrong here ?

It does what you ask it for.
Also see the --setgroups switch.
AFAICT --setgroups=deny is the new default, then your command line should just 
work. Maybe your unshare tool is too old.

> Anyways, I doubt that user namespaces help solving my problem.
> 
> What I'd like to achieve is that processes can manipulate their private
> namespace at will and mount other filesystems (primarily 9p and fuse).
> 
> For that, I need to get rid of setuid (and per-file caps) for these
> private namespaces.

This is exactly why we have the user namespace.
In the user namespace you can create your own mount namespace and do (almost) 
whatever you want.
Please note that you cannot mount any kind of filesystem.
For FUSE, see https://lwn.net/Articles/684774/

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 12:53                         ` Richard Weinberger
@ 2018-02-14 14:03                             ` Enrico Weigelt
  0 siblings, 0 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14 14:03 UTC (permalink / raw)
  To: Richard Weinberger, Aleksa Sarai
  Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 14.02.2018 13:53, Richard Weinberger wrote:

> It does what you ask it for. > Also see the --setgroups switch.> AFAICT --setgroups=deny is the new 
default, then your command line should just> work. Maybe your unshare 
tool is too old.
Also doesn't help:

daemon@alphabox:~ unshare -U -r --setgroups=deny
unshare: can't open '/proc/self/setgroups': Permission denied

>> What I'd like to achieve is that processes can manipulate their private >> namespace at will and mount other filesystems (primarily 9p and 
fuse).>>>> For that, I need to get rid of setuid (and per-file caps) for 
these>> private namespaces.>
> This is exactly why we have the user namespace.
> In the user namespace you can create your own mount namespace and do (almost)
> whatever you want.

What's the exact relation between user and mnt namespace ?
Why do I need an own user ns for private mnt ns ? (except for the suid
bit, which I wanna get rid of anyways).


--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
@ 2018-02-14 14:03                             ` Enrico Weigelt
  0 siblings, 0 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14 14:03 UTC (permalink / raw)
  To: Richard Weinberger, Aleksa Sarai; +Cc: Linux Containers, linux-kernel

On 14.02.2018 13:53, Richard Weinberger wrote:

> It does what you ask it for. > Also see the --setgroups switch.> AFAICT --setgroups=deny is the new 
default, then your command line should just> work. Maybe your unshare 
tool is too old.
Also doesn't help:

daemon@alphabox:~ unshare -U -r --setgroups=deny
unshare: can't open '/proc/self/setgroups': Permission denied

>> What I'd like to achieve is that processes can manipulate their private >> namespace at will and mount other filesystems (primarily 9p and 
fuse).>>>> For that, I need to get rid of setuid (and per-file caps) for 
these>> private namespaces.>
> This is exactly why we have the user namespace.
> In the user namespace you can create your own mount namespace and do (almost)
> whatever you want.

What's the exact relation between user and mnt namespace ?
Why do I need an own user ns for private mnt ns ? (except for the suid
bit, which I wanna get rid of anyways).


--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 14:03                             ` Enrico Weigelt
@ 2018-02-14 14:19                                 ` Richard Weinberger
  -1 siblings, 0 replies; 43+ messages in thread
From: Richard Weinberger @ 2018-02-14 14:19 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA

Am Mittwoch, 14. Februar 2018, 15:03:55 CET schrieb Enrico Weigelt:
> On 14.02.2018 13:53, Richard Weinberger wrote:
> > It does what you ask it for. > Also see the --setgroups switch.> AFAICT
> > --setgroups=deny is the new
> default, then your command line should just> work. Maybe your unshare
> tool is too old.
> Also doesn't help:
> 
> daemon@alphabox:~ unshare -U -r --setgroups=deny
> unshare: can't open '/proc/self/setgroups': Permission denied

Works here(tm).
Can you debug it? Maybe we miss something obvious.
 
> >> What I'd like to achieve is that processes can manipulate their private
> >> >> namespace at will and mount other filesystems (primarily 9p and
> fuse).>>>> For that, I need to get rid of setuid (and per-file caps) for
> these>> private namespaces.>
> 
> > This is exactly why we have the user namespace.
> > In the user namespace you can create your own mount namespace and do
> > (almost) whatever you want.
> 
> What's the exact relation between user and mnt namespace ?
> Why do I need an own user ns for private mnt ns ? (except for the suid
> bit, which I wanna get rid of anyways).

mount related system calls are root-only. Therefore you need the user 
namespace to become a root in your own little world. :)

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
@ 2018-02-14 14:19                                 ` Richard Weinberger
  0 siblings, 0 replies; 43+ messages in thread
From: Richard Weinberger @ 2018-02-14 14:19 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: Aleksa Sarai, Linux Containers, linux-kernel

Am Mittwoch, 14. Februar 2018, 15:03:55 CET schrieb Enrico Weigelt:
> On 14.02.2018 13:53, Richard Weinberger wrote:
> > It does what you ask it for. > Also see the --setgroups switch.> AFAICT
> > --setgroups=deny is the new
> default, then your command line should just> work. Maybe your unshare
> tool is too old.
> Also doesn't help:
> 
> daemon@alphabox:~ unshare -U -r --setgroups=deny
> unshare: can't open '/proc/self/setgroups': Permission denied

Works here(tm).
Can you debug it? Maybe we miss something obvious.
 
> >> What I'd like to achieve is that processes can manipulate their private
> >> >> namespace at will and mount other filesystems (primarily 9p and
> fuse).>>>> For that, I need to get rid of setuid (and per-file caps) for
> these>> private namespaces.>
> 
> > This is exactly why we have the user namespace.
> > In the user namespace you can create your own mount namespace and do
> > (almost) whatever you want.
> 
> What's the exact relation between user and mnt namespace ?
> Why do I need an own user ns for private mnt ns ? (except for the suid
> bit, which I wanna get rid of anyways).

mount related system calls are root-only. Therefore you need the user 
namespace to become a root in your own little world. :)

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 14:19                                 ` Richard Weinberger
  (?)
@ 2018-02-14 15:02                                 ` Enrico Weigelt
  -1 siblings, 0 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14 15:02 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 14.02.2018 15:19, Richard Weinberger wrote:

> Works here(tm).
> Can you debug it? Maybe we miss something obvious.

daemon@alphabox:~ strace unshare -U -r --setgroups=deny
execve("/bin/unshare", ["unshare", "-U", "-r", "--setgroups=deny"], 
0x7ee51e0c /* 11 vars */) = 0
brk(NULL)                               = 0x58000
fcntl64(0, F_GETFD)                     = 0
fcntl64(1, F_GETFD)                     = 0
fcntl64(2, F_GETFD)                     = 0
access("/etc/suid-debug", F_OK)         = -1 ENOENT (No such file or 
directory)
uname({sysname="Linux", nodename="alphabox", ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0x76f90000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or 
directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file 
or directory)
open("/lib/tls/v7l/neon/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT 
(No such file or directory)
stat64("/lib/tls/v7l/neon/vfp", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/tls/v7l/neon/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
stat64("/lib/tls/v7l/neon", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/tls/v7l/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
stat64("/lib/tls/v7l/vfp", 0x7eae8710)  = -1 ENOENT (No such file or 
directory)
open("/lib/tls/v7l/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/tls/v7l", 0x7eae8710)      = -1 ENOENT (No such file or 
directory)
open("/lib/tls/neon/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
stat64("/lib/tls/neon/vfp", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/tls/neon/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/tls/neon", 0x7eae8710)     = -1 ENOENT (No such file or 
directory)
open("/lib/tls/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/tls/vfp", 0x7eae8710)      = -1 ENOENT (No such file or 
directory)
open("/lib/tls/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file 
or directory)
stat64("/lib/tls", 0x7eae8710)          = -1 ENOENT (No such file or 
directory)
open("/lib/v7l/neon/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
stat64("/lib/v7l/neon/vfp", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/v7l/neon/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/v7l/neon", 0x7eae8710)     = -1 ENOENT (No such file or 
directory)
open("/lib/v7l/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/v7l/vfp", 0x7eae8710)      = -1 ENOENT (No such file or 
directory)
open("/lib/v7l/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file 
or directory)
stat64("/lib/v7l", 0x7eae8710)          = -1 ENOENT (No such file or 
directory)
open("/lib/neon/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/neon/vfp", 0x7eae8710)     = -1 ENOENT (No such file or 
directory)
open("/lib/neon/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/neon", 0x7eae8710)         = -1 ENOENT (No such file or 
directory)
open("/lib/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file 
or directory)
stat64("/lib/vfp", 0x7eae8710)          = -1 ENOENT (No such file or 
directory)
open("/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0Yi\1\0004\0\0\0"..., 512) 
= 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=878136, ...}) = 0
mmap2(NULL, 947496, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 
0) = 0x76e82000
mprotect(0x76f55000, 61440, PROT_NONE)  = 0
mmap2(0x76f64000, 12288, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xd2000) = 0x76f64000
mmap2(0x76f67000, 9512, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x76f67000
close(3)                                = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0x76f8f000
set_tls(0x76f8f4c0, 0x76f8fb98, 0x76f92050, 0x76f8f4c0, 0x76f92050) = 0
mprotect(0x76f64000, 8192, PROT_READ)   = 0
mprotect(0x76f91000, 4096, PROT_READ)   = 0
getuid32()                              = 1
stat64("/etc/busybox.conf", {st_mode=S_IFREG|0644, st_size=198, ...}) = 0
brk(NULL)                               = 0x58000
brk(0x79000)                            = 0x79000
open("/etc/busybox.conf", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=198, ...}) = 0
read(3, "[SUID]\n#lines starting with # ar"..., 1024) = 198
read(3, "", 1024)                       = 0
close(3)                                = 0
getgid32()                              = 1
setgid32(1)                             = 0
setuid32(1)                             = 0
geteuid32()                             = 1
getegid32()                             = 1
unshare(CLONE_NEWUTS|CLONE_NEWUSER)     = 0
open("/proc/self/setgroups", O_WRONLY|O_LARGEFILE) = 3
write(3, "deny", 4)                     = 4
close(3)                                = 0
open("/proc/self/uid_map", O_WRONLY|O_LARGEFILE) = 3
write(3, "1 0 1", 5)                    = -1 EPERM (Operation not permitted)
write(2, "unshare: write error: Operation "..., 46unshare: write error: 
Operation not permitted
) = 46
exit_group(1)                           = ?
+++ exited with 1 +++

Seems it fails to write the uid map.
Is the order of setgroups vs uid_map correct ?

>> What's the exact relation between user and mnt namespace ?
>> Why do I need an own user ns for private mnt ns ? (except for the suid
>> bit, which I wanna get rid of anyways).
> 
> mount related system calls are root-only. Therefore you need the user
> namespace to become a root in your own little world. :)

I'm looking for a way to do that w/o being root (or something similar).
Actually, I don't like to change the user namespace, as it would cause
a lot of trouble w/ the /dev/cap[hash|use] devices, which I'm using for
user switching (as said: I'm going to get rid of suid completely).

--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 14:19                                 ` Richard Weinberger
  (?)
  (?)
@ 2018-02-14 15:02                                 ` Enrico Weigelt
       [not found]                                   ` <4f620eb7-c00c-487b-2e06-8cc4c97af38c-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
  -1 siblings, 1 reply; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14 15:02 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Aleksa Sarai, Linux Containers, linux-kernel

On 14.02.2018 15:19, Richard Weinberger wrote:

> Works here(tm).
> Can you debug it? Maybe we miss something obvious.

daemon@alphabox:~ strace unshare -U -r --setgroups=deny
execve("/bin/unshare", ["unshare", "-U", "-r", "--setgroups=deny"], 
0x7ee51e0c /* 11 vars */) = 0
brk(NULL)                               = 0x58000
fcntl64(0, F_GETFD)                     = 0
fcntl64(1, F_GETFD)                     = 0
fcntl64(2, F_GETFD)                     = 0
access("/etc/suid-debug", F_OK)         = -1 ENOENT (No such file or 
directory)
uname({sysname="Linux", nodename="alphabox", ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0x76f90000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or 
directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file 
or directory)
open("/lib/tls/v7l/neon/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT 
(No such file or directory)
stat64("/lib/tls/v7l/neon/vfp", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/tls/v7l/neon/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
stat64("/lib/tls/v7l/neon", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/tls/v7l/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
stat64("/lib/tls/v7l/vfp", 0x7eae8710)  = -1 ENOENT (No such file or 
directory)
open("/lib/tls/v7l/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/tls/v7l", 0x7eae8710)      = -1 ENOENT (No such file or 
directory)
open("/lib/tls/neon/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
stat64("/lib/tls/neon/vfp", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/tls/neon/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/tls/neon", 0x7eae8710)     = -1 ENOENT (No such file or 
directory)
open("/lib/tls/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/tls/vfp", 0x7eae8710)      = -1 ENOENT (No such file or 
directory)
open("/lib/tls/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file 
or directory)
stat64("/lib/tls", 0x7eae8710)          = -1 ENOENT (No such file or 
directory)
open("/lib/v7l/neon/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
stat64("/lib/v7l/neon/vfp", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/v7l/neon/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/v7l/neon", 0x7eae8710)     = -1 ENOENT (No such file or 
directory)
open("/lib/v7l/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/v7l/vfp", 0x7eae8710)      = -1 ENOENT (No such file or 
directory)
open("/lib/v7l/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file 
or directory)
stat64("/lib/v7l", 0x7eae8710)          = -1 ENOENT (No such file or 
directory)
open("/lib/neon/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/neon/vfp", 0x7eae8710)     = -1 ENOENT (No such file or 
directory)
open("/lib/neon/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/neon", 0x7eae8710)         = -1 ENOENT (No such file or 
directory)
open("/lib/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file 
or directory)
stat64("/lib/vfp", 0x7eae8710)          = -1 ENOENT (No such file or 
directory)
open("/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0Yi\1\0004\0\0\0"..., 512) 
= 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=878136, ...}) = 0
mmap2(NULL, 947496, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 
0) = 0x76e82000
mprotect(0x76f55000, 61440, PROT_NONE)  = 0
mmap2(0x76f64000, 12288, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xd2000) = 0x76f64000
mmap2(0x76f67000, 9512, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x76f67000
close(3)                                = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0x76f8f000
set_tls(0x76f8f4c0, 0x76f8fb98, 0x76f92050, 0x76f8f4c0, 0x76f92050) = 0
mprotect(0x76f64000, 8192, PROT_READ)   = 0
mprotect(0x76f91000, 4096, PROT_READ)   = 0
getuid32()                              = 1
stat64("/etc/busybox.conf", {st_mode=S_IFREG|0644, st_size=198, ...}) = 0
brk(NULL)                               = 0x58000
brk(0x79000)                            = 0x79000
open("/etc/busybox.conf", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=198, ...}) = 0
read(3, "[SUID]\n#lines starting with # ar"..., 1024) = 198
read(3, "", 1024)                       = 0
close(3)                                = 0
getgid32()                              = 1
setgid32(1)                             = 0
setuid32(1)                             = 0
geteuid32()                             = 1
getegid32()                             = 1
unshare(CLONE_NEWUTS|CLONE_NEWUSER)     = 0
open("/proc/self/setgroups", O_WRONLY|O_LARGEFILE) = 3
write(3, "deny", 4)                     = 4
close(3)                                = 0
open("/proc/self/uid_map", O_WRONLY|O_LARGEFILE) = 3
write(3, "1 0 1", 5)                    = -1 EPERM (Operation not permitted)
write(2, "unshare: write error: Operation "..., 46unshare: write error: 
Operation not permitted
) = 46
exit_group(1)                           = ?
+++ exited with 1 +++

Seems it fails to write the uid map.
Is the order of setgroups vs uid_map correct ?

>> What's the exact relation between user and mnt namespace ?
>> Why do I need an own user ns for private mnt ns ? (except for the suid
>> bit, which I wanna get rid of anyways).
> 
> mount related system calls are root-only. Therefore you need the user
> namespace to become a root in your own little world. :)

I'm looking for a way to do that w/o being root (or something similar).
Actually, I don't like to change the user namespace, as it would cause
a lot of trouble w/ the /dev/cap[hash|use] devices, which I'm using for
user switching (as said: I'm going to get rid of suid completely).

--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 15:02                                 ` Enrico Weigelt
@ 2018-02-14 15:17                                       ` Richard Weinberger
  0 siblings, 0 replies; 43+ messages in thread
From: Richard Weinberger @ 2018-02-14 15:17 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA

Enrico,

Am Mittwoch, 14. Februar 2018, 16:02:18 CET schrieb Enrico Weigelt:
> stat64("/etc/busybox.conf", {st_mode=S_IFREG|0644, st_size=198, ...}) = 0

busybox...

> brk(NULL)                               = 0x58000
> brk(0x79000)                            = 0x79000
> open("/etc/busybox.conf", O_RDONLY|O_LARGEFILE) = 3
> fstat64(3, {st_mode=S_IFREG|0644, st_size=198, ...}) = 0
> read(3, "[SUID]\n#lines starting with # ar"..., 1024) = 198
> read(3, "", 1024)                       = 0
> close(3)                                = 0
> getgid32()                              = 1
> setgid32(1)                             = 0
> setuid32(1)                             = 0
> geteuid32()                             = 1
> getegid32()                             = 1
> unshare(CLONE_NEWUTS|CLONE_NEWUSER)     = 0
> open("/proc/self/setgroups", O_WRONLY|O_LARGEFILE) = 3
> write(3, "deny", 4)                     = 4
> close(3)                                = 0
> open("/proc/self/uid_map", O_WRONLY|O_LARGEFILE) = 3
> write(3, "1 0 1", 5)                    = -1 EPERM (Operation not permitted)

This mapping looks broken.
Please report to busybox folks.

From taking a *very* quick look into busybox source, I suspect this should fix 
it:

diff --git a/util-linux/unshare.c b/util-linux/unshare.c
index 875e3f86e304..3f59cf4d27c2 100644
--- a/util-linux/unshare.c
+++ b/util-linux/unshare.c
@@ -350,9 +350,9 @@ int unshare_main(int argc UNUSED_PARAM, char **argv)
 		 * in that user namespace.
 		 */
 		xopen_xwrite_close(PATH_PROC_SETGROUPS, "deny");
-		sprintf(uidmap_buf, "%u 0 1", (unsigned)reuid);
+		sprintf(uidmap_buf, "0 %u 1", (unsigned)reuid);
 		xopen_xwrite_close(PATH_PROC_UIDMAP, uidmap_buf);
-		sprintf(uidmap_buf, "%u 0 1", (unsigned)regid);
+		sprintf(uidmap_buf, "0 %u 1", (unsigned)regid);
 		xopen_xwrite_close(PATH_PROC_GIDMAP, uidmap_buf);
 	} else
 	if (setgrp_str) {

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
@ 2018-02-14 15:17                                       ` Richard Weinberger
  0 siblings, 0 replies; 43+ messages in thread
From: Richard Weinberger @ 2018-02-14 15:17 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: Aleksa Sarai, Linux Containers, linux-kernel

Enrico,

Am Mittwoch, 14. Februar 2018, 16:02:18 CET schrieb Enrico Weigelt:
> stat64("/etc/busybox.conf", {st_mode=S_IFREG|0644, st_size=198, ...}) = 0

busybox...

> brk(NULL)                               = 0x58000
> brk(0x79000)                            = 0x79000
> open("/etc/busybox.conf", O_RDONLY|O_LARGEFILE) = 3
> fstat64(3, {st_mode=S_IFREG|0644, st_size=198, ...}) = 0
> read(3, "[SUID]\n#lines starting with # ar"..., 1024) = 198
> read(3, "", 1024)                       = 0
> close(3)                                = 0
> getgid32()                              = 1
> setgid32(1)                             = 0
> setuid32(1)                             = 0
> geteuid32()                             = 1
> getegid32()                             = 1
> unshare(CLONE_NEWUTS|CLONE_NEWUSER)     = 0
> open("/proc/self/setgroups", O_WRONLY|O_LARGEFILE) = 3
> write(3, "deny", 4)                     = 4
> close(3)                                = 0
> open("/proc/self/uid_map", O_WRONLY|O_LARGEFILE) = 3
> write(3, "1 0 1", 5)                    = -1 EPERM (Operation not permitted)

This mapping looks broken.
Please report to busybox folks.

>From taking a *very* quick look into busybox source, I suspect this should fix 
it:

diff --git a/util-linux/unshare.c b/util-linux/unshare.c
index 875e3f86e304..3f59cf4d27c2 100644
--- a/util-linux/unshare.c
+++ b/util-linux/unshare.c
@@ -350,9 +350,9 @@ int unshare_main(int argc UNUSED_PARAM, char **argv)
 		 * in that user namespace.
 		 */
 		xopen_xwrite_close(PATH_PROC_SETGROUPS, "deny");
-		sprintf(uidmap_buf, "%u 0 1", (unsigned)reuid);
+		sprintf(uidmap_buf, "0 %u 1", (unsigned)reuid);
 		xopen_xwrite_close(PATH_PROC_UIDMAP, uidmap_buf);
-		sprintf(uidmap_buf, "%u 0 1", (unsigned)regid);
+		sprintf(uidmap_buf, "0 %u 1", (unsigned)regid);
 		xopen_xwrite_close(PATH_PROC_GIDMAP, uidmap_buf);
 	} else
 	if (setgrp_str) {

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 15:17                                       ` Richard Weinberger
  (?)
@ 2018-02-14 17:21                                       ` Enrico Weigelt
  -1 siblings, 0 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14 17:21 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 14.02.2018 16:17, Richard Weinberger wrote:

>  From taking a *very* quick look into busybox source, I suspect this should fix
> it:
> 
> diff --git a/util-linux/unshare.c b/util-linux/unshare.c
> index 875e3f86e304..3f59cf4d27c2 100644
> --- a/util-linux/unshare.c
> +++ b/util-linux/unshare.c
> @@ -350,9 +350,9 @@ int unshare_main(int argc UNUSED_PARAM, char **argv)
>   		 * in that user namespace.
>   		 */
>   		xopen_xwrite_close(PATH_PROC_SETGROUPS, "deny");
> -		sprintf(uidmap_buf, "%u 0 1", (unsigned)reuid);
> +		sprintf(uidmap_buf, "0 %u 1", (unsigned)reuid);
>   		xopen_xwrite_close(PATH_PROC_UIDMAP, uidmap_buf);
> -		sprintf(uidmap_buf, "%u 0 1", (unsigned)regid);
> +		sprintf(uidmap_buf, "0 %u 1", (unsigned)regid);
>   		xopen_xwrite_close(PATH_PROC_GIDMAP, uidmap_buf);
>   	} else
>   	if (setgrp_str) {
> 

hmm, now it works, but only when strace'ing it.
that's really strange.

But still I wonder whether user_ns really solves my problem, as I don't
want to create sandboxed users, but only private namespaces just like
on Plan9.


--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 15:17                                       ` Richard Weinberger
  (?)
  (?)
@ 2018-02-14 17:21                                       ` Enrico Weigelt
  2018-02-14 17:50                                         ` Richard Weinberger
       [not found]                                         ` <e924b563-44c6-d678-a6cc-1181f4b820d5-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
  -1 siblings, 2 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14 17:21 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Aleksa Sarai, Linux Containers, linux-kernel

On 14.02.2018 16:17, Richard Weinberger wrote:

>  From taking a *very* quick look into busybox source, I suspect this should fix
> it:
> 
> diff --git a/util-linux/unshare.c b/util-linux/unshare.c
> index 875e3f86e304..3f59cf4d27c2 100644
> --- a/util-linux/unshare.c
> +++ b/util-linux/unshare.c
> @@ -350,9 +350,9 @@ int unshare_main(int argc UNUSED_PARAM, char **argv)
>   		 * in that user namespace.
>   		 */
>   		xopen_xwrite_close(PATH_PROC_SETGROUPS, "deny");
> -		sprintf(uidmap_buf, "%u 0 1", (unsigned)reuid);
> +		sprintf(uidmap_buf, "0 %u 1", (unsigned)reuid);
>   		xopen_xwrite_close(PATH_PROC_UIDMAP, uidmap_buf);
> -		sprintf(uidmap_buf, "%u 0 1", (unsigned)regid);
> +		sprintf(uidmap_buf, "0 %u 1", (unsigned)regid);
>   		xopen_xwrite_close(PATH_PROC_GIDMAP, uidmap_buf);
>   	} else
>   	if (setgrp_str) {
> 

hmm, now it works, but only when strace'ing it.
that's really strange.

But still I wonder whether user_ns really solves my problem, as I don't
want to create sandboxed users, but only private namespaces just like
on Plan9.


--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
       [not found]                                         ` <e924b563-44c6-d678-a6cc-1181f4b820d5-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
@ 2018-02-14 17:50                                           ` Richard Weinberger
  2018-02-14 20:39                                             ` Aleksa Sarai
  1 sibling, 0 replies; 43+ messages in thread
From: Richard Weinberger @ 2018-02-14 17:50 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA

Am Mittwoch, 14. Februar 2018, 18:21:12 CET schrieb Enrico Weigelt:
> On 14.02.2018 16:17, Richard Weinberger wrote:
> >  From taking a *very* quick look into busybox source, I suspect this
> >  should fix> 
> > it:
> > 
> > diff --git a/util-linux/unshare.c b/util-linux/unshare.c
> > index 875e3f86e304..3f59cf4d27c2 100644
> > --- a/util-linux/unshare.c
> > +++ b/util-linux/unshare.c
> > @@ -350,9 +350,9 @@ int unshare_main(int argc UNUSED_PARAM, char **argv)
> > 
> >   		 * in that user namespace.
> >   		 */
> >   		
> >   		xopen_xwrite_close(PATH_PROC_SETGROUPS, "deny");
> > 
> > -		sprintf(uidmap_buf, "%u 0 1", (unsigned)reuid);
> > +		sprintf(uidmap_buf, "0 %u 1", (unsigned)reuid);
> > 
> >   		xopen_xwrite_close(PATH_PROC_UIDMAP, uidmap_buf);
> > 
> > -		sprintf(uidmap_buf, "%u 0 1", (unsigned)regid);
> > +		sprintf(uidmap_buf, "0 %u 1", (unsigned)regid);
> > 
> >   		xopen_xwrite_close(PATH_PROC_GIDMAP, uidmap_buf);
> >   	
> >   	} else
> >   	if (setgrp_str) {
> 
> hmm, now it works, but only when strace'ing it.
> that's really strange.

On my box, with my patch applied, also busybox works now.
 
> But still I wonder whether user_ns really solves my problem, as I don't
> want to create sandboxed users, but only private namespaces just like
> on Plan9.

Well, I'd be surprised if that works out of the box.
Since you're posting on LKML I assumed you're hacking the kernel to support 
plan9-alike namespaces...

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 17:21                                       ` Enrico Weigelt
@ 2018-02-14 17:50                                         ` Richard Weinberger
  2018-02-14 18:01                                           ` Enrico Weigelt
  2018-02-14 18:01                                           ` Enrico Weigelt
       [not found]                                         ` <e924b563-44c6-d678-a6cc-1181f4b820d5-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
  1 sibling, 2 replies; 43+ messages in thread
From: Richard Weinberger @ 2018-02-14 17:50 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: Aleksa Sarai, Linux Containers, linux-kernel

Am Mittwoch, 14. Februar 2018, 18:21:12 CET schrieb Enrico Weigelt:
> On 14.02.2018 16:17, Richard Weinberger wrote:
> >  From taking a *very* quick look into busybox source, I suspect this
> >  should fix> 
> > it:
> > 
> > diff --git a/util-linux/unshare.c b/util-linux/unshare.c
> > index 875e3f86e304..3f59cf4d27c2 100644
> > --- a/util-linux/unshare.c
> > +++ b/util-linux/unshare.c
> > @@ -350,9 +350,9 @@ int unshare_main(int argc UNUSED_PARAM, char **argv)
> > 
> >   		 * in that user namespace.
> >   		 */
> >   		
> >   		xopen_xwrite_close(PATH_PROC_SETGROUPS, "deny");
> > 
> > -		sprintf(uidmap_buf, "%u 0 1", (unsigned)reuid);
> > +		sprintf(uidmap_buf, "0 %u 1", (unsigned)reuid);
> > 
> >   		xopen_xwrite_close(PATH_PROC_UIDMAP, uidmap_buf);
> > 
> > -		sprintf(uidmap_buf, "%u 0 1", (unsigned)regid);
> > +		sprintf(uidmap_buf, "0 %u 1", (unsigned)regid);
> > 
> >   		xopen_xwrite_close(PATH_PROC_GIDMAP, uidmap_buf);
> >   	
> >   	} else
> >   	if (setgrp_str) {
> 
> hmm, now it works, but only when strace'ing it.
> that's really strange.

On my box, with my patch applied, also busybox works now.
 
> But still I wonder whether user_ns really solves my problem, as I don't
> want to create sandboxed users, but only private namespaces just like
> on Plan9.

Well, I'd be surprised if that works out of the box.
Since you're posting on LKML I assumed you're hacking the kernel to support 
plan9-alike namespaces...

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 17:50                                         ` Richard Weinberger
@ 2018-02-14 18:01                                           ` Enrico Weigelt
  2018-02-14 18:01                                           ` Enrico Weigelt
  1 sibling, 0 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14 18:01 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 14.02.2018 18:50, Richard Weinberger wrote:

>> hmm, now it works, but only when strace'ing it.
>> that's really strange.
> 
> On my box, with my patch applied, also busybox works now.

hmm, w/o strace, too ?
Which version are you using ? I've got 1.27.2

>> But still I wonder whether user_ns really solves my problem, as I don't
>> want to create sandboxed users, but only private namespaces just like
>> on Plan9.
> 
> Well, I'd be surprised if that works out of the box.
> Since you're posting on LKML I assumed you're hacking the kernel to support
> plan9-alike namespaces...

Yes, that's the plan :)


--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 17:50                                         ` Richard Weinberger
  2018-02-14 18:01                                           ` Enrico Weigelt
@ 2018-02-14 18:01                                           ` Enrico Weigelt
       [not found]                                             ` <794929ce-0ecb-4c93-d51e-e94fcf749cfa-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
  1 sibling, 1 reply; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14 18:01 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Aleksa Sarai, Linux Containers, linux-kernel

On 14.02.2018 18:50, Richard Weinberger wrote:

>> hmm, now it works, but only when strace'ing it.
>> that's really strange.
> 
> On my box, with my patch applied, also busybox works now.

hmm, w/o strace, too ?
Which version are you using ? I've got 1.27.2

>> But still I wonder whether user_ns really solves my problem, as I don't
>> want to create sandboxed users, but only private namespaces just like
>> on Plan9.
> 
> Well, I'd be surprised if that works out of the box.
> Since you're posting on LKML I assumed you're hacking the kernel to support
> plan9-alike namespaces...

Yes, that's the plan :)


--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 18:01                                           ` Enrico Weigelt
@ 2018-02-14 18:12                                                 ` Richard Weinberger
  0 siblings, 0 replies; 43+ messages in thread
From: Richard Weinberger @ 2018-02-14 18:12 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA

Am Mittwoch, 14. Februar 2018, 19:01:52 CET schrieb Enrico Weigelt:
> On 14.02.2018 18:50, Richard Weinberger wrote:
> >> hmm, now it works, but only when strace'ing it.
> >> that's really strange.
> > 
> > On my box, with my patch applied, also busybox works now.
> 
> hmm, w/o strace, too ?

Sure.

> Which version are you using ? I've got 1.27.2

Both master and 1.12.x

BTW: Your issue is fixed/known. Just checked.

commit 1b510900e24459353922a1bc83c0b58bc8bafe1c
Author: Denys Vlasenko <vda.linux-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>
Date:   Thu Nov 9 16:06:33 2017 +0100

    unshare: -r should map root to user, not the other way around
    
    Signed-off-by: Denys Vlasenko <vda.linux-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
@ 2018-02-14 18:12                                                 ` Richard Weinberger
  0 siblings, 0 replies; 43+ messages in thread
From: Richard Weinberger @ 2018-02-14 18:12 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: Aleksa Sarai, Linux Containers, linux-kernel

Am Mittwoch, 14. Februar 2018, 19:01:52 CET schrieb Enrico Weigelt:
> On 14.02.2018 18:50, Richard Weinberger wrote:
> >> hmm, now it works, but only when strace'ing it.
> >> that's really strange.
> > 
> > On my box, with my patch applied, also busybox works now.
> 
> hmm, w/o strace, too ?

Sure.

> Which version are you using ? I've got 1.27.2

Both master and 1.12.x

BTW: Your issue is fixed/known. Just checked.

commit 1b510900e24459353922a1bc83c0b58bc8bafe1c
Author: Denys Vlasenko <vda.linux@googlemail.com>
Date:   Thu Nov 9 16:06:33 2017 +0100

    unshare: -r should map root to user, not the other way around
    
    Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 18:12                                                 ` Richard Weinberger
  (?)
  (?)
@ 2018-02-14 18:32                                                 ` Enrico Weigelt
  -1 siblings, 0 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14 18:32 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 14.02.2018 19:12, Richard Weinberger wrote:

> BTW: Your issue is fixed/known. Just checked.

aha, on 1.2.28 ... I'll have to upgrade.


--mtx


-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 18:12                                                 ` Richard Weinberger
  (?)
@ 2018-02-14 18:32                                                 ` Enrico Weigelt
  -1 siblings, 0 replies; 43+ messages in thread
From: Enrico Weigelt @ 2018-02-14 18:32 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Aleksa Sarai, Linux Containers, linux-kernel

On 14.02.2018 19:12, Richard Weinberger wrote:

> BTW: Your issue is fixed/known. Just checked.

aha, on 1.2.28 ... I'll have to upgrade.


--mtx


-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-14 17:21                                       ` Enrico Weigelt
@ 2018-02-14 20:39                                             ` Aleksa Sarai
       [not found]                                         ` <e924b563-44c6-d678-a6cc-1181f4b820d5-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
  1 sibling, 0 replies; 43+ messages in thread
From: Aleksa Sarai @ 2018-02-14 20:39 UTC (permalink / raw)
  To: Enrico Weigelt
  Cc: Linux Containers, Richard Weinberger,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 1165 bytes --]

On 2018-02-14, Enrico Weigelt <lkml-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org> wrote:
> But still I wonder whether user_ns really solves my problem, as I don't
> want to create sandboxed users, but only private namespaces just like
> on Plan9.

On Linux you need to have CAP_SYS_ADMIN (in the user_ns that owns your
current mnt_ns) in order to mount anything, and to create any namespaces
(in your current user_ns). So, in order to use the functionality of
mnt_ns (the ability to create mounts only a subset of processes can
see) as an unprivileged user, you need to use user_ns.

(Note there is an additional restriction, namely that a mnt_ns that was
set up in the non-root user_ns cannot mount any filesystems that do not
have the FS_USERNS_MOUNT option set. This is also for security, as
exposing the kernel filesystem parser to arbitrary data by unprivileged
users wasn't deemed to be a safe thing to do. The unprivileged FUSE work
that Richard linked to will likely be useful for pushing FS_USERNS_MOUNT
into more filesystems -- like 9p.)

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 205 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
@ 2018-02-14 20:39                                             ` Aleksa Sarai
  0 siblings, 0 replies; 43+ messages in thread
From: Aleksa Sarai @ 2018-02-14 20:39 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: Richard Weinberger, Linux Containers, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1135 bytes --]

On 2018-02-14, Enrico Weigelt <lkml@metux.net> wrote:
> But still I wonder whether user_ns really solves my problem, as I don't
> want to create sandboxed users, but only private namespaces just like
> on Plan9.

On Linux you need to have CAP_SYS_ADMIN (in the user_ns that owns your
current mnt_ns) in order to mount anything, and to create any namespaces
(in your current user_ns). So, in order to use the functionality of
mnt_ns (the ability to create mounts only a subset of processes can
see) as an unprivileged user, you need to use user_ns.

(Note there is an additional restriction, namely that a mnt_ns that was
set up in the non-root user_ns cannot mount any filesystems that do not
have the FS_USERNS_MOUNT option set. This is also for security, as
exposing the kernel filesystem parser to arbitrary data by unprivileged
users wasn't deemed to be a safe thing to do. The unprivileged FUSE work
that Richard linked to will likely be useful for pushing FS_USERNS_MOUNT
into more filesystems -- like 9p.)

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
  2018-02-13 22:19 ` Enrico Weigelt
@ 2018-02-16 18:26       ` Eric W. Biederman
  0 siblings, 0 replies; 43+ messages in thread
From: Eric W. Biederman @ 2018-02-16 18:26 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA

Enrico Weigelt <lkml@metux.net> writes:

> On 13.02.2018 22:12, Enrico Weigelt wrote:
>
> CC @containers@lists.linux-foundation.org
>
>> Hi folks,
>>
>>
>> I'm currently trying to implement plan9 semantics on Linux and
>> yet sorting out how to do the mount namespace handling.
>>
>> On plan9, any unprivileged process can create its own namespace
>> and mount/bind at will, while on Linux this requires CAP_SYS_ADMIN.
>>
>> What is the reason for not allowing arbitrary users to create their
>> own private mount namespace ? What could go wrong here ?

suid root executables could be fooled.  An easy case is fooling
/bin/su into reading a different copy of /etc/shadow, and allowing
arbitrary changes between users.

>> IMHO, we could allow mount/bind under the following conditions:
>>
>> * the process is in a private mount namespace
>> * no suid-flag is honored (either force all mounts to nosuid or
>>    completely mask it out)
>> * only certain whitelisted filesystems allowed (eg. 9P and FUSE)
>>
>> Maybe that all could be enabled by a new capability.
>>
>>
>> any suggestions ?

User namespaces limit the contained processes to not having any
permissions outside of the user namespace.  While still allowing the
fully unix permission model inside user namespaces.

I am in the final stages of getting the changes in the vfs and in fuse
to allow unprivileged users to mount that filesystem.  plan9fs would
also be a candidate for that kind of treatment if it had a maintainer.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: plan9 semantics on Linux - mount namespaces
@ 2018-02-16 18:26       ` Eric W. Biederman
  0 siblings, 0 replies; 43+ messages in thread
From: Eric W. Biederman @ 2018-02-16 18:26 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: linux-kernel, Linux Containers

Enrico Weigelt <lkml@metux.net> writes:

> On 13.02.2018 22:12, Enrico Weigelt wrote:
>
> CC @containers@lists.linux-foundation.org
>
>> Hi folks,
>>
>>
>> I'm currently trying to implement plan9 semantics on Linux and
>> yet sorting out how to do the mount namespace handling.
>>
>> On plan9, any unprivileged process can create its own namespace
>> and mount/bind at will, while on Linux this requires CAP_SYS_ADMIN.
>>
>> What is the reason for not allowing arbitrary users to create their
>> own private mount namespace ? What could go wrong here ?

suid root executables could be fooled.  An easy case is fooling
/bin/su into reading a different copy of /etc/shadow, and allowing
arbitrary changes between users.

>> IMHO, we could allow mount/bind under the following conditions:
>>
>> * the process is in a private mount namespace
>> * no suid-flag is honored (either force all mounts to nosuid or
>>    completely mask it out)
>> * only certain whitelisted filesystems allowed (eg. 9P and FUSE)
>>
>> Maybe that all could be enabled by a new capability.
>>
>>
>> any suggestions ?

User namespaces limit the contained processes to not having any
permissions outside of the user namespace.  While still allowing the
fully unix permission model inside user namespaces.

I am in the final stages of getting the changes in the vfs and in fuse
to allow unprivileged users to mount that filesystem.  plan9fs would
also be a candidate for that kind of treatment if it had a maintainer.

Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2018-02-16 18:27 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-13 22:12 plan9 semantics on Linux - mount namespaces Enrico Weigelt
     [not found] ` <0f058286-a432-379b-f559-f2fe713807ab-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
2018-02-13 22:19   ` Enrico Weigelt
2018-02-13 22:19 ` Enrico Weigelt
     [not found]   ` <5633d335-3926-d98f-d6d7-948b1e2a0b2c-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
2018-02-13 22:27     ` Aleksa Sarai
2018-02-13 22:27       ` Aleksa Sarai
2018-02-14  0:01       ` Enrico Weigelt
2018-02-14  0:01       ` Enrico Weigelt
     [not found]         ` <39b08c53-3449-3164-c1b1-44ac587dd4ea-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
2018-02-14  4:54           ` Aleksa Sarai
2018-02-14  4:54             ` Aleksa Sarai
2018-02-14 10:18             ` Enrico Weigelt
2018-02-14 10:18             ` Enrico Weigelt
     [not found]               ` <9c097fd9-3035-d5be-a829-fc18e7734f18-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
2018-02-14 10:24                 ` Aleksa Sarai
2018-02-14 10:24               ` Aleksa Sarai
2018-02-14 11:27                 ` Enrico Weigelt
2018-02-14 11:27                   ` Enrico Weigelt
2018-02-14 11:30                   ` Richard Weinberger
     [not found]                     ` <CAFLxGvzxLP_UTQbwEY99bQfyftWzZHwaOP+WrzJ8099EKtbVLg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-14 12:38                       ` Enrico Weigelt
2018-02-14 12:38                         ` Enrico Weigelt
2018-02-14 12:53                         ` Richard Weinberger
2018-02-14 14:03                           ` Enrico Weigelt
2018-02-14 14:03                             ` Enrico Weigelt
     [not found]                             ` <a2a6f189-008e-38f2-afcb-b9393d8d440a-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
2018-02-14 14:19                               ` Richard Weinberger
2018-02-14 14:19                                 ` Richard Weinberger
2018-02-14 15:02                                 ` Enrico Weigelt
2018-02-14 15:02                                 ` Enrico Weigelt
     [not found]                                   ` <4f620eb7-c00c-487b-2e06-8cc4c97af38c-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
2018-02-14 15:17                                     ` Richard Weinberger
2018-02-14 15:17                                       ` Richard Weinberger
2018-02-14 17:21                                       ` Enrico Weigelt
2018-02-14 17:21                                       ` Enrico Weigelt
2018-02-14 17:50                                         ` Richard Weinberger
2018-02-14 18:01                                           ` Enrico Weigelt
2018-02-14 18:01                                           ` Enrico Weigelt
     [not found]                                             ` <794929ce-0ecb-4c93-d51e-e94fcf749cfa-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
2018-02-14 18:12                                               ` Richard Weinberger
2018-02-14 18:12                                                 ` Richard Weinberger
2018-02-14 18:32                                                 ` Enrico Weigelt
2018-02-14 18:32                                                 ` Enrico Weigelt
     [not found]                                         ` <e924b563-44c6-d678-a6cc-1181f4b820d5-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
2018-02-14 17:50                                           ` Richard Weinberger
2018-02-14 20:39                                           ` Aleksa Sarai
2018-02-14 20:39                                             ` Aleksa Sarai
     [not found]                         ` <4864d279-9a3f-eaf4-c297-ea34be604e41-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
2018-02-14 12:53                           ` Richard Weinberger
     [not found]                   ` <24ddea73-5c84-e098-caae-8a4c14834cbd-EcKl7qYKIbxeoWH0uzbU5w@public.gmane.org>
2018-02-14 11:30                     ` Richard Weinberger
2018-02-16 18:26     ` Eric W. Biederman
2018-02-16 18:26       ` Eric W. Biederman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.