* Bug: for mount namespaces inside a chroot, unshare works but nsenter doesn't
@ 2017-10-27 18:07 Ximin Luo
2017-11-03 13:33 ` Karel Zak
0 siblings, 1 reply; 6+ messages in thread
From: Ximin Luo @ 2017-10-27 18:07 UTC (permalink / raw)
To: util-linux
(Please keep me on CC, I'm not subscribed)
When unsharing persistent mount namespaces, unshare+nsenter does not seem to
work properly when run from inside a chroot session. However, unshare by itself
works.
As a workaround for the unshare+nsenter case, one can run `nsenter --mount=<ns>
chroot <real/path/to/chroot> command args`. The `--root` option to `nsenter`
sounds like it should work, but it does not - see below for details.
Is this a bug? I'm trying to write code to work regardless of whether it's run
inside a chroot, so it would be nice not to have to pass arguments to
`nsenter(1)` that are specific to chroots, like `chroot <real/path/to/chroot>`.
It's also a bit counterintuitive to have to re-enter the chroot again.
Also, these extra steps are not needed with `unshare(1)`, which works fine by
itself. It's solely re-entering the namespace that seems to be problematic.
I'm using util-linux 2.30.2-0.1 on Debian. I don't believe it's a problem
specific to Debian, because everything works when using `unshare(1)` by itself,
as stated.
(I haven't tried running this inside a chroot-inside-a-chroot.)
Details:
# Below is all run inside a "schroot" session, which is a Debian tool for making chroot use more convenient.
# I used the instructions here (https://wiki.debian.org/sbuild#Create_the_chroot) to create one.
## Preparation for the tests
# Enter the chroot
$ sudo schroot -c unstable-amd64-sbuild
# Set up a private-bind file to hold a handle to our new namespace, as documented in the man page of unshare(1)
(unstable-amd64-sbuild)root@localhost:/tmp# touch ns-mnt; mount --bind --make-private ns-mnt ns-mnt
# Set up our test script
(unstable-amd64-sbuild)root@localhost:/tmp# script='mount; ls /; ls -l /proc/$$/ns/mnt; mount -B /dev/null /etc/hosts; echo hosts:; cat /etc/hosts'
## Case 1: unshare(1) with no special options or commands, everything works as expected
(unstable-amd64-sbuild)root@localhost:/tmp# unshare --mount=ns-mnt sh -ec "$script"
unstable-amd64-sbuild on / type overlay (rw,relatime,lowerdir=/var/lib/schroot/union/underlay/<<SESSIONID>>,...)
proc on /proc type proc (rw,relatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
[.. etc. other mappings in my chroot ..]
unstable-amd64-sbuild on /tmp/ns-mnt type overlay (rw,relatime,lowerdir=/var/lib/schroot/union/underlay/<<SESSIONID>>,...)
bin boot build dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
lrwxrwxrwx 1 root root 0 Oct 27 17:35 /proc/31691/ns/mnt -> 'mnt:[4026532398]'
hosts:
[.. empty hosts (inside the namespace) ..]
# we are now back outside the namespace
# if we cat /etc/hosts (both inside and outside the chroot), we see the original
## now we try to re-enter the namespace.
## Case 2: nsenter(1) with no extra options or commands, doesn't work:
(unstable-amd64-sbuild)root@localhost:/tmp# nsenter --mount=ns-mnt sh -ec "$script"
[.. mappings for my host system, outside the chroot ..]
bin boot dev etc home initrd.img initrd.img.old lib lib32 lib64 libx32 lost+found media mnt opt proc root run sbin selinux srv sys tmp usr var vmlinuz vmlinuz.old
[.. aka the / on my host filesystem outside the chroot ..]
lrwxrwxrwx 1 root root 0 Oct 27 19:36 /proc/32434/ns/mnt -> 'mnt:[4026532398]'
[.. correct namespace ..]
hosts:
[.. empty hosts (inside the namespace) ..]
# if we cat /etc/hosts outside the namespace, it's non-empty inside the chroot but EMPTY outside the chroot.
# whoops, because we ran mount -B on the original non-chrooted / filesystem. findmnt says:
└─/etc/hosts udev[/null] devtmpfs rw,nosuid,relatime,size=8181852k,nr_inodes=2045463,mode=755
# we unmount it before proceeding
## Case 3: nsenter(1) with --root, partially works but not really:
(unstable-amd64-sbuild)root@localhost:/tmp# nsenter --root=/ --mount=ns-mnt sh -ec "$script"
[.. i.e. mount(1) gives empty output ..]
bin boot build dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
[.. at least the root is inside the chroot ..]
lrwxrwxrwx 1 root root 0 Oct 27 17:37 /proc/878/ns/mnt -> 'mnt:[4026532398]'
[.. correct namespace ..]
mount: /etc/hosts: wrong fs type, bad option, bad superblock on /dev/null, missing codepage or helper program, or other error.
[.. mount operations fail, but the namespace is correct ..]
[.. if you analyse this case a bit more, you find that /proc/$$/{mounts,mountinfo,mountstats} are all empty ..]
# exit code 32
# outside the namespace, /etc/hosts is still non-empty, both inside and outside the chroot
## Case 4: nsenter(1) with explicit chroot(1) call, everything works as expected, again:
(unstable-amd64-sbuild)root@localhost:/tmp# nsenter --mount=ns-mnt chroot /run/schroot/mount/<<SESSIONID>> sh -ec 'mount && ls /'
unstable-amd64-sbuild on / type overlay (rw,relatime,lowerdir=/var/lib/schroot/union/underlay/<<SESSIONID>>,...)
proc on /proc type proc (rw,relatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
[.. etc. other mappings in my chroot ..]
unstable-amd64-sbuild on /tmp/ns-mnt type overlay (rw,relatime,lowerdir=/var/lib/schroot/union/underlay/<<SESSIONID>>,...)
[.. great, we got our mounts back! ..]
bin boot build dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
lrwxrwxrwx 1 root root 0 Oct 27 17:39 /proc/2025/ns/mnt -> 'mnt:[4026532398]'
[.. correct namespace ..]
hosts:
[.. empty hosts, as desired ..]
# outside the namespace, /etc/hosts is still non-empty, both inside and outside the chroot
--
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Bug: for mount namespaces inside a chroot, unshare works but nsenter doesn't
2017-10-27 18:07 Bug: for mount namespaces inside a chroot, unshare works but nsenter doesn't Ximin Luo
@ 2017-11-03 13:33 ` Karel Zak
2017-11-09 22:54 ` Eric W. Biederman
0 siblings, 1 reply; 6+ messages in thread
From: Karel Zak @ 2017-11-03 13:33 UTC (permalink / raw)
To: Ximin Luo; +Cc: util-linux, Eric W. Biederman
On Fri, Oct 27, 2017 at 06:07:00PM +0000, Ximin Luo wrote:
> When unsharing persistent mount namespaces, unshare+nsenter does not seem to
> work properly when run from inside a chroot session. However, unshare by itself
> works.
It's not related to persistent namespace, but to the way how nsenter
uses chroot().
> As a workaround for the unshare+nsenter case, one can run `nsenter --mount=<ns>
> chroot <real/path/to/chroot> command args`. The `--root` option to `nsenter`
> sounds like it should work, but it does not - see below for details.
>
> Is this a bug?
It seems like nsenter logic problem.
The command nsenter opens root-dir and cwd file descriptors *before*
setns() syscall, and than *after* the syscall it calls chroot(). The
final process is in the namespace, but no in the root directory.
open("/mnt/test/chroot/namespaces/mnt", O_RDONLY) = 3
open("/mnt/test/chroot", O_RDONLY) = 4
open("/mnt/test/chroot", O_RDONLY) = 5
setns(3, CLONE_NEWNS) = 0
close(3) = 0
fchdir(4) = 0
chroot(".") = 0
close(4) = 0
fchdir(5) = 0
close(5) = 0
execve("/bin/bash", ["-bash"], 0x7ffd2b5244d0 /* 31 vars */) = 0
The patch below fixes the issue. It just moves root-dir and cwd open
calls *after* the setns():
open("/mnt/test/chroot/namespaces/mnt", O_RDONLY) = 3
setns(3, CLONE_NEWNS) = 0
close(3) = 0
open("/mnt/test/chroot", O_RDONLY) = 3
open("/mnt/test/chroot", O_RDONLY) = 4
fchdir(4) = 0
chroot(".") = 0
close(4) = 0
fchdir(3) = 0
close(3) = 0
execve("/bin/bash", ["-bash"], 0x7fff1ff8eb60 /* 31 vars */) = 0
Unfortunately, I'm not sure if this is the right way in all cases.
Eric?
Examples:
*** I have simple chroot directory:
ls -la /mnt/test/chroot
total 20
drwxr-xr-x 5 root root 4096 Nov 3 13:10 .
drwxr-xr-x. 8 root root 4096 Nov 2 15:36 ..
lrwxrwxrwx 1 root root 8 Nov 2 15:40 bin -> /usr/bin
lrwxrwxrwx 1 root root 8 Nov 2 15:40 lib -> /usr/lib
lrwxrwxrwx 1 root root 10 Nov 2 15:40 lib64 -> /usr/lib64
drwxr-xr-x 4 root root 4096 Nov 3 13:22 namespaces
dr-xr-xr-x 330 root root 0 Sep 26 22:17 proc
lrwxrwxrwx 1 root root 9 Nov 2 15:40 sbin -> /usr/sbin
drwxr-xr-x. 14 root root 4096 Aug 16 10:50 usr
where is bind mounted /usr and mounted /proc
# findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION --submounts /mnt/test/chroot
TARGET SOURCE FSTYPE PROPAGATION
/mnt/test/chroot /dev/sda4[/mnt/test/chroot] ext4 private
├─/mnt/test/chroot/usr /dev/sda4[/usr] ext4 shared
└─/mnt/test/chroot/proc proc proc private
let's enter the root and create persistent mount namespace within the chroot:
# chroot /mnt/test/chroot
# unshare --mount=namespaces/mnt
our mount table:
findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION
TARGET SOURCE FSTYPE PROPAGATION
/ /dev/sda4[/mnt/test/chroot] ext4 private
├─/usr /dev/sda4[/usr] ext4 private
└─/proc proc proc private
and our mount namespace:
# ls -la /proc/self/ns | grep mnt
lrwxrwxrwx 1 0 0 0 Nov 3 12:56 mnt -> mnt:[4026532457]
our pid:
# echo $$
14411
IMHO good idea is keep the shell alive in the chroot and use another session
to play with nsenter.
*** nsenter examples:
a) let's try it by PID, all works as expected:
# nsenter --target 14411 --mount --root --wd
# findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION
TARGET SOURCE FSTYPE PROPAGATION
/ /dev/sda4[/mnt/test/chroot] ext4 private
├─/usr /dev/sda4[/usr] ext4 private
└─/proc proc proc private
# ls -la /proc/self/ns | grep mnt
lrwxrwxrwx 1 0 0 0 Nov 3 13:02 mnt -> mnt:[4026532457]
Important note: in this case nsenter uses /proc/<target>/root for
chroot(), but the goal is to use persistent namespace where no <target>
available.
b) let's try chroot() by path:
# nsenter --target 14411 --mount --root=/mnt/test/chroot --wd=/mnt/test/chroot
# findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION
failed, mount table is empty
c) let's try chroot by /proc paths:
# nsenter --target 14411 --mount --root=/mnt/test/chroot/proc/14411/root --wd=/mnt/test/chroot/proc/14411/cwd
# findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION
TARGET SOURCE FSTYPE PROPAGATION
/ /dev/sda4[/mnt/test/chroot] ext4 private
├─/usr /dev/sda4[/usr] ext4 private
└─/proc proc proc private
# ls -la /proc/self/ns | grep mnt
lrwxrwxrwx 1 0 0 0 Nov 3 13:09 mnt -> mnt:[4026532457]
it works!
Note that --target or --mount=<persistent> namespace does not change
anything here.
The nsenter with the patch:
# ./nsenter --mount=/mnt/test/chroot/namespaces/mnt --root=/mnt/test/chroot --wd=/mnt/test/chroot
# findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION
TARGET SOURCE FSTYPE PROPAGATION
/ /dev/sda4[/mnt/test/chroot] ext4 private
├─/usr /dev/sda4[/usr] ext4 private
└─/proc proc proc private
# ls -la /proc/self/ns | grep mnt
lrwxrwxrwx 1 0 0 0 Nov 3 13:11 mnt -> mnt:[4026532457]
all works as expected. The patch is below.
Karel
diff --git a/sys-utils/nsenter.c b/sys-utils/nsenter.c
index 9c452c1d1..464f9f98c 100644
--- a/sys-utils/nsenter.c
+++ b/sys-utils/nsenter.c
@@ -238,6 +238,7 @@ int main(int argc, char *argv[])
int do_fork = -1; /* unknown yet */
uid_t uid = 0;
gid_t gid = 0;
+ const char *rd_path = NULL, *wd_path = NULL;
#ifdef HAVE_LIBSELINUX
bool selinux = 0;
#endif
@@ -318,13 +319,13 @@ int main(int argc, char *argv[])
break;
case 'r':
if (optarg)
- open_target_fd(&root_fd, "root", optarg);
+ rd_path = optarg;
else
do_rd = true;
break;
case 'w':
if (optarg)
- open_target_fd(&wd_fd, "cwd", optarg);
+ wd_path = optarg;
else
do_wd = true;
break;
@@ -433,6 +434,11 @@ int main(int argc, char *argv[])
}
}
+ if (wd_path)
+ open_target_fd(&wd_fd, "cwd", wd_path);
+ if (rd_path)
+ open_target_fd(&root_fd, "root", rd_path);
+
/* Remember the current working directory if I'm not changing it */
if (root_fd >= 0 && wd_fd < 0) {
wd_fd = open(".", O_RDONLY);
> I'm trying to write code to work regardless of whether it's run
> inside a chroot, so it would be nice not to have to pass arguments to
> `nsenter(1)` that are specific to chroots, like `chroot <real/path/to/chroot>`.
> It's also a bit counterintuitive to have to re-enter the chroot again.
>
> Also, these extra steps are not needed with `unshare(1)`, which works fine by
> itself. It's solely re-entering the namespace that seems to be problematic.
>
> I'm using util-linux 2.30.2-0.1 on Debian. I don't believe it's a problem
> specific to Debian, because everything works when using `unshare(1)` by itself,
> as stated.
>
> (I haven't tried running this inside a chroot-inside-a-chroot.)
>
> Details:
>
> # Below is all run inside a "schroot" session, which is a Debian tool for making chroot use more convenient.
> # I used the instructions here (https://wiki.debian.org/sbuild#Create_the_chroot) to create one.
>
> ## Preparation for the tests
>
> # Enter the chroot
> $ sudo schroot -c unstable-amd64-sbuild
> # Set up a private-bind file to hold a handle to our new namespace, as documented in the man page of unshare(1)
> (unstable-amd64-sbuild)root@localhost:/tmp# touch ns-mnt; mount --bind --make-private ns-mnt ns-mnt
> # Set up our test script
> (unstable-amd64-sbuild)root@localhost:/tmp# script='mount; ls /; ls -l /proc/$$/ns/mnt; mount -B /dev/null /etc/hosts; echo hosts:; cat /etc/hosts'
>
> ## Case 1: unshare(1) with no special options or commands, everything works as expected
>
> (unstable-amd64-sbuild)root@localhost:/tmp# unshare --mount=ns-mnt sh -ec "$script"
> unstable-amd64-sbuild on / type overlay (rw,relatime,lowerdir=/var/lib/schroot/union/underlay/<<SESSIONID>>,...)
> proc on /proc type proc (rw,relatime)
> sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
> [.. etc. other mappings in my chroot ..]
> unstable-amd64-sbuild on /tmp/ns-mnt type overlay (rw,relatime,lowerdir=/var/lib/schroot/union/underlay/<<SESSIONID>>,...)
> bin boot build dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
> lrwxrwxrwx 1 root root 0 Oct 27 17:35 /proc/31691/ns/mnt -> 'mnt:[4026532398]'
> hosts:
> [.. empty hosts (inside the namespace) ..]
> # we are now back outside the namespace
> # if we cat /etc/hosts (both inside and outside the chroot), we see the original
>
> ## now we try to re-enter the namespace.
>
> ## Case 2: nsenter(1) with no extra options or commands, doesn't work:
>
> (unstable-amd64-sbuild)root@localhost:/tmp# nsenter --mount=ns-mnt sh -ec "$script"
> [.. mappings for my host system, outside the chroot ..]
> bin boot dev etc home initrd.img initrd.img.old lib lib32 lib64 libx32 lost+found media mnt opt proc root run sbin selinux srv sys tmp usr var vmlinuz vmlinuz.old
> [.. aka the / on my host filesystem outside the chroot ..]
> lrwxrwxrwx 1 root root 0 Oct 27 19:36 /proc/32434/ns/mnt -> 'mnt:[4026532398]'
> [.. correct namespace ..]
> hosts:
> [.. empty hosts (inside the namespace) ..]
> # if we cat /etc/hosts outside the namespace, it's non-empty inside the chroot but EMPTY outside the chroot.
> # whoops, because we ran mount -B on the original non-chrooted / filesystem. findmnt says:
> └─/etc/hosts udev[/null] devtmpfs rw,nosuid,relatime,size=8181852k,nr_inodes=2045463,mode=755
> # we unmount it before proceeding
>
> ## Case 3: nsenter(1) with --root, partially works but not really:
>
> (unstable-amd64-sbuild)root@localhost:/tmp# nsenter --root=/ --mount=ns-mnt sh -ec "$script"
> [.. i.e. mount(1) gives empty output ..]
> bin boot build dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
> [.. at least the root is inside the chroot ..]
> lrwxrwxrwx 1 root root 0 Oct 27 17:37 /proc/878/ns/mnt -> 'mnt:[4026532398]'
> [.. correct namespace ..]
> mount: /etc/hosts: wrong fs type, bad option, bad superblock on /dev/null, missing codepage or helper program, or other error.
> [.. mount operations fail, but the namespace is correct ..]
> [.. if you analyse this case a bit more, you find that /proc/$$/{mounts,mountinfo,mountstats} are all empty ..]
> # exit code 32
> # outside the namespace, /etc/hosts is still non-empty, both inside and outside the chroot
>
> ## Case 4: nsenter(1) with explicit chroot(1) call, everything works as expected, again:
>
> (unstable-amd64-sbuild)root@localhost:/tmp# nsenter --mount=ns-mnt chroot /run/schroot/mount/<<SESSIONID>> sh -ec 'mount && ls /'
> unstable-amd64-sbuild on / type overlay (rw,relatime,lowerdir=/var/lib/schroot/union/underlay/<<SESSIONID>>,...)
> proc on /proc type proc (rw,relatime)
> sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
> [.. etc. other mappings in my chroot ..]
> unstable-amd64-sbuild on /tmp/ns-mnt type overlay (rw,relatime,lowerdir=/var/lib/schroot/union/underlay/<<SESSIONID>>,...)
> [.. great, we got our mounts back! ..]
> bin boot build dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
> lrwxrwxrwx 1 root root 0 Oct 27 17:39 /proc/2025/ns/mnt -> 'mnt:[4026532398]'
> [.. correct namespace ..]
> hosts:
> [.. empty hosts, as desired ..]
> # outside the namespace, /etc/hosts is still non-empty, both inside and outside the chroot
>
> --
> GPG: ed25519/56034877E1F87C35
> GPG: rsa4096/1318EFAC5FBBDBCE
> https://github.com/infinity0/pubkeys.git
> --
> To unsubscribe from this list: send the line "unsubscribe util-linux" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Karel Zak <kzak@redhat.com>
http://karelzak.blogspot.com
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: Bug: for mount namespaces inside a chroot, unshare works but nsenter doesn't
2017-11-03 13:33 ` Karel Zak
@ 2017-11-09 22:54 ` Eric W. Biederman
2017-11-10 13:14 ` Karel Zak
0 siblings, 1 reply; 6+ messages in thread
From: Eric W. Biederman @ 2017-11-09 22:54 UTC (permalink / raw)
To: Karel Zak; +Cc: Ximin Luo, util-linux
Karel Zak <kzak@redhat.com> writes:
> On Fri, Oct 27, 2017 at 06:07:00PM +0000, Ximin Luo wrote:
>> When unsharing persistent mount namespaces, unshare+nsenter does not seem to
>> work properly when run from inside a chroot session. However, unshare by itself
>> works.
>
> It's not related to persistent namespace, but to the way how nsenter
> uses chroot().
At a practical level it is related to persistent namespaces as this
problem will come up nowhere else.
In the non-persistent case you can do:
nsenter --mount=/proc/<pid>/ns/mnt --root=/proc/<pid>/root
Which works because the root directory is in the mount namespace.
>> As a workaround for the unshare+nsenter case, one can run `nsenter --mount=<ns>
>> chroot <real/path/to/chroot> command args`. The `--root` option to `nsenter`
>> sounds like it should work, but it does not - see below for details.
>>
>> Is this a bug?
>
> It seems like nsenter logic problem.
>
> The command nsenter opens root-dir and cwd file descriptors *before*
> setns() syscall, and than *after* the syscall it calls chroot(). The
> final process is in the namespace, but no in the root directory.
Which is necessary for the opening of file descriptors to have a well
defined meaning.
> open("/mnt/test/chroot/namespaces/mnt", O_RDONLY) = 3
> open("/mnt/test/chroot", O_RDONLY) = 4
> open("/mnt/test/chroot", O_RDONLY) = 5
> setns(3, CLONE_NEWNS) = 0
> close(3) = 0
> fchdir(4) = 0
> chroot(".") = 0
> close(4) = 0
> fchdir(5) = 0
> close(5) = 0
> execve("/bin/bash", ["-bash"], 0x7ffd2b5244d0 /* 31 vars */) = 0
> The patch below fixes the issue. It just moves root-dir and cwd open
> calls *after* the setns():
>
> open("/mnt/test/chroot/namespaces/mnt", O_RDONLY) = 3
> setns(3, CLONE_NEWNS) = 0
> close(3) = 0
> open("/mnt/test/chroot", O_RDONLY) = 3
> open("/mnt/test/chroot", O_RDONLY) = 4
> fchdir(4) = 0
> chroot(".") = 0
> close(4) = 0
> fchdir(3) = 0
> close(3) = 0
> execve("/bin/bash", ["-bash"], 0x7fff1ff8eb60 /* 31 vars */) = 0
>
> Unfortunately, I'm not sure if this is the right way in all cases.
I believe this will break all except the case mentioned.
My personal recommendation is not to use chroot with persistent mount
namespaces. That just seems to keep unnecessary mounts around. Those
extra mounts will almost certainly be a problem later when you discover
you want to unmount one of those mounted filesystems you don't care
about but are chrooting over.
I think it would be quite reasonable to have an additional option to
open things in the new mount namespace, just before exec. I just don't
see how useful it would be.
A second possibility is to issue a warning if root and is not a member
of the target mount namespace. That might even allow doing the right
thing automatically. It looks like the mnt_id is available from
/proc/<pid>/fdinfo/<fd#>. So it looks like it is possible with the
existing kernel interfaces (at least in theory).
Ugh. It looks like you commited your change below to sys-utils by
accident.
Eric
>
>
> Examples:
>
> *** I have simple chroot directory:
>
> ls -la /mnt/test/chroot
> total 20
> drwxr-xr-x 5 root root 4096 Nov 3 13:10 .
> drwxr-xr-x. 8 root root 4096 Nov 2 15:36 ..
> lrwxrwxrwx 1 root root 8 Nov 2 15:40 bin -> /usr/bin
> lrwxrwxrwx 1 root root 8 Nov 2 15:40 lib -> /usr/lib
> lrwxrwxrwx 1 root root 10 Nov 2 15:40 lib64 -> /usr/lib64
> drwxr-xr-x 4 root root 4096 Nov 3 13:22 namespaces
> dr-xr-xr-x 330 root root 0 Sep 26 22:17 proc
> lrwxrwxrwx 1 root root 9 Nov 2 15:40 sbin -> /usr/sbin
> drwxr-xr-x. 14 root root 4096 Aug 16 10:50 usr
>
> where is bind mounted /usr and mounted /proc
>
> # findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION --submounts /mnt/test/chroot
> TARGET SOURCE FSTYPE PROPAGATION
> /mnt/test/chroot /dev/sda4[/mnt/test/chroot] ext4 private
> ├─/mnt/test/chroot/usr /dev/sda4[/usr] ext4 shared
> └─/mnt/test/chroot/proc proc proc private
>
> let's enter the root and create persistent mount namespace within the chroot:
>
> # chroot /mnt/test/chroot
> # unshare --mount=namespaces/mnt
>
> our mount table:
>
> findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION
> TARGET SOURCE FSTYPE PROPAGATION
> / /dev/sda4[/mnt/test/chroot] ext4 private
> ├─/usr /dev/sda4[/usr] ext4 private
> └─/proc proc proc private
>
> and our mount namespace:
>
> # ls -la /proc/self/ns | grep mnt
> lrwxrwxrwx 1 0 0 0 Nov 3 12:56 mnt -> mnt:[4026532457]
>
> our pid:
>
> # echo $$
> 14411
>
> IMHO good idea is keep the shell alive in the chroot and use another session
> to play with nsenter.
>
> *** nsenter examples:
>
> a) let's try it by PID, all works as expected:
>
> # nsenter --target 14411 --mount --root --wd
>
> # findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION
> TARGET SOURCE FSTYPE PROPAGATION
> / /dev/sda4[/mnt/test/chroot] ext4 private
> ├─/usr /dev/sda4[/usr] ext4 private
> └─/proc proc proc private
>
> # ls -la /proc/self/ns | grep mnt
> lrwxrwxrwx 1 0 0 0 Nov 3 13:02 mnt -> mnt:[4026532457]
>
> Important note: in this case nsenter uses /proc/<target>/root for
> chroot(), but the goal is to use persistent namespace where no <target>
> available.
>
> b) let's try chroot() by path:
>
> # nsenter --target 14411 --mount --root=/mnt/test/chroot --wd=/mnt/test/chroot
>
> # findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION
>
> failed, mount table is empty
>
> c) let's try chroot by /proc paths:
>
> # nsenter --target 14411 --mount --root=/mnt/test/chroot/proc/14411/root --wd=/mnt/test/chroot/proc/14411/cwd
>
> # findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION
> TARGET SOURCE FSTYPE PROPAGATION
> / /dev/sda4[/mnt/test/chroot] ext4 private
> ├─/usr /dev/sda4[/usr] ext4 private
> └─/proc proc proc private
>
> # ls -la /proc/self/ns | grep mnt
> lrwxrwxrwx 1 0 0 0 Nov 3 13:09 mnt -> mnt:[4026532457]
>
> it works!
>
>
> Note that --target or --mount=<persistent> namespace does not change
> anything here.
>
> The nsenter with the patch:
>
>
> # ./nsenter --mount=/mnt/test/chroot/namespaces/mnt --root=/mnt/test/chroot --wd=/mnt/test/chroot
>
> # findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION
> TARGET SOURCE FSTYPE PROPAGATION
> / /dev/sda4[/mnt/test/chroot] ext4 private
> ├─/usr /dev/sda4[/usr] ext4 private
> └─/proc proc proc private
>
> # ls -la /proc/self/ns | grep mnt
> lrwxrwxrwx 1 0 0 0 Nov 3 13:11 mnt -> mnt:[4026532457]
>
> all works as expected. The patch is below.
>
> Karel
>
>
> diff --git a/sys-utils/nsenter.c b/sys-utils/nsenter.c
> index 9c452c1d1..464f9f98c 100644
> --- a/sys-utils/nsenter.c
> +++ b/sys-utils/nsenter.c
> @@ -238,6 +238,7 @@ int main(int argc, char *argv[])
> int do_fork = -1; /* unknown yet */
> uid_t uid = 0;
> gid_t gid = 0;
> + const char *rd_path = NULL, *wd_path = NULL;
> #ifdef HAVE_LIBSELINUX
> bool selinux = 0;
> #endif
> @@ -318,13 +319,13 @@ int main(int argc, char *argv[])
> break;
> case 'r':
> if (optarg)
> - open_target_fd(&root_fd, "root", optarg);
> + rd_path = optarg;
> else
> do_rd = true;
> break;
> case 'w':
> if (optarg)
> - open_target_fd(&wd_fd, "cwd", optarg);
> + wd_path = optarg;
> else
> do_wd = true;
> break;
> @@ -433,6 +434,11 @@ int main(int argc, char *argv[])
> }
> }
>
> + if (wd_path)
> + open_target_fd(&wd_fd, "cwd", wd_path);
> + if (rd_path)
> + open_target_fd(&root_fd, "root", rd_path);
> +
> /* Remember the current working directory if I'm not changing it */
> if (root_fd >= 0 && wd_fd < 0) {
> wd_fd = open(".", O_RDONLY);
>
>
>
>
>> I'm trying to write code to work regardless of whether it's run
>> inside a chroot, so it would be nice not to have to pass arguments to
>> `nsenter(1)` that are specific to chroots, like `chroot <real/path/to/chroot>`.
>> It's also a bit counterintuitive to have to re-enter the chroot again.
>>
>> Also, these extra steps are not needed with `unshare(1)`, which works fine by
>> itself. It's solely re-entering the namespace that seems to be problematic.
>>
>> I'm using util-linux 2.30.2-0.1 on Debian. I don't believe it's a problem
>> specific to Debian, because everything works when using `unshare(1)` by itself,
>> as stated.
>>
>> (I haven't tried running this inside a chroot-inside-a-chroot.)
>>
>> Details:
>>
>> # Below is all run inside a "schroot" session, which is a Debian tool for making chroot use more convenient.
>> # I used the instructions here (https://wiki.debian.org/sbuild#Create_the_chroot) to create one.
>>
>> ## Preparation for the tests
>>
>> # Enter the chroot
>> $ sudo schroot -c unstable-amd64-sbuild
>> # Set up a private-bind file to hold a handle to our new namespace, as documented in the man page of unshare(1)
>> (unstable-amd64-sbuild)root@localhost:/tmp# touch ns-mnt; mount --bind --make-private ns-mnt ns-mnt
>> # Set up our test script
>> (unstable-amd64-sbuild)root@localhost:/tmp# script='mount; ls /; ls -l /proc/$$/ns/mnt; mount -B /dev/null /etc/hosts; echo hosts:; cat /etc/hosts'
>>
>> ## Case 1: unshare(1) with no special options or commands, everything works as expected
>>
>> (unstable-amd64-sbuild)root@localhost:/tmp# unshare --mount=ns-mnt sh -ec "$script"
>> unstable-amd64-sbuild on / type overlay (rw,relatime,lowerdir=/var/lib/schroot/union/underlay/<<SESSIONID>>,...)
>> proc on /proc type proc (rw,relatime)
>> sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
>> [.. etc. other mappings in my chroot ..]
>> unstable-amd64-sbuild on /tmp/ns-mnt type overlay (rw,relatime,lowerdir=/var/lib/schroot/union/underlay/<<SESSIONID>>,...)
>> bin boot build dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
>> lrwxrwxrwx 1 root root 0 Oct 27 17:35 /proc/31691/ns/mnt -> 'mnt:[4026532398]'
>> hosts:
>> [.. empty hosts (inside the namespace) ..]
>> # we are now back outside the namespace
>> # if we cat /etc/hosts (both inside and outside the chroot), we see the original
>>
>> ## now we try to re-enter the namespace.
>>
>> ## Case 2: nsenter(1) with no extra options or commands, doesn't work:
>>
>> (unstable-amd64-sbuild)root@localhost:/tmp# nsenter --mount=ns-mnt sh -ec "$script"
>> [.. mappings for my host system, outside the chroot ..]
>> bin boot dev etc home initrd.img initrd.img.old lib lib32 lib64 libx32 lost+found media mnt opt proc root run sbin selinux srv sys tmp usr var vmlinuz vmlinuz.old
>> [.. aka the / on my host filesystem outside the chroot ..]
>> lrwxrwxrwx 1 root root 0 Oct 27 19:36 /proc/32434/ns/mnt -> 'mnt:[4026532398]'
>> [.. correct namespace ..]
>> hosts:
>> [.. empty hosts (inside the namespace) ..]
>> # if we cat /etc/hosts outside the namespace, it's non-empty inside the chroot but EMPTY outside the chroot.
>> # whoops, because we ran mount -B on the original non-chrooted / filesystem. findmnt says:
>> └─/etc/hosts udev[/null] devtmpfs rw,nosuid,relatime,size=8181852k,nr_inodes=2045463,mode=755
>> # we unmount it before proceeding
>>
>> ## Case 3: nsenter(1) with --root, partially works but not really:
>>
>> (unstable-amd64-sbuild)root@localhost:/tmp# nsenter --root=/ --mount=ns-mnt sh -ec "$script"
>> [.. i.e. mount(1) gives empty output ..]
>> bin boot build dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
>> [.. at least the root is inside the chroot ..]
>> lrwxrwxrwx 1 root root 0 Oct 27 17:37 /proc/878/ns/mnt -> 'mnt:[4026532398]'
>> [.. correct namespace ..]
>> mount: /etc/hosts: wrong fs type, bad option, bad superblock on /dev/null, missing codepage or helper program, or other error.
>> [.. mount operations fail, but the namespace is correct ..]
>> [.. if you analyse this case a bit more, you find that /proc/$$/{mounts,mountinfo,mountstats} are all empty ..]
>> # exit code 32
>> # outside the namespace, /etc/hosts is still non-empty, both inside and outside the chroot
>>
>> ## Case 4: nsenter(1) with explicit chroot(1) call, everything works as expected, again:
>>
>> (unstable-amd64-sbuild)root@localhost:/tmp# nsenter --mount=ns-mnt chroot /run/schroot/mount/<<SESSIONID>> sh -ec 'mount && ls /'
>> unstable-amd64-sbuild on / type overlay (rw,relatime,lowerdir=/var/lib/schroot/union/underlay/<<SESSIONID>>,...)
>> proc on /proc type proc (rw,relatime)
>> sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
>> [.. etc. other mappings in my chroot ..]
>> unstable-amd64-sbuild on /tmp/ns-mnt type overlay (rw,relatime,lowerdir=/var/lib/schroot/union/underlay/<<SESSIONID>>,...)
>> [.. great, we got our mounts back! ..]
>> bin boot build dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
>> lrwxrwxrwx 1 root root 0 Oct 27 17:39 /proc/2025/ns/mnt -> 'mnt:[4026532398]'
>> [.. correct namespace ..]
>> hosts:
>> [.. empty hosts, as desired ..]
>> # outside the namespace, /etc/hosts is still non-empty, both inside and outside the chroot
>>
>> --
>> GPG: ed25519/56034877E1F87C35
>> GPG: rsa4096/1318EFAC5FBBDBCE
>> https://github.com/infinity0/pubkeys.git
>> --
>> To unsubscribe from this list: send the line "unsubscribe util-linux" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Bug: for mount namespaces inside a chroot, unshare works but nsenter doesn't
2017-11-09 22:54 ` Eric W. Biederman
@ 2017-11-10 13:14 ` Karel Zak
2017-11-10 14:22 ` Ximin Luo
0 siblings, 1 reply; 6+ messages in thread
From: Karel Zak @ 2017-11-10 13:14 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: Ximin Luo, util-linux
On Thu, Nov 09, 2017 at 04:54:06PM -0600, Eric W. Biederman wrote:
> Karel Zak <kzak@redhat.com> writes:
>
> > Unfortunately, I'm not sure if this is the right way in all cases.
>
> I believe this will break all except the case mentioned.
I have expected something like this...
> My personal recommendation is not to use chroot with persistent mount
> namespaces. That just seems to keep unnecessary mounts around. Those
> extra mounts will almost certainly be a problem later when you discover
> you want to unmount one of those mounted filesystems you don't care
> about but are chrooting over.
>
> I think it would be quite reasonable to have an additional option to
> open things in the new mount namespace, just before exec. I just don't
> see how useful it would be.
It would be solution for this use-case, but it will increase
complexity and I'm not sure this use-case is important enough.
Especially if the all you need is to use chroot command before nsenter.
I don't think nsenter has to be all-in-one command. It's very basic
tool.
> A second possibility is to issue a warning if root and is not a member
> of the target mount namespace. That might even allow doing the right
> thing automatically. It looks like the mnt_id is available from
> /proc/<pid>/fdinfo/<fd#>. So it looks like it is possible with the
> existing kernel interfaces (at least in theory).
I'll think about it.
> Ugh. It looks like you commited your change below to sys-utils by
> accident.
OMG...<censored>... fixed. Thanks!
Karel
--
Karel Zak <kzak@redhat.com>
http://karelzak.blogspot.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Bug: for mount namespaces inside a chroot, unshare works but nsenter doesn't
2017-11-10 13:14 ` Karel Zak
@ 2017-11-10 14:22 ` Ximin Luo
2017-11-24 13:09 ` Ximin Luo
0 siblings, 1 reply; 6+ messages in thread
From: Ximin Luo @ 2017-11-10 14:22 UTC (permalink / raw)
To: Karel Zak, Eric W. Biederman; +Cc: util-linux
Karel Zak:
> [..]
>
>> My personal recommendation is not to use chroot with persistent mount
>> namespaces. That just seems to keep unnecessary mounts around. Those
>> extra mounts will almost certainly be a problem later when you discover
>> you want to unmount one of those mounted filesystems you don't care
>> about but are chrooting over.
>>
>> I think it would be quite reasonable to have an additional option to
>> open things in the new mount namespace, just before exec. I just don't
>> see how useful it would be.
>
> It would be solution for this use-case, but it will increase
> complexity and I'm not sure this use-case is important enough.
>
> Especially if the all you need is to use chroot command before nsenter.
> I don't think nsenter has to be all-in-one command. It's very basic
> tool.
>
My nsenter code may be run inside or outside a chroot, I have no control over that in the general case - users decide whether they want to run it inside a chroot or not.
The issue with using the chroot(1) command, is that you must give it the path to the chroot *from outside the chroot*. I don't know of a clean way to figure this out from my code, that starts life running from inside the chroot, and just wants to unshare part of the tree that it sees there.
An option to open root/wd in the new ns, sounds like it would allow me (and others) to write code that is chroot-independent. I'd very much appreciate that.
X
--
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Bug: for mount namespaces inside a chroot, unshare works but nsenter doesn't
2017-11-10 14:22 ` Ximin Luo
@ 2017-11-24 13:09 ` Ximin Luo
0 siblings, 0 replies; 6+ messages in thread
From: Ximin Luo @ 2017-11-24 13:09 UTC (permalink / raw)
To: Karel Zak, Eric W. Biederman; +Cc: util-linux
Ximin Luo:
> Karel Zak:
>> [..]
>>
>>> My personal recommendation is not to use chroot with persistent mount
>>> namespaces. That just seems to keep unnecessary mounts around. Those
>>> extra mounts will almost certainly be a problem later when you discover
>>> you want to unmount one of those mounted filesystems you don't care
>>> about but are chrooting over.
>>>
>>> I think it would be quite reasonable to have an additional option to
>>> open things in the new mount namespace, just before exec. I just don't
>>> see how useful it would be.
>>
>> It would be solution for this use-case, but it will increase
>> complexity and I'm not sure this use-case is important enough.
>>
>> Especially if the all you need is to use chroot command before nsenter.
>> I don't think nsenter has to be all-in-one command. It's very basic
>> tool.
>>
>
> My nsenter code may be run inside or outside a chroot, I have no control over that in the general case - users decide whether they want to run it inside a chroot or not.
>
> The issue with using the chroot(1) command, is that you must give it the path to the chroot *from outside the chroot*. I don't know of a clean way to figure this out from my code, that starts life running from inside the chroot, and just wants to unshare part of the tree that it sees there.
>
> An option to open root/wd in the new ns, sounds like it would allow me (and others) to write code that is chroot-independent. I'd very much appreciate that.
>
I tried Karel's patch from earlier and it seems that it does not work as I need it to - with the patch, it is still required to pass the path-to-the-chroot, from the parent mount namespace's "real" root.
I guess the problem stems from the fact that the unshare process's root, is a child-namespace-specific copy of the root from the parent namespace. When the process exists the handle to this root is lost, and nsenter does not have enough information to be able to pick it up again.
And opening the parent namespace's root (inside the chroot), if I understood correctly, does not correspond to a valid file descriptor inside the child namespace, and that's why this bug exists.
Perhaps the solution then, is to offer a way for unshare(1) to save a handle to the root inside the child-namespace, and then nsenter(1) can later be pointed to this root? i.e.
# all commands run inside the chroot
$ unshare --mount=ns-mnt --root=./root true
# ^^^^^^^^^^^^^
# this would effectively save (chroot)/proc/self/root to ./root so it's available after the process exits
$ nsenter --mount=ns-mnt --root=./root true
# ^^^^^^^^^^^^^
# nsenter can then chroot to this as normal
(The wd is less important to me but I suppose a similar thing could be done with that too.)
I am not sure if the first step is possible in the current kernel however. But it seems to me that it *ought* to be possible, to make chroots and persistent-mount-namespaces play nicely together.
X
--
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-11-24 13:10 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-27 18:07 Bug: for mount namespaces inside a chroot, unshare works but nsenter doesn't Ximin Luo
2017-11-03 13:33 ` Karel Zak
2017-11-09 22:54 ` Eric W. Biederman
2017-11-10 13:14 ` Karel Zak
2017-11-10 14:22 ` Ximin Luo
2017-11-24 13:09 ` Ximin Luo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.