linux-man.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* For review: rewritten pivot_root(2) manual page
@ 2019-09-23 12:04 Michael Kerrisk (man-pages)
  2019-09-23 13:42 ` Philipp Wendler
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Kerrisk (man-pages) @ 2019-09-23 12:04 UTC (permalink / raw)
  To: Eric W. Biederman, Serge E. Hallyn, Christian Brauner,
	Aleksa Sarai, Reid Priedhorsky, Andy Lutomirski, Yang Bo,
	Jakub Wilk, Joseph Sible, Al Viro, Philipp Wendler, werner
  Cc: mtk.manpages, linux-man, lkml, Containers, Stéphane Graber

Hello all,

I'm looking for review input for the pivot_root(2) manual
page, which I have substantially rewritten.

The original page was written 19 years ago, and has seen
little revision since that time. It contains a number of
errors. Even at the time it was first released, the 
manual page already had some inaccuracies, since it was
written before the final release of the system call, whose
implementation was subsequently changed, but the manual
page was not updated to reflect those changes.

The revised page is more than 2.5 times the size of the
previous page, and now includes an example program.
As well as fixing a number of errors and adding many
missing details, the page also adds a description of the
pivot_root(".", ".") technique.

I would be happy to receive error corrections and notes
on missing details that should be added to the page.

The rendered page is shown below. The page source can
be found in the Git repo at
https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git

One area of the page that I'm still not really happy with
is the "vague" wording in the second paragraph and the note
in the third paragraph about the system call possibly
changing. These pieces survive (in somewhat modified form)
from the original page, which was written before the
system call was released, and it seems there was some
question about whether the system call might still change
its behavior with respect to the root directory and current
working directory of other processes. However, after 19
years, nothing has changed, and surely it will not in the
future, since that would constitute an ABI breakage.
I'm considering to rewrite these pieces to exactly
describe what the system call does (which I already
do in the third paragraph) and remove the "may or may not"
pieces in the second paragraph. I'd welcome comments
on making that change.

The rendered page is shown below. The page source is at
https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/tree/man2/pivot_root.2
in the Git repo at
https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git

Thanks,

Michael

NAME
       pivot_root - change the root filesystem

SYNOPSIS
       int pivot_root(const char *new_root, const char *put_old);

       Note: There is no glibc wrapper for this system call; see NOTES.

DESCRIPTION
       pivot_root() changes the root filesystem in the mount namespace of
       the calling process.  More precisely, it moves the root filesystem
       to  the directory put_old and makes new_root the new root filesys‐
       tem.  The calling process must have the  CAP_SYS_ADMIN  capability
       in the user namespace that owns the caller's mount namespace.

       pivot_root()  may  or may not change the current root and the cur‐
       rent working directory of any processes or threads  that  use  the
       old  root  directory  and which are in the same mount namespace as
       the caller of pivot_root().  The  caller  of  pivot_root()  should
       ensure  that  processes  with root or current working directory at
       the old root operate correctly in either case.   An  easy  way  to
       ensure  this is to change their root and current working directory
       to  new_root  before  invoking  pivot_root().   Note   also   that
       pivot_root()  may  or may not affect the calling process's current
       working directory.  It is therefore recommended to call chdir("/")
       immediately after pivot_root().

       The  paragraph  above  is  intentionally vague because at the time
       when pivot_root() was first implemented, it  was  unclear  whether
       its  affect  on  other process's root and current working directo‐
       ries—and the caller's current working  directory—might  change  in
       the  future.   However, the behavior has remained consistent since
       this system call was first implemented: pivot_root()  changes  the
       root  directory  and the current working directory of each process
       or thread in the same mount namespace to new_root if they point to
       the  old  root  directory.   (See also NOTES.)  On the other hand,
       pivot_root() does not change the caller's current  working  direc‐
       tory  (unless it is on the old root directory), and thus it should
       be followed by a chdir("/") call.

       The following restrictions apply:

       -  new_root and put_old must be directories.

       -  new_root and put_old must not be on the same filesystem as  the
          current root.  In particular, new_root can't be "/" (but can be
          a bind mounted directory on the current root filesystem).

       -  put_old must be at or underneath new_root; that  is,  adding  a
          nonnegative  number  of /.. to the string pointed to by put_old
          must yield the same directory as new_root.

       -  new_root must be a mount point.  (If  it  is  not  otherwise  a
          mount  point,  it  suffices  to  bind  mount new_root on top of
          itself.)

       -  The propagation type of the parent mount of  new_root  and  the
          parent  mount  of  the  current  root  directory  must  not  be
          MS_SHARED; similarly, if put_old is an  existing  mount  point,
          its propagation type must not be MS_SHARED.  These restrictions
          ensure  that  pivot_root()  never  propagates  any  changes  to
          another mount namespace.

       -  The current root directory must be a mount point.

RETURN VALUE
       On success, zero is returned.  On error, -1 is returned, and errno
       is set appropriately.

ERRORS
       pivot_root() may fail with any of  the  same  errors  as  stat(2).
       Additionally, it may fail with the following errors:

       EBUSY  new_root  or  put_old  is  on  the current root filesystem.
              (This error covers the pathological case where new_root  is
              "/".)

       EINVAL new_root is not a mount point.

       EINVAL put_old is not underneath new_root.

       EINVAL The current root directory is not a mount point (because of
              an earlier chroot(2)).

       EINVAL The current root is on the rootfs (initial ramfs)  filesys‐
              tem; see NOTES.

       EINVAL Either  the mount point at new_root, or the parent mount of
              that mount point, has propagation type MS_SHARED.

       EINVAL put_old is a mount  point  and  has  the  propagation  type
              MS_SHARED.

       ENOTDIR
              new_root or put_old is not a directory.

       EPERM  The  calling  process does not have the CAP_SYS_ADMIN capa‐
              bility.

VERSIONS
       pivot_root() was introduced in Linux 2.3.41.

CONFORMING TO
       pivot_root() is Linux-specific and hence is not portable.

NOTES
       Glibc does not provide a wrapper for this  system  call;  call  it
       using syscall(2).

       A  command-line  interface  for  this  system  call is provided by
       pivot_root(8).

       pivot_root() allows the caller to switch to a new root  filesystem
       while  at  the  same time placing the old root mount at a location
       under new_root from where it can subsequently be unmounted.   (The
       fact  that  it  moves  all processes that have a root directory or
       current working directory on the old root filesystem  to  the  new
       root  filesystem  frees the old root filesystem of users, allowing
       it to be unmounted more easily.)

       A typical use of pivot_root() is during system startup,  when  the
       system  mounts a temporary root filesystem (e.g., an initrd), then
       mounts the real root filesystem, and eventually turns  the  latter
       into  the  current  root  of all relevant processes or threads.  A
       modern use is to set up a root filesystem during the creation of a
       container.

       The fact that pivot_root() modifies process root and current work‐
       ing directories in the manner noted in DESCRIPTION is necessary in
       order  to  prevent kernel threads from keeping the old root direc‐
       tory busy with their root and current working directory,  even  if
       they never access the filesystem in any way.

       new_root  and  put_old  may be the same directory.  In particular,
       the following sequence allows a pivot-root operation without need‐
       ing to create and remove a temporary directory:

           chdir(new_root);
           mount("", ".", MS_SLAVE | MS_REC, NULL);
                   /* Or: MS_PRIVATE | MS_REC */
           pivot_root(".", ".");
           umount2(".", MNT_DETACH);

       This  sequence  succeeds  because the pivot_root() call stacks the
       old root mount point (old_root) on top of the new root mount point
       at  /.   At  that  point, the calling process's root directory and
       current working directory  refer  to  the  new  root  mount  point
       (new_root).   During  the  subsequent umount() call, resolution of
       "."  starts with new_root and then moves up  the  list  of  mounts
       stacked at /, with the result that old_root is unmounted.

       The  rootfs  (initial ramfs) cannot be pivot_root()ed.  The recom‐
       mended method of changing the root filesystem in this case  is  to
       delete  everything  in rootfs, overmount rootfs with the new root,
       attach stdin/stdout/stderr to the new /dev/console, and  exec  the
       new   init(1).   Helper  programs  for  this  process  exist;  see
       switch_root(8).

EXAMPLE
       The program below demonstrates the use of  pivot_root()  inside  a
       mount namespace that is created using clone(2).  After pivoting to
       the root directory named in the program's first command-line argu‐
       ment,  the  child  created  by  clone(2) then executes the program
       named in the remaining command-line arguments.

       We demonstrate the program by creating a directory that will serve
       as  the  new root filesystem and placing a copy of the (statically
       linked) busybox(1) executable in that directory.

           $ mkdir /tmp/rootfs
           $ ls -id /tmp/rootfs    # Show inode number of new root directory
           319459 /tmp/rootfs
           $ cp $(which busybox) /tmp/rootfs
           $ PS1='bbsh$ ' sudo ./pivot_root_demo /tmp/rootfs /busybox sh
           bbsh$ PATH=/
           bbsh$ busybox ln busybox ln
           bbsh$ ln busybox echo
           bbsh$ ln busybox ls
           bbsh$ ls
           busybox  echo     ln       ls
           bbsh$ ls -id /          # Compare with inode number above
           319459 /
           bbsh$ echo 'hello world'
           hello world

   Program source

       /* pivot_root_demo.c */

       #define _GNU_SOURCE
       #include <sched.h>
       #include <stdio.h>
       #include <stdlib.h>
       #include <unistd.h>
       #include <sys/wait.h>
       #include <sys/syscall.h>
       #include <sys/mount.h>
       #include <sys/stat.h>
       #include <limits.h>

       #define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
                               } while (0)

       static int
       pivot_root(const char *new_root, const char *put_old)
       {
           return syscall(SYS_pivot_root, new_root, put_old);
       }

       #define STACK_SIZE (1024 * 1024)

       static int              /* Startup function for cloned child */
       child(void *arg)
       {
           char **args = arg;
           char *new_root = args[0];
           const char *put_old = "/oldrootfs";
           char path[PATH_MAX];

           /* Ensure that 'new_root' and its parent mount don't have
              shared propagation (which would cause pivot_root() to
              return an error), and prevent propagation of mount
              events to the initial mount namespace */

           if (mount(NULL, "/", NULL, MS_REC | MS_PRIVATE, NULL) == 1)
               errExit("mount-MS_PRIVATE");

           /* Ensure that 'new_root' is a mount point */

           if (mount(new_root, new_root, NULL, MS_BIND, NULL) == -1)
               errExit("mount-MS_BIND");

           /* Create directory to which old root will be pivoted */

           snprintf(path, sizeof(path), "%s/%s", new_root, put_old);
           if (mkdir(path, 0777) == -1)
               errExit("mkdir");

           /* And pivot the root filesystem */

           if (pivot_root(new_root, path) == -1)
               errExit("pivot_root");

           /* Switch the current working working directory to "/" */

           if (chdir("/") == -1)
               errExit("chdir");

           /* Unmount old root and remove mount point */

           if (umount2(put_old, MNT_DETACH) == -1)
               perror("umount2");
           if (rmdir(put_old) == -1)
               perror("rmdir");

           /* Execute the command specified in argv[1]... */

           execv(args[1], &args[1]);
           errExit("execv");
       }

       int
       main(int argc, char *argv[])
       {
           /* Create a child process in a new mount namespace */

           char *stack = malloc(STACK_SIZE);
           if (stack == NULL)
               errExit("malloc");

           if (clone(child, stack + STACK_SIZE,
                       CLONE_NEWNS | SIGCHLD, &argv[1]) == -1)
               errExit("clone");

           /* Parent falls through to here; wait for child */

           if (wait(NULL) == -1)
               errExit("wait");

           exit(EXIT_SUCCESS);
       }

SEE ALSO
       chdir(2), chroot(2), mount(2),  stat(2),  initrd(4),  mount_names‐
       paces(7), pivot_root(8), switch_root(8)


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: For review: rewritten pivot_root(2) manual page
  2019-09-23 12:04 For review: rewritten pivot_root(2) manual page Michael Kerrisk (man-pages)
@ 2019-09-23 13:42 ` Philipp Wendler
  2019-10-09  7:41   ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 9+ messages in thread
From: Philipp Wendler @ 2019-09-23 13:42 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages),
	Eric W. Biederman, Serge E. Hallyn, Christian Brauner,
	Aleksa Sarai, Reid Priedhorsky, Andy Lutomirski, Yang Bo,
	Jakub Wilk, Joseph Sible, Al Viro, werner
  Cc: linux-man, lkml, Containers, Stéphane Graber

Hello Michael,

Am 23.09.19 um 14:04 schrieb Michael Kerrisk (man-pages):

> I'm considering to rewrite these pieces to exactly
> describe what the system call does (which I already
> do in the third paragraph) and remove the "may or may not"
> pieces in the second paragraph. I'd welcome comments
> on making that change.

I think that it would make the man page significantly easier to
understand if if the vague wording and the meta discussion about it are
removed.

> DESCRIPTION
[...]>        pivot_root()  changes  the
>        root  directory  and the current working directory of each process
>        or thread in the same mount namespace to new_root if they point to
>        the  old  root  directory.   (See also NOTES.)  On the other hand,
>        pivot_root() does not change the caller's current  working  direc‐
>        tory  (unless it is on the old root directory), and thus it should
>        be followed by a chdir("/") call.

There is a contradiction here with the NOTES (cf. below).

>        The following restrictions apply:
> 
>        -  new_root and put_old must be directories.
> 
>        -  new_root and put_old must not be on the same filesystem as  the
>           current root.  In particular, new_root can't be "/" (but can be
>           a bind mounted directory on the current root filesystem).

Wouldn't "must not be on the same mountpoint" or something similar be
more clear, at least for new_root? The note in parentheses indicates
that new_root can actually be on the same filesystem as the current
note. However, ...

>        -  put_old must be at or underneath new_root; that  is,  adding  a
>           nonnegative  number  of /.. to the string pointed to by put_old
>           must yield the same directory as new_root.
> 
>        -  new_root must be a mount point.  (If  it  is  not  otherwise  a
>           mount  point,  it  suffices  to  bind  mount new_root on top of
>           itself.)

... this item actually makes the above item almost redundant regarding
new_root (except for the "/") case. So one could replace this item with
something like this:

- new_root must be a mount point different from "/". (If it is not
  otherwise a mount point, it suffices  to bind mount new_root on top
  of itself.)

The above item would then only mention put_old (and maybe use clarified
wording on whether actually a different file system is necessary for
put_old or whether a different mount point is enough).

> NOTES
[...]
>        pivot_root() allows the caller to switch to a new root  filesystem
>        while  at  the  same time placing the old root mount at a location
>        under new_root from where it can subsequently be unmounted.   (The
>        fact  that  it  moves  all processes that have a root directory or
>        current working directory on the old root filesystem  to  the  new
>        root  filesystem  frees the old root filesystem of users, allowing
>        it to be unmounted more easily.)

Here is the contradiction:
The DESCRIPTION says that root and current working dir are only changed
"if they point to the old root directory". Here in the NOTES it says
that any root or working directories on the old root file system (i.e.,
even if somewhere below the root) are changed.

Which is correct?

If it indeed affects all processes with root and/or current working
directory below the old root, the text here does not clearly state what
the new root/current working directory of theses processes is.
E.g., if a process is at /foo and we pivot to /bar, will the process be
moved to /bar (i.e., at / after pivot_root), or will the kernel attempt
to move it to some location like /bar/foo? Because the latter might not
even exist, I suspect that everything is just moved to new_root, but
this could be stated explicitly by replacing "to the new root
filesystem" in the above paragraph with "to the new root directory"
(after checking whether this is true).

> EXAMPLE>        The program below demonstrates the use of  pivot_root()  inside  a
>        mount namespace that is created using clone(2).  After pivoting to
>        the root directory named in the program's first command-line argu‐
>        ment,  the  child  created  by  clone(2) then executes the program
>        named in the remaining command-line arguments.

Why not use the pivot_root(".", ".") in the example program?
It would make the example shorter, and also works if the process cannot
write to new_root (e..g., in a user namespace).

Regards,
Philipp

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: For review: rewritten pivot_root(2) manual page
  2019-09-23 13:42 ` Philipp Wendler
@ 2019-10-09  7:41   ` Michael Kerrisk (man-pages)
  2019-10-09  8:18     ` G. Branden Robinson
  2019-10-09  8:46     ` Eric W. Biederman
  0 siblings, 2 replies; 9+ messages in thread
From: Michael Kerrisk (man-pages) @ 2019-10-09  7:41 UTC (permalink / raw)
  To: Philipp Wendler, Eric W. Biederman, Serge E. Hallyn,
	Christian Brauner, Aleksa Sarai, Reid Priedhorsky,
	Andy Lutomirski, Yang Bo, Jakub Wilk, Joseph Sible, Al Viro,
	werner
  Cc: mtk.manpages, linux-man, lkml, Containers, Stéphane Graber

Hello Philipp,

My apologies that it has taken a while to reply. (I had been hoping
and waiting that a few more people might weigh in on this thread.)

On 9/23/19 3:42 PM, Philipp Wendler wrote:
> Hello Michael,
> 
> Am 23.09.19 um 14:04 schrieb Michael Kerrisk (man-pages):
> 
>> I'm considering to rewrite these pieces to exactly
>> describe what the system call does (which I already
>> do in the third paragraph) and remove the "may or may not"
>> pieces in the second paragraph. I'd welcome comments
>> on making that change.
> 
> I think that it would make the man page significantly easier to
> understand if if the vague wording and the meta discussion about it are
> removed.

It is my inclination to make this change, but I'd love to get more
feedback on this point.

>> DESCRIPTION
> [...]>        pivot_root()  changes  the
>>        root  directory  and the current working directory of each process
>>        or thread in the same mount namespace to new_root if they point to
>>        the  old  root  directory.   (See also NOTES.)  On the other hand,
>>        pivot_root() does not change the caller's current  working  direc‐
>>        tory  (unless it is on the old root directory), and thus it should
>>        be followed by a chdir("/") call.
> 
> There is a contradiction here with the NOTES (cf. below).

See below.

>>        The following restrictions apply:
>>
>>        -  new_root and put_old must be directories.
>>
>>        -  new_root and put_old must not be on the same filesystem as  the
>>           current root.  In particular, new_root can't be "/" (but can be
>>           a bind mounted directory on the current root filesystem).
> 
> Wouldn't "must not be on the same mountpoint" or something similar be
> more clear, at least for new_root? The note in parentheses indicates
> that new_root can actually be on the same filesystem as the current
> note. However, ...

For 'put_old', it really is "filesystem".

For 'new_root', see below.

>>        -  put_old must be at or underneath new_root; that  is,  adding  a
>>           nonnegative  number  of /.. to the string pointed to by put_old
>>           must yield the same directory as new_root.
>>
>>        -  new_root must be a mount point.  (If  it  is  not  otherwise  a
>>           mount  point,  it  suffices  to  bind  mount new_root on top of
>>           itself.)
> 
> ... this item actually makes the above item almost redundant regarding
> new_root (except for the "/") case. So one could replace this item with
> something like this:
> 
> - new_root must be a mount point different from "/". (If it is not
>   otherwise a mount point, it suffices  to bind mount new_root on top
>   of itself.)
> 
> The above item would then only mention put_old (and maybe use clarified
> wording on whether actually a different file system is necessary for
> put_old or whether a different mount point is enough).

Thanks. That's a good suggestion. I simplified the earlier bullet
point as you suggested, and changed the text here to say:

       -  new_root must be a mount point, but can't be "/".  If it is not
          otherwise  a mount point, it suffices to bind mount new_root on
          top of itself.  (new_root can be a bind  mounted  directory  on
          the current root filesystem.)

>> NOTES
> [...]
>>        pivot_root() allows the caller to switch to a new root  filesystem
>>        while  at  the  same time placing the old root mount at a location
>>        under new_root from where it can subsequently be unmounted.   (The
>>        fact  that  it  moves  all processes that have a root directory or
>>        current working directory on the old root filesystem  to  the  new
>>        root  filesystem  frees the old root filesystem of users, allowing
>>        it to be unmounted more easily.)
> 
> Here is the contradiction:
> The DESCRIPTION says that root and current working dir are only changed
> "if they point to the old root directory". Here in the NOTES it says
> that any root or working directories on the old root file system (i.e.,
> even if somewhere below the root) are changed.
> 
> Which is correct?

The first text is correct. I must have accidentally inserted
"filesystem" into the paragraph just here during a global edit.
Thanks for catching that.

> If it indeed affects all processes with root and/or current working
> directory below the old root, the text here does not clearly state what
> the new root/current working directory of theses processes is.
> E.g., if a process is at /foo and we pivot to /bar, will the process be
> moved to /bar (i.e., at / after pivot_root), or will the kernel attempt
> to move it to some location like /bar/foo? Because the latter might not
> even exist, I suspect that everything is just moved to new_root, but
> this could be stated explicitly by replacing "to the new root
> filesystem" in the above paragraph with "to the new root directory"
> (after checking whether this is true).

The text here now reads:

       pivot_root() allows the caller to switch to a new root  filesystem
       while  at  the  same time placing the old root mount at a location
       under new_root from where it can subsequently be unmounted.   (The
       fact  that  it  moves  all processes that have a root directory or
       current working directory on the old root  directory  to  the  new
       root  frees the old root directory of users, allowing the old root
       filesystem to be unmounted more easily.)


>> EXAMPLE>        The program below demonstrates the use of  pivot_root()  inside  a
>>        mount namespace that is created using clone(2).  After pivoting to
>>        the root directory named in the program's first command-line argu‐
>>        ment,  the  child  created  by  clone(2) then executes the program
>>        named in the remaining command-line arguments.
> 
> Why not use the pivot_root(".", ".") in the example program?
> It would make the example shorter, and also works if the process cannot
> write to new_root (e..g., in a user namespace).

I'm not sure. Some people have a bit of trouble to wrap their head
around the pivot_root(".", ".") idea. (I possibly am one of them.)
I'd be quite keen to hear other opinions on this. Unfortunately,
few people have commented on this manual page rewrite.

Thanks,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: For review: rewritten pivot_root(2) manual page
  2019-10-09  7:41   ` Michael Kerrisk (man-pages)
@ 2019-10-09  8:18     ` G. Branden Robinson
  2019-10-09  8:46     ` Eric W. Biederman
  1 sibling, 0 replies; 9+ messages in thread
From: G. Branden Robinson @ 2019-10-09  8:18 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Philipp Wendler, Eric W. Biederman, Serge E. Hallyn,
	Christian Brauner, Aleksa Sarai, Reid Priedhorsky,
	Andy Lutomirski, Yang Bo, Jakub Wilk, Joseph Sible, Al Viro,
	werner, linux-man, lkml, Containers, Stéphane Graber

[-- Attachment #1: Type: text/plain, Size: 2511 bytes --]

At 2019-10-09T09:41:34+0200, Michael Kerrisk (man-pages) wrote:
> I'm not sure. Some people have a bit of trouble to wrap their head
> around the pivot_root(".", ".") idea. (I possibly am one of them.)
> I'd be quite keen to hear other opinions on this. Unfortunately,
> few people have commented on this manual page rewrite.

pivot_root(".", ".") seems as ineffable to me as chdir(".").

Meaning mostly, but not completely.

I have an external drive with a USB cable that's a little dodgy.  If it
moves around a bit the external drive gets auto-unmounted, and then
remounted in the same place, so I can experience the otherwise-baffling
shell experience of:

[disconnect/reconnect happens; the device is mounted again now]
$ ls .
Input/output error
$ cd .
$ ls .
[perfectly fine listing]

What's happened is that the meaning of "." has subtly changed in a way
that I suppose would never have been seen back in Version 7 Unix days.
Maybe I've been reading too much historical documentation (I'm currently
enjoying McKusick et al.'s _Design and Implementation of the 4.4BSD
Operations System_), but the way we describe and teach Unixlike systems
in operating systems classes and, more to the point, in our man pages I
think continues to be strongly informed by the invariants we learned in
our youth, and which are slowly but steadily being invalidated.

Concretely, I recommend having pivot_root(".", ".") in the man page as
an example, but perhaps as an alternate.  Because it is
counterintuitive (to some minds), it's worth spending some time to
explain it.  But I would offer it because it's a valid use of the system
call and because it makes sense to a domain expert (Eric Biedermann).

I would try to offer an explanation myself but I lack the understanding.
_If_ I'm following the discussion correctly, which I doubt, then what I
imagine to happen is that a sequence point occurs between the function
parameters, and "." changes its meaning as with my "cd ." example above.
I am probably reasoning by analogy, and perhaps not by a good one.

Also, it is okay if the language of this page continues to evolve over
time.  I appreciate your desire to get it "perfect" (or at least to some
local optimum) now since you're most of the way through an overhaul of
it, but it is not just the system that changes with time--the audience
does too.

Maybe in 5 or 10 years, the kids will be au fait with pivot_root(".",
".") and only some graybeards will continue to think of it as a bit
strange.

Regards,
Branden

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: For review: rewritten pivot_root(2) manual page
  2019-10-09  7:41   ` Michael Kerrisk (man-pages)
  2019-10-09  8:18     ` G. Branden Robinson
@ 2019-10-09  8:46     ` Eric W. Biederman
  2019-10-09 10:22       ` Michael Kerrisk (man-pages)
  1 sibling, 1 reply; 9+ messages in thread
From: Eric W. Biederman @ 2019-10-09  8:46 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Philipp Wendler, Serge E. Hallyn, Christian Brauner,
	Aleksa Sarai, Reid Priedhorsky, Andy Lutomirski, Yang Bo,
	Jakub Wilk, Joseph Sible, Al Viro, werner, linux-man, lkml,
	Containers, Stéphane Graber

"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:

> Hello Philipp,
>
> My apologies that it has taken a while to reply. (I had been hoping
> and waiting that a few more people might weigh in on this thread.)
>
> On 9/23/19 3:42 PM, Philipp Wendler wrote:
>> Hello Michael,
>> 
>> Am 23.09.19 um 14:04 schrieb Michael Kerrisk (man-pages):
>> 
>>> I'm considering to rewrite these pieces to exactly
>>> describe what the system call does (which I already
>>> do in the third paragraph) and remove the "may or may not"
>>> pieces in the second paragraph. I'd welcome comments
>>> on making that change.
>> 
>> I think that it would make the man page significantly easier to
>> understand if if the vague wording and the meta discussion about it are
>> removed.
>
> It is my inclination to make this change, but I'd love to get more
> feedback on this point.
>
>>> DESCRIPTION
>> [...]>        pivot_root()  changes  the
>>>        root  directory  and the current working directory of each process
>>>        or thread in the same mount namespace to new_root if they point to
>>>        the  old  root  directory.   (See also NOTES.)  On the other hand,
>>>        pivot_root() does not change the caller's current  working  direc‐
>>>        tory  (unless it is on the old root directory), and thus it should
>>>        be followed by a chdir("/") call.
>> 
>> There is a contradiction here with the NOTES (cf. below).
>
> See below.
>
>>>        The following restrictions apply:
>>>
>>>        -  new_root and put_old must be directories.
>>>
>>>        -  new_root and put_old must not be on the same filesystem as  the
>>>           current root.  In particular, new_root can't be "/" (but can be
>>>           a bind mounted directory on the current root filesystem).
>> 
>> Wouldn't "must not be on the same mountpoint" or something similar be
>> more clear, at least for new_root? The note in parentheses indicates
>> that new_root can actually be on the same filesystem as the current
>> note. However, ...
>
> For 'put_old', it really is "filesystem".

If we are going to be pedantic "filesystem" is really the wrong concept
here.  The section about bind mount clarifies it, but I wonder if there
is a better term.

I think I would say: "new_root and put_old must not be on the same mount
as the current root."

I think using "mount" instead of "filesystem" keeps the concepts less
confusing.

As I am reading through this email and seeing text that is trying to be
precise and clear then hitting the term "filesystem" is a bit jarring.
pivot_root doesn't care a thing for file systems.  pivot_root only cares
about mounts.

And by a "mount" I mean the thing that you get when you create a bind
mount or you call mount normally.

Michael do you have man pages for the new mount api yet?

> For 'new_root', see below.
>
>>>        -  put_old must be at or underneath new_root; that  is,  adding  a
>>>           nonnegative  number  of /.. to the string pointed to by put_old
>>>           must yield the same directory as new_root.
>>>
>>>        -  new_root must be a mount point.  (If  it  is  not  otherwise  a
>>>           mount  point,  it  suffices  to  bind  mount new_root on top of
>>>           itself.)
>> 
>> ... this item actually makes the above item almost redundant regarding
>> new_root (except for the "/") case. So one could replace this item with
>> something like this:
>> 
>> - new_root must be a mount point different from "/". (If it is not
>>   otherwise a mount point, it suffices  to bind mount new_root on top
>>   of itself.)
>> 
>> The above item would then only mention put_old (and maybe use clarified
>> wording on whether actually a different file system is necessary for
>> put_old or whether a different mount point is enough).
>
> Thanks. That's a good suggestion. I simplified the earlier bullet
> point as you suggested, and changed the text here to say:
>
>        -  new_root must be a mount point, but can't be "/".  If it is not
>           otherwise  a mount point, it suffices to bind mount new_root on
>           top of itself.  (new_root can be a bind  mounted  directory  on
>           the current root filesystem.)

How about:
          - new_root must be the path to a mount, but can't be "/".  Any
          path that is not already a mount can be converted into one by
          bind mounting the path onto itself.
          
>>> NOTES
>> [...]
>>>        pivot_root() allows the caller to switch to a new root  filesystem
>>>        while  at  the  same time placing the old root mount at a location
>>>        under new_root from where it can subsequently be unmounted.   (The
>>>        fact  that  it  moves  all processes that have a root directory or
>>>        current working directory on the old root filesystem  to  the  new
>>>        root  filesystem  frees the old root filesystem of users, allowing
>>>        it to be unmounted more easily.)
>> 
>> Here is the contradiction:
>> The DESCRIPTION says that root and current working dir are only changed
>> "if they point to the old root directory". Here in the NOTES it says
>> that any root or working directories on the old root file system (i.e.,
>> even if somewhere below the root) are changed.
>> 
>> Which is correct?
>
> The first text is correct. I must have accidentally inserted
> "filesystem" into the paragraph just here during a global edit.
> Thanks for catching that.
>
>> If it indeed affects all processes with root and/or current working
>> directory below the old root, the text here does not clearly state what
>> the new root/current working directory of theses processes is.
>> E.g., if a process is at /foo and we pivot to /bar, will the process be
>> moved to /bar (i.e., at / after pivot_root), or will the kernel attempt
>> to move it to some location like /bar/foo? Because the latter might not
>> even exist, I suspect that everything is just moved to new_root, but
>> this could be stated explicitly by replacing "to the new root
>> filesystem" in the above paragraph with "to the new root directory"
>> (after checking whether this is true).
>
> The text here now reads:
>
>        pivot_root() allows the caller to switch to a new root  filesystem
>        while  at  the  same time placing the old root mount at a location
>        under new_root from where it can subsequently be unmounted.   (The
>        fact  that  it  moves  all processes that have a root directory or
>        current working directory on the old root  directory  to  the  new
>        root  frees the old root directory of users, allowing the old root
>        filesystem to be unmounted more easily.)


Please "mount" instead of "filesystem".

>>> EXAMPLE>        The program below demonstrates the use of  pivot_root()  inside  a
>>>        mount namespace that is created using clone(2).  After pivoting to
>>>        the root directory named in the program's first command-line argu‐
>>>        ment,  the  child  created  by  clone(2) then executes the program
>>>        named in the remaining command-line arguments.
>> 
>> Why not use the pivot_root(".", ".") in the example program?
>> It would make the example shorter, and also works if the process cannot
>> write to new_root (e..g., in a user namespace).
>
> I'm not sure. Some people have a bit of trouble to wrap their head
> around the pivot_root(".", ".") idea. (I possibly am one of them.)
> I'd be quite keen to hear other opinions on this. Unfortunately,
> few people have commented on this manual page rewrite.

I am happy as long as it is pivot_root(".", ".") is documented
somewhere.  There is real code that uses it so it is not going away.
Plus pivot_root(".", ".") is really what is desired in a lot of
situations where the caller of pivot_root is an intermediary and
does not control the new root filesystem.  At which point the only
path you can be guaranteed to exit on the new root filesystem is "/".

Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: For review: rewritten pivot_root(2) manual page
  2019-10-09  8:46     ` Eric W. Biederman
@ 2019-10-09 10:22       ` Michael Kerrisk (man-pages)
  2019-10-09 16:00         ` Eric W. Biederman
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Kerrisk (man-pages) @ 2019-10-09 10:22 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: mtk.manpages, Philipp Wendler, Serge E. Hallyn,
	Christian Brauner, Aleksa Sarai, Reid Priedhorsky,
	Andy Lutomirski, Yang Bo, Jakub Wilk, Joseph Sible, Al Viro,
	werner, linux-man, lkml, Containers, Stéphane Graber

Hello Eric,

Thank you. I was hoping you might jump in on this thread.

Please see below.

On 10/9/19 10:46 AM, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
> 
>> Hello Philipp,
>>
>> My apologies that it has taken a while to reply. (I had been hoping
>> and waiting that a few more people might weigh in on this thread.)
>>
>> On 9/23/19 3:42 PM, Philipp Wendler wrote:
>>> Hello Michael,
>>>
>>> Am 23.09.19 um 14:04 schrieb Michael Kerrisk (man-pages):
>>>
>>>> I'm considering to rewrite these pieces to exactly
>>>> describe what the system call does (which I already
>>>> do in the third paragraph) and remove the "may or may not"
>>>> pieces in the second paragraph. I'd welcome comments
>>>> on making that change.

What did you think about my proposal above? To put it in context,
this was my initial comment in the mail:

[[
One area of the page that I'm still not really happy with
is the "vague" wording in the second paragraph and the note
in the third paragraph about the system call possibly
changing. These pieces survive (in somewhat modified form)
from the original page, which was written before the
system call was released, and it seems there was some
question about whether the system call might still change
its behavior with respect to the root directory and current
working directory of other processes. However, after 19
years, nothing has changed, and surely it will not in the
future, since that would constitute an ABI breakage.
I'm considering to rewrite these pieces to exactly
describe what the system call does (which I already
do in the third paragraph) and remove the "may or may not"
pieces in the second paragraph. I'd welcome comments
on making that change.
]]

And the second and third paragraphs of the manual page currently
read:

[[
       pivot_root()  may  or may not change the current root and the cur‐
       rent working directory of any processes or threads  that  use  the
       old  root  directory  and which are in the same mount namespace as
       the caller of pivot_root().  The  caller  of  pivot_root()  should
       ensure  that  processes  with root or current working directory at
       the old root operate correctly in either case.   An  easy  way  to
       ensure  this is to change their root and current working directory
       to  new_root  before  invoking  pivot_root().   Note   also   that
       pivot_root()  may  or may not affect the calling process's current
       working directory.  It is therefore recommended to call chdir("/")
       immediately after pivot_root().

       The  paragraph  above  is  intentionally vague because at the time
       when pivot_root() was first implemented, it  was  unclear  whether
       its  affect  on  other process's root and current working directo‐
       ries—and the caller's current working  directory—might  change  in
       the  future.   However, the behavior has remained consistent since
       this system call was first implemented: pivot_root()  changes  the
       root  directory  and the current working directory of each process
       or thread in the same mount namespace to new_root if they point to
       the  old  root  directory.   (See also NOTES.)  On the other hand,
       pivot_root() does not change the caller's current  working  direc‐
       tory  (unless it is on the old root directory), and thus it should
       be followed by a chdir("/") call.
]]

>>> I think that it would make the man page significantly easier to
>>> understand if if the vague wording and the meta discussion about it are
>>> removed.
>>
>> It is my inclination to make this change, but I'd love to get more
>> feedback on this point.
>>
>>>> DESCRIPTION
>>> [...]>        pivot_root()  changes  the
>>>>        root  directory  and the current working directory of each process
>>>>        or thread in the same mount namespace to new_root if they point to
>>>>        the  old  root  directory.   (See also NOTES.)  On the other hand,
>>>>        pivot_root() does not change the caller's current  working  direc‐
>>>>        tory  (unless it is on the old root directory), and thus it should
>>>>        be followed by a chdir("/") call.
>>>
>>> There is a contradiction here with the NOTES (cf. below).
>>
>> See below.
>>
>>>>        The following restrictions apply:
>>>>
>>>>        -  new_root and put_old must be directories.
>>>>
>>>>        -  new_root and put_old must not be on the same filesystem as  the
>>>>           current root.  In particular, new_root can't be "/" (but can be
>>>>           a bind mounted directory on the current root filesystem).
>>>
>>> Wouldn't "must not be on the same mountpoint" or something similar be
>>> more clear, at least for new_root? The note in parentheses indicates
>>> that new_root can actually be on the same filesystem as the current
>>> note. However, ...
>>
>> For 'put_old', it really is "filesystem".
> 
> If we are going to be pedantic "filesystem" is really the wrong concept
> here.  The section about bind mount clarifies it, but I wonder if there
> is a better term.

Thanks. My aim was to try to distinguish "mount point" from
"a mount somewhere inside the file system associated with a
certain mount point"--in other words, I wanted to make it clear
that 'put_old' (and 'new_root') could not be subdirectories
under the current root mount point (which is correct, right?).

Using "mount" does seem better. (My only concern is that some
people may take it to mean "the mount point", but perhaps that
just my own confusion.)

> I think I would say: "new_root and put_old must not be on the same mount
> as the current root."

I've made that change.

> I think using "mount" instead of "filesystem" keeps the concepts less
> confusing.
> 
> As I am reading through this email and seeing text that is trying to be
> precise and clear then hitting the term "filesystem" is a bit jarring.
> pivot_root doesn't care a thing for file systems.  pivot_root only cares
> about mounts.
> 
> And by a "mount" I mean the thing that you get when you create a bind
> mount or you call mount normally.

Thanks for the above comments.

Hmm, doI need to make similar changes in the initial paragraph of
the manual page as well? It currently reads:

       pivot_root() changes the root filesystem in the mount namespace of
       the calling process.  More precisely, it moves the root filesystem
       to  the directory put_old and makes new_root the new root filesys‐
       tem.  The calling process must have the  CAP_SYS_ADMIN  capability
       in the user namespace that owns the caller's mount namespace.

Furthermore the one line NAME of the man page reads:

       pivot_root - change the root filesystem

Is a change needed there also?

> Michael do you have man pages for the new mount api yet?

David Howells wrote pages in mid-2018, well before the syscalls got
merged in the kernel (in mid-2019). I did not merge them because
the code was not yet in the kernel, and lacking time, I never chased
David when the syscalls did get merged to see if the pages were still
up to date. I pinged David just now.

>> For 'new_root', see below.
>>
>>>>        -  put_old must be at or underneath new_root; that  is,  adding  a
>>>>           nonnegative  number  of /.. to the string pointed to by put_old
>>>>           must yield the same directory as new_root.
>>>>
>>>>        -  new_root must be a mount point.  (If  it  is  not  otherwise  a
>>>>           mount  point,  it  suffices  to  bind  mount new_root on top of
>>>>           itself.)
>>>
>>> ... this item actually makes the above item almost redundant regarding
>>> new_root (except for the "/") case. So one could replace this item with
>>> something like this:
>>>
>>> - new_root must be a mount point different from "/". (If it is not
>>>   otherwise a mount point, it suffices  to bind mount new_root on top
>>>   of itself.)
>>>
>>> The above item would then only mention put_old (and maybe use clarified
>>> wording on whether actually a different file system is necessary for
>>> put_old or whether a different mount point is enough).
>>
>> Thanks. That's a good suggestion. I simplified the earlier bullet
>> point as you suggested, and changed the text here to say:
>>
>>        -  new_root must be a mount point, but can't be "/".  If it is not
>>           otherwise  a mount point, it suffices to bind mount new_root on
>>           top of itself.  (new_root can be a bind  mounted  directory  on
>>           the current root filesystem.)
> 
> How about:
>           - new_root must be the path to a mount, but can't be "/".  Any

Surely here it must be "mount point" not "mount"? (See my discussion
above.)

>           path that is not already a mount can be converted into one by
>           bind mounting the path onto itself.
>>>> NOTES
>>> [...]
>>>>        pivot_root() allows the caller to switch to a new root  filesystem
>>>>        while  at  the  same time placing the old root mount at a location
>>>>        under new_root from where it can subsequently be unmounted.   (The
>>>>        fact  that  it  moves  all processes that have a root directory or
>>>>        current working directory on the old root filesystem  to  the  new
>>>>        root  filesystem  frees the old root filesystem of users, allowing
>>>>        it to be unmounted more easily.)
>>>
>>> Here is the contradiction:
>>> The DESCRIPTION says that root and current working dir are only changed
>>> "if they point to the old root directory". Here in the NOTES it says
>>> that any root or working directories on the old root file system (i.e.,
>>> even if somewhere below the root) are changed.
>>>
>>> Which is correct?
>>
>> The first text is correct. I must have accidentally inserted
>> "filesystem" into the paragraph just here during a global edit.
>> Thanks for catching that.
>>
>>> If it indeed affects all processes with root and/or current working
>>> directory below the old root, the text here does not clearly state what
>>> the new root/current working directory of theses processes is.
>>> E.g., if a process is at /foo and we pivot to /bar, will the process be
>>> moved to /bar (i.e., at / after pivot_root), or will the kernel attempt
>>> to move it to some location like /bar/foo? Because the latter might not
>>> even exist, I suspect that everything is just moved to new_root, but
>>> this could be stated explicitly by replacing "to the new root
>>> filesystem" in the above paragraph with "to the new root directory"
>>> (after checking whether this is true).
>>
>> The text here now reads:
>>
>>        pivot_root() allows the caller to switch to a new root  filesystem
>>        while  at  the  same time placing the old root mount at a location
>>        under new_root from where it can subsequently be unmounted.   (The
>>        fact  that  it  moves  all processes that have a root directory or
>>        current working directory on the old root  directory  to  the  new
>>        root  frees the old root directory of users, allowing the old root
>>        filesystem to be unmounted more easily.)
> 
> 
> Please "mount" instead of "filesystem".

Changed.


>>>> EXAMPLE>        The program below demonstrates the use of  pivot_root()  inside  a
>>>>        mount namespace that is created using clone(2).  After pivoting to
>>>>        the root directory named in the program's first command-line argu‐
>>>>        ment,  the  child  created  by  clone(2) then executes the program
>>>>        named in the remaining command-line arguments.
>>>
>>> Why not use the pivot_root(".", ".") in the example program?
>>> It would make the example shorter, and also works if the process cannot
>>> write to new_root (e..g., in a user namespace).
>>
>> I'm not sure. Some people have a bit of trouble to wrap their head
>> around the pivot_root(".", ".") idea. (I possibly am one of them.)
>> I'd be quite keen to hear other opinions on this. Unfortunately,
>> few people have commented on this manual page rewrite.
> 
> I am happy as long as it is pivot_root(".", ".") is documented
> somewhere.  There is real code that uses it so it is not going away.
> Plus pivot_root(".", ".") is really what is desired in a lot of
> situations where the caller of pivot_root is an intermediary and
> does not control the new root filesystem.  At which point the only
> path you can be guaranteed to exit on the new root filesystem is "/".

Good. There is documentation of pivot_root(".", ".") i the page!

Thanks,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: For review: rewritten pivot_root(2) manual page
  2019-10-09 10:22       ` Michael Kerrisk (man-pages)
@ 2019-10-09 16:00         ` Eric W. Biederman
  2019-10-09 21:01           ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 9+ messages in thread
From: Eric W. Biederman @ 2019-10-09 16:00 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Philipp Wendler, Serge E. Hallyn, Christian Brauner,
	Aleksa Sarai, Reid Priedhorsky, Andy Lutomirski, Yang Bo,
	Jakub Wilk, Joseph Sible, Al Viro, werner, linux-man, lkml,
	Containers, Stéphane Graber

"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:

> Hello Eric,
>
> Thank you. I was hoping you might jump in on this thread.
>
> Please see below.
>
> On 10/9/19 10:46 AM, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>> 
>>> Hello Philipp,
>>>
>>> My apologies that it has taken a while to reply. (I had been hoping
>>> and waiting that a few more people might weigh in on this thread.)
>>>
>>> On 9/23/19 3:42 PM, Philipp Wendler wrote:
>>>> Hello Michael,
>>>>
>>>> Am 23.09.19 um 14:04 schrieb Michael Kerrisk (man-pages):
>>>>
>>>>> I'm considering to rewrite these pieces to exactly
>>>>> describe what the system call does (which I already
>>>>> do in the third paragraph) and remove the "may or may not"
>>>>> pieces in the second paragraph. I'd welcome comments
>>>>> on making that change.
>
> What did you think about my proposal above? To put it in context,
> this was my initial comment in the mail:
>
> [[
> One area of the page that I'm still not really happy with
> is the "vague" wording in the second paragraph and the note
> in the third paragraph about the system call possibly
> changing. These pieces survive (in somewhat modified form)
> from the original page, which was written before the
> system call was released, and it seems there was some
> question about whether the system call might still change
> its behavior with respect to the root directory and current
> working directory of other processes. However, after 19
> years, nothing has changed, and surely it will not in the
> future, since that would constitute an ABI breakage.
> I'm considering to rewrite these pieces to exactly
> describe what the system call does (which I already
> do in the third paragraph) and remove the "may or may not"
> pieces in the second paragraph. I'd welcome comments
> on making that change.
> ]]
>
> And the second and third paragraphs of the manual page currently
> read:
>
> [[
>        pivot_root()  may  or may not change the current root and the cur‐
>        rent working directory of any processes or threads  that  use  the
>        old  root  directory  and which are in the same mount namespace as
>        the caller of pivot_root().  The  caller  of  pivot_root()  should
>        ensure  that  processes  with root or current working directory at
>        the old root operate correctly in either case.   An  easy  way  to
>        ensure  this is to change their root and current working directory
>        to  new_root  before  invoking  pivot_root().   Note   also   that
>        pivot_root()  may  or may not affect the calling process's current
>        working directory.  It is therefore recommended to call chdir("/")
>        immediately after pivot_root().
>
>        The  paragraph  above  is  intentionally vague because at the time
>        when pivot_root() was first implemented, it  was  unclear  whether
>        its  affect  on  other process's root and current working directo‐
>        ries—and the caller's current working  directory—might  change  in
>        the  future.   However, the behavior has remained consistent since
>        this system call was first implemented: pivot_root()  changes  the
>        root  directory  and the current working directory of each process
>        or thread in the same mount namespace to new_root if they point to
>        the  old  root  directory.   (See also NOTES.)  On the other hand,
>        pivot_root() does not change the caller's current  working  direc‐
>        tory  (unless it is on the old root directory), and thus it should
>        be followed by a chdir("/") call.
> ]]

Apologies I saw that concern I didn't realize it was a questio

I think it is very reasonable to remove warning the behavior might
change.  We have pivot_root(8) in common use that to use it requires
the semantic of changing processes other than the current process.
Which means any attempt to noticably change the behavior of
pivot_root(2) will break userspace.

Now the documented semantics in behavior above are not quite what
pivot_root(2) does.  It walks all processes on the system and if the
working directory or the root directory refer to the root mount that is
being replaced, then pivot_root(2) will update them.

In practice the above is limited to a mount namespace.  But something as
simple as "cd /proc/<somepid>/root" can allow a process to have a
working directory in a different mount namespace.

Because ``unprivileged'' users can now use pivot_root(2) we may want to
rethink the implementation at some point to be cheaper than a global
process walk.  So far that process walk has not been a problem in
practice.

If we had to write pivot_root(2) from scratch limiting it to just
changing the root directory of the process that calls pivot_root(2)
would have been the superior semantic.  That would have required run
pivot_root(8) like: "exec pivot_root . . -- /bin/bash ..."  but it would
not have required walking every thread in the system.

>>>> I think that it would make the man page significantly easier to
>>>> understand if if the vague wording and the meta discussion about it are
>>>> removed.
>>>
>>> It is my inclination to make this change, but I'd love to get more
>>> feedback on this point.
>>>
>>>>> DESCRIPTION
>>>> [...]>        pivot_root()  changes  the
>>>>>        root  directory  and the current working directory of each process
>>>>>        or thread in the same mount namespace to new_root if they point to
>>>>>        the  old  root  directory.   (See also NOTES.)  On the other hand,
>>>>>        pivot_root() does not change the caller's current  working  direc‐
>>>>>        tory  (unless it is on the old root directory), and thus it should
>>>>>        be followed by a chdir("/") call.
>>>>
>>>> There is a contradiction here with the NOTES (cf. below).
>>>
>>> See below.
>>>
>>>>>        The following restrictions apply:
>>>>>
>>>>>        -  new_root and put_old must be directories.
>>>>>
>>>>>        -  new_root and put_old must not be on the same filesystem as  the
>>>>>           current root.  In particular, new_root can't be "/" (but can be
>>>>>           a bind mounted directory on the current root filesystem).
>>>>
>>>> Wouldn't "must not be on the same mountpoint" or something similar be
>>>> more clear, at least for new_root? The note in parentheses indicates
>>>> that new_root can actually be on the same filesystem as the current
>>>> note. However, ...
>>>
>>> For 'put_old', it really is "filesystem".
>> 
>> If we are going to be pedantic "filesystem" is really the wrong concept
>> here.  The section about bind mount clarifies it, but I wonder if there
>> is a better term.
>
> Thanks. My aim was to try to distinguish "mount point" from
> "a mount somewhere inside the file system associated with a
> certain mount point"--in other words, I wanted to make it clear
> that 'put_old' (and 'new_root') could not be subdirectories
> under the current root mount point (which is correct, right?).
>
> Using "mount" does seem better. (My only concern is that some
> people may take it to mean "the mount point", but perhaps that
> just my own confusion.)

I am open to better terms.  But mount or vfsmount is what we are using
internal to the kernel and is really a distinct concept from filesystem.
And it is starting to leak out in system calls like move_mount.

>> I think I would say: "new_root and put_old must not be on the same mount
>> as the current root."
>
> I've made that change.
>
>> I think using "mount" instead of "filesystem" keeps the concepts less
>> confusing.
>> 
>> As I am reading through this email and seeing text that is trying to be
>> precise and clear then hitting the term "filesystem" is a bit jarring.
>> pivot_root doesn't care a thing for file systems.  pivot_root only cares
>> about mounts.
>> 
>> And by a "mount" I mean the thing that you get when you create a bind
>> mount or you call mount normally.
>
> Thanks for the above comments.
>
> Hmm, doI need to make similar changes in the initial paragraph of
> the manual page as well? It currently reads:
>
>        pivot_root() changes the root filesystem in the mount namespace of
>        the calling process.  More precisely, it moves the root filesystem
>        to  the directory put_old and makes new_root the new root filesys‐
>        tem.  The calling process must have the  CAP_SYS_ADMIN  capability
>        in the user namespace that owns the caller's mount namespace.
>
> Furthermore the one line NAME of the man page reads:
>
>        pivot_root - change the root filesystem
>
> Is a change needed there also?

Yes please.  Both locations.

>> Michael do you have man pages for the new mount api yet?
>
> David Howells wrote pages in mid-2018, well before the syscalls got
> merged in the kernel (in mid-2019). I did not merge them because
> the code was not yet in the kernel, and lacking time, I never chased
> David when the syscalls did get merged to see if the pages were still
> up to date. I pinged David just now.

Good.  I was thinking of them because the concept of "mount" matters more
there.


>>>
>>>>>        -  put_old must be at or underneath new_root; that  is,  adding  a
>>>>>           nonnegative  number  of /.. to the string pointed to by put_old
>>>>>           must yield the same directory as new_root.
>>>>>
>>>>>        -  new_root must be a mount point.  (If  it  is  not  otherwise  a
>>>>>           mount  point,  it  suffices  to  bind  mount new_root on top of
>>>>>           itself.)
>>>>
>>>> ... this item actually makes the above item almost redundant regarding
>>>> new_root (except for the "/") case. So one could replace this item with
>>>> something like this:
>>>>
>>>> - new_root must be a mount point different from "/". (If it is not
>>>>   otherwise a mount point, it suffices  to bind mount new_root on top
>>>>   of itself.)
>>>>
>>>> The above item would then only mention put_old (and maybe use clarified
>>>> wording on whether actually a different file system is necessary for
>>>> put_old or whether a different mount point is enough).
>>>
>>> Thanks. That's a good suggestion. I simplified the earlier bullet
>>> point as you suggested, and changed the text here to say:
>>>
>>>        -  new_root must be a mount point, but can't be "/".  If it is not
>>>           otherwise  a mount point, it suffices to bind mount new_root on
>>>           top of itself.  (new_root can be a bind  mounted  directory  on
>>>           the current root filesystem.)
>> 
>> How about:
>>           - new_root must be the path to a mount, but can't be "/".  Any
>
> Surely here it must be "mount point" not "mount"? (See my discussion
> above.)

Sigh.  I have had my head in the code to long, where new_root is
used to refer to the mount that is mounted on that mount point as well.


>
>>           path that is not already a mount can be converted into one by
>>           bind mounting the path onto itself.
>>>>> NOTES
>>>> [...]
>>>>>        pivot_root() allows the caller to switch to a new root  filesystem
>>>>>        while  at  the  same time placing the old root mount at a location
>>>>>        under new_root from where it can subsequently be unmounted.   (The
>>>>>        fact  that  it  moves  all processes that have a root directory or
>>>>>        current working directory on the old root filesystem  to  the  new
>>>>>        root  filesystem  frees the old root filesystem of users, allowing
>>>>>        it to be unmounted more easily.)
>>>>
>>>> Here is the contradiction:
>>>> The DESCRIPTION says that root and current working dir are only changed
>>>> "if they point to the old root directory". Here in the NOTES it says
>>>> that any root or working directories on the old root file system (i.e.,
>>>> even if somewhere below the root) are changed.
>>>>
>>>> Which is correct?
>>>
>>> The first text is correct. I must have accidentally inserted
>>> "filesystem" into the paragraph just here during a global edit.
>>> Thanks for catching that.
>>>
>>>> If it indeed affects all processes with root and/or current working
>>>> directory below the old root, the text here does not clearly state what
>>>> the new root/current working directory of theses processes is.
>>>> E.g., if a process is at /foo and we pivot to /bar, will the process be
>>>> moved to /bar (i.e., at / after pivot_root), or will the kernel attempt
>>>> to move it to some location like /bar/foo? Because the latter might not
>>>> even exist, I suspect that everything is just moved to new_root, but
>>>> this could be stated explicitly by replacing "to the new root
>>>> filesystem" in the above paragraph with "to the new root directory"
>>>> (after checking whether this is true).
>>>
>>> The text here now reads:
>>>
>>>        pivot_root() allows the caller to switch to a new root  filesystem
>>>        while  at  the  same time placing the old root mount at a location
>>>        under new_root from where it can subsequently be unmounted.   (The
>>>        fact  that  it  moves  all processes that have a root directory or
>>>        current working directory on the old root  directory  to  the  new
>>>        root  frees the old root directory of users, allowing the old root
>>>        filesystem to be unmounted more easily.)
>> 
>> 
>> Please "mount" instead of "filesystem".
>
> Changed.
>
>
>>>>> EXAMPLE>        The program below demonstrates the use of  pivot_root()  inside  a
>>>>>        mount namespace that is created using clone(2).  After pivoting to
>>>>>        the root directory named in the program's first command-line argu‐
>>>>>        ment,  the  child  created  by  clone(2) then executes the program
>>>>>        named in the remaining command-line arguments.
>>>>
>>>> Why not use the pivot_root(".", ".") in the example program?
>>>> It would make the example shorter, and also works if the process cannot
>>>> write to new_root (e..g., in a user namespace).
>>>
>>> I'm not sure. Some people have a bit of trouble to wrap their head
>>> around the pivot_root(".", ".") idea. (I possibly am one of them.)
>>> I'd be quite keen to hear other opinions on this. Unfortunately,
>>> few people have commented on this manual page rewrite.
>> 
>> I am happy as long as it is pivot_root(".", ".") is documented
>> somewhere.  There is real code that uses it so it is not going away.
>> Plus pivot_root(".", ".") is really what is desired in a lot of
>> situations where the caller of pivot_root is an intermediary and
>> does not control the new root filesystem.  At which point the only
>> path you can be guaranteed to exit on the new root filesystem is "/".
>
> Good. There is documentation of pivot_root(".", ".") i the page!

Yeah!

Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: For review: rewritten pivot_root(2) manual page
  2019-10-09 16:00         ` Eric W. Biederman
@ 2019-10-09 21:01           ` Michael Kerrisk (man-pages)
  2019-10-10  7:42             ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Kerrisk (man-pages) @ 2019-10-09 21:01 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: mtk.manpages, Philipp Wendler, Serge E. Hallyn,
	Christian Brauner, Aleksa Sarai, Reid Priedhorsky,
	Andy Lutomirski, Yang Bo, Jakub Wilk, Joseph Sible, Al Viro,
	werner, linux-man, lkml, Containers, Stéphane Graber

Hello Eric,

On 10/9/19 6:00 PM, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
> 
>> Hello Eric,
>>
>> Thank you. I was hoping you might jump in on this thread.
>>
>> Please see below.
>>
>> On 10/9/19 10:46 AM, Eric W. Biederman wrote:
>>> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>>>
>>>> Hello Philipp,
>>>>
>>>> My apologies that it has taken a while to reply. (I had been hoping
>>>> and waiting that a few more people might weigh in on this thread.)
>>>>
>>>> On 9/23/19 3:42 PM, Philipp Wendler wrote:
>>>>> Hello Michael,
>>>>>
>>>>> Am 23.09.19 um 14:04 schrieb Michael Kerrisk (man-pages):
>>>>>
>>>>>> I'm considering to rewrite these pieces to exactly
>>>>>> describe what the system call does (which I already
>>>>>> do in the third paragraph) and remove the "may or may not"
>>>>>> pieces in the second paragraph. I'd welcome comments
>>>>>> on making that change.
>>
>> What did you think about my proposal above? To put it in context,
>> this was my initial comment in the mail:
>>
>> [[
>> One area of the page that I'm still not really happy with
>> is the "vague" wording in the second paragraph and the note
>> in the third paragraph about the system call possibly
>> changing. These pieces survive (in somewhat modified form)
>> from the original page, which was written before the
>> system call was released, and it seems there was some
>> question about whether the system call might still change
>> its behavior with respect to the root directory and current
>> working directory of other processes. However, after 19
>> years, nothing has changed, and surely it will not in the
>> future, since that would constitute an ABI breakage.
>> I'm considering to rewrite these pieces to exactly
>> describe what the system call does (which I already
>> do in the third paragraph) and remove the "may or may not"
>> pieces in the second paragraph. I'd welcome comments
>> on making that change.
>> ]]
>>
>> And the second and third paragraphs of the manual page currently
>> read:
>>
>> [[
>>        pivot_root()  may  or may not change the current root and the cur‐
>>        rent working directory of any processes or threads  that  use  the
>>        old  root  directory  and which are in the same mount namespace as
>>        the caller of pivot_root().  The  caller  of  pivot_root()  should
>>        ensure  that  processes  with root or current working directory at
>>        the old root operate correctly in either case.   An  easy  way  to
>>        ensure  this is to change their root and current working directory
>>        to  new_root  before  invoking  pivot_root().   Note   also   that
>>        pivot_root()  may  or may not affect the calling process's current
>>        working directory.  It is therefore recommended to call chdir("/")
>>        immediately after pivot_root().
>>
>>        The  paragraph  above  is  intentionally vague because at the time
>>        when pivot_root() was first implemented, it  was  unclear  whether
>>        its  affect  on  other process's root and current working directo‐
>>        ries—and the caller's current working  directory—might  change  in
>>        the  future.   However, the behavior has remained consistent since
>>        this system call was first implemented: pivot_root()  changes  the
>>        root  directory  and the current working directory of each process
>>        or thread in the same mount namespace to new_root if they point to
>>        the  old  root  directory.   (See also NOTES.)  On the other hand,
>>        pivot_root() does not change the caller's current  working  direc‐
>>        tory  (unless it is on the old root directory), and thus it should
>>        be followed by a chdir("/") call.
>> ]]
> 
> Apologies I saw that concern I didn't realize it was a questio
> 
> I think it is very reasonable to remove warning the behavior might
> change.  We have pivot_root(8) in common use that to use it requires
> the semantic of changing processes other than the current process.
> Which means any attempt to noticably change the behavior of
> pivot_root(2) will break userspace.

Thanks for the confirmation that this change would be okay.
I will make this change soon, unless I hear a counterargument.

> Now the documented semantics in behavior above are not quite what
> pivot_root(2) does.  It walks all processes on the system and if the
> working directory or the root directory refer to the root mount that is
> being replaced, then pivot_root(2) will update them.
> 
> In practice the above is limited to a mount namespace.  But something as
> simple as "cd /proc/<somepid>/root" can allow a process to have a
> working directory in a different mount namespace.

So, I'm not quite clear. Do you mean that something in the existing
manual page text should change? If so, could you describe the
needed change please?

> Because ``unprivileged'' users can now use pivot_root(2) we may want to
> rethink the implementation at some point to be cheaper than a global
> process walk.  So far that process walk has not been a problem in
> practice.
> 
> If we had to write pivot_root(2) from scratch limiting it to just
> changing the root directory of the process that calls pivot_root(2)
> would have been the superior semantic.  That would have required run
> pivot_root(8) like: "exec pivot_root . . -- /bin/bash ..."  but it would
> not have required walking every thread in the system.

Okay.

[...]

>>>>>> DESCRIPTION
>>>>> [...]>        pivot_root()  changes  the
>>>>>>        root  directory  and the current working directory of each process
>>>>>>        or thread in the same mount namespace to new_root if they point to
>>>>>>        the  old  root  directory.   (See also NOTES.)  On the other hand,
>>>>>>        pivot_root() does not change the caller's current  working  direc‐
>>>>>>        tory  (unless it is on the old root directory), and thus it should
>>>>>>        be followed by a chdir("/") call.
>>>>>
>>>>> There is a contradiction here with the NOTES (cf. below).
>>>>
>>>> See below.
>>>>
>>>>>>        The following restrictions apply:
>>>>>>
>>>>>>        -  new_root and put_old must be directories.
>>>>>>
>>>>>>        -  new_root and put_old must not be on the same filesystem as  the
>>>>>>           current root.  In particular, new_root can't be "/" (but can be
>>>>>>           a bind mounted directory on the current root filesystem).
>>>>>
>>>>> Wouldn't "must not be on the same mountpoint" or something similar be
>>>>> more clear, at least for new_root? The note in parentheses indicates
>>>>> that new_root can actually be on the same filesystem as the current
>>>>> note. However, ...
>>>>
>>>> For 'put_old', it really is "filesystem".
>>>
>>> If we are going to be pedantic "filesystem" is really the wrong concept
>>> here.  The section about bind mount clarifies it, but I wonder if there
>>> is a better term.
>>
>> Thanks. My aim was to try to distinguish "mount point" from
>> "a mount somewhere inside the file system associated with a
>> certain mount point"--in other words, I wanted to make it clear
>> that 'put_old' (and 'new_root') could not be subdirectories
>> under the current root mount point (which is correct, right?).
>>
>> Using "mount" does seem better. (My only concern is that some
>> people may take it to mean "the mount point", but perhaps that
>> just my own confusion.)
> 
> I am open to better terms.  But mount or vfsmount is what we are using
> internal to the kernel and is really a distinct concept from filesystem.
> And it is starting to leak out in system calls like move_mount.

I have no better term to propose.

[...]

>> Thanks for the above comments.
>>
>> Hmm, doI need to make similar changes in the initial paragraph of
>> the manual page as well? It currently reads:
>>
>>        pivot_root() changes the root filesystem in the mount namespace of
>>        the calling process.  More precisely, it moves the root filesystem
>>        to  the directory put_old and makes new_root the new root filesys‐
>>        tem.  The calling process must have the  CAP_SYS_ADMIN  capability
>>        in the user namespace that owns the caller's mount namespace.
>>
>> Furthermore the one line NAME of the man page reads:
>>
>>        pivot_root - change the root filesystem
>>
>> Is a change needed there also?
> 
> Yes please.  Both locations.

Okay. So would the following be okay:

[[
NAME
       pivot_root - change the root mount
...
DESCRIPTION
       pivot_root()  changes the root mount in the mount namespace of the
       calling process.  More precisely, it moves the root mount  to  the
       directory  put_old  and  makes  new_root  the new root mount.  The
       calling process must have the CAP_SYS_ADMIN capability in the user
       namespace that owns the caller's mount namespace.
]]

?

[...]

>>>>>>        -  new_root must be a mount point.  (If  it  is  not  otherwise  a
>>>>>>           mount  point,  it  suffices  to  bind  mount new_root on top of
>>>>>>           itself.)
>>>>>
>>>>> ... this item actually makes the above item almost redundant regarding
>>>>> new_root (except for the "/") case. So one could replace this item with
>>>>> something like this:
>>>>>
>>>>> - new_root must be a mount point different from "/". (If it is not
>>>>>   otherwise a mount point, it suffices  to bind mount new_root on top
>>>>>   of itself.)
>>>>>
>>>>> The above item would then only mention put_old (and maybe use clarified
>>>>> wording on whether actually a different file system is necessary for
>>>>> put_old or whether a different mount point is enough).
>>>>
>>>> Thanks. That's a good suggestion. I simplified the earlier bullet
>>>> point as you suggested, and changed the text here to say:
>>>>
>>>>        -  new_root must be a mount point, but can't be "/".  If it is not
>>>>           otherwise  a mount point, it suffices to bind mount new_root on
>>>>           top of itself.  (new_root can be a bind  mounted  directory  on
>>>>           the current root filesystem.)
>>>
>>> How about:
>>>           - new_root must be the path to a mount, but can't be "/".  Any
>>
>> Surely here it must be "mount point" not "mount"? (See my discussion
>> above.)
> 
> Sigh.  I have had my head in the code to long, where new_root is
> used to refer to the mount that is mounted on that mount point as well.

Okay -- so I made the text here:

       -  new_root  must be a path to a mount point, but can't be "/".  A
          path that is not already a mount point can  be  converted  into
          one by bind mounting the path onto itself.

Okay?

[...]

Thanks, Eric. As always, your input for the man pages is so
valuable. (My only challenge is to keep up with you...)

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: For review: rewritten pivot_root(2) manual page
  2019-10-09 21:01           ` Michael Kerrisk (man-pages)
@ 2019-10-10  7:42             ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 9+ messages in thread
From: Michael Kerrisk (man-pages) @ 2019-10-10  7:42 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: mtk.manpages, Philipp Wendler, Serge E. Hallyn,
	Christian Brauner, Aleksa Sarai, Reid Priedhorsky,
	Andy Lutomirski, Yang Bo, Jakub Wilk, Joseph Sible, Al Viro,
	werner, linux-man, lkml, Containers, Stéphane Graber

Hello Eric,

I think I just understood something. See below.

On 10/9/19 11:01 PM, Michael Kerrisk (man-pages) wrote:
> Hello Eric,
> 
> On 10/9/19 6:00 PM, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>>
>>> Hello Eric,
>>>
>>> Thank you. I was hoping you might jump in on this thread.
>>>
>>> Please see below.
>>>
>>> On 10/9/19 10:46 AM, Eric W. Biederman wrote:
>>>> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>>>>
>>>>> Hello Philipp,
>>>>>
>>>>> My apologies that it has taken a while to reply. (I had been hoping
>>>>> and waiting that a few more people might weigh in on this thread.)
>>>>>
>>>>> On 9/23/19 3:42 PM, Philipp Wendler wrote:
>>>>>> Hello Michael,
>>>>>>
>>>>>> Am 23.09.19 um 14:04 schrieb Michael Kerrisk (man-pages):
>>>>>>
>>>>>>> I'm considering to rewrite these pieces to exactly
>>>>>>> describe what the system call does (which I already
>>>>>>> do in the third paragraph) and remove the "may or may not"
>>>>>>> pieces in the second paragraph. I'd welcome comments
>>>>>>> on making that change.
>>>
>>> What did you think about my proposal above? To put it in context,
>>> this was my initial comment in the mail:
>>>
>>> [[
>>> One area of the page that I'm still not really happy with
>>> is the "vague" wording in the second paragraph and the note
>>> in the third paragraph about the system call possibly
>>> changing. These pieces survive (in somewhat modified form)
>>> from the original page, which was written before the
>>> system call was released, and it seems there was some
>>> question about whether the system call might still change
>>> its behavior with respect to the root directory and current
>>> working directory of other processes. However, after 19
>>> years, nothing has changed, and surely it will not in the
>>> future, since that would constitute an ABI breakage.
>>> I'm considering to rewrite these pieces to exactly
>>> describe what the system call does (which I already
>>> do in the third paragraph) and remove the "may or may not"
>>> pieces in the second paragraph. I'd welcome comments
>>> on making that change.
>>> ]]
>>>
>>> And the second and third paragraphs of the manual page currently
>>> read:
>>>
>>> [[
>>>        pivot_root()  may  or may not change the current root and the cur‐
>>>        rent working directory of any processes or threads  that  use  the
>>>        old  root  directory  and which are in the same mount namespace as
>>>        the caller of pivot_root().  The  caller  of  pivot_root()  should
>>>        ensure  that  processes  with root or current working directory at
>>>        the old root operate correctly in either case.   An  easy  way  to
>>>        ensure  this is to change their root and current working directory
>>>        to  new_root  before  invoking  pivot_root().   Note   also   that
>>>        pivot_root()  may  or may not affect the calling process's current
>>>        working directory.  It is therefore recommended to call chdir("/")
>>>        immediately after pivot_root().
>>>
>>>        The  paragraph  above  is  intentionally vague because at the time
>>>        when pivot_root() was first implemented, it  was  unclear  whether
>>>        its  affect  on  other process's root and current working directo‐
>>>        ries—and the caller's current working  directory—might  change  in
>>>        the  future.   However, the behavior has remained consistent since
>>>        this system call was first implemented: pivot_root()  changes  the
>>>        root  directory  and the current working directory of each process
>>>        or thread in the same mount namespace to new_root if they point to
>>>        the  old  root  directory.   (See also NOTES.)  On the other hand,
>>>        pivot_root() does not change the caller's current  working  direc‐
>>>        tory  (unless it is on the old root directory), and thus it should
>>>        be followed by a chdir("/") call.
>>> ]]
>>
>> Apologies I saw that concern I didn't realize it was a questio
>>
>> I think it is very reasonable to remove warning the behavior might
>> change.  We have pivot_root(8) in common use that to use it requires
>> the semantic of changing processes other than the current process.
>> Which means any attempt to noticably change the behavior of
>> pivot_root(2) will break userspace.
> 
> Thanks for the confirmation that this change would be okay.
> I will make this change soon, unless I hear a counterargument.
> 
>> Now the documented semantics in behavior above are not quite what
>> pivot_root(2) does.  It walks all processes on the system and if the
>> working directory or the root directory refer to the root mount that is
>> being replaced, then pivot_root(2) will update them.
>>
>> In practice the above is limited to a mount namespace.  But something as
>> simple as "cd /proc/<somepid>/root" can allow a process to have a
>> working directory in a different mount namespace.
> 
> So, I'm not quite clear. Do you mean that something in the existing
> manual page text should change? If so, could you describe the
> needed change please?

Okay, I had to sleep on this one. I think what you are saying is
that is some process, pidX, in mountns X does a "cd /proc/<pidY>/root"
where pidY is a process in mountns Y, and then some
process in mountns Y does a pivot_root(), the the CWD of pidX will
be changed, even though it is in a different mountns. Right?

Thanks,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-10-10  7:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-23 12:04 For review: rewritten pivot_root(2) manual page Michael Kerrisk (man-pages)
2019-09-23 13:42 ` Philipp Wendler
2019-10-09  7:41   ` Michael Kerrisk (man-pages)
2019-10-09  8:18     ` G. Branden Robinson
2019-10-09  8:46     ` Eric W. Biederman
2019-10-09 10:22       ` Michael Kerrisk (man-pages)
2019-10-09 16:00         ` Eric W. Biederman
2019-10-09 21:01           ` Michael Kerrisk (man-pages)
2019-10-10  7:42             ` Michael Kerrisk (man-pages)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).