linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1] devpts: resolve devpts bind-mounts
@ 2018-03-09 10:57 Christian Brauner
  2018-03-09 18:37 ` Linus Torvalds
  0 siblings, 1 reply; 4+ messages in thread
From: Christian Brauner @ 2018-03-09 10:57 UTC (permalink / raw)
  To: linux-kernel, ebiederm, torvalds; +Cc: Christian Brauner

Most libcs will still look at /dev/ptmx when opening the master fd of a pty
device. When /dev/ptmx is a bind-mount of /dev/pts/ptmx and the TIOCGPTPEER
ioctl() is used to safely retrieve a file descriptor for the slave side of
the pty based on the master fd, the /proc/self/fd/{0,1,2} symlinks will
point to /.

When the kernel tries to look up the root mount of the dentry for the slave
file descriptor it will detect that the dentry is escaping its bind-mount
since the root mount of the dentry is /dev/pts where the devpts is mounted
but the root mount of /dev/ptmx is /dev.

Having bind-mounts of /dev/pts/ptmx to /dev/ptmx not working correctly is a
regression. In addition, it is also a fairly common scenario in containers
employing user namespaces.

To handle bind-mounts of /dev/pts/ptmx to /dev/ptmx correctly we need to
walk up the bind-mounts for /dev/ptmx in devpts_mntget(). This also means
we need to remove the
"if (path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)" check from
devpts_ptmx_path(). However, devpts_acquire() needs to perform that check.
The reason is that while the devpts mounted at /dev/pts has devtmpfs
mounted at /dev as its parent mount:

21 -- -- / /dev
-- 21 -- / /dev/pts

they are on different devices

-- -- 0:6  / /dev
-- -- 0:20 / /dev/pts

which has the consequence that the pathname of the directory which forms
the root of the /dev/pts mount is /. So if we bind-mount /dev/pts/ptmx to
/dev/ptmx we will end up on the same device as the devtmpfs mount on
/dev/pts

-- -- 0:20 /ptmx /dev/ptmx

but given that / is the pathname of the directory which forms the root of
the /dev/pts bind-mount, the resulting source will be /ptmx. So in
devpts_acquire() we need to check if
"(path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)" and exit early before
we try to call path_pts(). If we call path_pts() in the bind-mount case we
will hit path_parent_directory() which will **not** yield /dev as parent
directory but rather / and fail. We should still perform the path_pts()
check in devpts_acquire() if we haven't found a suitable devpts filesystem
at open time but we should always try to exit early in devpts_acquire() and
defer the bind-mount resolution to devpts_mntget(). If we fail in
devpts_mntget() we will end up with a wrong /proc/<pid>/fd/{0,1,2} symlink,
if we fail in devpts_acquire() we will end up with a failed
open("/dev/ptmx") which is more troublesome.

Here's a little reproducer that presupposes a libc that uses TIOCGPTPEER in
its openpty() implementation:

unshare --mount
mount --bind /dev/pts/ptmx /dev/ptmx
chmod 666 /dev/ptmx
script
ls -al /proc/self/fd/0

with output:

lrwx------ 1 chb chb 64 Mar  7 16:41 /proc/self/fd/0 -> /

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Suggested-by: Eric Biederman <ebiederm@xmission.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
---
ChangeLog v0->v1:
- remove
        /* Has the devpts filesystem already been found? */
        if (path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)
	                return 0
  from devpts_ptmx_path()
- check superblock after devpts_ptmx_path() returned
---
 fs/devpts/inode.c | 41 +++++++++++++++++++++++++++++++++--------
 1 file changed, 33 insertions(+), 8 deletions(-)

diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index e31d6ed3ec32..89ee17733087 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -138,10 +138,6 @@ static int devpts_ptmx_path(struct path *path)
 	struct super_block *sb;
 	int err;
 
-	/* Has the devpts filesystem already been found? */
-	if (path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)
-		return 0;
-
 	/* Is a devpts filesystem at "pts" in the same directory? */
 	err = path_pts(path);
 	if (err)
@@ -163,6 +159,26 @@ struct vfsmount *devpts_mntget(struct file *filp, struct pts_fs_info *fsi)
 
 	path = filp->f_path;
 	path_get(&path);
+	if ((DEVPTS_SB(path.mnt->mnt_sb) == fsi) &&
+	    (path.mnt->mnt_root == fsi->ptmx_dentry)) {
+		/* Walk upward while the start point is a bind mount of a single
+		 * file.
+		 */
+		while (path.mnt->mnt_root == path.dentry)
+			if (follow_up(&path) == 0)
+				break;
+
+		/* Is this path a valid devpts filesystem? */
+		err = devpts_ptmx_path(&path);
+		dput(path.dentry);
+		if (err == 0)
+			goto check_devpts_sb;
+
+		path_put(&path);
+		path = filp->f_path;
+		path_get(&path);
+		goto check_devpts_sb;
+	}
 
 	err = devpts_ptmx_path(&path);
 	dput(path.dentry);
@@ -170,10 +186,13 @@ struct vfsmount *devpts_mntget(struct file *filp, struct pts_fs_info *fsi)
 		mntput(path.mnt);
 		return ERR_PTR(err);
 	}
+
+check_devpts_sb:
 	if (DEVPTS_SB(path.mnt->mnt_sb) != fsi) {
 		mntput(path.mnt);
 		return ERR_PTR(-ENODEV);
 	}
+
 	return path.mnt;
 }
 
@@ -187,10 +206,16 @@ struct pts_fs_info *devpts_acquire(struct file *filp)
 	path = filp->f_path;
 	path_get(&path);
 
-	err = devpts_ptmx_path(&path);
-	if (err) {
-		result = ERR_PTR(err);
-		goto out;
+	/* Has the devpts filesystem already been found? */
+	if (path.mnt->mnt_sb->s_magic != DEVPTS_SUPER_MAGIC) {
+		/* Is there an appropriate devpts filesystem in the parent
+		 * directory?
+		 */
+		err = devpts_ptmx_path(&path);
+		if (err) {
+			result = ERR_PTR(err);
+			goto out;
+		}
 	}
 
 	/*
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v1] devpts: resolve devpts bind-mounts
  2018-03-09 10:57 [PATCH v1] devpts: resolve devpts bind-mounts Christian Brauner
@ 2018-03-09 18:37 ` Linus Torvalds
  2018-03-11 11:30   ` Christian Brauner
  0 siblings, 1 reply; 4+ messages in thread
From: Linus Torvalds @ 2018-03-09 18:37 UTC (permalink / raw)
  To: Christian Brauner; +Cc: Linux Kernel Mailing List, Eric W. Biederman

Hmm. This hunk annoys me and makes me go "Whaa?":

On Fri, Mar 9, 2018 at 2:57 AM, Christian Brauner
<christian.brauner@ubuntu.com> wrote:
> @@ -163,6 +159,26 @@ struct vfsmount *devpts_mntget(struct file *filp, struct pts_fs_info *fsi)
>
>         path = filp->f_path;
>         path_get(&path);
> +       if ((DEVPTS_SB(path.mnt->mnt_sb) == fsi) &&
> +           (path.mnt->mnt_root == fsi->ptmx_dentry)) {
> +               /* Walk upward while the start point is a bind mount of a single
> +                * file.
> +                */
> +               while (path.mnt->mnt_root == path.dentry)
> +                       if (follow_up(&path) == 0)
> +                               break;
> +
> +               /* Is this path a valid devpts filesystem? */
> +               err = devpts_ptmx_path(&path);
> +               dput(path.dentry);
> +               if (err == 0)
> +                       goto check_devpts_sb;
> +
> +               path_put(&path);
> +               path = filp->f_path;
> +               path_get(&path);
> +               goto check_devpts_sb;
> +       }
>
>         err = devpts_ptmx_path(&path);
>         dput(path.dentry);

why did you duplicate the devpts_ptmx_path() and then do that odd
error handling?

We only go into that "if()" statement if
DEVPTS_SB(filp->f_path.mnt->mnt_sb) == fsi, so then when you do that
"put path and re-get it, and go to check_devpts_sb", the
check_devpts_sb won't actually _do_ anything, because it has

> +check_devpts_sb:
>         if (DEVPTS_SB(path.mnt->mnt_sb) != fsi) {

and we know that "if()" there cannot trigger, since we just checked it earlier.

So abou two thirds of the above seems unnecessary.

Why isn't the code just doing


       if ((DEVPTS_SB(path.mnt->mnt_sb) == fsi) &&
           (path.mnt->mnt_root == fsi->ptmx_dentry)) {
               /* Walk upward while the start point is a bind mount of a single
                * file.
                */
               while (path.mnt->mnt_root == path.dentry)
                       if (follow_up(&path) == 0)
                               break;
        }

and then just falling through to the existing "devpts_ptmx_path()" etc
code? Duplicating it seems wrong, and the error handling in the
duplicated path seems wrong too.

Am I missing something?


> @@ -187,10 +206,16 @@ struct pts_fs_info *devpts_acquire(struct file *filp)
>         path = filp->f_path;
>         path_get(&path);
>
> -       err = devpts_ptmx_path(&path);
> -       if (err) {
> -               result = ERR_PTR(err);
> -               goto out;
> +       /* Has the devpts filesystem already been found? */
> +       if (path.mnt->mnt_sb->s_magic != DEVPTS_SUPER_MAGIC) {
> +               /* Is there an appropriate devpts filesystem in the parent
> +                * directory?
> +                */
> +               err = devpts_ptmx_path(&path);
> +               if (err) {
> +                       result = ERR_PTR(err);
> +                       goto out;
> +               }
>         }

This part (and the accompanying removal from devpts_ptmx_path() should
just have been a separate preparatory patch that doesn't change
semantics, no? Also, the scope of 'err' is now entirely inside that
if(), so I think it should just be declared there too.

I didn't actually test the patch, this is just from reading it, so I
might have missed something.

                Linus

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v1] devpts: resolve devpts bind-mounts
  2018-03-09 18:37 ` Linus Torvalds
@ 2018-03-11 11:30   ` Christian Brauner
  0 siblings, 0 replies; 4+ messages in thread
From: Christian Brauner @ 2018-03-11 11:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, Eric W. Biederman

On Fri, Mar 09, 2018 at 10:37:34AM -0800, Linus Torvalds wrote:
> Hmm. This hunk annoys me and makes me go "Whaa?":
> 
> On Fri, Mar 9, 2018 at 2:57 AM, Christian Brauner
> <christian.brauner@ubuntu.com> wrote:
> > @@ -163,6 +159,26 @@ struct vfsmount *devpts_mntget(struct file *filp, struct pts_fs_info *fsi)
> >
> >         path = filp->f_path;
> >         path_get(&path);
> > +       if ((DEVPTS_SB(path.mnt->mnt_sb) == fsi) &&
> > +           (path.mnt->mnt_root == fsi->ptmx_dentry)) {
> > +               /* Walk upward while the start point is a bind mount of a single
> > +                * file.
> > +                */
> > +               while (path.mnt->mnt_root == path.dentry)
> > +                       if (follow_up(&path) == 0)
> > +                               break;
> > +
> > +               /* Is this path a valid devpts filesystem? */
> > +               err = devpts_ptmx_path(&path);
> > +               dput(path.dentry);
> > +               if (err == 0)
> > +                       goto check_devpts_sb;
> > +
> > +               path_put(&path);
> > +               path = filp->f_path;
> > +               path_get(&path);
> > +               goto check_devpts_sb;
> > +       }
> >
> >         err = devpts_ptmx_path(&path);
> >         dput(path.dentry);
> 
> why did you duplicate the devpts_ptmx_path() and then do that odd
> error handling?
> 
> We only go into that "if()" statement if
> DEVPTS_SB(filp->f_path.mnt->mnt_sb) == fsi, so then when you do that
> "put path and re-get it, and go to check_devpts_sb", the
> check_devpts_sb won't actually _do_ anything, because it has
> 
> > +check_devpts_sb:
> >         if (DEVPTS_SB(path.mnt->mnt_sb) != fsi) {
> 
> and we know that "if()" there cannot trigger, since we just checked it earlier.
> 
> So abou two thirds of the above seems unnecessary.
> 
> Why isn't the code just doing
> 
> 
>        if ((DEVPTS_SB(path.mnt->mnt_sb) == fsi) &&
>            (path.mnt->mnt_root == fsi->ptmx_dentry)) {
>                /* Walk upward while the start point is a bind mount of a single
>                 * file.
>                 */
>                while (path.mnt->mnt_root == path.dentry)
>                        if (follow_up(&path) == 0)
>                                break;
>         }
> 
> and then just falling through to the existing "devpts_ptmx_path()" etc
> code? Duplicating it seems wrong, and the error handling in the
> duplicated path seems wrong too.
> 
> Am I missing something?

Right, the sb information can't be changed by follow_up() so we can
actually do it simpler. In the first iteration of the patch I wasn't
sure of that when I thought through the whole source pathname vs.
location on a different device mess for /dev, /dev/pts, and /dev/ptmx
being a bind-mount.

> 
> 
> > @@ -187,10 +206,16 @@ struct pts_fs_info *devpts_acquire(struct file *filp)
> >         path = filp->f_path;
> >         path_get(&path);
> >
> > -       err = devpts_ptmx_path(&path);
> > -       if (err) {
> > -               result = ERR_PTR(err);
> > -               goto out;
> > +       /* Has the devpts filesystem already been found? */
> > +       if (path.mnt->mnt_sb->s_magic != DEVPTS_SUPER_MAGIC) {
> > +               /* Is there an appropriate devpts filesystem in the parent
> > +                * directory?
> > +                */
> > +               err = devpts_ptmx_path(&path);
> > +               if (err) {
> > +                       result = ERR_PTR(err);
> > +                       goto out;
> > +               }
> >         }
> 
> This part (and the accompanying removal from devpts_ptmx_path() should
> just have been a separate preparatory patch that doesn't change
> semantics, no? Also, the scope of 'err' is now entirely inside that
> if(), so I think it should just be declared there too.

Yeah, I split this out into a non-functional change in the new version
of the patch.

Christian

> 
> I didn't actually test the patch, this is just from reading it, so I

> might have missed something.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v1] devpts: resolve devpts bind-mounts
@ 2018-03-09 10:24 Christian Brauner
  0 siblings, 0 replies; 4+ messages in thread
From: Christian Brauner @ 2018-03-09 10:24 UTC (permalink / raw)
  To: linux-kernel, torvalds, ebiederm; +Cc: Christian Brauner

Most libcs will still look at /dev/ptmx when opening the master fd of a pty
device. When /dev/ptmx is a bind-mount of /dev/pts/ptmx and the TIOCGPTPEER
ioctl() is used to safely retrieve a file descriptor for the slave side of
the pty based on the master fd, the /proc/self/fd/{0,1,2} symlinks will
point to /.

When the kernel tries to look up the root mount of the dentry for the slave
file descriptor it will detect that the dentry is escaping its bind-mount
since the root mount of the dentry is /dev/pts where the devpts is mounted
but the root mount of /dev/ptmx is /dev.

Having bind-mounts of /dev/pts/ptmx to /dev/ptmx not working correctly is a
regression. In addition, it is also a fairly common scenario in containers
employing user namespaces.

To handle bind-mounts of /dev/pts/ptmx to /dev/ptmx correctly we need to
walk up the bind-mounts for /dev/ptmx in devpts_mntget(). This also means
we need to remove the
"if (path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)" check from
devpts_ptmx_path(). However, devpts_acquire() needs to perform that check.
The reason is that while the devpts mounted at /dev/pts has devtmpfs
mounted at /dev as its parent mount:

21 -- -- / /dev
-- 21 -- / /dev/pts

they are on different devices

-- -- 0:6  / /dev
-- -- 0:20 / /dev/pts

which has the consequence that the pathname of the directory which forms
the root of the /dev/pts mount is /. So if we bind-mount /dev/pts/ptmx to
/dev/ptmx we will end up on the same device as the devtmpfs mount on
/dev/pts

-- -- 0:20 /ptmx /dev/ptmx

but given that / is the pathname of the directory which forms the root of
the /dev/pts bind-mount, the resulting source will be /ptmx. So in
devpts_acquire() we need to check if
"(path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)" and exit early before
we try to call path_pts(). If we call path_pts() in the bind-mount case we
will hit path_parent_directory() which will **not** yield /dev as parent
directory but rather / and fail. We should still perform the path_pts()
check in devpts_acquire() if we haven't found a suitable devpts filesystem
at open time but we should always try to exit early in devpts_acquire() and
defer the bind-mount resolution to devpts_mntget(). If we fail in
devpts_mntget() we will end up with a wrong /proc/<pid>/fd/{0,1,2} symlink,
if we fail in devpts_acquire() we will end up with a failed
open("/dev/ptmx") which is more troublesome.

Here's a little reproducer that presupposes a libc that uses TIOCGPTPEER in
its openpty() implementation:

unshare --mount
mount --bind /dev/pts/ptmx /dev/ptmx
chmod 666 /dev/ptmx
script
ls -al /proc/self/fd/0

with output:

lrwx------ 1 chb chb 64 Mar  7 16:41 /proc/self/fd/0 -> /

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Suggested-by: Eric Biederman <ebiederm@xmission.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
---
ChangeLog v0->v1:
- remove
        /* Has the devpts filesystem already been found? */
        if (path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)
	                return 0
  from devpts_ptmx_path()
- check superblock after devpts_ptmx_path() returned
---
 fs/devpts/inode.c | 41 +++++++++++++++++++++++++++++++++--------
 1 file changed, 33 insertions(+), 8 deletions(-)

diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index e31d6ed3ec32..89ee17733087 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -138,10 +138,6 @@ static int devpts_ptmx_path(struct path *path)
 	struct super_block *sb;
 	int err;
 
-	/* Has the devpts filesystem already been found? */
-	if (path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)
-		return 0;
-
 	/* Is a devpts filesystem at "pts" in the same directory? */
 	err = path_pts(path);
 	if (err)
@@ -163,6 +159,26 @@ struct vfsmount *devpts_mntget(struct file *filp, struct pts_fs_info *fsi)
 
 	path = filp->f_path;
 	path_get(&path);
+	if ((DEVPTS_SB(path.mnt->mnt_sb) == fsi) &&
+	    (path.mnt->mnt_root == fsi->ptmx_dentry)) {
+		/* Walk upward while the start point is a bind mount of a single
+		 * file.
+		 */
+		while (path.mnt->mnt_root == path.dentry)
+			if (follow_up(&path) == 0)
+				break;
+
+		/* Is this path a valid devpts filesystem? */
+		err = devpts_ptmx_path(&path);
+		dput(path.dentry);
+		if (err == 0)
+			goto check_devpts_sb;
+
+		path_put(&path);
+		path = filp->f_path;
+		path_get(&path);
+		goto check_devpts_sb;
+	}
 
 	err = devpts_ptmx_path(&path);
 	dput(path.dentry);
@@ -170,10 +186,13 @@ struct vfsmount *devpts_mntget(struct file *filp, struct pts_fs_info *fsi)
 		mntput(path.mnt);
 		return ERR_PTR(err);
 	}
+
+check_devpts_sb:
 	if (DEVPTS_SB(path.mnt->mnt_sb) != fsi) {
 		mntput(path.mnt);
 		return ERR_PTR(-ENODEV);
 	}
+
 	return path.mnt;
 }
 
@@ -187,10 +206,16 @@ struct pts_fs_info *devpts_acquire(struct file *filp)
 	path = filp->f_path;
 	path_get(&path);
 
-	err = devpts_ptmx_path(&path);
-	if (err) {
-		result = ERR_PTR(err);
-		goto out;
+	/* Has the devpts filesystem already been found? */
+	if (path.mnt->mnt_sb->s_magic != DEVPTS_SUPER_MAGIC) {
+		/* Is there an appropriate devpts filesystem in the parent
+		 * directory?
+		 */
+		err = devpts_ptmx_path(&path);
+		if (err) {
+			result = ERR_PTR(err);
+			goto out;
+		}
 	}
 
 	/*
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-03-11 11:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-09 10:57 [PATCH v1] devpts: resolve devpts bind-mounts Christian Brauner
2018-03-09 18:37 ` Linus Torvalds
2018-03-11 11:30   ` Christian Brauner
  -- strict thread matches above, loose matches on Subject: below --
2018-03-09 10:24 Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).