linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3 v2] devpts: handle /dev/ptmx bind-mounts
@ 2018-03-11 21:05 Christian Brauner
  2018-03-11 21:05 ` [PATCH 1/3 v2] devpts: hoist out check for DEVPTS_SUPER_MAGIC Christian Brauner
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Christian Brauner @ 2018-03-11 21:05 UTC (permalink / raw)
  To: linux-kernel, ebiederm, torvalds; +Cc: Christian Brauner

Hey everyone,

This is the second iteration of this patch. Please note, that I added a
selftest for correct /proc/<pid>/fd/{0,1,2} symlinks and added it to
tools/testing/selftests/filesystem because the root cause of the problem is
located in the devpts filesystem. But I'm not sure if this is the right place.
It might also be argued that this should be under
tools/testing/selftests/drivers/pty.

Most libcs will still look at /dev/ptmx when opening the master fd of a pty
device. When /dev/ptmx is a bind-mount of /dev/pts/ptmx and the TIOCGPTPEER
ioctl() is used to safely retrieve a file descriptor for the slave side of
the pty based on the master fd, the /proc/self/fd/{0,1,2} symlinks will
point to /.

When the kernel tries to look up the root mount of the dentry for the slave
file descriptor it will detect that the dentry is escaping its bind-mount
since the root mount of the dentry is /dev/pts where the devpts is mounted
but the root mount of /dev/ptmx is /dev.

Having bind-mounts of /dev/pts/ptmx to /dev/ptmx not working correctly is a
regression. In addition, it is also a fairly common scenario in containers
employing user namespaces.

To handle bind-mounts of /dev/pts/ptmx to /dev/ptmx correctly we need to
walk up the bind-mounts for /dev/ptmx in devpts_mntget(). This also means
we need to remove the
"if (path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)" check from
devpts_mntget(). However, devpts_acquire() needs to perform that check.
The reason is that while the devpts mounted at /dev/pts has devtmpfs
mounted at /dev as its parent mount:

21 -- -- / /dev
-- 21 -- / /dev/pts

they are on different devices

-- -- 0:6  / /dev
-- -- 0:20 / /dev/pts

which has the consequence that the pathname of the directory which forms
the root of the /dev/pts mount is /. So if we bind-mount /dev/pts/ptmx to
/dev/ptmx we will end up on the same device as the devtmpfs mount on
/dev/pts

-- -- 0:20 /ptmx /dev/ptmx

but given that / is the pathname of the directory which forms the root of
the /dev/pts bind-mount, the resulting source will be /ptmx. So in
devpts_acquire() we still need to check if
"(path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)" and exit early before
we try to call path_pts(). If we call path_pts() in the bind-mount case we
will hit path_parent_directory() which will **not** yield /dev as parent
directory but rather / which will cause path_pts() to fail since it will
not find a "pts" directory underneath /. We should still perform the
path_pts() check in devpts_acquire() if we haven't found a suitable devpts
filesystem at open time but we should always try to exit early in
devpts_acquire() and defer resolving bind-mounts until devpts_mntget(). If
we fail in devpts_mntget() we will end up with a wrong
/proc/<pid>/fd/{0,1,2} symlink, if we fail in devpts_acquire() we will end
up with a failed open("/dev/ptmx") which is more troublesome.

Here's a little reproducer that presupposes a libc that uses TIOCGPTPEER in
its openpty() implementation:

unshare --mount
mount --bind /dev/pts/ptmx /dev/ptmx
chmod 666 /dev/ptmx
script
ls -al /proc/self/fd/0

with output:

lrwx------ 1 chb chb 64 Mar  7 16:41 /proc/self/fd/0 -> /

Christian Brauner (3):
  devpts: hoist out check for DEVPTS_SUPER_MAGIC
  devpts: resolve devpts bind-mounts
  selftests: add devpts selftests

 fs/devpts/inode.c                                |  32 ++--
 tools/testing/selftests/Makefile                 |   1 +
 tools/testing/selftests/filesystems/.gitignore   |   1 +
 tools/testing/selftests/filesystems/Makefile     |   3 +-
 tools/testing/selftests/filesystems/devpts_pts.c | 191 +++++++++++++++++++++++
 5 files changed, 217 insertions(+), 11 deletions(-)
 create mode 100644 tools/testing/selftests/filesystems/devpts_pts.c

---
ChangeLog v1->v2:
* patch for non-functional changes added
* patch for tests added
ChangeLog v0->v1:
* remove
        /* Has the devpts filesystem already been found? */
        if (path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)
	                return 0
  from devpts_ptmx_path()
* check superblock after devpts_ptmx_path() returned

-- 
2.15.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/3 v2] devpts: hoist out check for DEVPTS_SUPER_MAGIC
  2018-03-11 21:05 [PATCH 0/3 v2] devpts: handle /dev/ptmx bind-mounts Christian Brauner
@ 2018-03-11 21:05 ` Christian Brauner
  2018-03-11 21:05 ` [PATCH 2/3 v2] devpts: resolve devpts bind-mounts Christian Brauner
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Christian Brauner @ 2018-03-11 21:05 UTC (permalink / raw)
  To: linux-kernel, ebiederm, torvalds; +Cc: Christian Brauner

Hoist the check whether we have already found a suitable devpts filesystem
out of devpts_ptmx_path() in preparation for the devpts bind-mount
resolution patch. This is a non-functional change.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
ChangeLog v1->v2:
* patch added
ChangeLog v0->v1:
* patch not present
---
 fs/devpts/inode.c | 35 ++++++++++++++++++++---------------
 1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index e31d6ed3ec32..d3ddbb888874 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -138,10 +138,6 @@ static int devpts_ptmx_path(struct path *path)
 	struct super_block *sb;
 	int err;
 
-	/* Has the devpts filesystem already been found? */
-	if (path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)
-		return 0;
-
 	/* Is a devpts filesystem at "pts" in the same directory? */
 	err = path_pts(path);
 	if (err)
@@ -159,17 +155,22 @@ static int devpts_ptmx_path(struct path *path)
 struct vfsmount *devpts_mntget(struct file *filp, struct pts_fs_info *fsi)
 {
 	struct path path;
-	int err;
 
 	path = filp->f_path;
 	path_get(&path);
 
-	err = devpts_ptmx_path(&path);
-	dput(path.dentry);
-	if (err) {
-		mntput(path.mnt);
-		return ERR_PTR(err);
+	/* Has the devpts filesystem already been found? */
+	if (path.mnt->mnt_sb->s_magic != DEVPTS_SUPER_MAGIC) {
+		int err;
+
+		err = devpts_ptmx_path(&path);
+		dput(path.dentry);
+		if (err) {
+			mntput(path.mnt);
+			return ERR_PTR(err);
+		}
 	}
+
 	if (DEVPTS_SB(path.mnt->mnt_sb) != fsi) {
 		mntput(path.mnt);
 		return ERR_PTR(-ENODEV);
@@ -182,15 +183,19 @@ struct pts_fs_info *devpts_acquire(struct file *filp)
 	struct pts_fs_info *result;
 	struct path path;
 	struct super_block *sb;
-	int err;
 
 	path = filp->f_path;
 	path_get(&path);
 
-	err = devpts_ptmx_path(&path);
-	if (err) {
-		result = ERR_PTR(err);
-		goto out;
+	/* Has the devpts filesystem already been found? */
+	if (path.mnt->mnt_sb->s_magic != DEVPTS_SUPER_MAGIC) {
+		int err;
+
+		err = devpts_ptmx_path(&path);
+		if (err) {
+			result = ERR_PTR(err);
+			goto out;
+		}
 	}
 
 	/*
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3 v2] devpts: resolve devpts bind-mounts
  2018-03-11 21:05 [PATCH 0/3 v2] devpts: handle /dev/ptmx bind-mounts Christian Brauner
  2018-03-11 21:05 ` [PATCH 1/3 v2] devpts: hoist out check for DEVPTS_SUPER_MAGIC Christian Brauner
@ 2018-03-11 21:05 ` Christian Brauner
  2018-03-11 21:05 ` [PATCH 3/3 v2] selftests: add devpts selftests Christian Brauner
  2018-03-11 21:46 ` [PATCH 0/3 v2] devpts: handle /dev/ptmx bind-mounts Linus Torvalds
  3 siblings, 0 replies; 6+ messages in thread
From: Christian Brauner @ 2018-03-11 21:05 UTC (permalink / raw)
  To: linux-kernel, ebiederm, torvalds; +Cc: Christian Brauner

Most libcs will still look at /dev/ptmx when opening the master fd of a pty
device. When /dev/ptmx is a bind-mount of /dev/pts/ptmx and the TIOCGPTPEER
ioctl() is used to safely retrieve a file descriptor for the slave side of
the pty based on the master fd, the /proc/self/fd/{0,1,2} symlinks will
point to /.

When the kernel tries to look up the root mount of the dentry for the slave
file descriptor it will detect that the dentry is escaping its bind-mount
since the root mount of the dentry is /dev/pts where the devpts is mounted
but the root mount of /dev/ptmx is /dev.

Having bind-mounts of /dev/pts/ptmx to /dev/ptmx not working correctly is a
regression. In addition, it is also a fairly common scenario in containers
employing user namespaces.

To handle bind-mounts of /dev/pts/ptmx to /dev/ptmx correctly we need to
walk up the bind-mounts for /dev/ptmx in devpts_mntget(). This also means
we need to remove the
"if (path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)" check from
devpts_mntget(). However, devpts_acquire() needs to perform that check.
The reason is that while the devpts mounted at /dev/pts has devtmpfs
mounted at /dev as its parent mount:

21 -- -- / /dev
-- 21 -- / /dev/pts

they are on different devices

-- -- 0:6  / /dev
-- -- 0:20 / /dev/pts

which has the consequence that the pathname of the directory which forms
the root of the /dev/pts mount is /. So if we bind-mount /dev/pts/ptmx to
/dev/ptmx we will end up on the same device as the devtmpfs mount on
/dev/pts

-- -- 0:20 /ptmx /dev/ptmx

but given that / is the pathname of the directory which forms the root of
the /dev/pts bind-mount, the resulting source will be /ptmx. So in
devpts_acquire() we still need to check if
"(path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)" and exit early before
we try to call path_pts(). If we call path_pts() in the bind-mount case we
will hit path_parent_directory() which will **not** yield /dev as parent
directory but rather / which will cause path_pts() to fail since it will
not find a "pts" directory underneath /. We should still perform the
path_pts() check in devpts_acquire() if we haven't found a suitable devpts
filesystem at open time but we should always try to exit early in
devpts_acquire() and defer resolving bind-mounts until devpts_mntget(). If
we fail in devpts_mntget() we will end up with a wrong
/proc/<pid>/fd/{0,1,2} symlink, if we fail in devpts_acquire() we will end
up with a failed open("/dev/ptmx") which is more troublesome.

Here's a little reproducer that presupposes a libc that uses TIOCGPTPEER in
its openpty() implementation:

unshare --mount
mount --bind /dev/pts/ptmx /dev/ptmx
chmod 666 /dev/ptmx
script
ls -al /proc/self/fd/0

with output:

lrwx------ 1 chb chb 64 Mar  7 16:41 /proc/self/fd/0 -> /

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Suggested-by: Eric Biederman <ebiederm@xmission.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
---
ChangeLog v1->v2:
* move removal of if (path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)
  condition to separate patch with non-functional changes
ChangeLog v0->v1:
* remove
        /* Has the devpts filesystem already been found? */
        if (path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)
	                return 0
  from devpts_ptmx_path()
* check superblock after devpts_ptmx_path() returned
---
 fs/devpts/inode.c | 25 ++++++++++++++++---------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index d3ddbb888874..249aba08e9e6 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -154,27 +154,34 @@ static int devpts_ptmx_path(struct path *path)
 
 struct vfsmount *devpts_mntget(struct file *filp, struct pts_fs_info *fsi)
 {
+	int err;
 	struct path path;
 
 	path = filp->f_path;
 	path_get(&path);
 
-	/* Has the devpts filesystem already been found? */
-	if (path.mnt->mnt_sb->s_magic != DEVPTS_SUPER_MAGIC) {
-		int err;
+	if ((DEVPTS_SB(path.mnt->mnt_sb) == fsi) &&
+	    (path.mnt->mnt_root == fsi->ptmx_dentry)) {
+		/* Walk upward while the start point is a bind mount of
+		 * a single file.
+		 */
+		while (path.mnt->mnt_root == path.dentry)
+			if (follow_up(&path) == 0)
+				break;
+	}
 
-		err = devpts_ptmx_path(&path);
-		dput(path.dentry);
-		if (err) {
-			mntput(path.mnt);
-			return ERR_PTR(err);
-		}
+	err = devpts_ptmx_path(&path);
+	dput(path.dentry);
+	if (err) {
+		mntput(path.mnt);
+		return ERR_PTR(err);
 	}
 
 	if (DEVPTS_SB(path.mnt->mnt_sb) != fsi) {
 		mntput(path.mnt);
 		return ERR_PTR(-ENODEV);
 	}
+
 	return path.mnt;
 }
 
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3 v2] selftests: add devpts selftests
  2018-03-11 21:05 [PATCH 0/3 v2] devpts: handle /dev/ptmx bind-mounts Christian Brauner
  2018-03-11 21:05 ` [PATCH 1/3 v2] devpts: hoist out check for DEVPTS_SUPER_MAGIC Christian Brauner
  2018-03-11 21:05 ` [PATCH 2/3 v2] devpts: resolve devpts bind-mounts Christian Brauner
@ 2018-03-11 21:05 ` Christian Brauner
  2018-03-11 21:46 ` [PATCH 0/3 v2] devpts: handle /dev/ptmx bind-mounts Linus Torvalds
  3 siblings, 0 replies; 6+ messages in thread
From: Christian Brauner @ 2018-03-11 21:05 UTC (permalink / raw)
  To: linux-kernel, ebiederm, torvalds; +Cc: Christian Brauner

This adds a simple test to check whether /proc/<pid>/fd/<nr> symlinks are
correctly pointing to /dev/pts/<idx> devices when attached to a terminal.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
ChangeLog v1->v2:
* patch added
ChangeLog v0->v1:
* patch not present
---
 tools/testing/selftests/Makefile                 |   1 +
 tools/testing/selftests/filesystems/.gitignore   |   1 +
 tools/testing/selftests/filesystems/Makefile     |   3 +-
 tools/testing/selftests/filesystems/devpts_pts.c | 191 +++++++++++++++++++++++
 4 files changed, 195 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/filesystems/devpts_pts.c

diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 7442dfb73b7f..dbda89c9d9b9 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -7,6 +7,7 @@ TARGETS += cpufreq
 TARGETS += cpu-hotplug
 TARGETS += efivarfs
 TARGETS += exec
+TARGETS += filesystems
 TARGETS += firmware
 TARGETS += ftrace
 TARGETS += futex
diff --git a/tools/testing/selftests/filesystems/.gitignore b/tools/testing/selftests/filesystems/.gitignore
index 31d6e426b6d4..8449cf6716ce 100644
--- a/tools/testing/selftests/filesystems/.gitignore
+++ b/tools/testing/selftests/filesystems/.gitignore
@@ -1 +1,2 @@
 dnotify_test
+devpts_pts
diff --git a/tools/testing/selftests/filesystems/Makefile b/tools/testing/selftests/filesystems/Makefile
index 13a73bf725b5..b9b5d2f68990 100644
--- a/tools/testing/selftests/filesystems/Makefile
+++ b/tools/testing/selftests/filesystems/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
-TEST_PROGS := dnotify_test
+TEST_PROGS := dnotify_test \
+	devpts_pts
 all: $(TEST_PROGS)
 
 include ../lib.mk
diff --git a/tools/testing/selftests/filesystems/devpts_pts.c b/tools/testing/selftests/filesystems/devpts_pts.c
new file mode 100644
index 000000000000..10a73e422600
--- /dev/null
+++ b/tools/testing/selftests/filesystems/devpts_pts.c
@@ -0,0 +1,191 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+#include <errno.h>
+#include <fcntl.h>
+#include <sched.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <sys/ioctl.h>
+#include <sys/mount.h>
+#include <sys/wait.h>
+
+static bool terminal_dup2(int duplicate, int original)
+{
+	int ret;
+
+	ret = dup2(duplicate, original);
+	if (ret < 0)
+		return false;
+
+	return true;
+}
+
+static int lxc_terminal_set_stdfds(int fd)
+{
+	int i;
+
+	if (fd < 0)
+		return 0;
+
+	for (i = 0; i < 3; i++)
+		if (!terminal_dup2(fd, (int[]){STDIN_FILENO, STDOUT_FILENO,
+					       STDERR_FILENO}[i]))
+			return -1;
+
+	return 0;
+}
+
+static int login_pty(int fd)
+{
+	int ret;
+
+	setsid();
+
+	ret = ioctl(fd, TIOCSCTTY, NULL);
+	if (ret < 0)
+		return -1;
+
+	ret = lxc_terminal_set_stdfds(fd);
+	if (ret < 0)
+		return -1;
+
+	if (fd > STDERR_FILENO)
+		close(fd);
+
+	return 0;
+}
+
+static int wait_for_pid(pid_t pid)
+{
+	int status, ret;
+
+again:
+	ret = waitpid(pid, &status, 0);
+	if (ret == -1) {
+		if (errno == EINTR)
+			goto again;
+		return -1;
+	}
+	if (ret != pid)
+		goto again;
+
+	if (!WIFEXITED(status) || WEXITSTATUS(status) != 0)
+		return -1;
+
+	return 0;
+}
+
+int main(int argc, char *argv[])
+{
+	int master = -1, ret = -1, slave = -1;
+	int fret = EXIT_FAILURE;
+
+	if (!isatty(STDIN_FILENO)) {
+		fprintf(stderr, "Standard input file desciptor is not attached "
+				"to a terminal. Skipping test\n");
+		goto do_cleanup;
+	}
+
+	ret = unshare(CLONE_NEWNS);
+	if (ret < 0) {
+		fprintf(stderr, "Failed to unshare mount namespace\n");
+		goto do_cleanup;
+	}
+
+	ret = mount("", "/", NULL, MS_PRIVATE | MS_REC, 0);
+	if (ret < 0) {
+		fprintf(stderr, "Failed to make \"/\" MS_PRIVATE in new mount "
+				"namespace\n");
+		goto do_cleanup;
+	}
+
+	ret = mount("/dev/pts/ptmx", "/dev/ptmx", NULL, MS_BIND, NULL);
+	if (ret < 0) {
+		fprintf(stderr, "Failed to bind mount \"/dev/pts/ptmx\" to "
+				"\"/dev/ptmx\" mount namespace\n");
+		goto do_cleanup;
+	}
+
+	master = open("/dev/ptmx", O_RDWR | O_NOCTTY | O_CLOEXEC);
+	if (master < 0) {
+		fprintf(stderr, "Failed to open \"/dev/ptmx\"\n");
+		goto do_cleanup;
+	}
+
+	ret = grantpt(master);
+	if (ret < 0) {
+		fprintf(stderr, "Failed to grant access to terminal\n");
+		goto do_cleanup;
+	}
+
+	ret = unlockpt(master);
+	if (ret < 0) {
+		fprintf(stderr, "Failed to unlock terminal\n");
+		goto do_cleanup;
+	}
+
+#ifdef TIOCGPTPEER
+	slave = ioctl(master, TIOCGPTPEER, O_RDWR | O_NOCTTY | O_CLOEXEC);
+#endif
+	if (slave < 0) {
+		if (errno == EINVAL) {
+			fprintf(stderr, "TIOCGPTPEER is not supported. "
+					"Skipping test.\n");
+			fret = EXIT_SUCCESS;
+		}
+
+		fprintf(stderr, "Failed to perform TIOCGPTPEER ioctl\n");
+		goto do_cleanup;
+	}
+
+	pid_t pid = fork();
+	if (pid < 0)
+		goto do_cleanup;
+
+	if (pid == 0) {
+                char buf[4096];
+
+		ret = login_pty(slave);
+		if (ret < 0) {
+			fprintf(stderr, "Failed to setup terminal\n");
+			_exit(EXIT_FAILURE);
+		}
+
+		ret = readlink("/proc/self/fd/0", buf, sizeof(buf));
+		if (ret < 0 || (size_t)ret >= sizeof(buf)) {
+			fprintf(stderr, "Failed to retrieve pathname of pts "
+					"slave file descriptor\n");
+			_exit(EXIT_FAILURE);
+		}
+		buf[ret] = '\0';
+
+		if (strncmp(buf, "/dev/pts/", strlen("/dev/pts/"))) {
+			fprintf(stderr, "Received invalid contents for "
+					"\"/proc/<pid>/fd/<nr>\" symlink: %s\n",
+					buf);
+			_exit(-1);
+		}
+
+		fprintf(stderr, "Contents of \"/proc/<pid>/fd/<nr>\" "
+				"symlink are valid: %s\n", buf);
+
+		_exit(EXIT_SUCCESS);
+	}
+
+	ret = wait_for_pid(pid);
+	if (ret < 0)
+		goto do_cleanup;
+
+	fret = EXIT_SUCCESS;
+
+do_cleanup:
+	if (master >= 0)
+		close(master);
+	if (slave >= 0)
+		close(slave);
+
+	return ret;
+}
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/3 v2] devpts: handle /dev/ptmx bind-mounts
  2018-03-11 21:05 [PATCH 0/3 v2] devpts: handle /dev/ptmx bind-mounts Christian Brauner
                   ` (2 preceding siblings ...)
  2018-03-11 21:05 ` [PATCH 3/3 v2] selftests: add devpts selftests Christian Brauner
@ 2018-03-11 21:46 ` Linus Torvalds
  2018-03-12 14:07   ` Christian Brauner
  3 siblings, 1 reply; 6+ messages in thread
From: Linus Torvalds @ 2018-03-11 21:46 UTC (permalink / raw)
  To: Christian Brauner, Al Viro; +Cc: Linux Kernel Mailing List, Eric W. Biederman

On Sun, Mar 11, 2018 at 2:05 PM, Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> This is the second iteration of this patch.

This looks good to me. Just wondering how this should be merged, and
whether we should have a Cc: stable for it?

.. and, just in case, maybe Al can verify that there's nothing subtle
about follow_up() that we need to worry about. That said, NFS already
has that exact same loop for follow_to_parent(), just syntactically
slightly different version.

In fact, I wonder if we even need to do that

        if ((DEVPTS_SB(path.mnt->mnt_sb) == fsi) &&
            (path.mnt->mnt_root == fsi->ptmx_dentry)) {

and maybe we could do the follow_up() loop unconditionally?

Because if the ptmx dentry is *not* a bind mount, then the loop will
be a no-op, and if it *is* a bind-mount, then I'm not convinced we
should even try to just limit it to the devpts case - maybe somebody
did a bind-mount on just a legacy ptmx device node?

So that "if()" actually seems to be to be superfluous, and only limit
the "follow bind mounts' case unnecessarily. Hmm?

           Linus

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/3 v2] devpts: handle /dev/ptmx bind-mounts
  2018-03-11 21:46 ` [PATCH 0/3 v2] devpts: handle /dev/ptmx bind-mounts Linus Torvalds
@ 2018-03-12 14:07   ` Christian Brauner
  0 siblings, 0 replies; 6+ messages in thread
From: Christian Brauner @ 2018-03-12 14:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christian Brauner, Al Viro, Linux Kernel Mailing List, Eric W. Biederman

On Sun, Mar 11, 2018 at 02:46:26PM -0700, Linus Torvalds wrote:
> On Sun, Mar 11, 2018 at 2:05 PM, Christian Brauner
> <christian.brauner@ubuntu.com> wrote:
> >
> > This is the second iteration of this patch.
> 
> This looks good to me. Just wondering how this should be merged, and
> whether we should have a Cc: stable for it?
> 
> .. and, just in case, maybe Al can verify that there's nothing subtle
> about follow_up() that we need to worry about. That said, NFS already

Right, sorry I forgot to CC him again after the first version of the
patch.

> has that exact same loop for follow_to_parent(), just syntactically
> slightly different version.

Once we got devpts worked out I can send a follow-up patch that adds a
helper to namei.c that walks up bind-mounts. Seems it would make sense
as I can see this being useful in general. To be honest, before this I
had no real concept what resolving a bind-mount inside the kernel means
which was why I was worried to just bang out a patch without discussing
it first. Anyway, that should be a small follow-up.

> 
> In fact, I wonder if we even need to do that
> 
>         if ((DEVPTS_SB(path.mnt->mnt_sb) == fsi) &&
>             (path.mnt->mnt_root == fsi->ptmx_dentry)) {
> 
> and maybe we could do the follow_up() loop unconditionally?
> 
> Because if the ptmx dentry is *not* a bind mount, then the loop will
> be a no-op, and if it *is* a bind-mount, then I'm not convinced we
> should even try to just limit it to the devpts case - maybe somebody
> did a bind-mount on just a legacy ptmx device node?

Hm, I think we want to keep the condition and the reasoning points to a
snag in the current patch I think:
So in case that e.g. /dev/ptmx or /some/other/place is indeed a
bind-mount of a devpts mounted somewhere else we want to give userspace
a chance and check whether we find a suitable "pts" directory in the
same directory where the bind-mount is. This ensures that the paths in
/proc/<pid>/fd/<nr> are correct and that operations like
readlink("/proc/<pid>/fd/<nr>", buf, sizeof(buf))
chown(buf, 0, 0)
are actually sane and don't suddenly chown() / instead of
/<devpts>/<idx>. This also let's us easily support libcs still going
through /dev/ptmx instead of /dev/pts/ptmx [1].
But in case the passed in path is not a bind-mount we actually don't
want to call devpts_ptmx_path() without the

/* Has the devpts filesystem already been found? */
if (path->mnt->mnt_sb->s_magic != DEVPTS_SUPER_MAGIC)

check being performed because if (path->mnt->mnt_sb->s_magic == DEVPTS_SUPER_MAGIC)
holds and we still call path_pts() it will try to look for a "pts" mount
in the parent directory but there's no requirement that I know of for a
devpts to only be mounted at /dev/pts or /<somewhere>/pts. This means
that the current logic would cause us to e.g. break stuff like
mount -t devpts devpts /wherever which we shouldn't do. (That's a
discussion we had in the previous round of devpts fixes.) So I think we
actually want devpts_mntget() to do:

struct vfsmount *devpts_mntget(struct file *filp, struct pts_fs_info *fsi)
{
	bool unwind;
	struct path path;
	int err = 0;

	path = filp->f_path;
	path_get(&path);

	unwind = (DEVPTS_SB(path.mnt->mnt_sb) == fsi) &&
		 (path.mnt->mnt_root == fsi->ptmx_dentry);
	/* Walk upward while the start point is a bind mount of
	 * a single file.
	 */
	while (path.mnt->mnt_root == path.dentry && unwind)
		if (follow_up(&path) == 0)
			break;

	if ((path.mnt->mnt_sb->s_magic != DEVPTS_SUPER_MAGIC) || unwind)
		err = devpts_ptmx_path(&path);
	dput(path.dentry);
	if (err) {
		mntput(path.mnt);
		return ERR_PTR(err);
	}

	if (DEVPTS_SB(path.mnt->mnt_sb) != fsi) {
		mntput(path.mnt);
		return ERR_PTR(-ENODEV);
	}

	return path.mnt;
}

Does that look sane?

I'll send a new version of the patch today and will add an additional
test for devpts being mounted at a different location.

Christian

> 
> So that "if()" actually seems to be to be superfluous, and only limit
> the "follow bind mounts' case unnecessarily. Hmm?

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-03-12 14:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-11 21:05 [PATCH 0/3 v2] devpts: handle /dev/ptmx bind-mounts Christian Brauner
2018-03-11 21:05 ` [PATCH 1/3 v2] devpts: hoist out check for DEVPTS_SUPER_MAGIC Christian Brauner
2018-03-11 21:05 ` [PATCH 2/3 v2] devpts: resolve devpts bind-mounts Christian Brauner
2018-03-11 21:05 ` [PATCH 3/3 v2] selftests: add devpts selftests Christian Brauner
2018-03-11 21:46 ` [PATCH 0/3 v2] devpts: handle /dev/ptmx bind-mounts Linus Torvalds
2018-03-12 14:07   ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).