linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] proc: don't show kernel-internal mnt_id in fdinfo (if possible)
@ 2016-11-07 16:06 Dmitry Safonov
  2016-11-07 17:06 ` Eric W. Biederman
  0 siblings, 1 reply; 3+ messages in thread
From: Dmitry Safonov @ 2016-11-07 16:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: 0x7f454c46, Dmitry Safonov, Al Viro, Andrey Vagin,
	Eric W. Biederman, Rob Landley

Some kernel special fs could be mounted by userspace, let's show
userspace mnt_id in those cases.

Otherwise, I got:
  [~]# cat /proc/11299/fdinfo/3
  pos:	0
  flags:	02000002
  mnt_id:	14
  [~]# cat /proc/11299/mountinfo | grep '^14'
  [~]# ls -l /proc/11299/fd/3
  lrwx------. 1 root root 64 Nov  7 18:30 /proc/11299/fd/3 -> /test-queue
  [~]# ls /dev/mqueue/
  test-queue
  [~]# cat /proc/11299/mountinfo | grep mqueue
  32 18 0:14 / /dev/mqueue rw,relatime shared:17 - mqueue mqueue rw,seclabel

This happens because mqueue fs is mounted twice:
- the first is kernel-internal mnt on init:
  kernel_init=>do_one_initcall=>init_mqueue_fs=>mq_init_ns=>vfs_kern_mount
- the second time it's systemd's mount-unit:
  entry_SYSCALL_64_fastpath=>SyS_mount=>do_mount=>vfs_kern_mount

For the purpose of userspace parsing, having in-kernel mnt_id is less
useful then mnt_id of mount point: afterwards I'm able to see fs type,
path, etc:
  [~]# cat /proc/11152/mountinfo | grep mqueue
  32 18 0:14 / /dev/mqueue rw,relatime shared:18 - mqueue mqueue rw,seclabel
  [~]# cat /proc/11152/fdinfo/3
  pos:	0
  flags:	02000002
  mnt_id:	32

On a bad side - if we've no userspace mount, we still can't tell a thing
about opened fd..

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rob Landley <rob@landley.net>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
---
 fs/proc/fd.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/fs/proc/fd.c b/fs/proc/fd.c
index d21dafef3102..bfa8699bcd8e 100644
--- a/fs/proc/fd.c
+++ b/fs/proc/fd.c
@@ -9,6 +9,7 @@
 #include <linux/file.h>
 #include <linux/seq_file.h>
 #include <linux/fs.h>
+#include <linux/nsproxy.h>
 
 #include <linux/proc_fs.h>
 
@@ -19,9 +20,10 @@
 static int seq_show(struct seq_file *m, void *v)
 {
 	struct files_struct *files = NULL;
-	int f_flags = 0, ret = -ENOENT;
+	int f_flags = 0, ret = -ENOENT, mnt_id = 0;
 	struct file *file = NULL;
 	struct task_struct *task;
+	struct mount *mount;
 
 	task = get_proc_task(m->private);
 	if (!task)
@@ -52,9 +54,25 @@ static int seq_show(struct seq_file *m, void *v)
 	if (ret)
 		return ret;
 
+	mount = real_mount(file->f_path.mnt);
+	if (mount->mnt_ns == MNT_NS_INTERNAL) {
+		struct mount *mnt;
+
+		lock_mount_hash();
+		list_for_each_entry(mnt, &mount->mnt_instance, mnt_instance) {
+			if (current->nsproxy->mnt_ns == mnt->mnt_ns) {
+				mnt_id = mnt->mnt_id;
+				break;
+			}
+		}
+		unlock_mount_hash();
+	}
+
+	if (mnt_id == 0)
+		mnt_id = mount->mnt_id;
+
 	seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\n",
-		   (long long)file->f_pos, f_flags,
-		   real_mount(file->f_path.mnt)->mnt_id);
+		   (long long)file->f_pos, f_flags, mnt_id);
 
 	show_fd_locks(m, file, files);
 	if (seq_has_overflowed(m))
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC] proc: don't show kernel-internal mnt_id in fdinfo (if possible)
  2016-11-07 16:06 [RFC] proc: don't show kernel-internal mnt_id in fdinfo (if possible) Dmitry Safonov
@ 2016-11-07 17:06 ` Eric W. Biederman
  2016-11-07 17:30   ` Dmitry Safonov
  0 siblings, 1 reply; 3+ messages in thread
From: Eric W. Biederman @ 2016-11-07 17:06 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, 0x7f454c46, Al Viro, Andrey Vagin, Rob Landley

Dmitry Safonov <dsafonov@virtuozzo.com> writes:

> Some kernel special fs could be mounted by userspace, let's show
> userspace mnt_id in those cases.

You are asking the kernel to lie to userspace in the case when
the truth in inconvenient.  That seems blantantly wrong.

In the case of an internal mount you may want to use the device id
of the device the filesystem is mounted on so you can tie all of
the mounts together.  That would allow restore and other things
to do something useful.

That information is available with fstat on a file descriptor so I don't
think we need to export it.  But if it is not available I can see
the point of a patch.

Outright ling to userpsace.  No.  That is just horrible.

Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>

>
> Otherwise, I got:
>   [~]# cat /proc/11299/fdinfo/3
>   pos:	0
>   flags:	02000002
>   mnt_id:	14
>   [~]# cat /proc/11299/mountinfo | grep '^14'
>   [~]# ls -l /proc/11299/fd/3
>   lrwx------. 1 root root 64 Nov  7 18:30 /proc/11299/fd/3 -> /test-queue
>   [~]# ls /dev/mqueue/
>   test-queue
>   [~]# cat /proc/11299/mountinfo | grep mqueue
>   32 18 0:14 / /dev/mqueue rw,relatime shared:17 - mqueue mqueue rw,seclabel
>
> This happens because mqueue fs is mounted twice:
> - the first is kernel-internal mnt on init:
>   kernel_init=>do_one_initcall=>init_mqueue_fs=>mq_init_ns=>vfs_kern_mount
> - the second time it's systemd's mount-unit:
>   entry_SYSCALL_64_fastpath=>SyS_mount=>do_mount=>vfs_kern_mount
>
> For the purpose of userspace parsing, having in-kernel mnt_id is less
> useful then mnt_id of mount point: afterwards I'm able to see fs type,
> path, etc:
>   [~]# cat /proc/11152/mountinfo | grep mqueue
>   32 18 0:14 / /dev/mqueue rw,relatime shared:18 - mqueue mqueue rw,seclabel
>   [~]# cat /proc/11152/fdinfo/3
>   pos:	0
>   flags:	02000002
>   mnt_id:	32
>
> On a bad side - if we've no userspace mount, we still can't tell a thing
> about opened fd..
>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: Andrey Vagin <avagin@openvz.org>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> Cc: Rob Landley <rob@landley.net>
> Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
> ---
>  fs/proc/fd.c | 24 +++++++++++++++++++++---
>  1 file changed, 21 insertions(+), 3 deletions(-)
>
> diff --git a/fs/proc/fd.c b/fs/proc/fd.c
> index d21dafef3102..bfa8699bcd8e 100644
> --- a/fs/proc/fd.c
> +++ b/fs/proc/fd.c
> @@ -9,6 +9,7 @@
>  #include <linux/file.h>
>  #include <linux/seq_file.h>
>  #include <linux/fs.h>
> +#include <linux/nsproxy.h>
>  
>  #include <linux/proc_fs.h>
>  
> @@ -19,9 +20,10 @@
>  static int seq_show(struct seq_file *m, void *v)
>  {
>  	struct files_struct *files = NULL;
> -	int f_flags = 0, ret = -ENOENT;
> +	int f_flags = 0, ret = -ENOENT, mnt_id = 0;
>  	struct file *file = NULL;
>  	struct task_struct *task;
> +	struct mount *mount;
>  
>  	task = get_proc_task(m->private);
>  	if (!task)
> @@ -52,9 +54,25 @@ static int seq_show(struct seq_file *m, void *v)
>  	if (ret)
>  		return ret;
>  
> +	mount = real_mount(file->f_path.mnt);
> +	if (mount->mnt_ns == MNT_NS_INTERNAL) {
> +		struct mount *mnt;
> +
> +		lock_mount_hash();
> +		list_for_each_entry(mnt, &mount->mnt_instance, mnt_instance) {
> +			if (current->nsproxy->mnt_ns == mnt->mnt_ns) {
> +				mnt_id = mnt->mnt_id;
> +				break;
> +			}
> +		}
> +		unlock_mount_hash();
> +	}
> +
> +	if (mnt_id == 0)
> +		mnt_id = mount->mnt_id;
> +
>  	seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\n",
> -		   (long long)file->f_pos, f_flags,
> -		   real_mount(file->f_path.mnt)->mnt_id);
> +		   (long long)file->f_pos, f_flags, mnt_id);
>  
>  	show_fd_locks(m, file, files);
>  	if (seq_has_overflowed(m))

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] proc: don't show kernel-internal mnt_id in fdinfo (if possible)
  2016-11-07 17:06 ` Eric W. Biederman
@ 2016-11-07 17:30   ` Dmitry Safonov
  0 siblings, 0 replies; 3+ messages in thread
From: Dmitry Safonov @ 2016-11-07 17:30 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, 0x7f454c46, Al Viro, Andrey Vagin, Rob Landley

On 11/07/2016 08:06 PM, Eric W. Biederman wrote:
> Dmitry Safonov <dsafonov@virtuozzo.com> writes:
>
>> Some kernel special fs could be mounted by userspace, let's show
>> userspace mnt_id in those cases.
>
> You are asking the kernel to lie to userspace in the case when
> the truth in inconvenient.  That seems blantantly wrong.
>
> In the case of an internal mount you may want to use the device id
> of the device the filesystem is mounted on so you can tie all of
> the mounts together.  That would allow restore and other things
> to do something useful.
>
> That information is available with fstat on a file descriptor so I don't
> think we need to export it.  But if it is not available I can see
> the point of a patch.
>
> Outright ling to userpsace.  No.  That is just horrible.
>
> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>

Ah, ok right, I see your point - then let's just drop the patch.

Thanks,
              Dmitry

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-11-07 19:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-07 16:06 [RFC] proc: don't show kernel-internal mnt_id in fdinfo (if possible) Dmitry Safonov
2016-11-07 17:06 ` Eric W. Biederman
2016-11-07 17:30   ` Dmitry Safonov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).