All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
@ 2022-05-19 21:40 ` Kalesh Singh
  0 siblings, 0 replies; 16+ messages in thread
From: Kalesh Singh @ 2022-05-19 21:40 UTC (permalink / raw)
  Cc: ilkos, tjmercier, surenb, kernel-team, Kalesh Singh,
	Jonathan Corbet, Sumit Semwal, Christian König,
	Andrew Morton, Christoph Anton Mitterer, Kees Cook,
	Mike Rapoport, Colin Cross, Randy Dunlap, linux-kernel,
	linux-fsdevel, linux-doc, linux-media, dri-devel, linaro-mm-sig

Processes can pin shared memory by keeping a handle to it through a
file descriptor; for instance dmabufs, memfd, and ashsmem (in Android).

In the case of a memory leak, to identify the process pinning the
memory, userspace needs to:
  - Iterate the /proc/<pid>/fd/* for each process
  - Do a readlink on each entry to identify the type of memory from
    the file path.
  - stat() each entry to get the size of the memory.

The file permissions on /proc/<pid>/fd/* only allows for the owner
or root to perform the operations above; and so is not suitable for
capturing the system-wide state in a production environment.

This issue was addressed for dmabufs by making /proc/*/fdinfo/*
accessible to a process with PTRACE_MODE_READ_FSCREDS credentials[1]
To allow the same kind of tracking for other types of shared memory,
add the following fields to /proc/<pid>/fdinfo/<fd>:

path - This allows identifying the type of memory based on common
       prefixes: e.g. "/memfd...", "/dmabuf...", "/dev/ashmem..."

       This was not an issued when dmabuf tracking was introduced
       because the exp_name field of dmabuf fdinfo could be used
       to distinguish dmabuf fds from other types.

size - To track the amount of memory that is being pinned.

       dmabufs expose size as an additional field in fdinfo. Remove
       this and make it a common field for all fds.

Access to /proc/<pid>/fdinfo is governed by PTRACE_MODE_READ_FSCREDS
-- the same as for /proc/<pid>/maps which also exposes the path and
size for mapped memory regions.

This allows for a system process with PTRACE_MODE_READ_FSCREDS to
account the pinned per-process memory via fdinfo.

[1] https://lore.kernel.org/lkml/20210308170651.919148-1-kaleshsingh@google.com/

Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
---
 Documentation/filesystems/proc.rst | 22 ++++++++++++++++++++--
 drivers/dma-buf/dma-buf.c          |  1 -
 fs/proc/fd.c                       |  9 +++++++--
 3 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index 061744c436d9..ad66d78aca51 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -1922,13 +1922,16 @@ if precise results are needed.
 3.8	/proc/<pid>/fdinfo/<fd> - Information about opened file
 ---------------------------------------------------------------
 This file provides information associated with an opened file. The regular
-files have at least four fields -- 'pos', 'flags', 'mnt_id' and 'ino'.
+files have at least six fields -- 'pos', 'flags', 'mnt_id', 'ino', 'size',
+and 'path'.
+
 The 'pos' represents the current offset of the opened file in decimal
 form [see lseek(2) for details], 'flags' denotes the octal O_xxx mask the
 file has been created with [see open(2) for details] and 'mnt_id' represents
 mount ID of the file system containing the opened file [see 3.5
 /proc/<pid>/mountinfo for details]. 'ino' represents the inode number of
-the file.
+the file, 'size' represents the size of the file in bytes, and 'path'
+represents the file path.
 
 A typical output is::
 
@@ -1936,6 +1939,8 @@ A typical output is::
 	flags:	0100002
 	mnt_id:	19
 	ino:	63107
+        size:   0
+        path:   /dev/null
 
 All locks associated with a file descriptor are shown in its fdinfo too::
 
@@ -1953,6 +1958,8 @@ Eventfd files
 	flags:	04002
 	mnt_id:	9
 	ino:	63107
+        size:   0
+        path:   anon_inode:[eventfd]
 	eventfd-count:	5a
 
 where 'eventfd-count' is hex value of a counter.
@@ -1966,6 +1973,8 @@ Signalfd files
 	flags:	04002
 	mnt_id:	9
 	ino:	63107
+        size:   0
+        path:   anon_inode:[signalfd]
 	sigmask:	0000000000000200
 
 where 'sigmask' is hex value of the signal mask associated
@@ -1980,6 +1989,8 @@ Epoll files
 	flags:	02
 	mnt_id:	9
 	ino:	63107
+        size:   0
+        path:   anon_inode:[eventpoll]
 	tfd:        5 events:       1d data: ffffffffffffffff pos:0 ino:61af sdev:7
 
 where 'tfd' is a target file descriptor number in decimal form,
@@ -1998,6 +2009,8 @@ For inotify files the format is the following::
 	flags:	02000000
 	mnt_id:	9
 	ino:	63107
+        size:   0
+        path:   anon_inode:inotify
 	inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d
 
 where 'wd' is a watch descriptor in decimal form, i.e. a target file
@@ -2021,6 +2034,8 @@ For fanotify files the format is::
 	flags:	02
 	mnt_id:	9
 	ino:	63107
+        size:   0
+        path:   anon_inode:[fanotify]
 	fanotify flags:10 event-flags:0
 	fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003
 	fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4
@@ -2046,6 +2061,8 @@ Timerfd files
 	flags:	02
 	mnt_id:	9
 	ino:	63107
+        size:   0
+        path:   anon_inode:[timerfd]
 	clockid: 0
 	ticks: 0
 	settime flags: 01
@@ -2070,6 +2087,7 @@ DMA Buffer files
 	mnt_id:	9
 	ino:	63107
 	size:   32768
+        path:   /dmabuf:
 	count:  2
 	exp_name:  system-heap
 
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index b1e25ae98302..d61183ff3c30 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -377,7 +377,6 @@ static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file)
 {
 	struct dma_buf *dmabuf = file->private_data;
 
-	seq_printf(m, "size:\t%zu\n", dmabuf->size);
 	/* Don't count the temporary reference taken inside procfs seq_show */
 	seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1);
 	seq_printf(m, "exp_name:\t%s\n", dmabuf->exp_name);
diff --git a/fs/proc/fd.c b/fs/proc/fd.c
index 913bef0d2a36..a8a968bc58f0 100644
--- a/fs/proc/fd.c
+++ b/fs/proc/fd.c
@@ -54,10 +54,15 @@ static int seq_show(struct seq_file *m, void *v)
 	if (ret)
 		return ret;
 
-	seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\n",
+	seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\nsize:\t%zu\n",
 		   (long long)file->f_pos, f_flags,
 		   real_mount(file->f_path.mnt)->mnt_id,
-		   file_inode(file)->i_ino);
+		   file_inode(file)->i_ino,
+		   file_inode(file)->i_size);
+
+	seq_puts(m, "path:\t");
+	seq_file_path(m, file, "\n");
+	seq_putc(m, '\n');
 
 	/* show_fd_locks() never deferences files so a stale value is safe */
 	show_fd_locks(m, file, files);

base-commit: b015dcd62b86d298829990f8261d5d154b8d7af5
-- 
2.36.1.124.g0e6072fb45-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
@ 2022-05-19 21:40 ` Kalesh Singh
  0 siblings, 0 replies; 16+ messages in thread
From: Kalesh Singh @ 2022-05-19 21:40 UTC (permalink / raw)
  Cc: linux-doc, Andrew Morton, Jonathan Corbet,
	Christoph Anton Mitterer, kernel-team, Randy Dunlap, ilkos,
	linux-kernel, Colin Cross, Sumit Semwal, linaro-mm-sig,
	dri-devel, Kalesh Singh, linux-fsdevel, Christian König,
	linux-media, surenb, tjmercier, Mike Rapoport, Kees Cook

Processes can pin shared memory by keeping a handle to it through a
file descriptor; for instance dmabufs, memfd, and ashsmem (in Android).

In the case of a memory leak, to identify the process pinning the
memory, userspace needs to:
  - Iterate the /proc/<pid>/fd/* for each process
  - Do a readlink on each entry to identify the type of memory from
    the file path.
  - stat() each entry to get the size of the memory.

The file permissions on /proc/<pid>/fd/* only allows for the owner
or root to perform the operations above; and so is not suitable for
capturing the system-wide state in a production environment.

This issue was addressed for dmabufs by making /proc/*/fdinfo/*
accessible to a process with PTRACE_MODE_READ_FSCREDS credentials[1]
To allow the same kind of tracking for other types of shared memory,
add the following fields to /proc/<pid>/fdinfo/<fd>:

path - This allows identifying the type of memory based on common
       prefixes: e.g. "/memfd...", "/dmabuf...", "/dev/ashmem..."

       This was not an issued when dmabuf tracking was introduced
       because the exp_name field of dmabuf fdinfo could be used
       to distinguish dmabuf fds from other types.

size - To track the amount of memory that is being pinned.

       dmabufs expose size as an additional field in fdinfo. Remove
       this and make it a common field for all fds.

Access to /proc/<pid>/fdinfo is governed by PTRACE_MODE_READ_FSCREDS
-- the same as for /proc/<pid>/maps which also exposes the path and
size for mapped memory regions.

This allows for a system process with PTRACE_MODE_READ_FSCREDS to
account the pinned per-process memory via fdinfo.

[1] https://lore.kernel.org/lkml/20210308170651.919148-1-kaleshsingh@google.com/

Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
---
 Documentation/filesystems/proc.rst | 22 ++++++++++++++++++++--
 drivers/dma-buf/dma-buf.c          |  1 -
 fs/proc/fd.c                       |  9 +++++++--
 3 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index 061744c436d9..ad66d78aca51 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -1922,13 +1922,16 @@ if precise results are needed.
 3.8	/proc/<pid>/fdinfo/<fd> - Information about opened file
 ---------------------------------------------------------------
 This file provides information associated with an opened file. The regular
-files have at least four fields -- 'pos', 'flags', 'mnt_id' and 'ino'.
+files have at least six fields -- 'pos', 'flags', 'mnt_id', 'ino', 'size',
+and 'path'.
+
 The 'pos' represents the current offset of the opened file in decimal
 form [see lseek(2) for details], 'flags' denotes the octal O_xxx mask the
 file has been created with [see open(2) for details] and 'mnt_id' represents
 mount ID of the file system containing the opened file [see 3.5
 /proc/<pid>/mountinfo for details]. 'ino' represents the inode number of
-the file.
+the file, 'size' represents the size of the file in bytes, and 'path'
+represents the file path.
 
 A typical output is::
 
@@ -1936,6 +1939,8 @@ A typical output is::
 	flags:	0100002
 	mnt_id:	19
 	ino:	63107
+        size:   0
+        path:   /dev/null
 
 All locks associated with a file descriptor are shown in its fdinfo too::
 
@@ -1953,6 +1958,8 @@ Eventfd files
 	flags:	04002
 	mnt_id:	9
 	ino:	63107
+        size:   0
+        path:   anon_inode:[eventfd]
 	eventfd-count:	5a
 
 where 'eventfd-count' is hex value of a counter.
@@ -1966,6 +1973,8 @@ Signalfd files
 	flags:	04002
 	mnt_id:	9
 	ino:	63107
+        size:   0
+        path:   anon_inode:[signalfd]
 	sigmask:	0000000000000200
 
 where 'sigmask' is hex value of the signal mask associated
@@ -1980,6 +1989,8 @@ Epoll files
 	flags:	02
 	mnt_id:	9
 	ino:	63107
+        size:   0
+        path:   anon_inode:[eventpoll]
 	tfd:        5 events:       1d data: ffffffffffffffff pos:0 ino:61af sdev:7
 
 where 'tfd' is a target file descriptor number in decimal form,
@@ -1998,6 +2009,8 @@ For inotify files the format is the following::
 	flags:	02000000
 	mnt_id:	9
 	ino:	63107
+        size:   0
+        path:   anon_inode:inotify
 	inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d
 
 where 'wd' is a watch descriptor in decimal form, i.e. a target file
@@ -2021,6 +2034,8 @@ For fanotify files the format is::
 	flags:	02
 	mnt_id:	9
 	ino:	63107
+        size:   0
+        path:   anon_inode:[fanotify]
 	fanotify flags:10 event-flags:0
 	fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003
 	fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4
@@ -2046,6 +2061,8 @@ Timerfd files
 	flags:	02
 	mnt_id:	9
 	ino:	63107
+        size:   0
+        path:   anon_inode:[timerfd]
 	clockid: 0
 	ticks: 0
 	settime flags: 01
@@ -2070,6 +2087,7 @@ DMA Buffer files
 	mnt_id:	9
 	ino:	63107
 	size:   32768
+        path:   /dmabuf:
 	count:  2
 	exp_name:  system-heap
 
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index b1e25ae98302..d61183ff3c30 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -377,7 +377,6 @@ static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file)
 {
 	struct dma_buf *dmabuf = file->private_data;
 
-	seq_printf(m, "size:\t%zu\n", dmabuf->size);
 	/* Don't count the temporary reference taken inside procfs seq_show */
 	seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1);
 	seq_printf(m, "exp_name:\t%s\n", dmabuf->exp_name);
diff --git a/fs/proc/fd.c b/fs/proc/fd.c
index 913bef0d2a36..a8a968bc58f0 100644
--- a/fs/proc/fd.c
+++ b/fs/proc/fd.c
@@ -54,10 +54,15 @@ static int seq_show(struct seq_file *m, void *v)
 	if (ret)
 		return ret;
 
-	seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\n",
+	seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\nsize:\t%zu\n",
 		   (long long)file->f_pos, f_flags,
 		   real_mount(file->f_path.mnt)->mnt_id,
-		   file_inode(file)->i_ino);
+		   file_inode(file)->i_ino,
+		   file_inode(file)->i_size);
+
+	seq_puts(m, "path:\t");
+	seq_file_path(m, file, "\n");
+	seq_putc(m, '\n');
 
 	/* show_fd_locks() never deferences files so a stale value is safe */
 	show_fd_locks(m, file, files);

base-commit: b015dcd62b86d298829990f8261d5d154b8d7af5
-- 
2.36.1.124.g0e6072fb45-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
  2022-05-19 21:40 ` Kalesh Singh
@ 2022-05-19 21:46   ` Randy Dunlap
  -1 siblings, 0 replies; 16+ messages in thread
From: Randy Dunlap @ 2022-05-19 21:46 UTC (permalink / raw)
  To: Kalesh Singh
  Cc: ilkos, tjmercier, surenb, kernel-team, Jonathan Corbet,
	Sumit Semwal, Christian König, Andrew Morton,
	Christoph Anton Mitterer, Kees Cook, Mike Rapoport, Colin Cross,
	linux-kernel, linux-fsdevel, linux-doc, linux-media, dri-devel,
	linaro-mm-sig

Hi--

On 5/19/22 14:40, Kalesh Singh wrote:
> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> index 061744c436d9..ad66d78aca51 100644
> --- a/Documentation/filesystems/proc.rst
> +++ b/Documentation/filesystems/proc.rst
> @@ -1922,13 +1922,16 @@ if precise results are needed.
>  3.8	/proc/<pid>/fdinfo/<fd> - Information about opened file
>  ---------------------------------------------------------------
>  This file provides information associated with an opened file. The regular
> -files have at least four fields -- 'pos', 'flags', 'mnt_id' and 'ino'.
> +files have at least six fields -- 'pos', 'flags', 'mnt_id', 'ino', 'size',
> +and 'path'.
> +
>  The 'pos' represents the current offset of the opened file in decimal
>  form [see lseek(2) for details], 'flags' denotes the octal O_xxx mask the
>  file has been created with [see open(2) for details] and 'mnt_id' represents
>  mount ID of the file system containing the opened file [see 3.5
>  /proc/<pid>/mountinfo for details]. 'ino' represents the inode number of
> -the file.
> +the file, 'size' represents the size of the file in bytes, and 'path'
> +represents the file path.
>  
>  A typical output is::
>  
> @@ -1936,6 +1939,8 @@ A typical output is::
>  	flags:	0100002
>  	mnt_id:	19
>  	ino:	63107
> +        size:   0
> +        path:   /dev/null
>  
>  All locks associated with a file descriptor are shown in its fdinfo too::
>  
> @@ -1953,6 +1958,8 @@ Eventfd files
>  	flags:	04002
>  	mnt_id:	9
>  	ino:	63107
> +        size:   0
> +        path:   anon_inode:[eventfd]
>  	eventfd-count:	5a
>  
>  where 'eventfd-count' is hex value of a counter.
> @@ -1966,6 +1973,8 @@ Signalfd files
>  	flags:	04002
>  	mnt_id:	9
>  	ino:	63107
> +        size:   0
> +        path:   anon_inode:[signalfd]
>  	sigmask:	0000000000000200
>  
>  where 'sigmask' is hex value of the signal mask associated
> @@ -1980,6 +1989,8 @@ Epoll files
>  	flags:	02
>  	mnt_id:	9
>  	ino:	63107
> +        size:   0
> +        path:   anon_inode:[eventpoll]
>  	tfd:        5 events:       1d data: ffffffffffffffff pos:0 ino:61af sdev:7
>  
>  where 'tfd' is a target file descriptor number in decimal form,
> @@ -1998,6 +2009,8 @@ For inotify files the format is the following::
>  	flags:	02000000
>  	mnt_id:	9
>  	ino:	63107
> +        size:   0
> +        path:   anon_inode:inotify
>  	inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d
>  
>  where 'wd' is a watch descriptor in decimal form, i.e. a target file
> @@ -2021,6 +2034,8 @@ For fanotify files the format is::
>  	flags:	02
>  	mnt_id:	9
>  	ino:	63107
> +        size:   0
> +        path:   anon_inode:[fanotify]
>  	fanotify flags:10 event-flags:0
>  	fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003
>  	fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4
> @@ -2046,6 +2061,8 @@ Timerfd files
>  	flags:	02
>  	mnt_id:	9
>  	ino:	63107
> +        size:   0
> +        path:   anon_inode:[timerfd]
>  	clockid: 0
>  	ticks: 0
>  	settime flags: 01
> @@ -2070,6 +2087,7 @@ DMA Buffer files
>  	mnt_id:	9
>  	ino:	63107
>  	size:   32768
> +        path:   /dmabuf:
>  	count:  2
>  	exp_name:  system-heap

All of these added lines should be indented with a tab instead of spaces.

thanks.
-- 
~Randy

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
@ 2022-05-19 21:46   ` Randy Dunlap
  0 siblings, 0 replies; 16+ messages in thread
From: Randy Dunlap @ 2022-05-19 21:46 UTC (permalink / raw)
  To: Kalesh Singh
  Cc: Andrew Morton, Jonathan Corbet, Christoph Anton Mitterer,
	kernel-team, linux-doc, ilkos, linux-kernel, Colin Cross,
	tjmercier, linaro-mm-sig, dri-devel, linux-fsdevel,
	Christian König, linux-media, surenb, Sumit Semwal,
	Mike Rapoport, Kees Cook

Hi--

On 5/19/22 14:40, Kalesh Singh wrote:
> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> index 061744c436d9..ad66d78aca51 100644
> --- a/Documentation/filesystems/proc.rst
> +++ b/Documentation/filesystems/proc.rst
> @@ -1922,13 +1922,16 @@ if precise results are needed.
>  3.8	/proc/<pid>/fdinfo/<fd> - Information about opened file
>  ---------------------------------------------------------------
>  This file provides information associated with an opened file. The regular
> -files have at least four fields -- 'pos', 'flags', 'mnt_id' and 'ino'.
> +files have at least six fields -- 'pos', 'flags', 'mnt_id', 'ino', 'size',
> +and 'path'.
> +
>  The 'pos' represents the current offset of the opened file in decimal
>  form [see lseek(2) for details], 'flags' denotes the octal O_xxx mask the
>  file has been created with [see open(2) for details] and 'mnt_id' represents
>  mount ID of the file system containing the opened file [see 3.5
>  /proc/<pid>/mountinfo for details]. 'ino' represents the inode number of
> -the file.
> +the file, 'size' represents the size of the file in bytes, and 'path'
> +represents the file path.
>  
>  A typical output is::
>  
> @@ -1936,6 +1939,8 @@ A typical output is::
>  	flags:	0100002
>  	mnt_id:	19
>  	ino:	63107
> +        size:   0
> +        path:   /dev/null
>  
>  All locks associated with a file descriptor are shown in its fdinfo too::
>  
> @@ -1953,6 +1958,8 @@ Eventfd files
>  	flags:	04002
>  	mnt_id:	9
>  	ino:	63107
> +        size:   0
> +        path:   anon_inode:[eventfd]
>  	eventfd-count:	5a
>  
>  where 'eventfd-count' is hex value of a counter.
> @@ -1966,6 +1973,8 @@ Signalfd files
>  	flags:	04002
>  	mnt_id:	9
>  	ino:	63107
> +        size:   0
> +        path:   anon_inode:[signalfd]
>  	sigmask:	0000000000000200
>  
>  where 'sigmask' is hex value of the signal mask associated
> @@ -1980,6 +1989,8 @@ Epoll files
>  	flags:	02
>  	mnt_id:	9
>  	ino:	63107
> +        size:   0
> +        path:   anon_inode:[eventpoll]
>  	tfd:        5 events:       1d data: ffffffffffffffff pos:0 ino:61af sdev:7
>  
>  where 'tfd' is a target file descriptor number in decimal form,
> @@ -1998,6 +2009,8 @@ For inotify files the format is the following::
>  	flags:	02000000
>  	mnt_id:	9
>  	ino:	63107
> +        size:   0
> +        path:   anon_inode:inotify
>  	inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d
>  
>  where 'wd' is a watch descriptor in decimal form, i.e. a target file
> @@ -2021,6 +2034,8 @@ For fanotify files the format is::
>  	flags:	02
>  	mnt_id:	9
>  	ino:	63107
> +        size:   0
> +        path:   anon_inode:[fanotify]
>  	fanotify flags:10 event-flags:0
>  	fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003
>  	fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4
> @@ -2046,6 +2061,8 @@ Timerfd files
>  	flags:	02
>  	mnt_id:	9
>  	ino:	63107
> +        size:   0
> +        path:   anon_inode:[timerfd]
>  	clockid: 0
>  	ticks: 0
>  	settime flags: 01
> @@ -2070,6 +2087,7 @@ DMA Buffer files
>  	mnt_id:	9
>  	ino:	63107
>  	size:   32768
> +        path:   /dmabuf:
>  	count:  2
>  	exp_name:  system-heap

All of these added lines should be indented with a tab instead of spaces.

thanks.
-- 
~Randy

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
  2022-05-19 21:46   ` Randy Dunlap
@ 2022-05-19 21:53     ` Kalesh Singh
  -1 siblings, 0 replies; 16+ messages in thread
From: Kalesh Singh @ 2022-05-19 21:53 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Ioannis Ilkos, T.J. Mercier, Suren Baghdasaryan,
	Cc: Android Kernel, Jonathan Corbet, Sumit Semwal,
	Christian König, Andrew Morton, Christoph Anton Mitterer,
	Kees Cook, Mike Rapoport, Colin Cross, LKML, linux-fsdevel,
	open list:DOCUMENTATION, Linux Media Mailing List,
	DRI mailing list, moderated list:DMA BUFFER SHARING FRAMEWORK

On Thu, May 19, 2022 at 2:47 PM Randy Dunlap <rdunlap@infradead.org> wrote:
>
> Hi--
>
> On 5/19/22 14:40, Kalesh Singh wrote:
> > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> > index 061744c436d9..ad66d78aca51 100644
> > --- a/Documentation/filesystems/proc.rst
> > +++ b/Documentation/filesystems/proc.rst
> > @@ -1922,13 +1922,16 @@ if precise results are needed.
> >  3.8  /proc/<pid>/fdinfo/<fd> - Information about opened file
> >  ---------------------------------------------------------------
> >  This file provides information associated with an opened file. The regular
> > -files have at least four fields -- 'pos', 'flags', 'mnt_id' and 'ino'.
> > +files have at least six fields -- 'pos', 'flags', 'mnt_id', 'ino', 'size',
> > +and 'path'.
> > +
> >  The 'pos' represents the current offset of the opened file in decimal
> >  form [see lseek(2) for details], 'flags' denotes the octal O_xxx mask the
> >  file has been created with [see open(2) for details] and 'mnt_id' represents
> >  mount ID of the file system containing the opened file [see 3.5
> >  /proc/<pid>/mountinfo for details]. 'ino' represents the inode number of
> > -the file.
> > +the file, 'size' represents the size of the file in bytes, and 'path'
> > +represents the file path.
> >
> >  A typical output is::
> >
> > @@ -1936,6 +1939,8 @@ A typical output is::
> >       flags:  0100002
> >       mnt_id: 19
> >       ino:    63107
> > +        size:   0
> > +        path:   /dev/null
> >
> >  All locks associated with a file descriptor are shown in its fdinfo too::
> >
> > @@ -1953,6 +1958,8 @@ Eventfd files
> >       flags:  04002
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[eventfd]
> >       eventfd-count:  5a
> >
> >  where 'eventfd-count' is hex value of a counter.
> > @@ -1966,6 +1973,8 @@ Signalfd files
> >       flags:  04002
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[signalfd]
> >       sigmask:        0000000000000200
> >
> >  where 'sigmask' is hex value of the signal mask associated
> > @@ -1980,6 +1989,8 @@ Epoll files
> >       flags:  02
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[eventpoll]
> >       tfd:        5 events:       1d data: ffffffffffffffff pos:0 ino:61af sdev:7
> >
> >  where 'tfd' is a target file descriptor number in decimal form,
> > @@ -1998,6 +2009,8 @@ For inotify files the format is the following::
> >       flags:  02000000
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:inotify
> >       inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d
> >
> >  where 'wd' is a watch descriptor in decimal form, i.e. a target file
> > @@ -2021,6 +2034,8 @@ For fanotify files the format is::
> >       flags:  02
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[fanotify]
> >       fanotify flags:10 event-flags:0
> >       fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003
> >       fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4
> > @@ -2046,6 +2061,8 @@ Timerfd files
> >       flags:  02
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[timerfd]
> >       clockid: 0
> >       ticks: 0
> >       settime flags: 01
> > @@ -2070,6 +2087,7 @@ DMA Buffer files
> >       mnt_id: 9
> >       ino:    63107
> >       size:   32768
> > +        path:   /dmabuf:
> >       count:  2
> >       exp_name:  system-heap
>
> All of these added lines should be indented with a tab instead of spaces.

Ahh. Thanks for catching it. WIll update in the next version.

-Kalesh

>
> thanks.
> --
> ~Randy
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
@ 2022-05-19 21:53     ` Kalesh Singh
  0 siblings, 0 replies; 16+ messages in thread
From: Kalesh Singh @ 2022-05-19 21:53 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Andrew Morton, Jonathan Corbet, Christoph Anton Mitterer,
	Cc: Android Kernel, open list:DOCUMENTATION, Ioannis Ilkos, LKML,
	Colin Cross, T.J. Mercier,
	moderated list:DMA BUFFER SHARING FRAMEWORK, DRI mailing list,
	linux-fsdevel, Christian König, Linux Media Mailing List,
	Suren Baghdasaryan, Sumit Semwal, Mike Rapoport, Kees Cook

On Thu, May 19, 2022 at 2:47 PM Randy Dunlap <rdunlap@infradead.org> wrote:
>
> Hi--
>
> On 5/19/22 14:40, Kalesh Singh wrote:
> > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> > index 061744c436d9..ad66d78aca51 100644
> > --- a/Documentation/filesystems/proc.rst
> > +++ b/Documentation/filesystems/proc.rst
> > @@ -1922,13 +1922,16 @@ if precise results are needed.
> >  3.8  /proc/<pid>/fdinfo/<fd> - Information about opened file
> >  ---------------------------------------------------------------
> >  This file provides information associated with an opened file. The regular
> > -files have at least four fields -- 'pos', 'flags', 'mnt_id' and 'ino'.
> > +files have at least six fields -- 'pos', 'flags', 'mnt_id', 'ino', 'size',
> > +and 'path'.
> > +
> >  The 'pos' represents the current offset of the opened file in decimal
> >  form [see lseek(2) for details], 'flags' denotes the octal O_xxx mask the
> >  file has been created with [see open(2) for details] and 'mnt_id' represents
> >  mount ID of the file system containing the opened file [see 3.5
> >  /proc/<pid>/mountinfo for details]. 'ino' represents the inode number of
> > -the file.
> > +the file, 'size' represents the size of the file in bytes, and 'path'
> > +represents the file path.
> >
> >  A typical output is::
> >
> > @@ -1936,6 +1939,8 @@ A typical output is::
> >       flags:  0100002
> >       mnt_id: 19
> >       ino:    63107
> > +        size:   0
> > +        path:   /dev/null
> >
> >  All locks associated with a file descriptor are shown in its fdinfo too::
> >
> > @@ -1953,6 +1958,8 @@ Eventfd files
> >       flags:  04002
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[eventfd]
> >       eventfd-count:  5a
> >
> >  where 'eventfd-count' is hex value of a counter.
> > @@ -1966,6 +1973,8 @@ Signalfd files
> >       flags:  04002
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[signalfd]
> >       sigmask:        0000000000000200
> >
> >  where 'sigmask' is hex value of the signal mask associated
> > @@ -1980,6 +1989,8 @@ Epoll files
> >       flags:  02
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[eventpoll]
> >       tfd:        5 events:       1d data: ffffffffffffffff pos:0 ino:61af sdev:7
> >
> >  where 'tfd' is a target file descriptor number in decimal form,
> > @@ -1998,6 +2009,8 @@ For inotify files the format is the following::
> >       flags:  02000000
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:inotify
> >       inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d
> >
> >  where 'wd' is a watch descriptor in decimal form, i.e. a target file
> > @@ -2021,6 +2034,8 @@ For fanotify files the format is::
> >       flags:  02
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[fanotify]
> >       fanotify flags:10 event-flags:0
> >       fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003
> >       fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4
> > @@ -2046,6 +2061,8 @@ Timerfd files
> >       flags:  02
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[timerfd]
> >       clockid: 0
> >       ticks: 0
> >       settime flags: 01
> > @@ -2070,6 +2087,7 @@ DMA Buffer files
> >       mnt_id: 9
> >       ino:    63107
> >       size:   32768
> > +        path:   /dmabuf:
> >       count:  2
> >       exp_name:  system-heap
>
> All of these added lines should be indented with a tab instead of spaces.

Ahh. Thanks for catching it. WIll update in the next version.

-Kalesh

>
> thanks.
> --
> ~Randy
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
  2022-05-19 21:40 ` Kalesh Singh
@ 2022-05-20  1:49   ` Kees Cook
  -1 siblings, 0 replies; 16+ messages in thread
From: Kees Cook @ 2022-05-20  1:49 UTC (permalink / raw)
  To: Kalesh Singh
  Cc: ilkos, tjmercier, surenb, kernel-team, Jonathan Corbet,
	Sumit Semwal, Christian König, Andrew Morton,
	Christoph Anton Mitterer, Mike Rapoport, Colin Cross,
	Randy Dunlap, linux-kernel, linux-fsdevel, linux-doc,
	linux-media, dri-devel, linaro-mm-sig

On Thu, May 19, 2022 at 02:40:15PM -0700, Kalesh Singh wrote:
> [...]
> +	seq_file_path(m, file, "\n");
> +	seq_putc(m, '\n');
>  
>  	/* show_fd_locks() never deferences files so a stale value is safe */
>  	show_fd_locks(m, file, files);

This comment implies "file" might be stale? Does that mean anything for
the above seq_file_path()?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
@ 2022-05-20  1:49   ` Kees Cook
  0 siblings, 0 replies; 16+ messages in thread
From: Kees Cook @ 2022-05-20  1:49 UTC (permalink / raw)
  To: Kalesh Singh
  Cc: linux-doc, Andrew Morton, Jonathan Corbet,
	Christoph Anton Mitterer, kernel-team, Randy Dunlap, ilkos,
	linux-kernel, Colin Cross, tjmercier, linaro-mm-sig, dri-devel,
	linux-fsdevel, Christian König, surenb, Sumit Semwal,
	Mike Rapoport, linux-media

On Thu, May 19, 2022 at 02:40:15PM -0700, Kalesh Singh wrote:
> [...]
> +	seq_file_path(m, file, "\n");
> +	seq_putc(m, '\n');
>  
>  	/* show_fd_locks() never deferences files so a stale value is safe */
>  	show_fd_locks(m, file, files);

This comment implies "file" might be stale? Does that mean anything for
the above seq_file_path()?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
  2022-05-19 21:40 ` Kalesh Singh
                   ` (2 preceding siblings ...)
  (?)
@ 2022-05-20  2:18 ` kernel test robot
  -1 siblings, 0 replies; 16+ messages in thread
From: kernel test robot @ 2022-05-20  2:18 UTC (permalink / raw)
  To: Kalesh Singh; +Cc: llvm, kbuild-all

Hi Kalesh,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on b015dcd62b86d298829990f8261d5d154b8d7af5]

url:    https://github.com/intel-lab-lkp/linux/commits/Kalesh-Singh/procfs-Add-file-path-and-size-to-proc-pid-fdinfo/20220520-054133
base:   b015dcd62b86d298829990f8261d5d154b8d7af5
config: i386-randconfig-a013 (https://download.01.org/0day-ci/archive/20220520/202205201031.8hfAOoam-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project e00cbbec06c08dc616a0d52a20f678b8fbd4e304)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/f2c1e6a5785dd09b34671073319c5aba5a94f423
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Kalesh-Singh/procfs-Add-file-path-and-size-to-proc-pid-fdinfo/20220520-054133
        git checkout f2c1e6a5785dd09b34671073319c5aba5a94f423
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash fs/proc/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> fs/proc/fd.c:61:6: warning: format specifies type 'size_t' (aka 'unsigned int') but the argument has type 'loff_t' (aka 'long long') [-Wformat]
                      file_inode(file)->i_size);
                      ^~~~~~~~~~~~~~~~~~~~~~~~
   1 warning generated.


vim +61 fs/proc/fd.c

    20	
    21	static int seq_show(struct seq_file *m, void *v)
    22	{
    23		struct files_struct *files = NULL;
    24		int f_flags = 0, ret = -ENOENT;
    25		struct file *file = NULL;
    26		struct task_struct *task;
    27	
    28		task = get_proc_task(m->private);
    29		if (!task)
    30			return -ENOENT;
    31	
    32		task_lock(task);
    33		files = task->files;
    34		if (files) {
    35			unsigned int fd = proc_fd(m->private);
    36	
    37			spin_lock(&files->file_lock);
    38			file = files_lookup_fd_locked(files, fd);
    39			if (file) {
    40				struct fdtable *fdt = files_fdtable(files);
    41	
    42				f_flags = file->f_flags;
    43				if (close_on_exec(fd, fdt))
    44					f_flags |= O_CLOEXEC;
    45	
    46				get_file(file);
    47				ret = 0;
    48			}
    49			spin_unlock(&files->file_lock);
    50		}
    51		task_unlock(task);
    52		put_task_struct(task);
    53	
    54		if (ret)
    55			return ret;
    56	
    57		seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\nsize:\t%zu\n",
    58			   (long long)file->f_pos, f_flags,
    59			   real_mount(file->f_path.mnt)->mnt_id,
    60			   file_inode(file)->i_ino,
  > 61			   file_inode(file)->i_size);
    62	
    63		seq_puts(m, "path:\t");
    64		seq_file_path(m, file, "\n");
    65		seq_putc(m, '\n');
    66	
    67		/* show_fd_locks() never deferences files so a stale value is safe */
    68		show_fd_locks(m, file, files);
    69		if (seq_has_overflowed(m))
    70			goto out;
    71	
    72		if (file->f_op->show_fdinfo)
    73			file->f_op->show_fdinfo(m, file);
    74	
    75	out:
    76		fput(file);
    77		return 0;
    78	}
    79	

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
  2022-05-19 21:40 ` Kalesh Singh
                   ` (3 preceding siblings ...)
  (?)
@ 2022-05-20  2:33 ` kernel test robot
  -1 siblings, 0 replies; 16+ messages in thread
From: kernel test robot @ 2022-05-20  2:33 UTC (permalink / raw)
  To: Kalesh Singh; +Cc: llvm, kbuild-all

Hi Kalesh,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on b015dcd62b86d298829990f8261d5d154b8d7af5]

url:    https://github.com/intel-lab-lkp/linux/commits/Kalesh-Singh/procfs-Add-file-path-and-size-to-proc-pid-fdinfo/20220520-054133
base:   b015dcd62b86d298829990f8261d5d154b8d7af5
config: x86_64-randconfig-a016 (https://download.01.org/0day-ci/archive/20220520/202205201017.8mFLfM16-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project e00cbbec06c08dc616a0d52a20f678b8fbd4e304)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/f2c1e6a5785dd09b34671073319c5aba5a94f423
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Kalesh-Singh/procfs-Add-file-path-and-size-to-proc-pid-fdinfo/20220520-054133
        git checkout f2c1e6a5785dd09b34671073319c5aba5a94f423
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash fs/proc/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> fs/proc/fd.c:61:6: warning: format specifies type 'size_t' (aka 'unsigned long') but the argument has type 'loff_t' (aka 'long long') [-Wformat]
                      file_inode(file)->i_size);
                      ^~~~~~~~~~~~~~~~~~~~~~~~
   1 warning generated.


vim +61 fs/proc/fd.c

    20	
    21	static int seq_show(struct seq_file *m, void *v)
    22	{
    23		struct files_struct *files = NULL;
    24		int f_flags = 0, ret = -ENOENT;
    25		struct file *file = NULL;
    26		struct task_struct *task;
    27	
    28		task = get_proc_task(m->private);
    29		if (!task)
    30			return -ENOENT;
    31	
    32		task_lock(task);
    33		files = task->files;
    34		if (files) {
    35			unsigned int fd = proc_fd(m->private);
    36	
    37			spin_lock(&files->file_lock);
    38			file = files_lookup_fd_locked(files, fd);
    39			if (file) {
    40				struct fdtable *fdt = files_fdtable(files);
    41	
    42				f_flags = file->f_flags;
    43				if (close_on_exec(fd, fdt))
    44					f_flags |= O_CLOEXEC;
    45	
    46				get_file(file);
    47				ret = 0;
    48			}
    49			spin_unlock(&files->file_lock);
    50		}
    51		task_unlock(task);
    52		put_task_struct(task);
    53	
    54		if (ret)
    55			return ret;
    56	
    57		seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\nsize:\t%zu\n",
    58			   (long long)file->f_pos, f_flags,
    59			   real_mount(file->f_path.mnt)->mnt_id,
    60			   file_inode(file)->i_ino,
  > 61			   file_inode(file)->i_size);
    62	
    63		seq_puts(m, "path:\t");
    64		seq_file_path(m, file, "\n");
    65		seq_putc(m, '\n');
    66	
    67		/* show_fd_locks() never deferences files so a stale value is safe */
    68		show_fd_locks(m, file, files);
    69		if (seq_has_overflowed(m))
    70			goto out;
    71	
    72		if (file->f_op->show_fdinfo)
    73			file->f_op->show_fdinfo(m, file);
    74	
    75	out:
    76		fput(file);
    77		return 0;
    78	}
    79	

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
  2022-05-20  1:49   ` Kees Cook
@ 2022-05-20  4:01     ` Kalesh Singh
  -1 siblings, 0 replies; 16+ messages in thread
From: Kalesh Singh @ 2022-05-20  4:01 UTC (permalink / raw)
  To: Kees Cook
  Cc: Ioannis Ilkos, T.J. Mercier, Suren Baghdasaryan,
	Cc: Android Kernel, Jonathan Corbet, Sumit Semwal,
	Christian König, Andrew Morton, Christoph Anton Mitterer,
	Mike Rapoport, Colin Cross, Randy Dunlap, LKML, linux-fsdevel,
	open list:DOCUMENTATION, Linux Media Mailing List,
	DRI mailing list, moderated list:DMA BUFFER SHARING FRAMEWORK

On Thu, May 19, 2022 at 6:50 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Thu, May 19, 2022 at 02:40:15PM -0700, Kalesh Singh wrote:
> > [...]
> > +     seq_file_path(m, file, "\n");
> > +     seq_putc(m, '\n');
> >
> >       /* show_fd_locks() never deferences files so a stale value is safe */
> >       show_fd_locks(m, file, files);
>
> This comment implies "file" might be stale? Does that mean anything for
> the above seq_file_path()?

Hi Kees.

Thanks for taking a look. The comment above says the "files" pointer
can be stale. It doesn't affect our use of "file" here. seq_show()
takes the reference with get_file() so "file" wouldn't be destroyed
from under us.

Thanks,
Kalesh
>
> --
> Kees Cook

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
@ 2022-05-20  4:01     ` Kalesh Singh
  0 siblings, 0 replies; 16+ messages in thread
From: Kalesh Singh @ 2022-05-20  4:01 UTC (permalink / raw)
  To: Kees Cook
  Cc: open list:DOCUMENTATION, Andrew Morton, Jonathan Corbet,
	Christoph Anton Mitterer, Cc: Android Kernel, Randy Dunlap,
	Ioannis Ilkos, LKML, Colin Cross, T.J. Mercier,
	moderated list:DMA BUFFER SHARING FRAMEWORK, DRI mailing list,
	linux-fsdevel, Christian König, Suren Baghdasaryan,
	Sumit Semwal, Mike Rapoport, Linux Media Mailing List

On Thu, May 19, 2022 at 6:50 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Thu, May 19, 2022 at 02:40:15PM -0700, Kalesh Singh wrote:
> > [...]
> > +     seq_file_path(m, file, "\n");
> > +     seq_putc(m, '\n');
> >
> >       /* show_fd_locks() never deferences files so a stale value is safe */
> >       show_fd_locks(m, file, files);
>
> This comment implies "file" might be stale? Does that mean anything for
> the above seq_file_path()?

Hi Kees.

Thanks for taking a look. The comment above says the "files" pointer
can be stale. It doesn't affect our use of "file" here. seq_show()
takes the reference with get_file() so "file" wouldn't be destroyed
from under us.

Thanks,
Kalesh
>
> --
> Kees Cook

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
  2022-05-19 21:40 ` Kalesh Singh
@ 2022-05-20  6:29   ` Christian König
  -1 siblings, 0 replies; 16+ messages in thread
From: Christian König @ 2022-05-20  6:29 UTC (permalink / raw)
  To: Kalesh Singh
  Cc: ilkos, tjmercier, surenb, kernel-team, Jonathan Corbet,
	Sumit Semwal, Andrew Morton, Christoph Anton Mitterer, Kees Cook,
	Mike Rapoport, Colin Cross, Randy Dunlap, linux-kernel,
	linux-fsdevel, linux-doc, linux-media, dri-devel, linaro-mm-sig

Am 19.05.22 um 23:40 schrieb Kalesh Singh:
> Processes can pin shared memory by keeping a handle to it through a
> file descriptor; for instance dmabufs, memfd, and ashsmem (in Android).
>
> In the case of a memory leak, to identify the process pinning the
> memory, userspace needs to:
>    - Iterate the /proc/<pid>/fd/* for each process
>    - Do a readlink on each entry to identify the type of memory from
>      the file path.
>    - stat() each entry to get the size of the memory.
>
> The file permissions on /proc/<pid>/fd/* only allows for the owner
> or root to perform the operations above; and so is not suitable for
> capturing the system-wide state in a production environment.
>
> This issue was addressed for dmabufs by making /proc/*/fdinfo/*
> accessible to a process with PTRACE_MODE_READ_FSCREDS credentials[1]
> To allow the same kind of tracking for other types of shared memory,
> add the following fields to /proc/<pid>/fdinfo/<fd>:
>
> path - This allows identifying the type of memory based on common
>         prefixes: e.g. "/memfd...", "/dmabuf...", "/dev/ashmem..."
>
>         This was not an issued when dmabuf tracking was introduced
>         because the exp_name field of dmabuf fdinfo could be used
>         to distinguish dmabuf fds from other types.
>
> size - To track the amount of memory that is being pinned.
>
>         dmabufs expose size as an additional field in fdinfo. Remove
>         this and make it a common field for all fds.
>
> Access to /proc/<pid>/fdinfo is governed by PTRACE_MODE_READ_FSCREDS
> -- the same as for /proc/<pid>/maps which also exposes the path and
> size for mapped memory regions.
>
> This allows for a system process with PTRACE_MODE_READ_FSCREDS to
> account the pinned per-process memory via fdinfo.

I think this should be split into two patches, one adding the size and 
one adding the path.

Adding the size is completely unproblematic, but the path might raise 
some eyebrows.

>
> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20210308170651.919148-1-kaleshsingh%40google.com%2F&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7C95ee7bf71c2c4aa342fa08da39e03398%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637885932392014544%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=kf%2B2es12hV3z5zjOFhx3EyxI1XEMeHexqTLNpNoDhAY%3D&amp;reserved=0
>
> Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
> ---
>   Documentation/filesystems/proc.rst | 22 ++++++++++++++++++++--
>   drivers/dma-buf/dma-buf.c          |  1 -
>   fs/proc/fd.c                       |  9 +++++++--
>   3 files changed, 27 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> index 061744c436d9..ad66d78aca51 100644
> --- a/Documentation/filesystems/proc.rst
> +++ b/Documentation/filesystems/proc.rst
> @@ -1922,13 +1922,16 @@ if precise results are needed.
>   3.8	/proc/<pid>/fdinfo/<fd> - Information about opened file
>   ---------------------------------------------------------------
>   This file provides information associated with an opened file. The regular
> -files have at least four fields -- 'pos', 'flags', 'mnt_id' and 'ino'.
> +files have at least six fields -- 'pos', 'flags', 'mnt_id', 'ino', 'size',
> +and 'path'.
> +
>   The 'pos' represents the current offset of the opened file in decimal
>   form [see lseek(2) for details], 'flags' denotes the octal O_xxx mask the
>   file has been created with [see open(2) for details] and 'mnt_id' represents
>   mount ID of the file system containing the opened file [see 3.5
>   /proc/<pid>/mountinfo for details]. 'ino' represents the inode number of
> -the file.
> +the file, 'size' represents the size of the file in bytes, and 'path'
> +represents the file path.
>   
>   A typical output is::
>   
> @@ -1936,6 +1939,8 @@ A typical output is::
>   	flags:	0100002
>   	mnt_id:	19
>   	ino:	63107
> +        size:   0
> +        path:   /dev/null
>   
>   All locks associated with a file descriptor are shown in its fdinfo too::
>   
> @@ -1953,6 +1958,8 @@ Eventfd files
>   	flags:	04002
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:[eventfd]
>   	eventfd-count:	5a
>   
>   where 'eventfd-count' is hex value of a counter.
> @@ -1966,6 +1973,8 @@ Signalfd files
>   	flags:	04002
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:[signalfd]
>   	sigmask:	0000000000000200
>   
>   where 'sigmask' is hex value of the signal mask associated
> @@ -1980,6 +1989,8 @@ Epoll files
>   	flags:	02
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:[eventpoll]
>   	tfd:        5 events:       1d data: ffffffffffffffff pos:0 ino:61af sdev:7
>   
>   where 'tfd' is a target file descriptor number in decimal form,
> @@ -1998,6 +2009,8 @@ For inotify files the format is the following::
>   	flags:	02000000
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:inotify
>   	inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d
>   
>   where 'wd' is a watch descriptor in decimal form, i.e. a target file
> @@ -2021,6 +2034,8 @@ For fanotify files the format is::
>   	flags:	02
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:[fanotify]
>   	fanotify flags:10 event-flags:0
>   	fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003
>   	fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4
> @@ -2046,6 +2061,8 @@ Timerfd files
>   	flags:	02
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:[timerfd]
>   	clockid: 0
>   	ticks: 0
>   	settime flags: 01
> @@ -2070,6 +2087,7 @@ DMA Buffer files
>   	mnt_id:	9
>   	ino:	63107
>   	size:   32768
> +        path:   /dmabuf:
>   	count:  2
>   	exp_name:  system-heap
>   
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index b1e25ae98302..d61183ff3c30 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -377,7 +377,6 @@ static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file)
>   {
>   	struct dma_buf *dmabuf = file->private_data;
>   
> -	seq_printf(m, "size:\t%zu\n", dmabuf->size);
>   	/* Don't count the temporary reference taken inside procfs seq_show */
>   	seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1);
>   	seq_printf(m, "exp_name:\t%s\n", dmabuf->exp_name);
> diff --git a/fs/proc/fd.c b/fs/proc/fd.c
> index 913bef0d2a36..a8a968bc58f0 100644
> --- a/fs/proc/fd.c
> +++ b/fs/proc/fd.c
> @@ -54,10 +54,15 @@ static int seq_show(struct seq_file *m, void *v)
>   	if (ret)
>   		return ret;
>   
> -	seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\n",
> +	seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\nsize:\t%zu\n",
>   		   (long long)file->f_pos, f_flags,
>   		   real_mount(file->f_path.mnt)->mnt_id,
> -		   file_inode(file)->i_ino);
> +		   file_inode(file)->i_ino,
> +		   file_inode(file)->i_size);

We might consider splitting this into multiple seq_printf calls, one for 
each printed attribute.

It becomes a bit unreadable and the minimal additional overhead 
shouldn't matter that much.

Regards,
Christian.

> +
> +	seq_puts(m, "path:\t");
> +	seq_file_path(m, file, "\n");
> +	seq_putc(m, '\n');
>   
>   	/* show_fd_locks() never deferences files so a stale value is safe */
>   	show_fd_locks(m, file, files);
>
> base-commit: b015dcd62b86d298829990f8261d5d154b8d7af5


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
@ 2022-05-20  6:29   ` Christian König
  0 siblings, 0 replies; 16+ messages in thread
From: Christian König @ 2022-05-20  6:29 UTC (permalink / raw)
  To: Kalesh Singh
  Cc: linux-doc, Kees Cook, Jonathan Corbet, Christoph Anton Mitterer,
	kernel-team, Randy Dunlap, ilkos, linux-kernel, Colin Cross,
	tjmercier, linaro-mm-sig, dri-devel, linux-fsdevel,
	Andrew Morton, surenb, Sumit Semwal, Mike Rapoport, linux-media

Am 19.05.22 um 23:40 schrieb Kalesh Singh:
> Processes can pin shared memory by keeping a handle to it through a
> file descriptor; for instance dmabufs, memfd, and ashsmem (in Android).
>
> In the case of a memory leak, to identify the process pinning the
> memory, userspace needs to:
>    - Iterate the /proc/<pid>/fd/* for each process
>    - Do a readlink on each entry to identify the type of memory from
>      the file path.
>    - stat() each entry to get the size of the memory.
>
> The file permissions on /proc/<pid>/fd/* only allows for the owner
> or root to perform the operations above; and so is not suitable for
> capturing the system-wide state in a production environment.
>
> This issue was addressed for dmabufs by making /proc/*/fdinfo/*
> accessible to a process with PTRACE_MODE_READ_FSCREDS credentials[1]
> To allow the same kind of tracking for other types of shared memory,
> add the following fields to /proc/<pid>/fdinfo/<fd>:
>
> path - This allows identifying the type of memory based on common
>         prefixes: e.g. "/memfd...", "/dmabuf...", "/dev/ashmem..."
>
>         This was not an issued when dmabuf tracking was introduced
>         because the exp_name field of dmabuf fdinfo could be used
>         to distinguish dmabuf fds from other types.
>
> size - To track the amount of memory that is being pinned.
>
>         dmabufs expose size as an additional field in fdinfo. Remove
>         this and make it a common field for all fds.
>
> Access to /proc/<pid>/fdinfo is governed by PTRACE_MODE_READ_FSCREDS
> -- the same as for /proc/<pid>/maps which also exposes the path and
> size for mapped memory regions.
>
> This allows for a system process with PTRACE_MODE_READ_FSCREDS to
> account the pinned per-process memory via fdinfo.

I think this should be split into two patches, one adding the size and 
one adding the path.

Adding the size is completely unproblematic, but the path might raise 
some eyebrows.

>
> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20210308170651.919148-1-kaleshsingh%40google.com%2F&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7C95ee7bf71c2c4aa342fa08da39e03398%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637885932392014544%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=kf%2B2es12hV3z5zjOFhx3EyxI1XEMeHexqTLNpNoDhAY%3D&amp;reserved=0
>
> Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
> ---
>   Documentation/filesystems/proc.rst | 22 ++++++++++++++++++++--
>   drivers/dma-buf/dma-buf.c          |  1 -
>   fs/proc/fd.c                       |  9 +++++++--
>   3 files changed, 27 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> index 061744c436d9..ad66d78aca51 100644
> --- a/Documentation/filesystems/proc.rst
> +++ b/Documentation/filesystems/proc.rst
> @@ -1922,13 +1922,16 @@ if precise results are needed.
>   3.8	/proc/<pid>/fdinfo/<fd> - Information about opened file
>   ---------------------------------------------------------------
>   This file provides information associated with an opened file. The regular
> -files have at least four fields -- 'pos', 'flags', 'mnt_id' and 'ino'.
> +files have at least six fields -- 'pos', 'flags', 'mnt_id', 'ino', 'size',
> +and 'path'.
> +
>   The 'pos' represents the current offset of the opened file in decimal
>   form [see lseek(2) for details], 'flags' denotes the octal O_xxx mask the
>   file has been created with [see open(2) for details] and 'mnt_id' represents
>   mount ID of the file system containing the opened file [see 3.5
>   /proc/<pid>/mountinfo for details]. 'ino' represents the inode number of
> -the file.
> +the file, 'size' represents the size of the file in bytes, and 'path'
> +represents the file path.
>   
>   A typical output is::
>   
> @@ -1936,6 +1939,8 @@ A typical output is::
>   	flags:	0100002
>   	mnt_id:	19
>   	ino:	63107
> +        size:   0
> +        path:   /dev/null
>   
>   All locks associated with a file descriptor are shown in its fdinfo too::
>   
> @@ -1953,6 +1958,8 @@ Eventfd files
>   	flags:	04002
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:[eventfd]
>   	eventfd-count:	5a
>   
>   where 'eventfd-count' is hex value of a counter.
> @@ -1966,6 +1973,8 @@ Signalfd files
>   	flags:	04002
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:[signalfd]
>   	sigmask:	0000000000000200
>   
>   where 'sigmask' is hex value of the signal mask associated
> @@ -1980,6 +1989,8 @@ Epoll files
>   	flags:	02
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:[eventpoll]
>   	tfd:        5 events:       1d data: ffffffffffffffff pos:0 ino:61af sdev:7
>   
>   where 'tfd' is a target file descriptor number in decimal form,
> @@ -1998,6 +2009,8 @@ For inotify files the format is the following::
>   	flags:	02000000
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:inotify
>   	inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d
>   
>   where 'wd' is a watch descriptor in decimal form, i.e. a target file
> @@ -2021,6 +2034,8 @@ For fanotify files the format is::
>   	flags:	02
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:[fanotify]
>   	fanotify flags:10 event-flags:0
>   	fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003
>   	fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4
> @@ -2046,6 +2061,8 @@ Timerfd files
>   	flags:	02
>   	mnt_id:	9
>   	ino:	63107
> +        size:   0
> +        path:   anon_inode:[timerfd]
>   	clockid: 0
>   	ticks: 0
>   	settime flags: 01
> @@ -2070,6 +2087,7 @@ DMA Buffer files
>   	mnt_id:	9
>   	ino:	63107
>   	size:   32768
> +        path:   /dmabuf:
>   	count:  2
>   	exp_name:  system-heap
>   
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index b1e25ae98302..d61183ff3c30 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -377,7 +377,6 @@ static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file)
>   {
>   	struct dma_buf *dmabuf = file->private_data;
>   
> -	seq_printf(m, "size:\t%zu\n", dmabuf->size);
>   	/* Don't count the temporary reference taken inside procfs seq_show */
>   	seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1);
>   	seq_printf(m, "exp_name:\t%s\n", dmabuf->exp_name);
> diff --git a/fs/proc/fd.c b/fs/proc/fd.c
> index 913bef0d2a36..a8a968bc58f0 100644
> --- a/fs/proc/fd.c
> +++ b/fs/proc/fd.c
> @@ -54,10 +54,15 @@ static int seq_show(struct seq_file *m, void *v)
>   	if (ret)
>   		return ret;
>   
> -	seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\n",
> +	seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\nsize:\t%zu\n",
>   		   (long long)file->f_pos, f_flags,
>   		   real_mount(file->f_path.mnt)->mnt_id,
> -		   file_inode(file)->i_ino);
> +		   file_inode(file)->i_ino,
> +		   file_inode(file)->i_size);

We might consider splitting this into multiple seq_printf calls, one for 
each printed attribute.

It becomes a bit unreadable and the minimal additional overhead 
shouldn't matter that much.

Regards,
Christian.

> +
> +	seq_puts(m, "path:\t");
> +	seq_file_path(m, file, "\n");
> +	seq_putc(m, '\n');
>   
>   	/* show_fd_locks() never deferences files so a stale value is safe */
>   	show_fd_locks(m, file, files);
>
> base-commit: b015dcd62b86d298829990f8261d5d154b8d7af5


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
  2022-05-20  6:29   ` Christian König
@ 2022-05-20 16:12     ` Kalesh Singh
  -1 siblings, 0 replies; 16+ messages in thread
From: Kalesh Singh @ 2022-05-20 16:12 UTC (permalink / raw)
  To: Christian König
  Cc: Ioannis Ilkos, T.J. Mercier, Suren Baghdasaryan,
	Cc: Android Kernel, Jonathan Corbet, Sumit Semwal, Andrew Morton,
	Christoph Anton Mitterer, Kees Cook, Mike Rapoport, Colin Cross,
	Randy Dunlap, LKML, linux-fsdevel, open list:DOCUMENTATION,
	Linux Media Mailing List, DRI mailing list,
	moderated list:DMA BUFFER SHARING FRAMEWORK

On Thu, May 19, 2022 at 11:29 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 19.05.22 um 23:40 schrieb Kalesh Singh:
> > Processes can pin shared memory by keeping a handle to it through a
> > file descriptor; for instance dmabufs, memfd, and ashsmem (in Android).
> >
> > In the case of a memory leak, to identify the process pinning the
> > memory, userspace needs to:
> >    - Iterate the /proc/<pid>/fd/* for each process
> >    - Do a readlink on each entry to identify the type of memory from
> >      the file path.
> >    - stat() each entry to get the size of the memory.
> >
> > The file permissions on /proc/<pid>/fd/* only allows for the owner
> > or root to perform the operations above; and so is not suitable for
> > capturing the system-wide state in a production environment.
> >
> > This issue was addressed for dmabufs by making /proc/*/fdinfo/*
> > accessible to a process with PTRACE_MODE_READ_FSCREDS credentials[1]
> > To allow the same kind of tracking for other types of shared memory,
> > add the following fields to /proc/<pid>/fdinfo/<fd>:
> >
> > path - This allows identifying the type of memory based on common
> >         prefixes: e.g. "/memfd...", "/dmabuf...", "/dev/ashmem..."
> >
> >         This was not an issued when dmabuf tracking was introduced
> >         because the exp_name field of dmabuf fdinfo could be used
> >         to distinguish dmabuf fds from other types.
> >
> > size - To track the amount of memory that is being pinned.
> >
> >         dmabufs expose size as an additional field in fdinfo. Remove
> >         this and make it a common field for all fds.
> >
> > Access to /proc/<pid>/fdinfo is governed by PTRACE_MODE_READ_FSCREDS
> > -- the same as for /proc/<pid>/maps which also exposes the path and
> > size for mapped memory regions.
> >
> > This allows for a system process with PTRACE_MODE_READ_FSCREDS to
> > account the pinned per-process memory via fdinfo.
>
> I think this should be split into two patches, one adding the size and
> one adding the path.
>
> Adding the size is completely unproblematic, but the path might raise
> some eyebrows.

Hi Christian,

Thanks for reviewing. "path" is exposed under the same ptrace
capability as in /proc/pid/maps. If we want to be more cautious, then
perhaps only adding "path" for the applicable anon inodes (dmabuf,
memfd, ...)? But prefer to keep it generic if no one sees an issue
with that.

>
> >
> > [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20210308170651.919148-1-kaleshsingh%40google.com%2F&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7C95ee7bf71c2c4aa342fa08da39e03398%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637885932392014544%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=kf%2B2es12hV3z5zjOFhx3EyxI1XEMeHexqTLNpNoDhAY%3D&amp;reserved=0
> >
> > Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
> > ---
> >   Documentation/filesystems/proc.rst | 22 ++++++++++++++++++++--
> >   drivers/dma-buf/dma-buf.c          |  1 -
> >   fs/proc/fd.c                       |  9 +++++++--
> >   3 files changed, 27 insertions(+), 5 deletions(-)
> >
> > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> > index 061744c436d9..ad66d78aca51 100644
> > --- a/Documentation/filesystems/proc.rst
> > +++ b/Documentation/filesystems/proc.rst
> > @@ -1922,13 +1922,16 @@ if precise results are needed.
> >   3.8 /proc/<pid>/fdinfo/<fd> - Information about opened file
> >   ---------------------------------------------------------------
> >   This file provides information associated with an opened file. The regular
> > -files have at least four fields -- 'pos', 'flags', 'mnt_id' and 'ino'.
> > +files have at least six fields -- 'pos', 'flags', 'mnt_id', 'ino', 'size',
> > +and 'path'.
> > +
> >   The 'pos' represents the current offset of the opened file in decimal
> >   form [see lseek(2) for details], 'flags' denotes the octal O_xxx mask the
> >   file has been created with [see open(2) for details] and 'mnt_id' represents
> >   mount ID of the file system containing the opened file [see 3.5
> >   /proc/<pid>/mountinfo for details]. 'ino' represents the inode number of
> > -the file.
> > +the file, 'size' represents the size of the file in bytes, and 'path'
> > +represents the file path.
> >
> >   A typical output is::
> >
> > @@ -1936,6 +1939,8 @@ A typical output is::
> >       flags:  0100002
> >       mnt_id: 19
> >       ino:    63107
> > +        size:   0
> > +        path:   /dev/null
> >
> >   All locks associated with a file descriptor are shown in its fdinfo too::
> >
> > @@ -1953,6 +1958,8 @@ Eventfd files
> >       flags:  04002
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[eventfd]
> >       eventfd-count:  5a
> >
> >   where 'eventfd-count' is hex value of a counter.
> > @@ -1966,6 +1973,8 @@ Signalfd files
> >       flags:  04002
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[signalfd]
> >       sigmask:        0000000000000200
> >
> >   where 'sigmask' is hex value of the signal mask associated
> > @@ -1980,6 +1989,8 @@ Epoll files
> >       flags:  02
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[eventpoll]
> >       tfd:        5 events:       1d data: ffffffffffffffff pos:0 ino:61af sdev:7
> >
> >   where 'tfd' is a target file descriptor number in decimal form,
> > @@ -1998,6 +2009,8 @@ For inotify files the format is the following::
> >       flags:  02000000
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:inotify
> >       inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d
> >
> >   where 'wd' is a watch descriptor in decimal form, i.e. a target file
> > @@ -2021,6 +2034,8 @@ For fanotify files the format is::
> >       flags:  02
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[fanotify]
> >       fanotify flags:10 event-flags:0
> >       fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003
> >       fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4
> > @@ -2046,6 +2061,8 @@ Timerfd files
> >       flags:  02
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[timerfd]
> >       clockid: 0
> >       ticks: 0
> >       settime flags: 01
> > @@ -2070,6 +2087,7 @@ DMA Buffer files
> >       mnt_id: 9
> >       ino:    63107
> >       size:   32768
> > +        path:   /dmabuf:
> >       count:  2
> >       exp_name:  system-heap
> >
> > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > index b1e25ae98302..d61183ff3c30 100644
> > --- a/drivers/dma-buf/dma-buf.c
> > +++ b/drivers/dma-buf/dma-buf.c
> > @@ -377,7 +377,6 @@ static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file)
> >   {
> >       struct dma_buf *dmabuf = file->private_data;
> >
> > -     seq_printf(m, "size:\t%zu\n", dmabuf->size);
> >       /* Don't count the temporary reference taken inside procfs seq_show */
> >       seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1);
> >       seq_printf(m, "exp_name:\t%s\n", dmabuf->exp_name);
> > diff --git a/fs/proc/fd.c b/fs/proc/fd.c
> > index 913bef0d2a36..a8a968bc58f0 100644
> > --- a/fs/proc/fd.c
> > +++ b/fs/proc/fd.c
> > @@ -54,10 +54,15 @@ static int seq_show(struct seq_file *m, void *v)
> >       if (ret)
> >               return ret;
> >
> > -     seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\n",
> > +     seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\nsize:\t%zu\n",
> >                  (long long)file->f_pos, f_flags,
> >                  real_mount(file->f_path.mnt)->mnt_id,
> > -                file_inode(file)->i_ino);
> > +                file_inode(file)->i_ino,
> > +                file_inode(file)->i_size);
>
> We might consider splitting this into multiple seq_printf calls, one for
> each printed attribute.
>
> It becomes a bit unreadable and the minimal additional overhead
> shouldn't matter that much.

Agreed. WIll update in the next version.

Thanks,
Kalesh
>
> Regards,
> Christian.
>
> > +
> > +     seq_puts(m, "path:\t");
> > +     seq_file_path(m, file, "\n");
> > +     seq_putc(m, '\n');
> >
> >       /* show_fd_locks() never deferences files so a stale value is safe */
> >       show_fd_locks(m, file, files);
> >
> > base-commit: b015dcd62b86d298829990f8261d5d154b8d7af5
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo
@ 2022-05-20 16:12     ` Kalesh Singh
  0 siblings, 0 replies; 16+ messages in thread
From: Kalesh Singh @ 2022-05-20 16:12 UTC (permalink / raw)
  To: Christian König
  Cc: open list:DOCUMENTATION, Kees Cook, Jonathan Corbet,
	Christoph Anton Mitterer, Cc: Android Kernel, Randy Dunlap,
	Ioannis Ilkos, LKML, Colin Cross, T.J. Mercier,
	moderated list:DMA BUFFER SHARING FRAMEWORK, DRI mailing list,
	linux-fsdevel, Andrew Morton, Suren Baghdasaryan, Sumit Semwal,
	Mike Rapoport, Linux Media Mailing List

On Thu, May 19, 2022 at 11:29 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 19.05.22 um 23:40 schrieb Kalesh Singh:
> > Processes can pin shared memory by keeping a handle to it through a
> > file descriptor; for instance dmabufs, memfd, and ashsmem (in Android).
> >
> > In the case of a memory leak, to identify the process pinning the
> > memory, userspace needs to:
> >    - Iterate the /proc/<pid>/fd/* for each process
> >    - Do a readlink on each entry to identify the type of memory from
> >      the file path.
> >    - stat() each entry to get the size of the memory.
> >
> > The file permissions on /proc/<pid>/fd/* only allows for the owner
> > or root to perform the operations above; and so is not suitable for
> > capturing the system-wide state in a production environment.
> >
> > This issue was addressed for dmabufs by making /proc/*/fdinfo/*
> > accessible to a process with PTRACE_MODE_READ_FSCREDS credentials[1]
> > To allow the same kind of tracking for other types of shared memory,
> > add the following fields to /proc/<pid>/fdinfo/<fd>:
> >
> > path - This allows identifying the type of memory based on common
> >         prefixes: e.g. "/memfd...", "/dmabuf...", "/dev/ashmem..."
> >
> >         This was not an issued when dmabuf tracking was introduced
> >         because the exp_name field of dmabuf fdinfo could be used
> >         to distinguish dmabuf fds from other types.
> >
> > size - To track the amount of memory that is being pinned.
> >
> >         dmabufs expose size as an additional field in fdinfo. Remove
> >         this and make it a common field for all fds.
> >
> > Access to /proc/<pid>/fdinfo is governed by PTRACE_MODE_READ_FSCREDS
> > -- the same as for /proc/<pid>/maps which also exposes the path and
> > size for mapped memory regions.
> >
> > This allows for a system process with PTRACE_MODE_READ_FSCREDS to
> > account the pinned per-process memory via fdinfo.
>
> I think this should be split into two patches, one adding the size and
> one adding the path.
>
> Adding the size is completely unproblematic, but the path might raise
> some eyebrows.

Hi Christian,

Thanks for reviewing. "path" is exposed under the same ptrace
capability as in /proc/pid/maps. If we want to be more cautious, then
perhaps only adding "path" for the applicable anon inodes (dmabuf,
memfd, ...)? But prefer to keep it generic if no one sees an issue
with that.

>
> >
> > [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20210308170651.919148-1-kaleshsingh%40google.com%2F&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7C95ee7bf71c2c4aa342fa08da39e03398%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637885932392014544%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=kf%2B2es12hV3z5zjOFhx3EyxI1XEMeHexqTLNpNoDhAY%3D&amp;reserved=0
> >
> > Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
> > ---
> >   Documentation/filesystems/proc.rst | 22 ++++++++++++++++++++--
> >   drivers/dma-buf/dma-buf.c          |  1 -
> >   fs/proc/fd.c                       |  9 +++++++--
> >   3 files changed, 27 insertions(+), 5 deletions(-)
> >
> > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> > index 061744c436d9..ad66d78aca51 100644
> > --- a/Documentation/filesystems/proc.rst
> > +++ b/Documentation/filesystems/proc.rst
> > @@ -1922,13 +1922,16 @@ if precise results are needed.
> >   3.8 /proc/<pid>/fdinfo/<fd> - Information about opened file
> >   ---------------------------------------------------------------
> >   This file provides information associated with an opened file. The regular
> > -files have at least four fields -- 'pos', 'flags', 'mnt_id' and 'ino'.
> > +files have at least six fields -- 'pos', 'flags', 'mnt_id', 'ino', 'size',
> > +and 'path'.
> > +
> >   The 'pos' represents the current offset of the opened file in decimal
> >   form [see lseek(2) for details], 'flags' denotes the octal O_xxx mask the
> >   file has been created with [see open(2) for details] and 'mnt_id' represents
> >   mount ID of the file system containing the opened file [see 3.5
> >   /proc/<pid>/mountinfo for details]. 'ino' represents the inode number of
> > -the file.
> > +the file, 'size' represents the size of the file in bytes, and 'path'
> > +represents the file path.
> >
> >   A typical output is::
> >
> > @@ -1936,6 +1939,8 @@ A typical output is::
> >       flags:  0100002
> >       mnt_id: 19
> >       ino:    63107
> > +        size:   0
> > +        path:   /dev/null
> >
> >   All locks associated with a file descriptor are shown in its fdinfo too::
> >
> > @@ -1953,6 +1958,8 @@ Eventfd files
> >       flags:  04002
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[eventfd]
> >       eventfd-count:  5a
> >
> >   where 'eventfd-count' is hex value of a counter.
> > @@ -1966,6 +1973,8 @@ Signalfd files
> >       flags:  04002
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[signalfd]
> >       sigmask:        0000000000000200
> >
> >   where 'sigmask' is hex value of the signal mask associated
> > @@ -1980,6 +1989,8 @@ Epoll files
> >       flags:  02
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[eventpoll]
> >       tfd:        5 events:       1d data: ffffffffffffffff pos:0 ino:61af sdev:7
> >
> >   where 'tfd' is a target file descriptor number in decimal form,
> > @@ -1998,6 +2009,8 @@ For inotify files the format is the following::
> >       flags:  02000000
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:inotify
> >       inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d
> >
> >   where 'wd' is a watch descriptor in decimal form, i.e. a target file
> > @@ -2021,6 +2034,8 @@ For fanotify files the format is::
> >       flags:  02
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[fanotify]
> >       fanotify flags:10 event-flags:0
> >       fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003
> >       fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4
> > @@ -2046,6 +2061,8 @@ Timerfd files
> >       flags:  02
> >       mnt_id: 9
> >       ino:    63107
> > +        size:   0
> > +        path:   anon_inode:[timerfd]
> >       clockid: 0
> >       ticks: 0
> >       settime flags: 01
> > @@ -2070,6 +2087,7 @@ DMA Buffer files
> >       mnt_id: 9
> >       ino:    63107
> >       size:   32768
> > +        path:   /dmabuf:
> >       count:  2
> >       exp_name:  system-heap
> >
> > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > index b1e25ae98302..d61183ff3c30 100644
> > --- a/drivers/dma-buf/dma-buf.c
> > +++ b/drivers/dma-buf/dma-buf.c
> > @@ -377,7 +377,6 @@ static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file)
> >   {
> >       struct dma_buf *dmabuf = file->private_data;
> >
> > -     seq_printf(m, "size:\t%zu\n", dmabuf->size);
> >       /* Don't count the temporary reference taken inside procfs seq_show */
> >       seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1);
> >       seq_printf(m, "exp_name:\t%s\n", dmabuf->exp_name);
> > diff --git a/fs/proc/fd.c b/fs/proc/fd.c
> > index 913bef0d2a36..a8a968bc58f0 100644
> > --- a/fs/proc/fd.c
> > +++ b/fs/proc/fd.c
> > @@ -54,10 +54,15 @@ static int seq_show(struct seq_file *m, void *v)
> >       if (ret)
> >               return ret;
> >
> > -     seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\n",
> > +     seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\nino:\t%lu\nsize:\t%zu\n",
> >                  (long long)file->f_pos, f_flags,
> >                  real_mount(file->f_path.mnt)->mnt_id,
> > -                file_inode(file)->i_ino);
> > +                file_inode(file)->i_ino,
> > +                file_inode(file)->i_size);
>
> We might consider splitting this into multiple seq_printf calls, one for
> each printed attribute.
>
> It becomes a bit unreadable and the minimal additional overhead
> shouldn't matter that much.

Agreed. WIll update in the next version.

Thanks,
Kalesh
>
> Regards,
> Christian.
>
> > +
> > +     seq_puts(m, "path:\t");
> > +     seq_file_path(m, file, "\n");
> > +     seq_putc(m, '\n');
> >
> >       /* show_fd_locks() never deferences files so a stale value is safe */
> >       show_fd_locks(m, file, files);
> >
> > base-commit: b015dcd62b86d298829990f8261d5d154b8d7af5
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-05-20 16:13 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-19 21:40 [RFC PATCH] procfs: Add file path and size to /proc/<pid>/fdinfo Kalesh Singh
2022-05-19 21:40 ` Kalesh Singh
2022-05-19 21:46 ` Randy Dunlap
2022-05-19 21:46   ` Randy Dunlap
2022-05-19 21:53   ` Kalesh Singh
2022-05-19 21:53     ` Kalesh Singh
2022-05-20  1:49 ` Kees Cook
2022-05-20  1:49   ` Kees Cook
2022-05-20  4:01   ` Kalesh Singh
2022-05-20  4:01     ` Kalesh Singh
2022-05-20  2:18 ` kernel test robot
2022-05-20  2:33 ` kernel test robot
2022-05-20  6:29 ` Christian König
2022-05-20  6:29   ` Christian König
2022-05-20 16:12   ` Kalesh Singh
2022-05-20 16:12     ` Kalesh Singh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.