[RFC][PATCH] procfs: Add /proc/<pid>/mapped_files

* [RFC][PATCH] procfs: Add /proc/<pid>/mapped_files
@ 2015-01-14  0:20 Calvin Owens
  2015-01-14  0:23 ` Calvin Owens
                   ` (3 more replies)
  0 siblings, 4 replies; 80+ messages in thread
From: Calvin Owens @ 2015-01-14  0:20 UTC (permalink / raw)
  To: Andrew Morton, Alexey Dobriyan, Oleg Nesterov, Eric W. Biederman,
	Al Viro, Kirill A. Shutemov, Peter Feiner, Grant Likely
  Cc: Siddhesh Poyarekar, linux-kernel, kernel-team, calvinowens

Commit b76437579d1344b6 ("procfs: mark thread stack correctly in
proc/<pid>/maps") introduced logic to mark thread stacks with the
"[stack:%d]" marker in /proc/<pid>/maps.

This causes reading /proc/<pid>/maps to take O(N^2) time, where N is
the number of threads sharing an address space, since each line of
output requires iterating over the VMA list looking for ranges that
correspond to the stack pointer in any task's register set. When
dealing with highly-threaded Java applications, reading this file can
take hours and trigger softlockup dumps.

Eliminating the "[stack:%d]" marker is not a viable option since it's
been there for some time, and I don't see a way to do the stack check
more efficiently that wouldn't end up making the whole thing really
ugly.

The use case I'm specifically concerned with is the lsof command, so
this patch adds an additional file, "mapped_files", that simply
iterates over the VMAs associated with the task and outputs a
newline-delimited list of the pathnames of the files associated with
the VMAs, if any.

This gives lsof and suchlike a way to determine the pathnames of files
mapped into a process without incurring the O(N^2) behavior of the
maps file.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
I'm also sending a simple repro program as a reply to this E-Mail.

 fs/proc/base.c     |  1 +
 fs/proc/internal.h |  1 +
 fs/proc/task_mmu.c | 32 ++++++++++++++++++++++++++++++++
 3 files changed, 34 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 3f3d7ae..15f8bd0 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2564,6 +2564,7 @@ static const struct pid_entry tgid_base_stuff[] = {
 	ONE("stat",       S_IRUGO, proc_tgid_stat),
 	ONE("statm",      S_IRUGO, proc_pid_statm),
 	REG("maps",       S_IRUGO, proc_pid_maps_operations),
+	REG("mapped_files", S_IRUGO, proc_mapped_files_operations),
 #ifdef CONFIG_NUMA
 	REG("numa_maps",  S_IRUGO, proc_pid_numa_maps_operations),
 #endif
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 6fcdba5..a09bbdd 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -284,6 +284,7 @@ struct mm_struct *proc_mem_open(struct inode *inode, unsigned int mode);
 
 extern const struct file_operations proc_pid_maps_operations;
 extern const struct file_operations proc_tid_maps_operations;
+extern const struct file_operations proc_mapped_files_operations;
 extern const struct file_operations proc_pid_numa_maps_operations;
 extern const struct file_operations proc_tid_numa_maps_operations;
 extern const struct file_operations proc_pid_smaps_operations;
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 246eae8..bc101e0 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -412,6 +412,38 @@ const struct file_operations proc_tid_maps_operations = {
 	.release	= proc_map_release,
 };
 
+static int show_next_mapped_file(struct seq_file *m, void *v)
+{
+	struct vm_area_struct *vma = v;
+	struct file *file = vma->vm_file;
+
+	if (file) {
+		seq_path(m, &file->f_path, "\n");
+		seq_putc(m, '\n');
+	}
+
+	return 0;
+}
+
+static const struct seq_operations mapped_files_seq_op = {
+	.start	= m_start,
+	.next	= m_next,
+	.stop	= m_stop,
+	.show	= show_next_mapped_file,
+};
+
+static int mapped_files_open(struct inode *inode, struct file *file)
+{
+	return do_maps_open(inode, file, &mapped_files_seq_op);
+}
+
+const struct file_operations proc_mapped_files_operations = {
+	.open		= mapped_files_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= seq_release_private,
+};
+
 /*
  * Proportional Set Size(PSS): my share of RSS.
  *
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 80+ messages in thread