linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] cleanups and refactor of /proc/pid/smaps*
@ 2018-07-23 11:19 Vlastimil Babka
  2018-07-23 11:19 ` [PATCH 1/4] mm: /proc/pid/*maps remove is_pid and related wrappers Vlastimil Babka
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Vlastimil Babka @ 2018-07-23 11:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Daniel Colascione, linux-mm, linux-kernel, linux-fsdevel,
	Alexey Dobriyan, linux-api, Vlastimil Babka

The recent regression in /proc/pid/smaps made me look more into the code.
Especially the issues with smaps_rollup reported in [1] as explained in
Patch 4, which fixes them by refactoring the code. Patches 2 and 3 are
preparations for that. Patch 1 is me realizing that there's a lot of
boilerplate left from times where we tried (unsuccessfuly) to mark thread
stacks in the output.

Originally I had also plans to rework the translation from /proc/pid/*maps*
file offsets to the internal structures. Now the offset means "vma number",
which is not really stable (vma's can come and go between read() calls) and
there's an extra caching of last vma's address. My idea was that offsets would
be interpreted directly as addresses, which would also allow meaningful seeks
(see the ugly seek_to_smaps_entry() in tools/testing/selftests/vm/mlock2.h).
However loff_t is (signed) long long so that might be insufficient somewhere
for the unsigned long addresses.

So the result is fixed issues with skewed /proc/pid/smaps_rollup results,
simpler smaps code, and a lot of unused code removed.

[1] https://marc.info/?l=linux-mm&m=151927723128134&w=2

Vlastimil Babka (4):
  mm: /proc/pid/*maps remove is_pid and related wrappers
  mm: proc/pid/smaps: factor out mem stats gathering
  mm: proc/pid/smaps: factor out common stats printing
  mm: proc/pid/smaps_rollup: convert to single value seq_file

 fs/proc/base.c       |   6 +-
 fs/proc/internal.h   |   3 -
 fs/proc/task_mmu.c   | 294 +++++++++++++++++++------------------------
 fs/proc/task_nommu.c |  39 +-----
 4 files changed, 133 insertions(+), 209 deletions(-)

-- 
2.18.0

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/4] mm: /proc/pid/*maps remove is_pid and related wrappers
  2018-07-23 11:19 [PATCH 0/4] cleanups and refactor of /proc/pid/smaps* Vlastimil Babka
@ 2018-07-23 11:19 ` Vlastimil Babka
  2018-07-23 11:19 ` [PATCH 2/4] mm: proc/pid/smaps: factor out mem stats gathering Vlastimil Babka
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Vlastimil Babka @ 2018-07-23 11:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Daniel Colascione, linux-mm, linux-kernel, linux-fsdevel,
	Alexey Dobriyan, linux-api, Vlastimil Babka

Commit b76437579d13 ("procfs: mark thread stack correctly in proc/<pid>/maps")
introduced differences between /proc/PID/maps and /proc/PID/task/TID/maps to
mark thread stacks properly, and this was also done for smaps and numa_maps.
However it didn't work properly and was ultimately removed by commit
b18cb64ead40 ("fs/proc: Stop trying to report thread stacks").

Now the is_pid parameter for the related show_*() functions is unused and we
can remove it together with wrapper functions and ops structures that differ
for PID and TID cases only in this parameter.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 fs/proc/base.c       |   6 +--
 fs/proc/internal.h   |   3 --
 fs/proc/task_mmu.c   | 114 +++++--------------------------------------
 fs/proc/task_nommu.c |  39 ++-------------
 4 files changed, 18 insertions(+), 144 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index aaffc0c30216..ad047977ed04 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3309,12 +3309,12 @@ static const struct pid_entry tid_base_stuff[] = {
 	REG("cmdline",   S_IRUGO, proc_pid_cmdline_ops),
 	ONE("stat",      S_IRUGO, proc_tid_stat),
 	ONE("statm",     S_IRUGO, proc_pid_statm),
-	REG("maps",      S_IRUGO, proc_tid_maps_operations),
+	REG("maps",      S_IRUGO, proc_pid_maps_operations),
 #ifdef CONFIG_PROC_CHILDREN
 	REG("children",  S_IRUGO, proc_tid_children_operations),
 #endif
 #ifdef CONFIG_NUMA
-	REG("numa_maps", S_IRUGO, proc_tid_numa_maps_operations),
+	REG("numa_maps", S_IRUGO, proc_pid_numa_maps_operations),
 #endif
 	REG("mem",       S_IRUSR|S_IWUSR, proc_mem_operations),
 	LNK("cwd",       proc_cwd_link),
@@ -3324,7 +3324,7 @@ static const struct pid_entry tid_base_stuff[] = {
 	REG("mountinfo",  S_IRUGO, proc_mountinfo_operations),
 #ifdef CONFIG_PROC_PAGE_MONITOR
 	REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
-	REG("smaps",     S_IRUGO, proc_tid_smaps_operations),
+	REG("smaps",     S_IRUGO, proc_pid_smaps_operations),
 	REG("smaps_rollup", S_IRUGO, proc_pid_smaps_rollup_operations),
 	REG("pagemap",    S_IRUSR, proc_pagemap_operations),
 #endif
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index da3dbfa09e79..0c538769512a 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -297,12 +297,9 @@ struct proc_maps_private {
 struct mm_struct *proc_mem_open(struct inode *inode, unsigned int mode);
 
 extern const struct file_operations proc_pid_maps_operations;
-extern const struct file_operations proc_tid_maps_operations;
 extern const struct file_operations proc_pid_numa_maps_operations;
-extern const struct file_operations proc_tid_numa_maps_operations;
 extern const struct file_operations proc_pid_smaps_operations;
 extern const struct file_operations proc_pid_smaps_rollup_operations;
-extern const struct file_operations proc_tid_smaps_operations;
 extern const struct file_operations proc_clear_refs_operations;
 extern const struct file_operations proc_pagemap_operations;
 
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index dfd73a4616ce..a3f98ca50981 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -294,7 +294,7 @@ static void show_vma_header_prefix(struct seq_file *m,
 }
 
 static void
-show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid)
+show_map_vma(struct seq_file *m, struct vm_area_struct *vma)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	struct file *file = vma->vm_file;
@@ -357,35 +357,18 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid)
 	seq_putc(m, '\n');
 }
 
-static int show_map(struct seq_file *m, void *v, int is_pid)
+static int show_map(struct seq_file *m, void *v)
 {
-	show_map_vma(m, v, is_pid);
+	show_map_vma(m, v);
 	m_cache_vma(m, v);
 	return 0;
 }
 
-static int show_pid_map(struct seq_file *m, void *v)
-{
-	return show_map(m, v, 1);
-}
-
-static int show_tid_map(struct seq_file *m, void *v)
-{
-	return show_map(m, v, 0);
-}
-
 static const struct seq_operations proc_pid_maps_op = {
 	.start	= m_start,
 	.next	= m_next,
 	.stop	= m_stop,
-	.show	= show_pid_map
-};
-
-static const struct seq_operations proc_tid_maps_op = {
-	.start	= m_start,
-	.next	= m_next,
-	.stop	= m_stop,
-	.show	= show_tid_map
+	.show	= show_map
 };
 
 static int pid_maps_open(struct inode *inode, struct file *file)
@@ -393,11 +376,6 @@ static int pid_maps_open(struct inode *inode, struct file *file)
 	return do_maps_open(inode, file, &proc_pid_maps_op);
 }
 
-static int tid_maps_open(struct inode *inode, struct file *file)
-{
-	return do_maps_open(inode, file, &proc_tid_maps_op);
-}
-
 const struct file_operations proc_pid_maps_operations = {
 	.open		= pid_maps_open,
 	.read		= seq_read,
@@ -405,13 +383,6 @@ const struct file_operations proc_pid_maps_operations = {
 	.release	= proc_map_release,
 };
 
-const struct file_operations proc_tid_maps_operations = {
-	.open		= tid_maps_open,
-	.read		= seq_read,
-	.llseek		= seq_lseek,
-	.release	= proc_map_release,
-};
-
 /*
  * Proportional Set Size(PSS): my share of RSS.
  *
@@ -733,7 +704,7 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask,
 
 #define SEQ_PUT_DEC(str, val) \
 		seq_put_decimal_ull_width(m, str, (val) >> 10, 8)
-static int show_smap(struct seq_file *m, void *v, int is_pid)
+static int show_smap(struct seq_file *m, void *v)
 {
 	struct proc_maps_private *priv = m->private;
 	struct vm_area_struct *vma = v;
@@ -796,7 +767,7 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
 		mss->pss_locked += mss->pss;
 
 	if (!rollup_mode) {
-		show_map_vma(m, vma, is_pid);
+		show_map_vma(m, vma);
 	} else if (last_vma) {
 		show_vma_header_prefix(
 			m, mss->first_vma_start, vma->vm_end, 0, 0, 0, 0);
@@ -845,28 +816,11 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
 }
 #undef SEQ_PUT_DEC
 
-static int show_pid_smap(struct seq_file *m, void *v)
-{
-	return show_smap(m, v, 1);
-}
-
-static int show_tid_smap(struct seq_file *m, void *v)
-{
-	return show_smap(m, v, 0);
-}
-
 static const struct seq_operations proc_pid_smaps_op = {
 	.start	= m_start,
 	.next	= m_next,
 	.stop	= m_stop,
-	.show	= show_pid_smap
-};
-
-static const struct seq_operations proc_tid_smaps_op = {
-	.start	= m_start,
-	.next	= m_next,
-	.stop	= m_stop,
-	.show	= show_tid_smap
+	.show	= show_smap
 };
 
 static int pid_smaps_open(struct inode *inode, struct file *file)
@@ -893,11 +847,6 @@ static int pid_smaps_rollup_open(struct inode *inode, struct file *file)
 	return 0;
 }
 
-static int tid_smaps_open(struct inode *inode, struct file *file)
-{
-	return do_maps_open(inode, file, &proc_tid_smaps_op);
-}
-
 const struct file_operations proc_pid_smaps_operations = {
 	.open		= pid_smaps_open,
 	.read		= seq_read,
@@ -912,13 +861,6 @@ const struct file_operations proc_pid_smaps_rollup_operations = {
 	.release	= proc_map_release,
 };
 
-const struct file_operations proc_tid_smaps_operations = {
-	.open		= tid_smaps_open,
-	.read		= seq_read,
-	.llseek		= seq_lseek,
-	.release	= proc_map_release,
-};
-
 enum clear_refs_types {
 	CLEAR_REFS_ALL = 1,
 	CLEAR_REFS_ANON,
@@ -1728,7 +1670,7 @@ static int gather_hugetlb_stats(pte_t *pte, unsigned long hmask,
 /*
  * Display pages allocated per node and memory policy via /proc.
  */
-static int show_numa_map(struct seq_file *m, void *v, int is_pid)
+static int show_numa_map(struct seq_file *m, void *v)
 {
 	struct numa_maps_private *numa_priv = m->private;
 	struct proc_maps_private *proc_priv = &numa_priv->proc_maps;
@@ -1812,45 +1754,17 @@ static int show_numa_map(struct seq_file *m, void *v, int is_pid)
 	return 0;
 }
 
-static int show_pid_numa_map(struct seq_file *m, void *v)
-{
-	return show_numa_map(m, v, 1);
-}
-
-static int show_tid_numa_map(struct seq_file *m, void *v)
-{
-	return show_numa_map(m, v, 0);
-}
-
 static const struct seq_operations proc_pid_numa_maps_op = {
 	.start  = m_start,
 	.next   = m_next,
 	.stop   = m_stop,
-	.show   = show_pid_numa_map,
+	.show   = show_numa_map,
 };
 
-static const struct seq_operations proc_tid_numa_maps_op = {
-	.start  = m_start,
-	.next   = m_next,
-	.stop   = m_stop,
-	.show   = show_tid_numa_map,
-};
-
-static int numa_maps_open(struct inode *inode, struct file *file,
-			  const struct seq_operations *ops)
-{
-	return proc_maps_open(inode, file, ops,
-				sizeof(struct numa_maps_private));
-}
-
 static int pid_numa_maps_open(struct inode *inode, struct file *file)
 {
-	return numa_maps_open(inode, file, &proc_pid_numa_maps_op);
-}
-
-static int tid_numa_maps_open(struct inode *inode, struct file *file)
-{
-	return numa_maps_open(inode, file, &proc_tid_numa_maps_op);
+	return proc_maps_open(inode, file, &proc_pid_numa_maps_op,
+				sizeof(struct numa_maps_private));
 }
 
 const struct file_operations proc_pid_numa_maps_operations = {
@@ -1860,10 +1774,4 @@ const struct file_operations proc_pid_numa_maps_operations = {
 	.release	= proc_map_release,
 };
 
-const struct file_operations proc_tid_numa_maps_operations = {
-	.open		= tid_numa_maps_open,
-	.read		= seq_read,
-	.llseek		= seq_lseek,
-	.release	= proc_map_release,
-};
 #endif /* CONFIG_NUMA */
diff --git a/fs/proc/task_nommu.c b/fs/proc/task_nommu.c
index 5b62f57bd9bc..0b63d68dedb2 100644
--- a/fs/proc/task_nommu.c
+++ b/fs/proc/task_nommu.c
@@ -142,8 +142,7 @@ static int is_stack(struct vm_area_struct *vma)
 /*
  * display a single VMA to a sequenced file
  */
-static int nommu_vma_show(struct seq_file *m, struct vm_area_struct *vma,
-			  int is_pid)
+static int nommu_vma_show(struct seq_file *m, struct vm_area_struct *vma)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	unsigned long ino = 0;
@@ -189,22 +188,11 @@ static int nommu_vma_show(struct seq_file *m, struct vm_area_struct *vma,
 /*
  * display mapping lines for a particular process's /proc/pid/maps
  */
-static int show_map(struct seq_file *m, void *_p, int is_pid)
+static int show_map(struct seq_file *m, void *_p)
 {
 	struct rb_node *p = _p;
 
-	return nommu_vma_show(m, rb_entry(p, struct vm_area_struct, vm_rb),
-			      is_pid);
-}
-
-static int show_pid_map(struct seq_file *m, void *_p)
-{
-	return show_map(m, _p, 1);
-}
-
-static int show_tid_map(struct seq_file *m, void *_p)
-{
-	return show_map(m, _p, 0);
+	return nommu_vma_show(m, rb_entry(p, struct vm_area_struct, vm_rb));
 }
 
 static void *m_start(struct seq_file *m, loff_t *pos)
@@ -260,14 +248,7 @@ static const struct seq_operations proc_pid_maps_ops = {
 	.start	= m_start,
 	.next	= m_next,
 	.stop	= m_stop,
-	.show	= show_pid_map
-};
-
-static const struct seq_operations proc_tid_maps_ops = {
-	.start	= m_start,
-	.next	= m_next,
-	.stop	= m_stop,
-	.show	= show_tid_map
+	.show	= show_map
 };
 
 static int maps_open(struct inode *inode, struct file *file,
@@ -308,11 +289,6 @@ static int pid_maps_open(struct inode *inode, struct file *file)
 	return maps_open(inode, file, &proc_pid_maps_ops);
 }
 
-static int tid_maps_open(struct inode *inode, struct file *file)
-{
-	return maps_open(inode, file, &proc_tid_maps_ops);
-}
-
 const struct file_operations proc_pid_maps_operations = {
 	.open		= pid_maps_open,
 	.read		= seq_read,
@@ -320,10 +296,3 @@ const struct file_operations proc_pid_maps_operations = {
 	.release	= map_release,
 };
 
-const struct file_operations proc_tid_maps_operations = {
-	.open		= tid_maps_open,
-	.read		= seq_read,
-	.llseek		= seq_lseek,
-	.release	= map_release,
-};
-
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] mm: proc/pid/smaps: factor out mem stats gathering
  2018-07-23 11:19 [PATCH 0/4] cleanups and refactor of /proc/pid/smaps* Vlastimil Babka
  2018-07-23 11:19 ` [PATCH 1/4] mm: /proc/pid/*maps remove is_pid and related wrappers Vlastimil Babka
@ 2018-07-23 11:19 ` Vlastimil Babka
  2018-07-30  8:45   ` Vlastimil Babka
  2018-07-23 11:19 ` [PATCH 3/4] mm: proc/pid/smaps: factor out common stats printing Vlastimil Babka
  2018-07-23 11:19 ` [PATCH 4/4] mm: proc/pid/smaps_rollup: convert to single value seq_file Vlastimil Babka
  3 siblings, 1 reply; 10+ messages in thread
From: Vlastimil Babka @ 2018-07-23 11:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Daniel Colascione, linux-mm, linux-kernel, linux-fsdevel,
	Alexey Dobriyan, linux-api, Vlastimil Babka

To prepare for handling /proc/pid/smaps_rollup differently from /proc/pid/smaps
factor out vma mem stats gathering from show_smap() - it will be used by both.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 fs/proc/task_mmu.c | 55 ++++++++++++++++++++++++++--------------------
 1 file changed, 31 insertions(+), 24 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index a3f98ca50981..d2ca88c92d9d 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -702,14 +702,9 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask,
 }
 #endif /* HUGETLB_PAGE */
 
-#define SEQ_PUT_DEC(str, val) \
-		seq_put_decimal_ull_width(m, str, (val) >> 10, 8)
-static int show_smap(struct seq_file *m, void *v)
+static void smap_gather_stats(struct vm_area_struct *vma,
+			     struct mem_size_stats *mss)
 {
-	struct proc_maps_private *priv = m->private;
-	struct vm_area_struct *vma = v;
-	struct mem_size_stats mss_stack;
-	struct mem_size_stats *mss;
 	struct mm_walk smaps_walk = {
 		.pmd_entry = smaps_pte_range,
 #ifdef CONFIG_HUGETLB_PAGE
@@ -717,23 +712,6 @@ static int show_smap(struct seq_file *m, void *v)
 #endif
 		.mm = vma->vm_mm,
 	};
-	int ret = 0;
-	bool rollup_mode;
-	bool last_vma;
-
-	if (priv->rollup) {
-		rollup_mode = true;
-		mss = priv->rollup;
-		if (mss->first) {
-			mss->first_vma_start = vma->vm_start;
-			mss->first = false;
-		}
-		last_vma = !m_next_vma(priv, vma);
-	} else {
-		rollup_mode = false;
-		memset(&mss_stack, 0, sizeof(mss_stack));
-		mss = &mss_stack;
-	}
 
 	smaps_walk.private = mss;
 
@@ -765,6 +743,35 @@ static int show_smap(struct seq_file *m, void *v)
 	walk_page_vma(vma, &smaps_walk);
 	if (vma->vm_flags & VM_LOCKED)
 		mss->pss_locked += mss->pss;
+}
+
+#define SEQ_PUT_DEC(str, val) \
+		seq_put_decimal_ull_width(m, str, (val) >> 10, 8)
+static int show_smap(struct seq_file *m, void *v)
+{
+	struct proc_maps_private *priv = m->private;
+	struct vm_area_struct *vma = v;
+	struct mem_size_stats mss_stack;
+	struct mem_size_stats *mss;
+	int ret = 0;
+	bool rollup_mode;
+	bool last_vma;
+
+	if (priv->rollup) {
+		rollup_mode = true;
+		mss = priv->rollup;
+		if (mss->first) {
+			mss->first_vma_start = vma->vm_start;
+			mss->first = false;
+		}
+		last_vma = !m_next_vma(priv, vma);
+	} else {
+		rollup_mode = false;
+		memset(&mss_stack, 0, sizeof(mss_stack));
+		mss = &mss_stack;
+	}
+
+	smap_gather_stats(vma, mss);
 
 	if (!rollup_mode) {
 		show_map_vma(m, vma);
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4] mm: proc/pid/smaps: factor out common stats printing
  2018-07-23 11:19 [PATCH 0/4] cleanups and refactor of /proc/pid/smaps* Vlastimil Babka
  2018-07-23 11:19 ` [PATCH 1/4] mm: /proc/pid/*maps remove is_pid and related wrappers Vlastimil Babka
  2018-07-23 11:19 ` [PATCH 2/4] mm: proc/pid/smaps: factor out mem stats gathering Vlastimil Babka
@ 2018-07-23 11:19 ` Vlastimil Babka
  2018-07-23 11:19 ` [PATCH 4/4] mm: proc/pid/smaps_rollup: convert to single value seq_file Vlastimil Babka
  3 siblings, 0 replies; 10+ messages in thread
From: Vlastimil Babka @ 2018-07-23 11:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Daniel Colascione, linux-mm, linux-kernel, linux-fsdevel,
	Alexey Dobriyan, linux-api, Vlastimil Babka

To prepare for handling /proc/pid/smaps_rollup differently from /proc/pid/smaps
factor out from show_smap() printing the parts of output that are common for
both variants, which is the bulk of the gathered memory stats.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 fs/proc/task_mmu.c | 51 ++++++++++++++++++++++++++--------------------
 1 file changed, 29 insertions(+), 22 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index d2ca88c92d9d..1d6d315fd31b 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -747,6 +747,32 @@ static void smap_gather_stats(struct vm_area_struct *vma,
 
 #define SEQ_PUT_DEC(str, val) \
 		seq_put_decimal_ull_width(m, str, (val) >> 10, 8)
+
+/* Show the contents common for smaps and smaps_rollup */
+static void __show_smap(struct seq_file *m, struct mem_size_stats *mss)
+{
+	SEQ_PUT_DEC("Rss:            ", mss->resident);
+	SEQ_PUT_DEC(" kB\nPss:            ", mss->pss >> PSS_SHIFT);
+	SEQ_PUT_DEC(" kB\nShared_Clean:   ", mss->shared_clean);
+	SEQ_PUT_DEC(" kB\nShared_Dirty:   ", mss->shared_dirty);
+	SEQ_PUT_DEC(" kB\nPrivate_Clean:  ", mss->private_clean);
+	SEQ_PUT_DEC(" kB\nPrivate_Dirty:  ", mss->private_dirty);
+	SEQ_PUT_DEC(" kB\nReferenced:     ", mss->referenced);
+	SEQ_PUT_DEC(" kB\nAnonymous:      ", mss->anonymous);
+	SEQ_PUT_DEC(" kB\nLazyFree:       ", mss->lazyfree);
+	SEQ_PUT_DEC(" kB\nAnonHugePages:  ", mss->anonymous_thp);
+	SEQ_PUT_DEC(" kB\nShmemPmdMapped: ", mss->shmem_thp);
+	SEQ_PUT_DEC(" kB\nShared_Hugetlb: ", mss->shared_hugetlb);
+	seq_put_decimal_ull_width(m, " kB\nPrivate_Hugetlb: ",
+				  mss->private_hugetlb >> 10, 7);
+	SEQ_PUT_DEC(" kB\nSwap:           ", mss->swap);
+	SEQ_PUT_DEC(" kB\nSwapPss:        ",
+					mss->swap_pss >> PSS_SHIFT);
+	SEQ_PUT_DEC(" kB\nLocked:         ",
+					mss->pss_locked >> PSS_SHIFT);
+	seq_puts(m, " kB\n");
+}
+
 static int show_smap(struct seq_file *m, void *v)
 {
 	struct proc_maps_private *priv = m->private;
@@ -791,28 +817,9 @@ static int show_smap(struct seq_file *m, void *v)
 		seq_puts(m, " kB\n");
 	}
 
-	if (!rollup_mode || last_vma) {
-		SEQ_PUT_DEC("Rss:            ", mss->resident);
-		SEQ_PUT_DEC(" kB\nPss:            ", mss->pss >> PSS_SHIFT);
-		SEQ_PUT_DEC(" kB\nShared_Clean:   ", mss->shared_clean);
-		SEQ_PUT_DEC(" kB\nShared_Dirty:   ", mss->shared_dirty);
-		SEQ_PUT_DEC(" kB\nPrivate_Clean:  ", mss->private_clean);
-		SEQ_PUT_DEC(" kB\nPrivate_Dirty:  ", mss->private_dirty);
-		SEQ_PUT_DEC(" kB\nReferenced:     ", mss->referenced);
-		SEQ_PUT_DEC(" kB\nAnonymous:      ", mss->anonymous);
-		SEQ_PUT_DEC(" kB\nLazyFree:       ", mss->lazyfree);
-		SEQ_PUT_DEC(" kB\nAnonHugePages:  ", mss->anonymous_thp);
-		SEQ_PUT_DEC(" kB\nShmemPmdMapped: ", mss->shmem_thp);
-		SEQ_PUT_DEC(" kB\nShared_Hugetlb: ", mss->shared_hugetlb);
-		seq_put_decimal_ull_width(m, " kB\nPrivate_Hugetlb: ",
-					  mss->private_hugetlb >> 10, 7);
-		SEQ_PUT_DEC(" kB\nSwap:           ", mss->swap);
-		SEQ_PUT_DEC(" kB\nSwapPss:        ",
-						mss->swap_pss >> PSS_SHIFT);
-		SEQ_PUT_DEC(" kB\nLocked:         ",
-						mss->pss_locked >> PSS_SHIFT);
-		seq_puts(m, " kB\n");
-	}
+	if (!rollup_mode || last_vma)
+		__show_smap(m, mss);
+
 	if (!rollup_mode) {
 		if (arch_pkeys_enabled())
 			seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4] mm: proc/pid/smaps_rollup: convert to single value seq_file
  2018-07-23 11:19 [PATCH 0/4] cleanups and refactor of /proc/pid/smaps* Vlastimil Babka
                   ` (2 preceding siblings ...)
  2018-07-23 11:19 ` [PATCH 3/4] mm: proc/pid/smaps: factor out common stats printing Vlastimil Babka
@ 2018-07-23 11:19 ` Vlastimil Babka
  2018-07-25  6:53   ` Vlastimil Babka
  3 siblings, 1 reply; 10+ messages in thread
From: Vlastimil Babka @ 2018-07-23 11:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Daniel Colascione, linux-mm, linux-kernel, linux-fsdevel,
	Alexey Dobriyan, linux-api, Vlastimil Babka

The /proc/pid/smaps_rollup file is currently implemented via the
m_start/m_next/m_stop seq_file iterators shared with the other maps files,
that iterate over vma's. However, the rollup file doesn't print anything
for each vma, only accumulate the stats.

There are some issues with the current code as reported in [1] - the
accumulated stats can get skewed if seq_file start()/stop() op is called
multiple times, if show() is called multiple times, and after seeks to
non-zero position.

Patch [1] fixed those within existing design, but I believe it is
fundamentally wrong to expose the vma iterators to the seq_file mechanism
when smaps_rollup shows logically a single set of values for the whole
address space.

This patch thus refactors the code to provide a single "value" at offset 0,
with vma iteration to gather the stats done internally. This fixes the
situations where results are skewed, and simplifies the code, especially
in show_smap(), at the expense of somewhat less code reuse.

[1] https://marc.info/?l=linux-mm&m=151927723128134&w=2

Reported-by: Daniel Colascione <dancol@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 fs/proc/task_mmu.c | 136 ++++++++++++++++++++++++++++-----------------
 1 file changed, 86 insertions(+), 50 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 1d6d315fd31b..31109e67804c 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -404,7 +404,6 @@ const struct file_operations proc_pid_maps_operations = {
 
 #ifdef CONFIG_PROC_PAGE_MONITOR
 struct mem_size_stats {
-	bool first;
 	unsigned long resident;
 	unsigned long shared_clean;
 	unsigned long shared_dirty;
@@ -418,11 +417,12 @@ struct mem_size_stats {
 	unsigned long swap;
 	unsigned long shared_hugetlb;
 	unsigned long private_hugetlb;
-	unsigned long first_vma_start;
+	unsigned long last_vma_end;
 	u64 pss;
 	u64 pss_locked;
 	u64 swap_pss;
 	bool check_shmem_swap;
+	bool finished;
 };
 
 static void smaps_account(struct mem_size_stats *mss, struct page *page,
@@ -775,58 +775,57 @@ static void __show_smap(struct seq_file *m, struct mem_size_stats *mss)
 
 static int show_smap(struct seq_file *m, void *v)
 {
-	struct proc_maps_private *priv = m->private;
 	struct vm_area_struct *vma = v;
-	struct mem_size_stats mss_stack;
-	struct mem_size_stats *mss;
-	int ret = 0;
-	bool rollup_mode;
-	bool last_vma;
-
-	if (priv->rollup) {
-		rollup_mode = true;
-		mss = priv->rollup;
-		if (mss->first) {
-			mss->first_vma_start = vma->vm_start;
-			mss->first = false;
-		}
-		last_vma = !m_next_vma(priv, vma);
-	} else {
-		rollup_mode = false;
-		memset(&mss_stack, 0, sizeof(mss_stack));
-		mss = &mss_stack;
-	}
+	struct mem_size_stats mss;
 
-	smap_gather_stats(vma, mss);
+	memset(&mss, 0, sizeof(mss));
 
-	if (!rollup_mode) {
-		show_map_vma(m, vma);
-	} else if (last_vma) {
-		show_vma_header_prefix(
-			m, mss->first_vma_start, vma->vm_end, 0, 0, 0, 0);
-		seq_pad(m, ' ');
-		seq_puts(m, "[rollup]\n");
-	} else {
-		ret = SEQ_SKIP;
-	}
+	smap_gather_stats(vma, &mss);
 
-	if (!rollup_mode) {
-		SEQ_PUT_DEC("Size:           ", vma->vm_end - vma->vm_start);
-		SEQ_PUT_DEC(" kB\nKernelPageSize: ", vma_kernel_pagesize(vma));
-		SEQ_PUT_DEC(" kB\nMMUPageSize:    ", vma_mmu_pagesize(vma));
-		seq_puts(m, " kB\n");
-	}
+	show_map_vma(m, vma);
 
-	if (!rollup_mode || last_vma)
-		__show_smap(m, mss);
+	SEQ_PUT_DEC("Size:           ", vma->vm_end - vma->vm_start);
+	SEQ_PUT_DEC(" kB\nKernelPageSize: ", vma_kernel_pagesize(vma));
+	SEQ_PUT_DEC(" kB\nMMUPageSize:    ", vma_mmu_pagesize(vma));
+	seq_puts(m, " kB\n");
+
+	__show_smap(m, &mss);
+
+	if (arch_pkeys_enabled())
+		seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
+	show_smap_vma_flags(m, vma);
 
-	if (!rollup_mode) {
-		if (arch_pkeys_enabled())
-			seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
-		show_smap_vma_flags(m, vma);
-	}
 	m_cache_vma(m, vma);
-	return ret;
+
+	return 0;
+}
+
+static int show_smaps_rollup(struct seq_file *m, void *v)
+{
+	struct proc_maps_private *priv = m->private;
+	struct mem_size_stats *mss = priv->rollup;
+	struct vm_area_struct *vma;
+
+	/*
+	 * We might be called multiple times when e.g. the seq buffer
+	 * overflows. Gather the stats only once.
+	 */
+	if (!mss->finished) {
+		for (vma = priv->mm->mmap; vma; vma = vma->vm_next) {
+			smap_gather_stats(vma, mss);
+			mss->last_vma_end = vma->vm_end;
+		}
+		mss->finished = true;
+	}
+
+	show_vma_header_prefix(m, priv->mm->mmap->vm_start,
+			       mss->last_vma_end, 0, 0, 0, 0);
+	seq_pad(m, ' ');
+	seq_puts(m, "[rollup]\n");
+
+	__show_smap(m, mss);
+
+	return 0;
 }
 #undef SEQ_PUT_DEC
 
@@ -837,6 +836,44 @@ static const struct seq_operations proc_pid_smaps_op = {
 	.show	= show_smap
 };
 
+static void *smaps_rollup_start(struct seq_file *m, loff_t *ppos)
+{
+	struct proc_maps_private *priv = m->private;
+	struct mm_struct *mm;
+
+	if (*ppos != 0)
+		return NULL;
+
+	priv->task = get_proc_task(priv->inode);
+	if (!priv->task)
+		return ERR_PTR(-ESRCH);
+
+	mm = priv->mm;
+	if (!mm || !mmget_not_zero(mm))
+		return NULL;
+
+	memset(priv->rollup, 0, sizeof(*priv->rollup));
+
+	down_read(&mm->mmap_sem);
+	hold_task_mempolicy(priv);
+
+	return mm;
+}
+
+static void *smaps_rollup_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	(*pos)++;
+	vma_stop(m->private);
+	return NULL;
+}
+
+static const struct seq_operations proc_pid_smaps_rollup_op = {
+	.start	= smaps_rollup_start,
+	.next	= smaps_rollup_next,
+	.stop	= m_stop,
+	.show	= show_smaps_rollup
+};
+
 static int pid_smaps_open(struct inode *inode, struct file *file)
 {
 	return do_maps_open(inode, file, &proc_pid_smaps_op);
@@ -846,18 +883,17 @@ static int pid_smaps_rollup_open(struct inode *inode, struct file *file)
 {
 	struct seq_file *seq;
 	struct proc_maps_private *priv;
-	int ret = do_maps_open(inode, file, &proc_pid_smaps_op);
+	int ret = do_maps_open(inode, file, &proc_pid_smaps_rollup_op);
 
 	if (ret < 0)
 		return ret;
 	seq = file->private_data;
 	priv = seq->private;
-	priv->rollup = kzalloc(sizeof(*priv->rollup), GFP_KERNEL);
+	priv->rollup = kmalloc(sizeof(*priv->rollup), GFP_KERNEL);
 	if (!priv->rollup) {
 		proc_map_release(inode, file);
 		return -ENOMEM;
 	}
-	priv->rollup->first = true;
 	return 0;
 }
 
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] mm: proc/pid/smaps_rollup: convert to single value seq_file
  2018-07-23 11:19 ` [PATCH 4/4] mm: proc/pid/smaps_rollup: convert to single value seq_file Vlastimil Babka
@ 2018-07-25  6:53   ` Vlastimil Babka
  2018-07-26 16:26     ` Alexey Dobriyan
  0 siblings, 1 reply; 10+ messages in thread
From: Vlastimil Babka @ 2018-07-25  6:53 UTC (permalink / raw)
  To: Andrew Morton, Alexey Dobriyan
  Cc: Daniel Colascione, linux-mm, linux-kernel, linux-fsdevel, linux-api

I moved the reply to this thread since the "added to -mm tree"
notification Alexey replied to in <20180724182908.GD27053@avx2> has
reduced CC list and is not linked to the patch postings.

On 07/24/2018 08:29 PM, Alexey Dobriyan wrote:
> On Mon, Jul 23, 2018 at 04:55:48PM -0700, akpm@linux-foundation.org wrote:
>> The patch titled
>>      Subject: mm: /proc/pid/smaps_rollup: convert to single value seq_file
>> has been added to the -mm tree.  Its filename is
>>      mm-proc-pid-smaps_rollup-convert-to-single-value-seq_file.patch
> 
>> Subject: mm: /proc/pid/smaps_rollup: convert to single value seq_file
>>
>> The /proc/pid/smaps_rollup file is currently implemented via the
>> m_start/m_next/m_stop seq_file iterators shared with the other maps files,
>> that iterate over vma's.  However, the rollup file doesn't print anything
>> for each vma, only accumulate the stats.
> 
> What I don't understand why keep seq_ops then and not do all the work in
> ->show hook.  Currently /proc/*/smaps_rollup is at ~500 bytes so with
> minimum 1 page seq buffer, no buffer resizing is possible.

Hmm IIUC seq_file also provides the buffer and handles feeding the data
from there to the user process, which might have called read() with a smaller
buffer than that. So I would rather not avoid the seq_file infrastructure.
Or you're saying it could be converted to single_open()? Maybe, with more work.

>> +static int show_smaps_rollup(struct seq_file *m, void *v)
>> +{
>> +	struct proc_maps_private *priv = m->private;
>> +	struct mem_size_stats *mss = priv->rollup;
>> +	struct vm_area_struct *vma;
>> +
>> +	/*
>> +	 * We might be called multiple times when e.g. the seq buffer
>> +	 * overflows. Gather the stats only once.
> 
> It doesn't!

Because the buffer is 1 page and the data is ~500 bytes as you said above?
Agreed, but I wouldn't want to depend on data not growing in the future or
the initial buffer not getting smaller. I could extend the comment that this
is theoretical for now?
 
>> +	if (!mss->finished) {
>> +		for (vma = priv->mm->mmap; vma; vma = vma->vm_next) {
>> +			smap_gather_stats(vma, mss);
>> +			mss->last_vma_end = vma->vm_end;
>>  		}
>> -		last_vma = !m_next_vma(priv, vma);
>> -	} else {
>> -		rollup_mode = false;
>> -		memset(&mss_stack, 0, sizeof(mss_stack));
>> -		mss = &mss_stack;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] mm: proc/pid/smaps_rollup: convert to single value seq_file
  2018-07-25  6:53   ` Vlastimil Babka
@ 2018-07-26 16:26     ` Alexey Dobriyan
  2018-07-30  8:53       ` Vlastimil Babka
  0 siblings, 1 reply; 10+ messages in thread
From: Alexey Dobriyan @ 2018-07-26 16:26 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Daniel Colascione, linux-mm, linux-kernel,
	linux-fsdevel, linux-api

On Wed, Jul 25, 2018 at 08:53:53AM +0200, Vlastimil Babka wrote:
> I moved the reply to this thread since the "added to -mm tree"
> notification Alexey replied to in <20180724182908.GD27053@avx2> has
> reduced CC list and is not linked to the patch postings.
> 
> On 07/24/2018 08:29 PM, Alexey Dobriyan wrote:
> > On Mon, Jul 23, 2018 at 04:55:48PM -0700, akpm@linux-foundation.org wrote:
> >> The patch titled
> >>      Subject: mm: /proc/pid/smaps_rollup: convert to single value seq_file
> >> has been added to the -mm tree.  Its filename is
> >>      mm-proc-pid-smaps_rollup-convert-to-single-value-seq_file.patch
> > 
> >> Subject: mm: /proc/pid/smaps_rollup: convert to single value seq_file
> >>
> >> The /proc/pid/smaps_rollup file is currently implemented via the
> >> m_start/m_next/m_stop seq_file iterators shared with the other maps files,
> >> that iterate over vma's.  However, the rollup file doesn't print anything
> >> for each vma, only accumulate the stats.
> > 
> > What I don't understand why keep seq_ops then and not do all the work in
> > ->show hook.  Currently /proc/*/smaps_rollup is at ~500 bytes so with
> > minimum 1 page seq buffer, no buffer resizing is possible.
> 
> Hmm IIUC seq_file also provides the buffer and handles feeding the data
> from there to the user process, which might have called read() with a smaller
> buffer than that. So I would rather not avoid the seq_file infrastructure.
> Or you're saying it could be converted to single_open()? Maybe, with more work.

Prefereably yes.

There are 2 ways to using seq_file:
* introduce seq_operations and iterate over objects printing them one by one,
* use single_open and 1 ->show hook and do all the work of collecting
  data there and print once.

  /proc/*/smaps_rollup is suited for variant 2 because variant 1 is
  designed for printing arbitrary amount of data.


> >> +static int show_smaps_rollup(struct seq_file *m, void *v)
> >> +{
> >> +	struct proc_maps_private *priv = m->private;
> >> +	struct mem_size_stats *mss = priv->rollup;
> >> +	struct vm_area_struct *vma;
> >> +
> >> +	/*
> >> +	 * We might be called multiple times when e.g. the seq buffer
> >> +	 * overflows. Gather the stats only once.
> > 
> > It doesn't!
> 
> Because the buffer is 1 page and the data is ~500 bytes as you said above?
> Agreed, but I wouldn't want to depend on data not growing in the future or
> the initial buffer not getting smaller. I could extend the comment that this
> is theoretical for now?

Given the rate of growth I wouldn't be concerned.

> >> +	if (!mss->finished) {
> >> +		for (vma = priv->mm->mmap; vma; vma = vma->vm_next) {
> >> +			smap_gather_stats(vma, mss);
> >> +			mss->last_vma_end = vma->vm_end;
> >>  		}
> >> -		last_vma = !m_next_vma(priv, vma);
> >> -	} else {
> >> -		rollup_mode = false;
> >> -		memset(&mss_stack, 0, sizeof(mss_stack));
> >> -		mss = &mss_stack;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] mm: proc/pid/smaps: factor out mem stats gathering
  2018-07-23 11:19 ` [PATCH 2/4] mm: proc/pid/smaps: factor out mem stats gathering Vlastimil Babka
@ 2018-07-30  8:45   ` Vlastimil Babka
  0 siblings, 0 replies; 10+ messages in thread
From: Vlastimil Babka @ 2018-07-30  8:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Daniel Colascione, linux-mm, linux-kernel, linux-fsdevel,
	Alexey Dobriyan, linux-api

(moved thread here)

On 07/26/2018 06:21 PM, Alexey Dobriyan wrote:
> On Wed, Jul 25, 2018 at 08:55:17AM +0200, Vlastimil Babka wrote:
>> On 07/24/2018 08:24 PM, Alexey Dobriyan wrote:
>>> On Mon, Jul 23, 2018 at 04:55:46PM -0700, akpm@linux-foundation.org wrote:
>>>> The patch titled
>>>>      Subject: mm: /proc/pid/smaps: factor out common stats printing
>>>> has been added to the -mm tree.  Its filename is
>>>>      mm-proc-pid-smaps-factor-out-common-stats-printing.patch
>>>> +/* Show the contents common for smaps and smaps_rollup */
>>>> +static void __show_smap(struct seq_file *m, struct mem_size_stats *mss)
>>> This can be "const".
>> What exactly, mss?
> Yes, of course.
> seq_file is changed by virtue of priting to it.

----8<----
>From 71a7c912496db1847e3b265cd30922c4e687b7c2 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Fri, 27 Jul 2018 13:56:00 +0200
Subject: [PATCH] mm: proc/pid/smaps: factor out common stats printing-fix

Add const, per Alexey.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 fs/proc/task_mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 1d6d315fd31b..c47f3cab70a1 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -749,7 +749,7 @@ static void smap_gather_stats(struct vm_area_struct *vma,
 		seq_put_decimal_ull_width(m, str, (val) >> 10, 8)
 
 /* Show the contents common for smaps and smaps_rollup */
-static void __show_smap(struct seq_file *m, struct mem_size_stats *mss)
+static void __show_smap(struct seq_file *m, const struct mem_size_stats *mss)
 {
 	SEQ_PUT_DEC("Rss:            ", mss->resident);
 	SEQ_PUT_DEC(" kB\nPss:            ", mss->pss >> PSS_SHIFT);
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] mm: proc/pid/smaps_rollup: convert to single value seq_file
  2018-07-26 16:26     ` Alexey Dobriyan
@ 2018-07-30  8:53       ` Vlastimil Babka
  2018-08-16 14:20         ` Alexey Dobriyan
  0 siblings, 1 reply; 10+ messages in thread
From: Vlastimil Babka @ 2018-07-30  8:53 UTC (permalink / raw)
  To: Alexey Dobriyan
  Cc: Andrew Morton, Daniel Colascione, linux-mm, linux-kernel,
	linux-fsdevel, linux-api

On 07/26/2018 06:26 PM, Alexey Dobriyan wrote:
> On Wed, Jul 25, 2018 at 08:53:53AM +0200, Vlastimil Babka wrote:
>> I moved the reply to this thread since the "added to -mm tree"
>> notification Alexey replied to in <20180724182908.GD27053@avx2> has
>> reduced CC list and is not linked to the patch postings.
>>
>> On 07/24/2018 08:29 PM, Alexey Dobriyan wrote:
>>> On Mon, Jul 23, 2018 at 04:55:48PM -0700, akpm@linux-foundation.org wrote:
>>>> The patch titled
>>>>      Subject: mm: /proc/pid/smaps_rollup: convert to single value seq_file
>>>> has been added to the -mm tree.  Its filename is
>>>>      mm-proc-pid-smaps_rollup-convert-to-single-value-seq_file.patch
>>>
>>>> Subject: mm: /proc/pid/smaps_rollup: convert to single value seq_file
>>>>
>>>> The /proc/pid/smaps_rollup file is currently implemented via the
>>>> m_start/m_next/m_stop seq_file iterators shared with the other maps files,
>>>> that iterate over vma's.  However, the rollup file doesn't print anything
>>>> for each vma, only accumulate the stats.
>>>
>>> What I don't understand why keep seq_ops then and not do all the work in
>>> ->show hook.  Currently /proc/*/smaps_rollup is at ~500 bytes so with
>>> minimum 1 page seq buffer, no buffer resizing is possible.
>>
>> Hmm IIUC seq_file also provides the buffer and handles feeding the data
>> from there to the user process, which might have called read() with a smaller
>> buffer than that. So I would rather not avoid the seq_file infrastructure.
>> Or you're saying it could be converted to single_open()? Maybe, with more work.
> 
> Prefereably yes.

OK here it is. Sending as a new patch instead of delta, as that's easier
to review - the delta is significant. Line stats wise it's the same.
Again a bit less boilerplate thans to no special seq_ops, a bit more
copy/paste in the open and release functions. But I guess it's better
overall.

----8>----
>From c6a2eaf3bb3546509d6b7c42f8bcc56cd7e92f90 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Wed, 18 Jul 2018 13:14:30 +0200
Subject: [PATCH] mm: proc/pid/smaps_rollup: convert to single value seq_file

The /proc/pid/smaps_rollup file is currently implemented via the
m_start/m_next/m_stop seq_file iterators shared with the other maps files,
that iterate over vma's. However, the rollup file doesn't print anything
for each vma, only accumulate the stats.

There are some issues with the current code as reported in [1] - the
accumulated stats can get skewed if seq_file start()/stop() op is called
multiple times, if show() is called multiple times, and after seeks to
non-zero position.

Patch [1] fixed those within existing design, but I believe it is
fundamentally wrong to expose the vma iterators to the seq_file mechanism
when smaps_rollup shows logically a single set of values for the whole
address space.

This patch thus refactors the code to provide a single "value" at offset 0,
with vma iteration to gather the stats done internally. This fixes the
situations where results are skewed, and simplifies the code, especially
in show_smap(), at the expense of somewhat less code reuse.

[1] https://marc.info/?l=linux-mm&m=151927723128134&w=2

Reported-by: Daniel Colascione <dancol@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 fs/proc/internal.h |   1 -
 fs/proc/task_mmu.c | 155 ++++++++++++++++++++++++++++-----------------
 2 files changed, 96 insertions(+), 60 deletions(-)

diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 0c538769512a..f5b75d258d22 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -285,7 +285,6 @@ struct proc_maps_private {
 	struct inode *inode;
 	struct task_struct *task;
 	struct mm_struct *mm;
-	struct mem_size_stats *rollup;
 #ifdef CONFIG_MMU
 	struct vm_area_struct *tail_vma;
 #endif
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index c47f3cab70a1..5ea1d64cb0b4 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -247,7 +247,6 @@ static int proc_map_release(struct inode *inode, struct file *file)
 	if (priv->mm)
 		mmdrop(priv->mm);
 
-	kfree(priv->rollup);
 	return seq_release_private(inode, file);
 }
 
@@ -404,7 +403,6 @@ const struct file_operations proc_pid_maps_operations = {
 
 #ifdef CONFIG_PROC_PAGE_MONITOR
 struct mem_size_stats {
-	bool first;
 	unsigned long resident;
 	unsigned long shared_clean;
 	unsigned long shared_dirty;
@@ -418,7 +416,6 @@ struct mem_size_stats {
 	unsigned long swap;
 	unsigned long shared_hugetlb;
 	unsigned long private_hugetlb;
-	unsigned long first_vma_start;
 	u64 pss;
 	u64 pss_locked;
 	u64 swap_pss;
@@ -775,57 +772,75 @@ static void __show_smap(struct seq_file *m, const struct mem_size_stats *mss)
 
 static int show_smap(struct seq_file *m, void *v)
 {
-	struct proc_maps_private *priv = m->private;
 	struct vm_area_struct *vma = v;
-	struct mem_size_stats mss_stack;
-	struct mem_size_stats *mss;
+	struct mem_size_stats mss;
+
+	memset(&mss, 0, sizeof(mss));
+
+	smap_gather_stats(vma, &mss);
+
+	show_map_vma(m, vma);
+
+	SEQ_PUT_DEC("Size:           ", vma->vm_end - vma->vm_start);
+	SEQ_PUT_DEC(" kB\nKernelPageSize: ", vma_kernel_pagesize(vma));
+	SEQ_PUT_DEC(" kB\nMMUPageSize:    ", vma_mmu_pagesize(vma));
+	seq_puts(m, " kB\n");
+
+	__show_smap(m, &mss);
+
+	if (arch_pkeys_enabled())
+		seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
+	show_smap_vma_flags(m, vma);
+
+	m_cache_vma(m, vma);
+
+	return 0;
+}
+
+static int show_smaps_rollup(struct seq_file *m, void *v)
+{
+	struct proc_maps_private *priv = m->private;
+	struct mem_size_stats mss;
+	struct mm_struct *mm;
+	struct vm_area_struct *vma;
+	unsigned long last_vma_end = 0;
 	int ret = 0;
-	bool rollup_mode;
-	bool last_vma;
-
-	if (priv->rollup) {
-		rollup_mode = true;
-		mss = priv->rollup;
-		if (mss->first) {
-			mss->first_vma_start = vma->vm_start;
-			mss->first = false;
-		}
-		last_vma = !m_next_vma(priv, vma);
-	} else {
-		rollup_mode = false;
-		memset(&mss_stack, 0, sizeof(mss_stack));
-		mss = &mss_stack;
-	}
 
-	smap_gather_stats(vma, mss);
+	priv->task = get_proc_task(priv->inode);
+	if (!priv->task)
+		return -ESRCH;
 
-	if (!rollup_mode) {
-		show_map_vma(m, vma);
-	} else if (last_vma) {
-		show_vma_header_prefix(
-			m, mss->first_vma_start, vma->vm_end, 0, 0, 0, 0);
-		seq_pad(m, ' ');
-		seq_puts(m, "[rollup]\n");
-	} else {
-		ret = SEQ_SKIP;
+	mm = priv->mm;
+	if (!mm || !mmget_not_zero(mm)) {
+		ret = -ESRCH;
+		goto out_put_task;
 	}
 
-	if (!rollup_mode) {
-		SEQ_PUT_DEC("Size:           ", vma->vm_end - vma->vm_start);
-		SEQ_PUT_DEC(" kB\nKernelPageSize: ", vma_kernel_pagesize(vma));
-		SEQ_PUT_DEC(" kB\nMMUPageSize:    ", vma_mmu_pagesize(vma));
-		seq_puts(m, " kB\n");
-	}
+	memset(&mss, 0, sizeof(mss));
 
-	if (!rollup_mode || last_vma)
-		__show_smap(m, mss);
+	down_read(&mm->mmap_sem);
+	hold_task_mempolicy(priv);
 
-	if (!rollup_mode) {
-		if (arch_pkeys_enabled())
-			seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
-		show_smap_vma_flags(m, vma);
+	for (vma = priv->mm->mmap; vma; vma = vma->vm_next) {
+		smap_gather_stats(vma, &mss);
+		last_vma_end = vma->vm_end;
 	}
-	m_cache_vma(m, vma);
+
+	show_vma_header_prefix(m, priv->mm->mmap->vm_start,
+			       last_vma_end, 0, 0, 0, 0);
+	seq_pad(m, ' ');
+	seq_puts(m, "[rollup]\n");
+
+	__show_smap(m, &mss);
+
+	release_task_mempolicy(priv);
+	up_read(&mm->mmap_sem);
+	mmput(mm);
+
+out_put_task:
+	put_task_struct(priv->task);
+	priv->task = NULL;
+
 	return ret;
 }
 #undef SEQ_PUT_DEC
@@ -842,23 +857,45 @@ static int pid_smaps_open(struct inode *inode, struct file *file)
 	return do_maps_open(inode, file, &proc_pid_smaps_op);
 }
 
-static int pid_smaps_rollup_open(struct inode *inode, struct file *file)
+static int smaps_rollup_open(struct inode *inode, struct file *file)
 {
-	struct seq_file *seq;
+	int ret;
 	struct proc_maps_private *priv;
-	int ret = do_maps_open(inode, file, &proc_pid_smaps_op);
-
-	if (ret < 0)
-		return ret;
-	seq = file->private_data;
-	priv = seq->private;
-	priv->rollup = kzalloc(sizeof(*priv->rollup), GFP_KERNEL);
-	if (!priv->rollup) {
-		proc_map_release(inode, file);
+
+	priv = kzalloc(sizeof(*priv), GFP_KERNEL_ACCOUNT);
+	if (!priv)
 		return -ENOMEM;
+
+	ret = single_open(file, show_smaps_rollup, priv);
+	if (ret)
+		goto out_free;
+
+	priv->inode = inode;
+	priv->mm = proc_mem_open(inode, PTRACE_MODE_READ);
+	if (IS_ERR(priv->mm)) {
+		ret = PTR_ERR(priv->mm);
+
+		single_release(inode, file);
+		goto out_free;
 	}
-	priv->rollup->first = true;
+
 	return 0;
+
+out_free:
+	kfree(priv);
+	return ret;
+}
+
+static int smaps_rollup_release(struct inode *inode, struct file *file)
+{
+	struct seq_file *seq = file->private_data;
+	struct proc_maps_private *priv = seq->private;
+
+	if (priv->mm)
+		mmdrop(priv->mm);
+
+	kfree(priv);
+	return single_release(inode, file);
 }
 
 const struct file_operations proc_pid_smaps_operations = {
@@ -869,10 +906,10 @@ const struct file_operations proc_pid_smaps_operations = {
 };
 
 const struct file_operations proc_pid_smaps_rollup_operations = {
-	.open		= pid_smaps_rollup_open,
+	.open		= smaps_rollup_open,
 	.read		= seq_read,
 	.llseek		= seq_lseek,
-	.release	= proc_map_release,
+	.release	= smaps_rollup_release,
 };
 
 enum clear_refs_types {
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] mm: proc/pid/smaps_rollup: convert to single value seq_file
  2018-07-30  8:53       ` Vlastimil Babka
@ 2018-08-16 14:20         ` Alexey Dobriyan
  0 siblings, 0 replies; 10+ messages in thread
From: Alexey Dobriyan @ 2018-08-16 14:20 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Daniel Colascione, linux-mm, linux-kernel,
	linux-fsdevel, linux-api

On Mon, Jul 30, 2018 at 10:53:53AM +0200, Vlastimil Babka wrote:
> On 07/26/2018 06:26 PM, Alexey Dobriyan wrote:
> > On Wed, Jul 25, 2018 at 08:53:53AM +0200, Vlastimil Babka wrote:
> >> I moved the reply to this thread since the "added to -mm tree"
> >> notification Alexey replied to in <20180724182908.GD27053@avx2> has
> >> reduced CC list and is not linked to the patch postings.
> >>
> >> On 07/24/2018 08:29 PM, Alexey Dobriyan wrote:
> >>> On Mon, Jul 23, 2018 at 04:55:48PM -0700, akpm@linux-foundation.org wrote:
> >>>> The patch titled
> >>>>      Subject: mm: /proc/pid/smaps_rollup: convert to single value seq_file
> >>>> has been added to the -mm tree.  Its filename is
> >>>>      mm-proc-pid-smaps_rollup-convert-to-single-value-seq_file.patch
> >>>
> >>>> Subject: mm: /proc/pid/smaps_rollup: convert to single value seq_file
> >>>>
> >>>> The /proc/pid/smaps_rollup file is currently implemented via the
> >>>> m_start/m_next/m_stop seq_file iterators shared with the other maps files,
> >>>> that iterate over vma's.  However, the rollup file doesn't print anything
> >>>> for each vma, only accumulate the stats.
> >>>
> >>> What I don't understand why keep seq_ops then and not do all the work in
> >>> ->show hook.  Currently /proc/*/smaps_rollup is at ~500 bytes so with
> >>> minimum 1 page seq buffer, no buffer resizing is possible.
> >>
> >> Hmm IIUC seq_file also provides the buffer and handles feeding the data
> >> from there to the user process, which might have called read() with a smaller
> >> buffer than that. So I would rather not avoid the seq_file infrastructure.
> >> Or you're saying it could be converted to single_open()? Maybe, with more work.
> > 
> > Prefereably yes.
> 
> OK here it is. Sending as a new patch instead of delta, as that's easier
> to review - the delta is significant. Line stats wise it's the same.
> Again a bit less boilerplate thans to no special seq_ops, a bit more
> copy/paste in the open and release functions. But I guess it's better
> overall.
> 
> ----8>----
> From c6a2eaf3bb3546509d6b7c42f8bcc56cd7e92f90 Mon Sep 17 00:00:00 2001
> From: Vlastimil Babka <vbabka@suse.cz>
> Date: Wed, 18 Jul 2018 13:14:30 +0200
> Subject: [PATCH] mm: proc/pid/smaps_rollup: convert to single value seq_file

Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-08-16 14:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-23 11:19 [PATCH 0/4] cleanups and refactor of /proc/pid/smaps* Vlastimil Babka
2018-07-23 11:19 ` [PATCH 1/4] mm: /proc/pid/*maps remove is_pid and related wrappers Vlastimil Babka
2018-07-23 11:19 ` [PATCH 2/4] mm: proc/pid/smaps: factor out mem stats gathering Vlastimil Babka
2018-07-30  8:45   ` Vlastimil Babka
2018-07-23 11:19 ` [PATCH 3/4] mm: proc/pid/smaps: factor out common stats printing Vlastimil Babka
2018-07-23 11:19 ` [PATCH 4/4] mm: proc/pid/smaps_rollup: convert to single value seq_file Vlastimil Babka
2018-07-25  6:53   ` Vlastimil Babka
2018-07-26 16:26     ` Alexey Dobriyan
2018-07-30  8:53       ` Vlastimil Babka
2018-08-16 14:20         ` Alexey Dobriyan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).