All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-05 20:14 ` robert.foss
  0 siblings, 0 replies; 43+ messages in thread
From: robert.foss @ 2016-09-05 20:14 UTC (permalink / raw)
  To: corbet, akpm, vbabka, hughd, mhocko, koct9i, n-horiguchi,
	robert.foss, kirill.shutemov, john.stultz, minchan, ross.zwisler,
	jmarchan, hannes, keescook, oleg, viro, mguzik, jdanis,
	calvinowens, adobriyan, ebiederm, sonnyrao, seth.forshee, tixxdz,
	linux-doc, linux-kernel, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Jann Horn, Michal Hocko, linux-api,
	Jacek Anaszewski

From: Robert Foss <robert.foss@collabora.com>

This series provides the /proc/PID/totmaps feature, which
summarizes the information provided by /proc/PID/smaps for
improved performance and usability reasons.

A use case is to speed up monitoring of memory consumption in
environments where RSS isn't precise.

For example Chrome tends to many processes which have hundreds of VMAs
with a substantial amount of shared memory, and the error of using
RSS rather than PSS tends to be very large when looking at overall
memory consumption.  PSS isn't kept as a single number that's exported
like RSS, so to calculate PSS means having to parse a very large smaps
file.

This process is slow and has to be repeated for many processes, and we
found that the just act of doing the parsing was taking up a
significant amount of CPU time, so this patch is an attempt to make
that process cheaper.

/proc/PID/totmaps provides roughly a 2x speedup compared to parsing
/proc/PID/smaps with awk.

$ ps aux | grep firefox
robertfoss   5025 24.3 13.7 3562820 2219616 ?     Rl   Aug15 277:44 /usr/lib/firefox/firefox https://allg.one/xpb
$ awk '/^[0-9a-f]/{print}' /proc/5025/smaps | wc -l
1503
$ /usr/bin/time -v -p zsh -c "(repeat 25 {cat /proc/5025/totmaps})"
[...]
	Command being timed: "zsh -c (repeat 25 {cat /proc/5025/totmaps})"
	User time (seconds): 0.00
	System time (seconds): 0.40
	Percent of CPU this job got: 90%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.45

$ /usr/bin/time -v -p zsh -c "repeat 25 { awk '/^Rss/{rss+=\$2} /^Pss/{pss+=\$2} END {printf \"rss:%d pss:%d\n\", rss, pss}\' /proc/5025/smaps }"
[...]
	Command being timed: "zsh -c repeat 25 { awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}\' /proc/5025/smaps }"
	User time (seconds): 0.37
	System time (seconds): 0.45
	Percent of CPU this job got: 92%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.89


Changes since v1:
- Removed IS_ERR check from get_task_mm() function
- Changed comment format
- Moved proc_totmaps_operations declaration inside internal.h
- Switched to using do_maps_open() in totmaps_open() function,
  which provides privilege checking
- Error handling reworked for totmaps_open() function
- Switched to stack allocated struct mem_size_stats mss_sum in
  totmaps_proc_show() function
- Removed get_task_mm() in totmaps_proc_show() since priv->mm
  already is available
- Added support to proc_map_release() fork priv==NULL, to allow
  function to be used for all failure cases
- Added proc_totmaps_op and for it helper functions
- Added documention in separate patch
- Removed totmaps_release() since it was just a wrapper for proc_map_release()

Changes since v2:
- Removed struct mem_size_stats *mss from struct proc_maps_private
- Removed priv->task assignment in totmaps_open() call
- Moved some assignements calls totmaps_open() around to increase code
  clarity
- Moved some function calls to unlock data structures before printing

Changes since v3:
- Fixed typo in totmaps documentation
- Fixed issue where proc_map_release wasn't called on error
- Fixed put_task_struct not being called during .release()

Changes since v4:
- Prevent access to invalid processes

Robert Foss (3):
  mm, proc: Implement /proc/<pid>/totmaps
  Documentation/filesystems: Fixed typo
  Documentation/filesystems: Added /proc/PID/totmaps documentation

 Documentation/filesystems/proc.txt |  23 +++++-
 fs/proc/base.c                     |   1 +
 fs/proc/internal.h                 |   2 +
 fs/proc/task_mmu.c                 | 148 +++++++++++++++++++++++++++++++++++++
 4 files changed, 173 insertions(+), 1 deletion(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-05 20:14 ` robert.foss
  0 siblings, 0 replies; 43+ messages in thread
From: robert.foss @ 2016-09-05 20:14 UTC (permalink / raw)
  To: corbet, akpm, vbabka, hughd, mhocko, koct9i, n-horiguchi,
	robert.foss, kirill.shutemov, john.stultz, minchan, ross.zwisler,
	jmarchan, hannes, keescook, oleg, viro, mguzik, jdanis,
	calvinowens, adobriyan, ebiederm, sonnyrao, seth.forshee, tixxdz,
	linux-doc, linux-kernel, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Jann Horn, Michal Hocko

From: Robert Foss <robert.foss@collabora.com>

This series provides the /proc/PID/totmaps feature, which
summarizes the information provided by /proc/PID/smaps for
improved performance and usability reasons.

A use case is to speed up monitoring of memory consumption in
environments where RSS isn't precise.

For example Chrome tends to many processes which have hundreds of VMAs
with a substantial amount of shared memory, and the error of using
RSS rather than PSS tends to be very large when looking at overall
memory consumption.  PSS isn't kept as a single number that's exported
like RSS, so to calculate PSS means having to parse a very large smaps
file.

This process is slow and has to be repeated for many processes, and we
found that the just act of doing the parsing was taking up a
significant amount of CPU time, so this patch is an attempt to make
that process cheaper.

/proc/PID/totmaps provides roughly a 2x speedup compared to parsing
/proc/PID/smaps with awk.

$ ps aux | grep firefox
robertfoss   5025 24.3 13.7 3562820 2219616 ?     Rl   Aug15 277:44 /usr/lib/firefox/firefox https://allg.one/xpb
$ awk '/^[0-9a-f]/{print}' /proc/5025/smaps | wc -l
1503
$ /usr/bin/time -v -p zsh -c "(repeat 25 {cat /proc/5025/totmaps})"
[...]
	Command being timed: "zsh -c (repeat 25 {cat /proc/5025/totmaps})"
	User time (seconds): 0.00
	System time (seconds): 0.40
	Percent of CPU this job got: 90%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.45

$ /usr/bin/time -v -p zsh -c "repeat 25 { awk '/^Rss/{rss+=\$2} /^Pss/{pss+=\$2} END {printf \"rss:%d pss:%d\n\", rss, pss}\' /proc/5025/smaps }"
[...]
	Command being timed: "zsh -c repeat 25 { awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}\' /proc/5025/smaps }"
	User time (seconds): 0.37
	System time (seconds): 0.45
	Percent of CPU this job got: 92%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.89


Changes since v1:
- Removed IS_ERR check from get_task_mm() function
- Changed comment format
- Moved proc_totmaps_operations declaration inside internal.h
- Switched to using do_maps_open() in totmaps_open() function,
  which provides privilege checking
- Error handling reworked for totmaps_open() function
- Switched to stack allocated struct mem_size_stats mss_sum in
  totmaps_proc_show() function
- Removed get_task_mm() in totmaps_proc_show() since priv->mm
  already is available
- Added support to proc_map_release() fork priv==NULL, to allow
  function to be used for all failure cases
- Added proc_totmaps_op and for it helper functions
- Added documention in separate patch
- Removed totmaps_release() since it was just a wrapper for proc_map_release()

Changes since v2:
- Removed struct mem_size_stats *mss from struct proc_maps_private
- Removed priv->task assignment in totmaps_open() call
- Moved some assignements calls totmaps_open() around to increase code
  clarity
- Moved some function calls to unlock data structures before printing

Changes since v3:
- Fixed typo in totmaps documentation
- Fixed issue where proc_map_release wasn't called on error
- Fixed put_task_struct not being called during .release()

Changes since v4:
- Prevent access to invalid processes

Robert Foss (3):
  mm, proc: Implement /proc/<pid>/totmaps
  Documentation/filesystems: Fixed typo
  Documentation/filesystems: Added /proc/PID/totmaps documentation

 Documentation/filesystems/proc.txt |  23 +++++-
 fs/proc/base.c                     |   1 +
 fs/proc/internal.h                 |   2 +
 fs/proc/task_mmu.c                 | 148 +++++++++++++++++++++++++++++++++++++
 4 files changed, 173 insertions(+), 1 deletion(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v5 1/3] mm, proc: Implement /proc/<pid>/totmaps
  2016-09-05 20:14 ` robert.foss
@ 2016-09-05 20:14   ` robert.foss
  -1 siblings, 0 replies; 43+ messages in thread
From: robert.foss @ 2016-09-05 20:14 UTC (permalink / raw)
  To: corbet, akpm, vbabka, hughd, mhocko, koct9i, n-horiguchi,
	robert.foss, kirill.shutemov, john.stultz, minchan, ross.zwisler,
	jmarchan, hannes, keescook, oleg, viro, mguzik, jdanis,
	calvinowens, adobriyan, ebiederm, sonnyrao, seth.forshee, tixxdz,
	linux-doc, linux-kernel, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Jann Horn, Michal Hocko, linux-api,
	Jacek Anaszewski

From: Robert Foss <robert.foss@collabora.com>

This is based on earlier work by Thiago Goncales. It implements a new
per process proc file which summarizes the contents of the smaps file
but doesn't display any addresses.  It gives more detailed information
than statm like the PSS (proprotional set size).  It differs from the
original implementation in that it doesn't use the full blown set of
seq operations, uses a different termination condition, and doesn't
displayed "Locked" as that was broken on the original implemenation.

This new proc file provides information faster than parsing the potentially
huge smaps file.

Tested-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: Robert Foss <robert.foss@collabora.com>

Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
---
 fs/proc/base.c     |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 148 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 151 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index ac0df4d..dc7e81b7 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2854,6 +2854,7 @@ static const struct pid_entry tgid_base_stuff[] = {
 	REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
 	REG("smaps",      S_IRUGO, proc_pid_smaps_operations),
 	REG("pagemap",    S_IRUSR, proc_pagemap_operations),
+	REG("totmaps",    S_IRUGO, proc_totmaps_operations),
 #endif
 #ifdef CONFIG_SECURITY
 	DIR("attr",       S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, proc_attr_dir_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 7931c55..3bdafe8 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -298,6 +298,8 @@ extern const struct file_operations proc_pid_smaps_operations;
 extern const struct file_operations proc_tid_smaps_operations;
 extern const struct file_operations proc_clear_refs_operations;
 extern const struct file_operations proc_pagemap_operations;
+extern const struct file_operations proc_totmaps_operations;
+
 
 extern unsigned long task_vsize(struct mm_struct *);
 extern unsigned long task_statm(struct mm_struct *,
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 187d84e..f0f4fee 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -810,6 +810,75 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
 	return 0;
 }
 
+static void add_smaps_sum(struct mem_size_stats *mss,
+		struct mem_size_stats *mss_sum)
+{
+	mss_sum->resident += mss->resident;
+	mss_sum->pss += mss->pss;
+	mss_sum->shared_clean += mss->shared_clean;
+	mss_sum->shared_dirty += mss->shared_dirty;
+	mss_sum->private_clean += mss->private_clean;
+	mss_sum->private_dirty += mss->private_dirty;
+	mss_sum->referenced += mss->referenced;
+	mss_sum->anonymous += mss->anonymous;
+	mss_sum->anonymous_thp += mss->anonymous_thp;
+	mss_sum->swap += mss->swap;
+}
+
+static int totmaps_proc_show(struct seq_file *m, void *data)
+{
+	struct proc_maps_private *priv = m->private;
+	struct mm_struct *mm = priv->mm;
+	struct vm_area_struct *vma;
+	struct mem_size_stats mss_sum;
+
+	memset(&mss_sum, 0, sizeof(mss_sum));
+	down_read(&mm->mmap_sem);
+	hold_task_mempolicy(priv);
+
+	for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
+		struct mem_size_stats mss;
+		struct mm_walk smaps_walk = {
+			.pmd_entry = smaps_pte_range,
+			.mm = vma->vm_mm,
+			.private = &mss,
+		};
+
+		if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
+			memset(&mss, 0, sizeof(mss));
+			walk_page_vma(vma, &smaps_walk);
+			add_smaps_sum(&mss, &mss_sum);
+		}
+	}
+
+	release_task_mempolicy(priv);
+	up_read(&mm->mmap_sem);
+
+	seq_printf(m,
+		   "Rss:            %8lu kB\n"
+		   "Pss:            %8lu kB\n"
+		   "Shared_Clean:   %8lu kB\n"
+		   "Shared_Dirty:   %8lu kB\n"
+		   "Private_Clean:  %8lu kB\n"
+		   "Private_Dirty:  %8lu kB\n"
+		   "Referenced:     %8lu kB\n"
+		   "Anonymous:      %8lu kB\n"
+		   "AnonHugePages:  %8lu kB\n"
+		   "Swap:           %8lu kB\n",
+		   mss_sum.resident >> 10,
+		   (unsigned long)(mss_sum.pss >> (10 + PSS_SHIFT)),
+		   mss_sum.shared_clean  >> 10,
+		   mss_sum.shared_dirty  >> 10,
+		   mss_sum.private_clean >> 10,
+		   mss_sum.private_dirty >> 10,
+		   mss_sum.referenced >> 10,
+		   mss_sum.anonymous >> 10,
+		   mss_sum.anonymous_thp >> 10,
+		   mss_sum.swap >> 10);
+
+	return 0;
+}
+
 static int show_pid_smap(struct seq_file *m, void *v)
 {
 	return show_smap(m, v, 1);
@@ -820,6 +889,35 @@ static int show_tid_smap(struct seq_file *m, void *v)
 	return show_smap(m, v, 0);
 }
 
+static void *m_totmaps_start(struct seq_file *m, loff_t *ppos)
+{
+	struct proc_maps_private *priv = m->private;
+	struct mm_struct *mm;
+
+	mm = priv->mm;
+	if (!mm || !atomic_inc_not_zero(&mm->mm_users))
+		return NULL;
+
+	return NULL + (*ppos == 0);
+}
+
+static void *m_totmaps_next(struct seq_file *p, void *v, loff_t *pos)
+{
+	++*pos;
+	return NULL;
+}
+
+static void m_totmaps_stop(struct seq_file *p, void *v)
+{
+}
+
+static const struct seq_operations proc_totmaps_op = {
+	.start	= m_totmaps_start,
+	.next	= m_totmaps_next,
+	.stop	= m_totmaps_stop,
+	.show	= totmaps_proc_show
+};
+
 static const struct seq_operations proc_pid_smaps_op = {
 	.start	= m_start,
 	.next	= m_next,
@@ -844,6 +942,49 @@ static int tid_smaps_open(struct inode *inode, struct file *file)
 	return do_maps_open(inode, file, &proc_tid_smaps_op);
 }
 
+static int totmaps_open(struct inode *inode, struct file *file)
+{
+	struct proc_maps_private *priv = NULL;
+	struct seq_file *seq;
+	int ret;
+
+	ret = do_maps_open(inode, file, &proc_totmaps_op);
+	if (ret)
+		goto error;
+
+	/*
+	 * We need to grab references to the task_struct
+	 * at open time, because there's a potential information
+	 * leak where the totmaps file is opened and held open
+	 * while the underlying pid to task mapping changes
+	 * underneath it
+	 */
+	seq = file->private_data;
+	priv = seq->private;
+	priv->task = get_proc_task(inode);
+	if (!priv->task) {
+		ret = -ESRCH;
+		goto error_free;
+	}
+
+	return 0;
+
+error_free:
+	proc_map_release(inode, file);
+error:
+	return ret;
+}
+
+static int totmaps_release(struct inode *inode, struct file *file)
+{
+	struct seq_file *seq = file->private_data;
+	struct proc_maps_private *priv = seq->private;
+
+	put_task_struct(priv->task);
+
+	return proc_map_release(inode, file);
+}
+
 const struct file_operations proc_pid_smaps_operations = {
 	.open		= pid_smaps_open,
 	.read		= seq_read,
@@ -858,6 +999,13 @@ const struct file_operations proc_tid_smaps_operations = {
 	.release	= proc_map_release,
 };
 
+const struct file_operations proc_totmaps_operations = {
+	.open		= totmaps_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= totmaps_release,
+};
+
 enum clear_refs_types {
 	CLEAR_REFS_ALL = 1,
 	CLEAR_REFS_ANON,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 1/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-05 20:14   ` robert.foss
  0 siblings, 0 replies; 43+ messages in thread
From: robert.foss @ 2016-09-05 20:14 UTC (permalink / raw)
  To: corbet, akpm, vbabka, hughd, mhocko, koct9i, n-horiguchi,
	robert.foss, kirill.shutemov, john.stultz, minchan, ross.zwisler,
	jmarchan, hannes, keescook, oleg, viro, mguzik, jdanis,
	calvinowens, adobriyan, ebiederm, sonnyrao, seth.forshee, tixxdz,
	linux-doc, linux-kernel, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Jann Horn, Michal Hocko

From: Robert Foss <robert.foss@collabora.com>

This is based on earlier work by Thiago Goncales. It implements a new
per process proc file which summarizes the contents of the smaps file
but doesn't display any addresses.  It gives more detailed information
than statm like the PSS (proprotional set size).  It differs from the
original implementation in that it doesn't use the full blown set of
seq operations, uses a different termination condition, and doesn't
displayed "Locked" as that was broken on the original implemenation.

This new proc file provides information faster than parsing the potentially
huge smaps file.

Tested-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: Robert Foss <robert.foss@collabora.com>

Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
---
 fs/proc/base.c     |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 148 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 151 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index ac0df4d..dc7e81b7 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2854,6 +2854,7 @@ static const struct pid_entry tgid_base_stuff[] = {
 	REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
 	REG("smaps",      S_IRUGO, proc_pid_smaps_operations),
 	REG("pagemap",    S_IRUSR, proc_pagemap_operations),
+	REG("totmaps",    S_IRUGO, proc_totmaps_operations),
 #endif
 #ifdef CONFIG_SECURITY
 	DIR("attr",       S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, proc_attr_dir_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 7931c55..3bdafe8 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -298,6 +298,8 @@ extern const struct file_operations proc_pid_smaps_operations;
 extern const struct file_operations proc_tid_smaps_operations;
 extern const struct file_operations proc_clear_refs_operations;
 extern const struct file_operations proc_pagemap_operations;
+extern const struct file_operations proc_totmaps_operations;
+
 
 extern unsigned long task_vsize(struct mm_struct *);
 extern unsigned long task_statm(struct mm_struct *,
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 187d84e..f0f4fee 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -810,6 +810,75 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
 	return 0;
 }
 
+static void add_smaps_sum(struct mem_size_stats *mss,
+		struct mem_size_stats *mss_sum)
+{
+	mss_sum->resident += mss->resident;
+	mss_sum->pss += mss->pss;
+	mss_sum->shared_clean += mss->shared_clean;
+	mss_sum->shared_dirty += mss->shared_dirty;
+	mss_sum->private_clean += mss->private_clean;
+	mss_sum->private_dirty += mss->private_dirty;
+	mss_sum->referenced += mss->referenced;
+	mss_sum->anonymous += mss->anonymous;
+	mss_sum->anonymous_thp += mss->anonymous_thp;
+	mss_sum->swap += mss->swap;
+}
+
+static int totmaps_proc_show(struct seq_file *m, void *data)
+{
+	struct proc_maps_private *priv = m->private;
+	struct mm_struct *mm = priv->mm;
+	struct vm_area_struct *vma;
+	struct mem_size_stats mss_sum;
+
+	memset(&mss_sum, 0, sizeof(mss_sum));
+	down_read(&mm->mmap_sem);
+	hold_task_mempolicy(priv);
+
+	for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
+		struct mem_size_stats mss;
+		struct mm_walk smaps_walk = {
+			.pmd_entry = smaps_pte_range,
+			.mm = vma->vm_mm,
+			.private = &mss,
+		};
+
+		if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
+			memset(&mss, 0, sizeof(mss));
+			walk_page_vma(vma, &smaps_walk);
+			add_smaps_sum(&mss, &mss_sum);
+		}
+	}
+
+	release_task_mempolicy(priv);
+	up_read(&mm->mmap_sem);
+
+	seq_printf(m,
+		   "Rss:            %8lu kB\n"
+		   "Pss:            %8lu kB\n"
+		   "Shared_Clean:   %8lu kB\n"
+		   "Shared_Dirty:   %8lu kB\n"
+		   "Private_Clean:  %8lu kB\n"
+		   "Private_Dirty:  %8lu kB\n"
+		   "Referenced:     %8lu kB\n"
+		   "Anonymous:      %8lu kB\n"
+		   "AnonHugePages:  %8lu kB\n"
+		   "Swap:           %8lu kB\n",
+		   mss_sum.resident >> 10,
+		   (unsigned long)(mss_sum.pss >> (10 + PSS_SHIFT)),
+		   mss_sum.shared_clean  >> 10,
+		   mss_sum.shared_dirty  >> 10,
+		   mss_sum.private_clean >> 10,
+		   mss_sum.private_dirty >> 10,
+		   mss_sum.referenced >> 10,
+		   mss_sum.anonymous >> 10,
+		   mss_sum.anonymous_thp >> 10,
+		   mss_sum.swap >> 10);
+
+	return 0;
+}
+
 static int show_pid_smap(struct seq_file *m, void *v)
 {
 	return show_smap(m, v, 1);
@@ -820,6 +889,35 @@ static int show_tid_smap(struct seq_file *m, void *v)
 	return show_smap(m, v, 0);
 }
 
+static void *m_totmaps_start(struct seq_file *m, loff_t *ppos)
+{
+	struct proc_maps_private *priv = m->private;
+	struct mm_struct *mm;
+
+	mm = priv->mm;
+	if (!mm || !atomic_inc_not_zero(&mm->mm_users))
+		return NULL;
+
+	return NULL + (*ppos == 0);
+}
+
+static void *m_totmaps_next(struct seq_file *p, void *v, loff_t *pos)
+{
+	++*pos;
+	return NULL;
+}
+
+static void m_totmaps_stop(struct seq_file *p, void *v)
+{
+}
+
+static const struct seq_operations proc_totmaps_op = {
+	.start	= m_totmaps_start,
+	.next	= m_totmaps_next,
+	.stop	= m_totmaps_stop,
+	.show	= totmaps_proc_show
+};
+
 static const struct seq_operations proc_pid_smaps_op = {
 	.start	= m_start,
 	.next	= m_next,
@@ -844,6 +942,49 @@ static int tid_smaps_open(struct inode *inode, struct file *file)
 	return do_maps_open(inode, file, &proc_tid_smaps_op);
 }
 
+static int totmaps_open(struct inode *inode, struct file *file)
+{
+	struct proc_maps_private *priv = NULL;
+	struct seq_file *seq;
+	int ret;
+
+	ret = do_maps_open(inode, file, &proc_totmaps_op);
+	if (ret)
+		goto error;
+
+	/*
+	 * We need to grab references to the task_struct
+	 * at open time, because there's a potential information
+	 * leak where the totmaps file is opened and held open
+	 * while the underlying pid to task mapping changes
+	 * underneath it
+	 */
+	seq = file->private_data;
+	priv = seq->private;
+	priv->task = get_proc_task(inode);
+	if (!priv->task) {
+		ret = -ESRCH;
+		goto error_free;
+	}
+
+	return 0;
+
+error_free:
+	proc_map_release(inode, file);
+error:
+	return ret;
+}
+
+static int totmaps_release(struct inode *inode, struct file *file)
+{
+	struct seq_file *seq = file->private_data;
+	struct proc_maps_private *priv = seq->private;
+
+	put_task_struct(priv->task);
+
+	return proc_map_release(inode, file);
+}
+
 const struct file_operations proc_pid_smaps_operations = {
 	.open		= pid_smaps_open,
 	.read		= seq_read,
@@ -858,6 +999,13 @@ const struct file_operations proc_tid_smaps_operations = {
 	.release	= proc_map_release,
 };
 
+const struct file_operations proc_totmaps_operations = {
+	.open		= totmaps_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= totmaps_release,
+};
+
 enum clear_refs_types {
 	CLEAR_REFS_ALL = 1,
 	CLEAR_REFS_ANON,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 2/3] Documentation/filesystems: Fixed typo
@ 2016-09-05 20:14   ` robert.foss-ZGY8ohtN/8qB+jHODAdFcQ
  0 siblings, 0 replies; 43+ messages in thread
From: robert.foss @ 2016-09-05 20:14 UTC (permalink / raw)
  To: corbet, akpm, vbabka, hughd, mhocko, koct9i, n-horiguchi,
	robert.foss, kirill.shutemov, john.stultz, minchan, ross.zwisler,
	jmarchan, hannes, keescook, oleg, viro, mguzik, jdanis,
	calvinowens, adobriyan, ebiederm, sonnyrao, seth.forshee, tixxdz,
	linux-doc, linux-kernel, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Jann Horn, Michal Hocko, linux-api,
	Jacek Anaszewski

From: Robert Foss <robert.foss@collabora.com>

Fixed a -> an typo.

Signed-off-by: Robert Foss <robert.foss@collabora.com>
---
 Documentation/filesystems/proc.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 68080ad..fcc1ac0 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -145,7 +145,7 @@ Table 1-1: Process specific entries in /proc
 		symbol the task is blocked in - or "0" if not blocked.
  pagemap	Page table
  stack		Report full stack trace, enable via CONFIG_STACKTRACE
- smaps		a extension based on maps, showing the memory consumption of
+ smaps		an extension based on maps, showing the memory consumption of
 		each mapping and flags associated with it
  numa_maps	an extension based on maps, showing the memory locality and
 		binding policy as well as mem usage (in pages) of each mapping.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 2/3] Documentation/filesystems: Fixed typo
@ 2016-09-05 20:14   ` robert.foss-ZGY8ohtN/8qB+jHODAdFcQ
  0 siblings, 0 replies; 43+ messages in thread
From: robert.foss-ZGY8ohtN/8qB+jHODAdFcQ @ 2016-09-05 20:14 UTC (permalink / raw)
  To: corbet-T1hC0tSOHrs, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	vbabka-AlSwsSmVLrQ, hughd-hpIqsD4AKlfQT0dZR+AlfA,
	mhocko-IBi9RG/b67k, koct9i-Re5JQEeQqe8AvxtiuMwx3w,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ,
	robert.foss-ZGY8ohtN/8qB+jHODAdFcQ,
	kirill.shutemov-VuQAYsv1563Yd54FQh9/CA,
	john.stultz-QSEj5FYQhm4dnm+yROfE0A,
	minchan-DgEjT+Ai2ygdnm+yROfE0A,
	ross.zwisler-VuQAYsv1563Yd54FQh9/CA,
	jmarchan-H+wXaHxf7aLQT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w,
	keescook-F7+t8E8rja9g9hUCZPvPmw, oleg-H+wXaHxf7aLQT0dZR+AlfA,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	mguzik-H+wXaHxf7aLQT0dZR+AlfA, jdanis-hpIqsD4AKlfQT0dZR+AlfA,
	calvinowens-b10kYP2dOMg, adobriyan-Re5JQEeQqe8AvxtiuMwx3w,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, sonnyrao-F7+t8E8rja9g9hUCZPvPmw,
	seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw,
	tixxdz-Re5JQEeQqe8AvxtiuMwx3w, linux-doc-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Jann Horn, Michal Hocko

From: Robert Foss <robert.foss-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org>

Fixed a -> an typo.

Signed-off-by: Robert Foss <robert.foss-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org>
---
 Documentation/filesystems/proc.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 68080ad..fcc1ac0 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -145,7 +145,7 @@ Table 1-1: Process specific entries in /proc
 		symbol the task is blocked in - or "0" if not blocked.
  pagemap	Page table
  stack		Report full stack trace, enable via CONFIG_STACKTRACE
- smaps		a extension based on maps, showing the memory consumption of
+ smaps		an extension based on maps, showing the memory consumption of
 		each mapping and flags associated with it
  numa_maps	an extension based on maps, showing the memory locality and
 		binding policy as well as mem usage (in pages) of each mapping.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 3/3] Documentation/filesystems: Added /proc/PID/totmaps documentation
  2016-09-05 20:14 ` robert.foss
@ 2016-09-05 20:14   ` robert.foss
  -1 siblings, 0 replies; 43+ messages in thread
From: robert.foss @ 2016-09-05 20:14 UTC (permalink / raw)
  To: corbet, akpm, vbabka, hughd, mhocko, koct9i, n-horiguchi,
	robert.foss, kirill.shutemov, john.stultz, minchan, ross.zwisler,
	jmarchan, hannes, keescook, oleg, viro, mguzik, jdanis,
	calvinowens, adobriyan, ebiederm, sonnyrao, seth.forshee, tixxdz,
	linux-doc, linux-kernel, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Jann Horn, Michal Hocko, linux-api,
	Jacek Anaszewski

From: Robert Foss <robert.foss@collabora.com>

Added documentation covering /proc/PID/totmaps.

Signed-off-by: Robert Foss <robert.foss@collabora.com>
---
 Documentation/filesystems/proc.txt | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index fcc1ac0..49a8483 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -11,6 +11,7 @@ Version 1.3                                              Kernel version 2.2.12
 					      Kernel version 2.4.0-test11-pre4
 ------------------------------------------------------------------------------
 fixes/update part 1.1  Stefani Seibold <stefani@seibold.net>       June 9 2009
+add totmaps            Robert Foss <robert.foss@collabora.com>  August 12 2016
 
 Table of Contents
 -----------------
@@ -147,6 +148,8 @@ Table 1-1: Process specific entries in /proc
  stack		Report full stack trace, enable via CONFIG_STACKTRACE
  smaps		an extension based on maps, showing the memory consumption of
 		each mapping and flags associated with it
+ totmaps        an extension based on maps, showing the total memory
+                consumption of all mappings
  numa_maps	an extension based on maps, showing the memory locality and
 		binding policy as well as mem usage (in pages) of each mapping.
 ..............................................................................
@@ -515,6 +518,24 @@ be vanished or the reverse -- new added.
 This file is only present if the CONFIG_MMU kernel configuration option is
 enabled.
 
+The /proc/PID/totmaps is an extension based on maps, showing the memory
+consumption totals for all of the process's mappings. It lists the sums of the
+same statistics as /proc/PID/smaps.
+
+The process' mappings will be summarized as a series of lines like the
+following:
+
+Rss:                4256 kB
+Pss:                1170 kB
+Shared_Clean:       2720 kB
+Shared_Dirty:       1136 kB
+Private_Clean:         0 kB
+Private_Dirty:       400 kB
+Referenced:         4256 kB
+Anonymous:          1536 kB
+AnonHugePages:         0 kB
+Swap:                  0 kB
+
 The /proc/PID/clear_refs is used to reset the PG_Referenced and ACCESSED/YOUNG
 bits on both physical and virtual pages associated with a process, and the
 soft-dirty bit on pte (see Documentation/vm/soft-dirty.txt for details).
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 3/3] Documentation/filesystems: Added /proc/PID/totmaps documentation
@ 2016-09-05 20:14   ` robert.foss
  0 siblings, 0 replies; 43+ messages in thread
From: robert.foss @ 2016-09-05 20:14 UTC (permalink / raw)
  To: corbet, akpm, vbabka, hughd, mhocko, koct9i, n-horiguchi,
	robert.foss, kirill.shutemov, john.stultz, minchan, ross.zwisler,
	jmarchan, hannes, keescook, oleg, viro, mguzik, jdanis,
	calvinowens, adobriyan, ebiederm, sonnyrao, seth.forshee, tixxdz,
	linux-doc, linux-kernel, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Jann Horn, Michal Hocko

From: Robert Foss <robert.foss@collabora.com>

Added documentation covering /proc/PID/totmaps.

Signed-off-by: Robert Foss <robert.foss@collabora.com>
---
 Documentation/filesystems/proc.txt | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index fcc1ac0..49a8483 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -11,6 +11,7 @@ Version 1.3                                              Kernel version 2.2.12
 					      Kernel version 2.4.0-test11-pre4
 ------------------------------------------------------------------------------
 fixes/update part 1.1  Stefani Seibold <stefani@seibold.net>       June 9 2009
+add totmaps            Robert Foss <robert.foss@collabora.com>  August 12 2016
 
 Table of Contents
 -----------------
@@ -147,6 +148,8 @@ Table 1-1: Process specific entries in /proc
  stack		Report full stack trace, enable via CONFIG_STACKTRACE
  smaps		an extension based on maps, showing the memory consumption of
 		each mapping and flags associated with it
+ totmaps        an extension based on maps, showing the total memory
+                consumption of all mappings
  numa_maps	an extension based on maps, showing the memory locality and
 		binding policy as well as mem usage (in pages) of each mapping.
 ..............................................................................
@@ -515,6 +518,24 @@ be vanished or the reverse -- new added.
 This file is only present if the CONFIG_MMU kernel configuration option is
 enabled.
 
+The /proc/PID/totmaps is an extension based on maps, showing the memory
+consumption totals for all of the process's mappings. It lists the sums of the
+same statistics as /proc/PID/smaps.
+
+The process' mappings will be summarized as a series of lines like the
+following:
+
+Rss:                4256 kB
+Pss:                1170 kB
+Shared_Clean:       2720 kB
+Shared_Dirty:       1136 kB
+Private_Clean:         0 kB
+Private_Dirty:       400 kB
+Referenced:         4256 kB
+Anonymous:          1536 kB
+AnonHugePages:         0 kB
+Swap:                  0 kB
+
 The /proc/PID/clear_refs is used to reset the PG_Referenced and ACCESSED/YOUNG
 bits on both physical and virtual pages associated with a process, and the
 soft-dirty bit on pte (see Documentation/vm/soft-dirty.txt for details).
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 1/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-07 12:58     ` Oleg Nesterov
  0 siblings, 0 replies; 43+ messages in thread
From: Oleg Nesterov @ 2016-09-07 12:58 UTC (permalink / raw)
  To: robert.foss
  Cc: corbet, akpm, vbabka, hughd, mhocko, koct9i, n-horiguchi,
	kirill.shutemov, john.stultz, minchan, ross.zwisler, jmarchan,
	hannes, keescook, viro, mguzik, jdanis, calvinowens, adobriyan,
	ebiederm, sonnyrao, seth.forshee, tixxdz, linux-doc,
	linux-kernel, Ben Zhang, Bryan Freed, Filipe Brandenburger,
	Jann Horn, Michal Hocko, linux-api, Jacek Anaszewski

On 09/05, robert.foss@collabora.com wrote:
>
> @@ -2854,6 +2854,7 @@ static const struct pid_entry tgid_base_stuff[] = {
>  	REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
>  	REG("smaps",      S_IRUGO, proc_pid_smaps_operations),
>  	REG("pagemap",    S_IRUSR, proc_pagemap_operations),
> +	REG("totmaps",    S_IRUGO, proc_totmaps_operations),

I must have missed something, but I fail to understand why this patch
is so complicated.

Just use ONE("totmaps", S_IRUGO, proc_totmaps_operations) ?

> +static int totmaps_proc_show(struct seq_file *m, void *data)
> +{
> +	struct proc_maps_private *priv = m->private;
> +	struct mm_struct *mm = priv->mm;
> +	struct vm_area_struct *vma;
> +	struct mem_size_stats mss_sum;
> +
> +	memset(&mss_sum, 0, sizeof(mss_sum));
> +	down_read(&mm->mmap_sem);
> +	hold_task_mempolicy(priv);
        ^^^^^^^^^^^^^^^^^^^^^^^^^
why?

> +	for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {

Hmm. the usage of ->tail_vma looks just wrong. I guess the code should
work because it is NULL but still.

> +		struct mem_size_stats mss;
> +		struct mm_walk smaps_walk = {
> +			.pmd_entry = smaps_pte_range,
> +			.mm = vma->vm_mm,
> +			.private = &mss,
> +		};
> +
> +		if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
> +			memset(&mss, 0, sizeof(mss));
> +			walk_page_vma(vma, &smaps_walk);
> +			add_smaps_sum(&mss, &mss_sum);
> +		}
> +	}

Why? I mean, why not walk_page_range() ? You do not need this for-each-vma
loop at all? At least if you change this patch to use the ONE() helper, and
everything else looks unneeded in this case.

Oleg.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 1/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-07 12:58     ` Oleg Nesterov
  0 siblings, 0 replies; 43+ messages in thread
From: Oleg Nesterov @ 2016-09-07 12:58 UTC (permalink / raw)
  To: robert.foss-ZGY8ohtN/8qB+jHODAdFcQ
  Cc: corbet-T1hC0tSOHrs, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	vbabka-AlSwsSmVLrQ, hughd-hpIqsD4AKlfQT0dZR+AlfA,
	mhocko-IBi9RG/b67k, koct9i-Re5JQEeQqe8AvxtiuMwx3w,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ,
	kirill.shutemov-VuQAYsv1563Yd54FQh9/CA,
	john.stultz-QSEj5FYQhm4dnm+yROfE0A,
	minchan-DgEjT+Ai2ygdnm+yROfE0A,
	ross.zwisler-VuQAYsv1563Yd54FQh9/CA,
	jmarchan-H+wXaHxf7aLQT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w,
	keescook-F7+t8E8rja9g9hUCZPvPmw,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	mguzik-H+wXaHxf7aLQT0dZR+AlfA, jdanis-hpIqsD4AKlfQT0dZR+AlfA,
	calvinowens-b10kYP2dOMg, adobriyan-Re5JQEeQqe8AvxtiuMwx3w,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, sonnyrao-F7+t8E8rja9g9hUCZPvPmw,
	seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw,
	tixxdz-Re5JQEeQqe8AvxtiuMwx3w, linux-doc-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Jann Horn, Michal Hocko,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Jacek

On 09/05, robert.foss-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org wrote:
>
> @@ -2854,6 +2854,7 @@ static const struct pid_entry tgid_base_stuff[] = {
>  	REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
>  	REG("smaps",      S_IRUGO, proc_pid_smaps_operations),
>  	REG("pagemap",    S_IRUSR, proc_pagemap_operations),
> +	REG("totmaps",    S_IRUGO, proc_totmaps_operations),

I must have missed something, but I fail to understand why this patch
is so complicated.

Just use ONE("totmaps", S_IRUGO, proc_totmaps_operations) ?

> +static int totmaps_proc_show(struct seq_file *m, void *data)
> +{
> +	struct proc_maps_private *priv = m->private;
> +	struct mm_struct *mm = priv->mm;
> +	struct vm_area_struct *vma;
> +	struct mem_size_stats mss_sum;
> +
> +	memset(&mss_sum, 0, sizeof(mss_sum));
> +	down_read(&mm->mmap_sem);
> +	hold_task_mempolicy(priv);
        ^^^^^^^^^^^^^^^^^^^^^^^^^
why?

> +	for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {

Hmm. the usage of ->tail_vma looks just wrong. I guess the code should
work because it is NULL but still.

> +		struct mem_size_stats mss;
> +		struct mm_walk smaps_walk = {
> +			.pmd_entry = smaps_pte_range,
> +			.mm = vma->vm_mm,
> +			.private = &mss,
> +		};
> +
> +		if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
> +			memset(&mss, 0, sizeof(mss));
> +			walk_page_vma(vma, &smaps_walk);
> +			add_smaps_sum(&mss, &mss_sum);
> +		}
> +	}

Why? I mean, why not walk_page_range() ? You do not need this for-each-vma
loop at all? At least if you change this patch to use the ONE() helper, and
everything else looks unneeded in this case.

Oleg.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 2/3] Documentation/filesystems: Fixed typo
  2016-09-05 20:14   ` robert.foss-ZGY8ohtN/8qB+jHODAdFcQ
@ 2016-09-07 23:22     ` Kees Cook
  -1 siblings, 0 replies; 43+ messages in thread
From: Kees Cook @ 2016-09-07 23:22 UTC (permalink / raw)
  To: robert.foss, Jonathan Corbet
  Cc: Andrew Morton, Vlastimil Babka, Hugh Dickins, Michal Hocko,
	Konstantin Khlebnikov, n-horiguchi, Kirill A. Shutemov,
	John Stultz, Minchan Kim, ross.zwisler, Jerome Marchand,
	Johannes Weiner, Oleg Nesterov, Al Viro, mguzik,
	Janis Danisevskis, Calvin Owens, Alexey Dobriyan,
	Eric W. Biederman, Sonny Rao, Seth Forshee, Djalal Harouni,
	linux-doc, LKML, Ben Zhang, Bryan Freed, Filipe Brandenburger,
	Jann Horn, Michal Hocko, Linux API, Jacek Anaszewski

On Mon, Sep 5, 2016 at 1:14 PM,  <robert.foss@collabora.com> wrote:
> From: Robert Foss <robert.foss@collabora.com>
>
> Fixed a -> an typo.
>
> Signed-off-by: Robert Foss <robert.foss@collabora.com>

Acked-by: Kees Cook <keescook@chromium.org>

This could be taken directly into the docs tree, I think -- no reason
to make it depend on the rest of the series.

-Kees

> ---
>  Documentation/filesystems/proc.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
> index 68080ad..fcc1ac0 100644
> --- a/Documentation/filesystems/proc.txt
> +++ b/Documentation/filesystems/proc.txt
> @@ -145,7 +145,7 @@ Table 1-1: Process specific entries in /proc
>                 symbol the task is blocked in - or "0" if not blocked.
>   pagemap       Page table
>   stack         Report full stack trace, enable via CONFIG_STACKTRACE
> - smaps         a extension based on maps, showing the memory consumption of
> + smaps         an extension based on maps, showing the memory consumption of
>                 each mapping and flags associated with it
>   numa_maps     an extension based on maps, showing the memory locality and
>                 binding policy as well as mem usage (in pages) of each mapping.
> --
> 2.7.4
>



-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 2/3] Documentation/filesystems: Fixed typo
@ 2016-09-07 23:22     ` Kees Cook
  0 siblings, 0 replies; 43+ messages in thread
From: Kees Cook @ 2016-09-07 23:22 UTC (permalink / raw)
  To: robert.foss, Jonathan Corbet
  Cc: Andrew Morton, Vlastimil Babka, Hugh Dickins, Michal Hocko,
	Konstantin Khlebnikov, n-horiguchi, Kirill A. Shutemov,
	John Stultz, Minchan Kim, ross.zwisler, Jerome Marchand,
	Johannes Weiner, Oleg Nesterov, Al Viro, mguzik,
	Janis Danisevskis, Calvin Owens, Alexey Dobriyan,
	Eric W. Biederman, Sonny Rao, Seth Forshee

On Mon, Sep 5, 2016 at 1:14 PM,  <robert.foss@collabora.com> wrote:
> From: Robert Foss <robert.foss@collabora.com>
>
> Fixed a -> an typo.
>
> Signed-off-by: Robert Foss <robert.foss@collabora.com>

Acked-by: Kees Cook <keescook@chromium.org>

This could be taken directly into the docs tree, I think -- no reason
to make it depend on the rest of the series.

-Kees

> ---
>  Documentation/filesystems/proc.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
> index 68080ad..fcc1ac0 100644
> --- a/Documentation/filesystems/proc.txt
> +++ b/Documentation/filesystems/proc.txt
> @@ -145,7 +145,7 @@ Table 1-1: Process specific entries in /proc
>                 symbol the task is blocked in - or "0" if not blocked.
>   pagemap       Page table
>   stack         Report full stack trace, enable via CONFIG_STACKTRACE
> - smaps         a extension based on maps, showing the memory consumption of
> + smaps         an extension based on maps, showing the memory consumption of
>                 each mapping and flags associated with it
>   numa_maps     an extension based on maps, showing the memory locality and
>                 binding policy as well as mem usage (in pages) of each mapping.
> --
> 2.7.4
>



-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 2/3] Documentation/filesystems: Fixed typo
@ 2016-09-08  0:22       ` Robert Foss
  0 siblings, 0 replies; 43+ messages in thread
From: Robert Foss @ 2016-09-08  0:22 UTC (permalink / raw)
  To: Kees Cook, Jonathan Corbet
  Cc: Andrew Morton, Vlastimil Babka, Hugh Dickins, Michal Hocko,
	Konstantin Khlebnikov, n-horiguchi, Kirill A. Shutemov,
	John Stultz, Minchan Kim, ross.zwisler, Jerome Marchand,
	Johannes Weiner, Oleg Nesterov, Al Viro, mguzik,
	Janis Danisevskis, Calvin Owens, Alexey Dobriyan,
	Eric W. Biederman, Sonny Rao, Seth Forshee, Djalal Harouni,
	linux-doc, LKML, Ben Zhang, Bryan Freed, Filipe Brandenburger,
	Jann Horn, Michal Hocko, Linux API, Jacek Anaszewski



On 2016-09-07 07:22 PM, Kees Cook wrote:
> On Mon, Sep 5, 2016 at 1:14 PM,  <robert.foss@collabora.com> wrote:
>> From: Robert Foss <robert.foss@collabora.com>
>>
>> Fixed a -> an typo.
>>
>> Signed-off-by: Robert Foss <robert.foss@collabora.com>
>
> Acked-by: Kees Cook <keescook@chromium.org>
>
> This could be taken directly into the docs tree, I think -- no reason
> to make it depend on the rest of the series.

Agreed. Would you like a separate submission for that?


Rob.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 2/3] Documentation/filesystems: Fixed typo
@ 2016-09-08  0:22       ` Robert Foss
  0 siblings, 0 replies; 43+ messages in thread
From: Robert Foss @ 2016-09-08  0:22 UTC (permalink / raw)
  To: Kees Cook, Jonathan Corbet
  Cc: Andrew Morton, Vlastimil Babka, Hugh Dickins, Michal Hocko,
	Konstantin Khlebnikov, n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ,
	Kirill A. Shutemov, John Stultz, Minchan Kim,
	ross.zwisler-VuQAYsv1563Yd54FQh9/CA, Jerome Marchand,
	Johannes Weiner, Oleg Nesterov, Al Viro,
	mguzik-H+wXaHxf7aLQT0dZR+AlfA, Janis Danisevskis, Calvin Owens,
	Alexey Dobriyan, Eric W. Biederman, Sonny Rao, Seth Forshee



On 2016-09-07 07:22 PM, Kees Cook wrote:
> On Mon, Sep 5, 2016 at 1:14 PM,  <robert.foss-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org> wrote:
>> From: Robert Foss <robert.foss-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org>
>>
>> Fixed a -> an typo.
>>
>> Signed-off-by: Robert Foss <robert.foss-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org>
>
> Acked-by: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
>
> This could be taken directly into the docs tree, I think -- no reason
> to make it depend on the rest of the series.

Agreed. Would you like a separate submission for that?


Rob.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 2/3] Documentation/filesystems: Fixed typo
  2016-09-08  0:22       ` Robert Foss
@ 2016-09-08  6:23         ` Jonathan Corbet
  -1 siblings, 0 replies; 43+ messages in thread
From: Jonathan Corbet @ 2016-09-08  6:23 UTC (permalink / raw)
  To: Robert Foss
  Cc: Kees Cook, Andrew Morton, Vlastimil Babka, Hugh Dickins,
	Michal Hocko, Konstantin Khlebnikov, n-horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	Jerome Marchand, Johannes Weiner, Oleg Nesterov, Al Viro, mguzik,
	Janis Danisevskis, Calvin Owens, Alexey Dobriyan,
	Eric W. Biederman, Sonny Rao, Seth Forshee, Djalal Harouni,
	linux-doc, LKML, Ben Zhang, Bryan Freed, Filipe Brandenburger,
	Jann Horn, Michal Hocko, Linux API, Jacek Anaszewski

On Wed, 7 Sep 2016 20:22:00 -0400
Robert Foss <robert.foss@collabora.com> wrote:

> > This could be taken directly into the docs tree, I think -- no reason
> > to make it depend on the rest of the series.  
> 
> Agreed. Would you like a separate submission for that?

Please, I've lost track of the original...

Thanks,

jon

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 2/3] Documentation/filesystems: Fixed typo
@ 2016-09-08  6:23         ` Jonathan Corbet
  0 siblings, 0 replies; 43+ messages in thread
From: Jonathan Corbet @ 2016-09-08  6:23 UTC (permalink / raw)
  To: Robert Foss
  Cc: Kees Cook, Andrew Morton, Vlastimil Babka, Hugh Dickins,
	Michal Hocko, Konstantin Khlebnikov, n-horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	Jerome Marchand, Johannes Weiner, Oleg Nesterov, Al Viro, mguzik,
	Janis Danisevskis, Calvin Owens, Alexey Dobriyan,
	Eric W. Biederman, Sonny Rao

On Wed, 7 Sep 2016 20:22:00 -0400
Robert Foss <robert.foss@collabora.com> wrote:

> > This could be taken directly into the docs tree, I think -- no reason
> > to make it depend on the rest of the series.  
> 
> Agreed. Would you like a separate submission for that?

Please, I've lost track of the original...

Thanks,

jon

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
  2016-09-05 20:14 ` robert.foss
                   ` (3 preceding siblings ...)
  (?)
@ 2016-09-12 12:02 ` Michal Hocko
  2016-09-12 15:31     ` Sonny Rao
  -1 siblings, 1 reply; 43+ messages in thread
From: Michal Hocko @ 2016-09-12 12:02 UTC (permalink / raw)
  To: robert.foss
  Cc: corbet, akpm, vbabka, hughd, koct9i, n-horiguchi,
	kirill.shutemov, john.stultz, minchan, ross.zwisler, jmarchan,
	hannes, keescook, oleg, viro, mguzik, jdanis, calvinowens,
	adobriyan, ebiederm, sonnyrao, seth.forshee, tixxdz, linux-doc,
	linux-kernel, Ben Zhang, Bryan Freed, Filipe Brandenburger,
	Jann Horn, linux-api, Jacek Anaszewski

On Mon 05-09-16 16:14:06, robert.foss@collabora.com wrote:
> From: Robert Foss <robert.foss@collabora.com>
> 
> This series provides the /proc/PID/totmaps feature, which
> summarizes the information provided by /proc/PID/smaps for
> improved performance and usability reasons.
> 
> A use case is to speed up monitoring of memory consumption in
> environments where RSS isn't precise.
> 
> For example Chrome tends to many processes which have hundreds of VMAs
> with a substantial amount of shared memory, and the error of using
> RSS rather than PSS tends to be very large when looking at overall
> memory consumption.  PSS isn't kept as a single number that's exported
> like RSS, so to calculate PSS means having to parse a very large smaps
> file.
> 
> This process is slow and has to be repeated for many processes, and we
> found that the just act of doing the parsing was taking up a
> significant amount of CPU time, so this patch is an attempt to make
> that process cheaper.

I still maintain my concerns about a single pss value. It might work in
a very specific situations where the consumer knows what is shared but
other than that the value can be more misleading than helpful. So a NACK
from me until I am shown that this is usable in general and still
helpful.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-12 15:31     ` Sonny Rao
  0 siblings, 0 replies; 43+ messages in thread
From: Sonny Rao @ 2016-09-12 15:31 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Robert Foss, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, seth.forshee, tixxdz, linux-doc, linux-kernel,
	Ben Zhang, Bryan Freed, Filipe Brandenburger, Jann Horn,
	linux-api, Jacek Anaszewski

On Mon, Sep 12, 2016 at 5:02 AM, Michal Hocko <mhocko@kernel.org> wrote:
> On Mon 05-09-16 16:14:06, robert.foss@collabora.com wrote:
>> From: Robert Foss <robert.foss@collabora.com>
>>
>> This series provides the /proc/PID/totmaps feature, which
>> summarizes the information provided by /proc/PID/smaps for
>> improved performance and usability reasons.
>>
>> A use case is to speed up monitoring of memory consumption in
>> environments where RSS isn't precise.
>>
>> For example Chrome tends to many processes which have hundreds of VMAs
>> with a substantial amount of shared memory, and the error of using
>> RSS rather than PSS tends to be very large when looking at overall
>> memory consumption.  PSS isn't kept as a single number that's exported
>> like RSS, so to calculate PSS means having to parse a very large smaps
>> file.
>>
>> This process is slow and has to be repeated for many processes, and we
>> found that the just act of doing the parsing was taking up a
>> significant amount of CPU time, so this patch is an attempt to make
>> that process cheaper.
>
> I still maintain my concerns about a single pss value. It might work in
> a very specific situations where the consumer knows what is shared but
> other than that the value can be more misleading than helpful. So a NACK
> from me until I am shown that this is usable in general and still
> helpful.

I know you think Pss isn't useful in general (though I'll point out
two other independent people said they found it useful) but how about
the other fields like Swap, Private_Dirty and Private_Shared?

If we removed Pss would you still NACK it?

>
> --
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-12 15:31     ` Sonny Rao
  0 siblings, 0 replies; 43+ messages in thread
From: Sonny Rao @ 2016-09-12 15:31 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Robert Foss, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim,
	ross.zwisler-VuQAYsv1563Yd54FQh9/CA,
	jmarchan-H+wXaHxf7aLQT0dZR+AlfA, Johannes Weiner, Kees Cook,
	oleg-H+wXaHxf7aLQT0dZR+AlfA, Al Viro, Mateusz Guzik,
	Janis Danisevskis, calvinowens-b10kYP2dOMg, Alexey Dobriyan,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw, tixxdz

On Mon, Sep 12, 2016 at 5:02 AM, Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Mon 05-09-16 16:14:06, robert.foss-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org wrote:
>> From: Robert Foss <robert.foss-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org>
>>
>> This series provides the /proc/PID/totmaps feature, which
>> summarizes the information provided by /proc/PID/smaps for
>> improved performance and usability reasons.
>>
>> A use case is to speed up monitoring of memory consumption in
>> environments where RSS isn't precise.
>>
>> For example Chrome tends to many processes which have hundreds of VMAs
>> with a substantial amount of shared memory, and the error of using
>> RSS rather than PSS tends to be very large when looking at overall
>> memory consumption.  PSS isn't kept as a single number that's exported
>> like RSS, so to calculate PSS means having to parse a very large smaps
>> file.
>>
>> This process is slow and has to be repeated for many processes, and we
>> found that the just act of doing the parsing was taking up a
>> significant amount of CPU time, so this patch is an attempt to make
>> that process cheaper.
>
> I still maintain my concerns about a single pss value. It might work in
> a very specific situations where the consumer knows what is shared but
> other than that the value can be more misleading than helpful. So a NACK
> from me until I am shown that this is usable in general and still
> helpful.

I know you think Pss isn't useful in general (though I'll point out
two other independent people said they found it useful) but how about
the other fields like Swap, Private_Dirty and Private_Shared?

If we removed Pss would you still NACK it?

>
> --
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
  2016-09-12 15:31     ` Sonny Rao
@ 2016-09-12 17:15       ` Michal Hocko
  -1 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2016-09-12 17:15 UTC (permalink / raw)
  To: Sonny Rao
  Cc: Robert Foss, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, seth.forshee, tixxdz, linux-doc, linux-kernel,
	Ben Zhang, Bryan Freed, Filipe Brandenburger, Jann Horn,
	linux-api, Jacek Anaszewski

On Mon 12-09-16 08:31:36, Sonny Rao wrote:
> On Mon, Sep 12, 2016 at 5:02 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Mon 05-09-16 16:14:06, robert.foss@collabora.com wrote:
> >> From: Robert Foss <robert.foss@collabora.com>
> >>
> >> This series provides the /proc/PID/totmaps feature, which
> >> summarizes the information provided by /proc/PID/smaps for
> >> improved performance and usability reasons.
> >>
> >> A use case is to speed up monitoring of memory consumption in
> >> environments where RSS isn't precise.
> >>
> >> For example Chrome tends to many processes which have hundreds of VMAs
> >> with a substantial amount of shared memory, and the error of using
> >> RSS rather than PSS tends to be very large when looking at overall
> >> memory consumption.  PSS isn't kept as a single number that's exported
> >> like RSS, so to calculate PSS means having to parse a very large smaps
> >> file.
> >>
> >> This process is slow and has to be repeated for many processes, and we
> >> found that the just act of doing the parsing was taking up a
> >> significant amount of CPU time, so this patch is an attempt to make
> >> that process cheaper.
> >
> > I still maintain my concerns about a single pss value. It might work in
> > a very specific situations where the consumer knows what is shared but
> > other than that the value can be more misleading than helpful. So a NACK
> > from me until I am shown that this is usable in general and still
> > helpful.
> 
> I know you think Pss isn't useful in general (though I'll point out
> two other independent people said they found it useful)

sure, and one of them admitted that the value is useful because they
_know_ the resource. The other was quite vague for me to understand
all the details. Please, try to understand that once you provide a user
API then it will carved in stone. If the interface is poor and ambigous
it will bite us later. One very specific usecase doesn't justify
something that might be really misleading for 90% of cases.

> but how about the other fields like Swap, Private_Dirty and
> Private_Shared?

Private_Shared can be pretty confusing as well without the whole context
as well see my other emails in the original thread (just to remind
shmem/tmpfs makes all this really confusing).

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-12 17:15       ` Michal Hocko
  0 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2016-09-12 17:15 UTC (permalink / raw)
  To: Sonny Rao
  Cc: Robert Foss, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, seth.forshee, tixxdz

On Mon 12-09-16 08:31:36, Sonny Rao wrote:
> On Mon, Sep 12, 2016 at 5:02 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Mon 05-09-16 16:14:06, robert.foss@collabora.com wrote:
> >> From: Robert Foss <robert.foss@collabora.com>
> >>
> >> This series provides the /proc/PID/totmaps feature, which
> >> summarizes the information provided by /proc/PID/smaps for
> >> improved performance and usability reasons.
> >>
> >> A use case is to speed up monitoring of memory consumption in
> >> environments where RSS isn't precise.
> >>
> >> For example Chrome tends to many processes which have hundreds of VMAs
> >> with a substantial amount of shared memory, and the error of using
> >> RSS rather than PSS tends to be very large when looking at overall
> >> memory consumption.  PSS isn't kept as a single number that's exported
> >> like RSS, so to calculate PSS means having to parse a very large smaps
> >> file.
> >>
> >> This process is slow and has to be repeated for many processes, and we
> >> found that the just act of doing the parsing was taking up a
> >> significant amount of CPU time, so this patch is an attempt to make
> >> that process cheaper.
> >
> > I still maintain my concerns about a single pss value. It might work in
> > a very specific situations where the consumer knows what is shared but
> > other than that the value can be more misleading than helpful. So a NACK
> > from me until I am shown that this is usable in general and still
> > helpful.
> 
> I know you think Pss isn't useful in general (though I'll point out
> two other independent people said they found it useful)

sure, and one of them admitted that the value is useful because they
_know_ the resource. The other was quite vague for me to understand
all the details. Please, try to understand that once you provide a user
API then it will carved in stone. If the interface is poor and ambigous
it will bite us later. One very specific usecase doesn't justify
something that might be really misleading for 90% of cases.

> but how about the other fields like Swap, Private_Dirty and
> Private_Shared?

Private_Shared can be pretty confusing as well without the whole context
as well see my other emails in the original thread (just to remind
shmem/tmpfs makes all this really confusing).

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-12 17:28         ` Sonny Rao
  0 siblings, 0 replies; 43+ messages in thread
From: Sonny Rao @ 2016-09-12 17:28 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Robert Foss, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, seth.forshee, tixxdz, linux-doc, linux-kernel,
	Ben Zhang, Bryan Freed, Filipe Brandenburger, Jann Horn,
	linux-api, Jacek Anaszewski

On Mon, Sep 12, 2016 at 10:15 AM, Michal Hocko <mhocko@kernel.org> wrote:
> On Mon 12-09-16 08:31:36, Sonny Rao wrote:
>> On Mon, Sep 12, 2016 at 5:02 AM, Michal Hocko <mhocko@kernel.org> wrote:
>> > On Mon 05-09-16 16:14:06, robert.foss@collabora.com wrote:
>> >> From: Robert Foss <robert.foss@collabora.com>
>> >>
>> >> This series provides the /proc/PID/totmaps feature, which
>> >> summarizes the information provided by /proc/PID/smaps for
>> >> improved performance and usability reasons.
>> >>
>> >> A use case is to speed up monitoring of memory consumption in
>> >> environments where RSS isn't precise.
>> >>
>> >> For example Chrome tends to many processes which have hundreds of VMAs
>> >> with a substantial amount of shared memory, and the error of using
>> >> RSS rather than PSS tends to be very large when looking at overall
>> >> memory consumption.  PSS isn't kept as a single number that's exported
>> >> like RSS, so to calculate PSS means having to parse a very large smaps
>> >> file.
>> >>
>> >> This process is slow and has to be repeated for many processes, and we
>> >> found that the just act of doing the parsing was taking up a
>> >> significant amount of CPU time, so this patch is an attempt to make
>> >> that process cheaper.
>> >
>> > I still maintain my concerns about a single pss value. It might work in
>> > a very specific situations where the consumer knows what is shared but
>> > other than that the value can be more misleading than helpful. So a NACK
>> > from me until I am shown that this is usable in general and still
>> > helpful.
>>
>> I know you think Pss isn't useful in general (though I'll point out
>> two other independent people said they found it useful)
>
> sure, and one of them admitted that the value is useful because they
> _know_ the resource. The other was quite vague for me to understand
> all the details. Please, try to understand that once you provide a user
> API then it will carved in stone. If the interface is poor and ambigous
> it will bite us later. One very specific usecase doesn't justify
> something that might be really misleading for 90% of cases.
>
>> but how about the other fields like Swap, Private_Dirty and
>> Private_Shared?
>
> Private_Shared can be pretty confusing as well without the whole context
> as well see my other emails in the original thread (just to remind
> shmem/tmpfs makes all this really confusing).

But this is exactly the issue -- RSS is can be just as confusing if
you don't know something about the application.  I think the issue is
how common that situation is, and you seem to believe that it's so
uncommon that it's actually better to keep the information more
difficult to get for those of us who know something about our systems.

That's fine, I guess we just have to disagree here, thanks for look at this.

>
> --
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-12 17:28         ` Sonny Rao
  0 siblings, 0 replies; 43+ messages in thread
From: Sonny Rao @ 2016-09-12 17:28 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Robert Foss, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim,
	ross.zwisler-VuQAYsv1563Yd54FQh9/CA,
	jmarchan-H+wXaHxf7aLQT0dZR+AlfA, Johannes Weiner, Kees Cook,
	oleg-H+wXaHxf7aLQT0dZR+AlfA, Al Viro, Mateusz Guzik,
	Janis Danisevskis, calvinowens-b10kYP2dOMg, Alexey Dobriyan,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw, tixxdz

On Mon, Sep 12, 2016 at 10:15 AM, Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Mon 12-09-16 08:31:36, Sonny Rao wrote:
>> On Mon, Sep 12, 2016 at 5:02 AM, Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>> > On Mon 05-09-16 16:14:06, robert.foss-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org wrote:
>> >> From: Robert Foss <robert.foss-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org>
>> >>
>> >> This series provides the /proc/PID/totmaps feature, which
>> >> summarizes the information provided by /proc/PID/smaps for
>> >> improved performance and usability reasons.
>> >>
>> >> A use case is to speed up monitoring of memory consumption in
>> >> environments where RSS isn't precise.
>> >>
>> >> For example Chrome tends to many processes which have hundreds of VMAs
>> >> with a substantial amount of shared memory, and the error of using
>> >> RSS rather than PSS tends to be very large when looking at overall
>> >> memory consumption.  PSS isn't kept as a single number that's exported
>> >> like RSS, so to calculate PSS means having to parse a very large smaps
>> >> file.
>> >>
>> >> This process is slow and has to be repeated for many processes, and we
>> >> found that the just act of doing the parsing was taking up a
>> >> significant amount of CPU time, so this patch is an attempt to make
>> >> that process cheaper.
>> >
>> > I still maintain my concerns about a single pss value. It might work in
>> > a very specific situations where the consumer knows what is shared but
>> > other than that the value can be more misleading than helpful. So a NACK
>> > from me until I am shown that this is usable in general and still
>> > helpful.
>>
>> I know you think Pss isn't useful in general (though I'll point out
>> two other independent people said they found it useful)
>
> sure, and one of them admitted that the value is useful because they
> _know_ the resource. The other was quite vague for me to understand
> all the details. Please, try to understand that once you provide a user
> API then it will carved in stone. If the interface is poor and ambigous
> it will bite us later. One very specific usecase doesn't justify
> something that might be really misleading for 90% of cases.
>
>> but how about the other fields like Swap, Private_Dirty and
>> Private_Shared?
>
> Private_Shared can be pretty confusing as well without the whole context
> as well see my other emails in the original thread (just to remind
> shmem/tmpfs makes all this really confusing).

But this is exactly the issue -- RSS is can be just as confusing if
you don't know something about the application.  I think the issue is
how common that situation is, and you seem to believe that it's so
uncommon that it's actually better to keep the information more
difficult to get for those of us who know something about our systems.

That's fine, I guess we just have to disagree here, thanks for look at this.

>
> --
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 1/3] mm, proc: Implement /proc/<pid>/totmaps
  2016-09-07 12:58     ` Oleg Nesterov
@ 2016-09-12 22:12       ` Robert Foss
  -1 siblings, 0 replies; 43+ messages in thread
From: Robert Foss @ 2016-09-12 22:12 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: corbet, akpm, vbabka, hughd, mhocko, koct9i, n-horiguchi,
	kirill.shutemov, john.stultz, minchan, ross.zwisler, jmarchan,
	hannes, keescook, viro, mguzik, jdanis, calvinowens, adobriyan,
	ebiederm, sonnyrao, seth.forshee, tixxdz, linux-doc,
	linux-kernel, Ben Zhang, Bryan Freed, Filipe Brandenburger,
	Jann Horn, Michal Hocko, linux-api, Jacek Anaszewski

Hey Oleg!

Thanks for the feedback, I'll keep it in mind, but currently it looks 
like the patch is on ice for non-implementation related reasons.


Rob.

>>
>> @@ -2854,6 +2854,7 @@ static const struct pid_entry tgid_base_stuff[] = {
>>  	REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
>>  	REG("smaps",      S_IRUGO, proc_pid_smaps_operations),
>>  	REG("pagemap",    S_IRUSR, proc_pagemap_operations),
>> +	REG("totmaps",    S_IRUGO, proc_totmaps_operations),
>
> I must have missed something, but I fail to understand why this patch
> is so complicated.
>
> Just use ONE("totmaps", S_IRUGO, proc_totmaps_operations) ?
>
>> +static int totmaps_proc_show(struct seq_file *m, void *data)
>> +{
>> +	struct proc_maps_private *priv = m->private;
>> +	struct mm_struct *mm = priv->mm;
>> +	struct vm_area_struct *vma;
>> +	struct mem_size_stats mss_sum;
>> +
>> +	memset(&mss_sum, 0, sizeof(mss_sum));
>> +	down_read(&mm->mmap_sem);
>> +	hold_task_mempolicy(priv);
>         ^^^^^^^^^^^^^^^^^^^^^^^^^
> why?
>
>> +	for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
>
> Hmm. the usage of ->tail_vma looks just wrong. I guess the code should
> work because it is NULL but still.
>
>> +		struct mem_size_stats mss;
>> +		struct mm_walk smaps_walk = {
>> +			.pmd_entry = smaps_pte_range,
>> +			.mm = vma->vm_mm,
>> +			.private = &mss,
>> +		};
>> +
>> +		if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
>> +			memset(&mss, 0, sizeof(mss));
>> +			walk_page_vma(vma, &smaps_walk);
>> +			add_smaps_sum(&mss, &mss_sum);
>> +		}
>> +	}
>
> Why? I mean, why not walk_page_range() ? You do not need this for-each-vma
> loop at all? At least if you change this patch to use the ONE() helper, and
> everything else looks unneeded in this case.
>
> Oleg.
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 1/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-12 22:12       ` Robert Foss
  0 siblings, 0 replies; 43+ messages in thread
From: Robert Foss @ 2016-09-12 22:12 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: corbet, akpm, vbabka, hughd, mhocko, koct9i, n-horiguchi,
	kirill.shutemov, john.stultz, minchan, ross.zwisler, jmarchan,
	hannes, keescook, viro, mguzik, jdanis, calvinowens, adobriyan,
	ebiederm, sonnyrao, seth.forshee, tixxdz, linux-doc,
	linux-kernel, Ben Zhang, Bryan Freed, Filipe Brandenburger,
	Jann Horn, Michal Hocko, linux-api, Jacek

Hey Oleg!

Thanks for the feedback, I'll keep it in mind, but currently it looks 
like the patch is on ice for non-implementation related reasons.


Rob.

>>
>> @@ -2854,6 +2854,7 @@ static const struct pid_entry tgid_base_stuff[] = {
>>  	REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
>>  	REG("smaps",      S_IRUGO, proc_pid_smaps_operations),
>>  	REG("pagemap",    S_IRUSR, proc_pagemap_operations),
>> +	REG("totmaps",    S_IRUGO, proc_totmaps_operations),
>
> I must have missed something, but I fail to understand why this patch
> is so complicated.
>
> Just use ONE("totmaps", S_IRUGO, proc_totmaps_operations) ?
>
>> +static int totmaps_proc_show(struct seq_file *m, void *data)
>> +{
>> +	struct proc_maps_private *priv = m->private;
>> +	struct mm_struct *mm = priv->mm;
>> +	struct vm_area_struct *vma;
>> +	struct mem_size_stats mss_sum;
>> +
>> +	memset(&mss_sum, 0, sizeof(mss_sum));
>> +	down_read(&mm->mmap_sem);
>> +	hold_task_mempolicy(priv);
>         ^^^^^^^^^^^^^^^^^^^^^^^^^
> why?
>
>> +	for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
>
> Hmm. the usage of ->tail_vma looks just wrong. I guess the code should
> work because it is NULL but still.
>
>> +		struct mem_size_stats mss;
>> +		struct mm_walk smaps_walk = {
>> +			.pmd_entry = smaps_pte_range,
>> +			.mm = vma->vm_mm,
>> +			.private = &mss,
>> +		};
>> +
>> +		if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
>> +			memset(&mss, 0, sizeof(mss));
>> +			walk_page_vma(vma, &smaps_walk);
>> +			add_smaps_sum(&mss, &mss_sum);
>> +		}
>> +	}
>
> Why? I mean, why not walk_page_range() ? You do not need this for-each-vma
> loop at all? At least if you change this patch to use the ONE() helper, and
> everything else looks unneeded in this case.
>
> Oleg.
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
  2016-09-12 17:28         ` Sonny Rao
@ 2016-09-13  7:12           ` Michal Hocko
  -1 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2016-09-13  7:12 UTC (permalink / raw)
  To: Sonny Rao
  Cc: Robert Foss, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, seth.forshee, tixxdz, linux-doc, linux-kernel,
	Ben Zhang, Bryan Freed, Filipe Brandenburger, Jann Horn,
	linux-api, Jacek Anaszewski

On Mon 12-09-16 10:28:53, Sonny Rao wrote:
> On Mon, Sep 12, 2016 at 10:15 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Mon 12-09-16 08:31:36, Sonny Rao wrote:
[...]
> >> but how about the other fields like Swap, Private_Dirty and
> >> Private_Shared?
> >
> > Private_Shared can be pretty confusing as well without the whole context
> > as well see my other emails in the original thread (just to remind
> > shmem/tmpfs makes all this really confusing).
> 
> But this is exactly the issue -- RSS is can be just as confusing if
> you don't know something about the application.

I agree that rss can be confusing but we will not make the situation any
better if we add yet another confusing metric.

> I think the issue is
> how common that situation is, and you seem to believe that it's so
> uncommon that it's actually better to keep the information more
> difficult to get for those of us who know something about our systems.
> 
> That's fine, I guess we just have to disagree here, thanks for look at this.

I think you should just step back and think more about what exactly
you expect from the counter(s). I believe what you want is an
estimate of a freeable memory when the particular process dies or is
killed. That would mean resident single mapped private anonymous memory
+ unlinked single mapped shareable mappings + single mapped swapped out
memory. Maybe I've missed something but it should be something along
those lines. Definitely something that the current smaps infrastructure
doesn't give you, though.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-13  7:12           ` Michal Hocko
  0 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2016-09-13  7:12 UTC (permalink / raw)
  To: Sonny Rao
  Cc: Robert Foss, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, seth.forshee, tixxdz

On Mon 12-09-16 10:28:53, Sonny Rao wrote:
> On Mon, Sep 12, 2016 at 10:15 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Mon 12-09-16 08:31:36, Sonny Rao wrote:
[...]
> >> but how about the other fields like Swap, Private_Dirty and
> >> Private_Shared?
> >
> > Private_Shared can be pretty confusing as well without the whole context
> > as well see my other emails in the original thread (just to remind
> > shmem/tmpfs makes all this really confusing).
> 
> But this is exactly the issue -- RSS is can be just as confusing if
> you don't know something about the application.

I agree that rss can be confusing but we will not make the situation any
better if we add yet another confusing metric.

> I think the issue is
> how common that situation is, and you seem to believe that it's so
> uncommon that it's actually better to keep the information more
> difficult to get for those of us who know something about our systems.
> 
> That's fine, I guess we just have to disagree here, thanks for look at this.

I think you should just step back and think more about what exactly
you expect from the counter(s). I believe what you want is an
estimate of a freeable memory when the particular process dies or is
killed. That would mean resident single mapped private anonymous memory
+ unlinked single mapped shareable mappings + single mapped swapped out
memory. Maybe I've missed something but it should be something along
those lines. Definitely something that the current smaps infrastructure
doesn't give you, though.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
  2016-09-13  7:12           ` Michal Hocko
@ 2016-09-13 20:27             ` Sonny Rao
  -1 siblings, 0 replies; 43+ messages in thread
From: Sonny Rao @ 2016-09-13 20:27 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Robert Foss, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, Seth Forshee, Djalal Harouni, linux-doc, linux-kernel,
	Ben Zhang, Bryan Freed, Filipe Brandenburger, Jann Horn,
	linux-api, Jacek Anaszewski

On Tue, Sep 13, 2016 at 12:12 AM, Michal Hocko <mhocko@kernel.org> wrote:
> On Mon 12-09-16 10:28:53, Sonny Rao wrote:
>> On Mon, Sep 12, 2016 at 10:15 AM, Michal Hocko <mhocko@kernel.org> wrote:
>> > On Mon 12-09-16 08:31:36, Sonny Rao wrote:
> [...]
>> >> but how about the other fields like Swap, Private_Dirty and
>> >> Private_Shared?
>> >
>> > Private_Shared can be pretty confusing as well without the whole context
>> > as well see my other emails in the original thread (just to remind
>> > shmem/tmpfs makes all this really confusing).
>>
>> But this is exactly the issue -- RSS is can be just as confusing if
>> you don't know something about the application.
>
> I agree that rss can be confusing but we will not make the situation any
> better if we add yet another confusing metric.
>
>> I think the issue is
>> how common that situation is, and you seem to believe that it's so
>> uncommon that it's actually better to keep the information more
>> difficult to get for those of us who know something about our systems.
>>
>> That's fine, I guess we just have to disagree here, thanks for look at this.
>
> I think you should just step back and think more about what exactly
> you expect from the counter(s). I believe what you want is an
> estimate of a freeable memory when the particular process dies or is
> killed. That would mean resident single mapped private anonymous memory
> + unlinked single mapped shareable mappings + single mapped swapped out
> memory. Maybe I've missed something but it should be something along
> those lines. Definitely something that the current smaps infrastructure
> doesn't give you, though.

Yes your description of what we want is pretty good.  Having a
reasonable lower bound on the estimate is fine, though we probably
want to break out swapped out memory separately.  Given that smaps
doesn't provide this in a straightforward way, what do you think is
the right way to provide this information?

> --
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-13 20:27             ` Sonny Rao
  0 siblings, 0 replies; 43+ messages in thread
From: Sonny Rao @ 2016-09-13 20:27 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Robert Foss, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, Seth Forshee

On Tue, Sep 13, 2016 at 12:12 AM, Michal Hocko <mhocko@kernel.org> wrote:
> On Mon 12-09-16 10:28:53, Sonny Rao wrote:
>> On Mon, Sep 12, 2016 at 10:15 AM, Michal Hocko <mhocko@kernel.org> wrote:
>> > On Mon 12-09-16 08:31:36, Sonny Rao wrote:
> [...]
>> >> but how about the other fields like Swap, Private_Dirty and
>> >> Private_Shared?
>> >
>> > Private_Shared can be pretty confusing as well without the whole context
>> > as well see my other emails in the original thread (just to remind
>> > shmem/tmpfs makes all this really confusing).
>>
>> But this is exactly the issue -- RSS is can be just as confusing if
>> you don't know something about the application.
>
> I agree that rss can be confusing but we will not make the situation any
> better if we add yet another confusing metric.
>
>> I think the issue is
>> how common that situation is, and you seem to believe that it's so
>> uncommon that it's actually better to keep the information more
>> difficult to get for those of us who know something about our systems.
>>
>> That's fine, I guess we just have to disagree here, thanks for look at this.
>
> I think you should just step back and think more about what exactly
> you expect from the counter(s). I believe what you want is an
> estimate of a freeable memory when the particular process dies or is
> killed. That would mean resident single mapped private anonymous memory
> + unlinked single mapped shareable mappings + single mapped swapped out
> memory. Maybe I've missed something but it should be something along
> those lines. Definitely something that the current smaps infrastructure
> doesn't give you, though.

Yes your description of what we want is pretty good.  Having a
reasonable lower bound on the estimate is fine, though we probably
want to break out swapped out memory separately.  Given that smaps
doesn't provide this in a straightforward way, what do you think is
the right way to provide this information?

> --
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
  2016-09-13 20:27             ` Sonny Rao
@ 2016-09-14  9:12               ` Michal Hocko
  -1 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2016-09-14  9:12 UTC (permalink / raw)
  To: Sonny Rao
  Cc: Robert Foss, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, Seth Forshee, Djalal Harouni, linux-doc, linux-kernel,
	Ben Zhang, Bryan Freed, Filipe Brandenburger, Jann Horn,
	linux-api, Jacek Anaszewski

On Tue 13-09-16 13:27:39, Sonny Rao wrote:
> On Tue, Sep 13, 2016 at 12:12 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Mon 12-09-16 10:28:53, Sonny Rao wrote:
> >> On Mon, Sep 12, 2016 at 10:15 AM, Michal Hocko <mhocko@kernel.org> wrote:
> >> > On Mon 12-09-16 08:31:36, Sonny Rao wrote:
> > [...]
> >> >> but how about the other fields like Swap, Private_Dirty and
> >> >> Private_Shared?
> >> >
> >> > Private_Shared can be pretty confusing as well without the whole context
> >> > as well see my other emails in the original thread (just to remind
> >> > shmem/tmpfs makes all this really confusing).
> >>
> >> But this is exactly the issue -- RSS is can be just as confusing if
> >> you don't know something about the application.
> >
> > I agree that rss can be confusing but we will not make the situation any
> > better if we add yet another confusing metric.
> >
> >> I think the issue is
> >> how common that situation is, and you seem to believe that it's so
> >> uncommon that it's actually better to keep the information more
> >> difficult to get for those of us who know something about our systems.
> >>
> >> That's fine, I guess we just have to disagree here, thanks for look at this.
> >
> > I think you should just step back and think more about what exactly
> > you expect from the counter(s). I believe what you want is an
> > estimate of a freeable memory when the particular process dies or is
> > killed. That would mean resident single mapped private anonymous memory
> > + unlinked single mapped shareable mappings + single mapped swapped out
> > memory. Maybe I've missed something but it should be something along
> > those lines. Definitely something that the current smaps infrastructure
> > doesn't give you, though.
> 
> Yes your description of what we want is pretty good.  Having a
> reasonable lower bound on the estimate is fine, though we probably
> want to break out swapped out memory separately.

Why would you want to separate that?

> Given that smaps
> doesn't provide this in a straightforward way, what do you think is
> the right way to provide this information?

I would be tempted to sneak it into /proc/<pid>/statm because that looks
like a proper place but getting this information is not for free
performance wise so I am not really sure something that relies on this
file would see unexpected stalls. Maybe this could be worked around by
some caching... I would suggest to check who is actually using this file
(top/ps etc...)

If this would be unacceptable then a new file could be considered.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-14  9:12               ` Michal Hocko
  0 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2016-09-14  9:12 UTC (permalink / raw)
  To: Sonny Rao
  Cc: Robert Foss, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, Seth Forshee

On Tue 13-09-16 13:27:39, Sonny Rao wrote:
> On Tue, Sep 13, 2016 at 12:12 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Mon 12-09-16 10:28:53, Sonny Rao wrote:
> >> On Mon, Sep 12, 2016 at 10:15 AM, Michal Hocko <mhocko@kernel.org> wrote:
> >> > On Mon 12-09-16 08:31:36, Sonny Rao wrote:
> > [...]
> >> >> but how about the other fields like Swap, Private_Dirty and
> >> >> Private_Shared?
> >> >
> >> > Private_Shared can be pretty confusing as well without the whole context
> >> > as well see my other emails in the original thread (just to remind
> >> > shmem/tmpfs makes all this really confusing).
> >>
> >> But this is exactly the issue -- RSS is can be just as confusing if
> >> you don't know something about the application.
> >
> > I agree that rss can be confusing but we will not make the situation any
> > better if we add yet another confusing metric.
> >
> >> I think the issue is
> >> how common that situation is, and you seem to believe that it's so
> >> uncommon that it's actually better to keep the information more
> >> difficult to get for those of us who know something about our systems.
> >>
> >> That's fine, I guess we just have to disagree here, thanks for look at this.
> >
> > I think you should just step back and think more about what exactly
> > you expect from the counter(s). I believe what you want is an
> > estimate of a freeable memory when the particular process dies or is
> > killed. That would mean resident single mapped private anonymous memory
> > + unlinked single mapped shareable mappings + single mapped swapped out
> > memory. Maybe I've missed something but it should be something along
> > those lines. Definitely something that the current smaps infrastructure
> > doesn't give you, though.
> 
> Yes your description of what we want is pretty good.  Having a
> reasonable lower bound on the estimate is fine, though we probably
> want to break out swapped out memory separately.

Why would you want to separate that?

> Given that smaps
> doesn't provide this in a straightforward way, what do you think is
> the right way to provide this information?

I would be tempted to sneak it into /proc/<pid>/statm because that looks
like a proper place but getting this information is not for free
performance wise so I am not really sure something that relies on this
file would see unexpected stalls. Maybe this could be worked around by
some caching... I would suggest to check who is actually using this file
(top/ps etc...)

If this would be unacceptable then a new file could be considered.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
  2016-09-14  9:12               ` Michal Hocko
@ 2016-09-19 15:16                 ` Robert Foss
  -1 siblings, 0 replies; 43+ messages in thread
From: Robert Foss @ 2016-09-19 15:16 UTC (permalink / raw)
  To: Michal Hocko, Sonny Rao
  Cc: Jonathan Corbet, Andrew Morton, Vlastimil Babka, Hugh Dickins,
	Konstantin Khlebnikov, Naoya Horiguchi, Kirill A. Shutemov,
	John Stultz, Minchan Kim, ross.zwisler, jmarchan,
	Johannes Weiner, Kees Cook, oleg, Al Viro, Mateusz Guzik,
	Janis Danisevskis, calvinowens, Alexey Dobriyan, ebiederm,
	Seth Forshee, Djalal Harouni, linux-doc, linux-kernel, Ben Zhang,
	Bryan Freed, Filipe Brandenburger, Jann Horn, linux-api,
	Jacek Anaszewski



On 2016-09-14 05:12 AM, Michal Hocko wrote:
> On Tue 13-09-16 13:27:39, Sonny Rao wrote:
>> On Tue, Sep 13, 2016 at 12:12 AM, Michal Hocko <mhocko@kernel.org> wrote:
>>> On Mon 12-09-16 10:28:53, Sonny Rao wrote:
>>>> On Mon, Sep 12, 2016 at 10:15 AM, Michal Hocko <mhocko@kernel.org> wrote:
>>>>> On Mon 12-09-16 08:31:36, Sonny Rao wrote:
>>> [...]
>>>>>> but how about the other fields like Swap, Private_Dirty and
>>>>>> Private_Shared?
>>>>>
>>>>> Private_Shared can be pretty confusing as well without the whole context
>>>>> as well see my other emails in the original thread (just to remind
>>>>> shmem/tmpfs makes all this really confusing).
>>>>
>>>> But this is exactly the issue -- RSS is can be just as confusing if
>>>> you don't know something about the application.
>>>
>>> I agree that rss can be confusing but we will not make the situation any
>>> better if we add yet another confusing metric.
>>>
>>>> I think the issue is
>>>> how common that situation is, and you seem to believe that it's so
>>>> uncommon that it's actually better to keep the information more
>>>> difficult to get for those of us who know something about our systems.
>>>>
>>>> That's fine, I guess we just have to disagree here, thanks for look at this.
>>>
>>> I think you should just step back and think more about what exactly
>>> you expect from the counter(s). I believe what you want is an
>>> estimate of a freeable memory when the particular process dies or is
>>> killed. That would mean resident single mapped private anonymous memory
>>> + unlinked single mapped shareable mappings + single mapped swapped out
>>> memory. Maybe I've missed something but it should be something along
>>> those lines. Definitely something that the current smaps infrastructure
>>> doesn't give you, though.
>>
>> Yes your description of what we want is pretty good.  Having a
>> reasonable lower bound on the estimate is fine, though we probably
>> want to break out swapped out memory separately.
>
> Why would you want to separate that?
>
>> Given that smaps
>> doesn't provide this in a straightforward way, what do you think is
>> the right way to provide this information?
>
> I would be tempted to sneak it into /proc/<pid>/statm because that looks
> like a proper place but getting this information is not for free
> performance wise so I am not really sure something that relies on this
> file would see unexpected stalls. Maybe this could be worked around by
> some caching... I would suggest to check who is actually using this file
> (top/ps etc...)

What would this caching look like? Can any information be re-used 
between vma walks?

>
> If this would be unacceptable then a new file could be considered.
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-19 15:16                 ` Robert Foss
  0 siblings, 0 replies; 43+ messages in thread
From: Robert Foss @ 2016-09-19 15:16 UTC (permalink / raw)
  To: Michal Hocko, Sonny Rao
  Cc: Jonathan Corbet, Andrew Morton, Vlastimil Babka, Hugh Dickins,
	Konstantin Khlebnikov, Naoya Horiguchi, Kirill A. Shutemov,
	John Stultz, Minchan Kim, ross.zwisler, jmarchan,
	Johannes Weiner, Kees Cook, oleg, Al Viro, Mateusz Guzik,
	Janis Danisevskis, calvinowens, Alexey Dobriyan, ebiederm,
	Seth Forshee, Djalal Harouni



On 2016-09-14 05:12 AM, Michal Hocko wrote:
> On Tue 13-09-16 13:27:39, Sonny Rao wrote:
>> On Tue, Sep 13, 2016 at 12:12 AM, Michal Hocko <mhocko@kernel.org> wrote:
>>> On Mon 12-09-16 10:28:53, Sonny Rao wrote:
>>>> On Mon, Sep 12, 2016 at 10:15 AM, Michal Hocko <mhocko@kernel.org> wrote:
>>>>> On Mon 12-09-16 08:31:36, Sonny Rao wrote:
>>> [...]
>>>>>> but how about the other fields like Swap, Private_Dirty and
>>>>>> Private_Shared?
>>>>>
>>>>> Private_Shared can be pretty confusing as well without the whole context
>>>>> as well see my other emails in the original thread (just to remind
>>>>> shmem/tmpfs makes all this really confusing).
>>>>
>>>> But this is exactly the issue -- RSS is can be just as confusing if
>>>> you don't know something about the application.
>>>
>>> I agree that rss can be confusing but we will not make the situation any
>>> better if we add yet another confusing metric.
>>>
>>>> I think the issue is
>>>> how common that situation is, and you seem to believe that it's so
>>>> uncommon that it's actually better to keep the information more
>>>> difficult to get for those of us who know something about our systems.
>>>>
>>>> That's fine, I guess we just have to disagree here, thanks for look at this.
>>>
>>> I think you should just step back and think more about what exactly
>>> you expect from the counter(s). I believe what you want is an
>>> estimate of a freeable memory when the particular process dies or is
>>> killed. That would mean resident single mapped private anonymous memory
>>> + unlinked single mapped shareable mappings + single mapped swapped out
>>> memory. Maybe I've missed something but it should be something along
>>> those lines. Definitely something that the current smaps infrastructure
>>> doesn't give you, though.
>>
>> Yes your description of what we want is pretty good.  Having a
>> reasonable lower bound on the estimate is fine, though we probably
>> want to break out swapped out memory separately.
>
> Why would you want to separate that?
>
>> Given that smaps
>> doesn't provide this in a straightforward way, what do you think is
>> the right way to provide this information?
>
> I would be tempted to sneak it into /proc/<pid>/statm because that looks
> like a proper place but getting this information is not for free
> performance wise so I am not really sure something that relies on this
> file would see unexpected stalls. Maybe this could be worked around by
> some caching... I would suggest to check who is actually using this file
> (top/ps etc...)

What would this caching look like? Can any information be re-used 
between vma walks?

>
> If this would be unacceptable then a new file could be considered.
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
  2016-09-19 15:16                 ` Robert Foss
@ 2016-09-19 19:32                   ` Michal Hocko
  -1 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2016-09-19 19:32 UTC (permalink / raw)
  To: Robert Foss
  Cc: Sonny Rao, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, Seth Forshee, Djalal Harouni, linux-doc, linux-kernel,
	Ben Zhang, Bryan Freed, Filipe Brandenburger, Jann Horn,
	linux-api, Jacek Anaszewski

On Mon 19-09-16 11:16:31, Robert Foss wrote:
> On 2016-09-14 05:12 AM, Michal Hocko wrote:
> > On Tue 13-09-16 13:27:39, Sonny Rao wrote:
[...]
> > > Given that smaps
> > > doesn't provide this in a straightforward way, what do you think is
> > > the right way to provide this information?
> > 
> > I would be tempted to sneak it into /proc/<pid>/statm because that looks
> > like a proper place but getting this information is not for free
> > performance wise so I am not really sure something that relies on this
> > file would see unexpected stalls. Maybe this could be worked around by
> > some caching... I would suggest to check who is actually using this file
> > (top/ps etc...)
> 
> What would this caching look like? Can any information be re-used between
> vma walks?

yes basically return the same value if called within HZ or something
similar. But that assumes that statm latency really matters and it is
called often enough.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-19 19:32                   ` Michal Hocko
  0 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2016-09-19 19:32 UTC (permalink / raw)
  To: Robert Foss
  Cc: Sonny Rao, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, Seth Forshee

On Mon 19-09-16 11:16:31, Robert Foss wrote:
> On 2016-09-14 05:12 AM, Michal Hocko wrote:
> > On Tue 13-09-16 13:27:39, Sonny Rao wrote:
[...]
> > > Given that smaps
> > > doesn't provide this in a straightforward way, what do you think is
> > > the right way to provide this information?
> > 
> > I would be tempted to sneak it into /proc/<pid>/statm because that looks
> > like a proper place but getting this information is not for free
> > performance wise so I am not really sure something that relies on this
> > file would see unexpected stalls. Maybe this could be worked around by
> > some caching... I would suggest to check who is actually using this file
> > (top/ps etc...)
> 
> What would this caching look like? Can any information be re-used between
> vma walks?

yes basically return the same value if called within HZ or something
similar. But that assumes that statm latency really matters and it is
called often enough.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
       [not found]                     ` <20160919195109.GB28639@dhcp22.suse.cz>
@ 2016-09-19 19:56                         ` Jann Horn
  0 siblings, 0 replies; 43+ messages in thread
From: Jann Horn @ 2016-09-19 19:56 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Sonny Rao, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, Seth Forshee, Djalal Harouni, linux-doc, linux-kernel,
	Ben Zhang, Bryan Freed, Filipe Brandenburger, Jann Horn,
	linux-api, Jacek Anaszewski

[-- Attachment #1: Type: text/plain, Size: 2419 bytes --]

On Mon, Sep 19, 2016 at 09:51:13PM +0200, Michal Hocko wrote:
> [not sure why the CC list was trimmed - do no do that please unless you
>  have a strong reason for that - if this was not intentional please
>  restpre it]

Ah, sorry, pressed the wrong key.


> On Mon 19-09-16 21:40:01, Jann Horn wrote:
> > On Mon, Sep 19, 2016 at 09:32:38PM +0200, Michal Hocko wrote:
> > > On Mon 19-09-16 11:16:31, Robert Foss wrote:
> > > > On 2016-09-14 05:12 AM, Michal Hocko wrote:
> > > > > On Tue 13-09-16 13:27:39, Sonny Rao wrote:
> > > [...]
> > > > > > Given that smaps
> > > > > > doesn't provide this in a straightforward way, what do you think is
> > > > > > the right way to provide this information?
> > > > > 
> > > > > I would be tempted to sneak it into /proc/<pid>/statm because that looks
> > > > > like a proper place but getting this information is not for free
> > > > > performance wise so I am not really sure something that relies on this
> > > > > file would see unexpected stalls. Maybe this could be worked around by
> > > > > some caching... I would suggest to check who is actually using this file
> > > > > (top/ps etc...)
> > > > 
> > > > What would this caching look like? Can any information be re-used between
> > > > vma walks?
> > > 
> > > yes basically return the same value if called within HZ or something
> > > similar. But that assumes that statm latency really matters and it is
> > > called often enough.
> > 
> > That sounds horrible. If some application decides that they want to check
> > statm directly after some action or so (like after program startup), this is
> > going to give them a very bad time. That probably doesn't happen
> > often - but still.
> > 
> > I can already imagine some developer going "yeah, that usleep()... that's
> > because the kernel API returns stale information for a couple milliseconds
> > after we do something *shrug*".
> > 
> > What are you trying to optimize for? Ten users on the same machine, each of
> > which is running "top" because it looks so great?
> 
> Please try to read what I wrote again. I didn't say this would be
> needed. The idea was that _if_ /proc/<pid>/statm is used very _often_
> than some caching might help to reduce the overhead. Especially when you
> consider that the information is not precise anyway. It can change
> anytime while you are doing the address space walk.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-19 19:56                         ` Jann Horn
  0 siblings, 0 replies; 43+ messages in thread
From: Jann Horn @ 2016-09-19 19:56 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Sonny Rao, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, Seth Forshee

[-- Attachment #1: Type: text/plain, Size: 2419 bytes --]

On Mon, Sep 19, 2016 at 09:51:13PM +0200, Michal Hocko wrote:
> [not sure why the CC list was trimmed - do no do that please unless you
>  have a strong reason for that - if this was not intentional please
>  restpre it]

Ah, sorry, pressed the wrong key.


> On Mon 19-09-16 21:40:01, Jann Horn wrote:
> > On Mon, Sep 19, 2016 at 09:32:38PM +0200, Michal Hocko wrote:
> > > On Mon 19-09-16 11:16:31, Robert Foss wrote:
> > > > On 2016-09-14 05:12 AM, Michal Hocko wrote:
> > > > > On Tue 13-09-16 13:27:39, Sonny Rao wrote:
> > > [...]
> > > > > > Given that smaps
> > > > > > doesn't provide this in a straightforward way, what do you think is
> > > > > > the right way to provide this information?
> > > > > 
> > > > > I would be tempted to sneak it into /proc/<pid>/statm because that looks
> > > > > like a proper place but getting this information is not for free
> > > > > performance wise so I am not really sure something that relies on this
> > > > > file would see unexpected stalls. Maybe this could be worked around by
> > > > > some caching... I would suggest to check who is actually using this file
> > > > > (top/ps etc...)
> > > > 
> > > > What would this caching look like? Can any information be re-used between
> > > > vma walks?
> > > 
> > > yes basically return the same value if called within HZ or something
> > > similar. But that assumes that statm latency really matters and it is
> > > called often enough.
> > 
> > That sounds horrible. If some application decides that they want to check
> > statm directly after some action or so (like after program startup), this is
> > going to give them a very bad time. That probably doesn't happen
> > often - but still.
> > 
> > I can already imagine some developer going "yeah, that usleep()... that's
> > because the kernel API returns stale information for a couple milliseconds
> > after we do something *shrug*".
> > 
> > What are you trying to optimize for? Ten users on the same machine, each of
> > which is running "top" because it looks so great?
> 
> Please try to read what I wrote again. I didn't say this would be
> needed. The idea was that _if_ /proc/<pid>/statm is used very _often_
> than some caching might help to reduce the overhead. Especially when you
> consider that the information is not precise anyway. It can change
> anytime while you are doing the address space walk.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
  2016-09-19 19:56                         ` Jann Horn
@ 2016-09-19 20:15                           ` Sonny Rao
  -1 siblings, 0 replies; 43+ messages in thread
From: Sonny Rao @ 2016-09-19 20:15 UTC (permalink / raw)
  To: Jann Horn
  Cc: Michal Hocko, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, Seth Forshee, Djalal Harouni, linux-doc, linux-kernel,
	Ben Zhang, Bryan Freed, Filipe Brandenburger, linux-api,
	Jacek Anaszewski

On Mon, Sep 19, 2016 at 12:56 PM, Jann Horn <jann@thejh.net> wrote:
> On Mon, Sep 19, 2016 at 09:51:13PM +0200, Michal Hocko wrote:
>> [not sure why the CC list was trimmed - do no do that please unless you
>>  have a strong reason for that - if this was not intentional please
>>  restpre it]
>
> Ah, sorry, pressed the wrong key.
>
>
>> On Mon 19-09-16 21:40:01, Jann Horn wrote:
>> > On Mon, Sep 19, 2016 at 09:32:38PM +0200, Michal Hocko wrote:
>> > > On Mon 19-09-16 11:16:31, Robert Foss wrote:
>> > > > On 2016-09-14 05:12 AM, Michal Hocko wrote:
>> > > > > On Tue 13-09-16 13:27:39, Sonny Rao wrote:
>> > > [...]
>> > > > > > Given that smaps
>> > > > > > doesn't provide this in a straightforward way, what do you think is
>> > > > > > the right way to provide this information?
>> > > > >
>> > > > > I would be tempted to sneak it into /proc/<pid>/statm because that looks
>> > > > > like a proper place but getting this information is not for free
>> > > > > performance wise so I am not really sure something that relies on this
>> > > > > file would see unexpected stalls. Maybe this could be worked around by
>> > > > > some caching... I would suggest to check who is actually using this file
>> > > > > (top/ps etc...)
>> > > >
>> > > > What would this caching look like? Can any information be re-used between
>> > > > vma walks?
>> > >
>> > > yes basically return the same value if called within HZ or something
>> > > similar. But that assumes that statm latency really matters and it is
>> > > called often enough.
>> >
>> > That sounds horrible. If some application decides that they want to check
>> > statm directly after some action or so (like after program startup), this is
>> > going to give them a very bad time. That probably doesn't happen
>> > often - but still.
>> >
>> > I can already imagine some developer going "yeah, that usleep()... that's
>> > because the kernel API returns stale information for a couple milliseconds
>> > after we do something *shrug*".
>> >
>> > What are you trying to optimize for? Ten users on the same machine, each of
>> > which is running "top" because it looks so great?
>>
>> Please try to read what I wrote again. I didn't say this would be
>> needed. The idea was that _if_ /proc/<pid>/statm is used very _often_
>> than some caching might help to reduce the overhead. Especially when you
>> consider that the information is not precise anyway. It can change
>> anytime while you are doing the address space walk.

Just thinking out loud here -- I haven't looked closely at the code so
please bear with me :-)

Instead of checking when the last read was and returning old data,
what about a scheme where we still have a timestamp for last stat read
on and any changes to that address space invalidate the timestamp.

The invalidation could be racy because we're not too concerned about
immediate accuracy -- so just a write.   The main issue I could see
which this is that it could cause the cacheline holding this timestamp
to bounce around a lot?  Maybe there's an existing solution in the
page table locking that could be leveraged here to at least maintain
whatever scalability enhancements are present for this type of
situation where there are many updates happening in parallel.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-19 20:15                           ` Sonny Rao
  0 siblings, 0 replies; 43+ messages in thread
From: Sonny Rao @ 2016-09-19 20:15 UTC (permalink / raw)
  To: Jann Horn
  Cc: Michal Hocko, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, Seth Forshee

On Mon, Sep 19, 2016 at 12:56 PM, Jann Horn <jann@thejh.net> wrote:
> On Mon, Sep 19, 2016 at 09:51:13PM +0200, Michal Hocko wrote:
>> [not sure why the CC list was trimmed - do no do that please unless you
>>  have a strong reason for that - if this was not intentional please
>>  restpre it]
>
> Ah, sorry, pressed the wrong key.
>
>
>> On Mon 19-09-16 21:40:01, Jann Horn wrote:
>> > On Mon, Sep 19, 2016 at 09:32:38PM +0200, Michal Hocko wrote:
>> > > On Mon 19-09-16 11:16:31, Robert Foss wrote:
>> > > > On 2016-09-14 05:12 AM, Michal Hocko wrote:
>> > > > > On Tue 13-09-16 13:27:39, Sonny Rao wrote:
>> > > [...]
>> > > > > > Given that smaps
>> > > > > > doesn't provide this in a straightforward way, what do you think is
>> > > > > > the right way to provide this information?
>> > > > >
>> > > > > I would be tempted to sneak it into /proc/<pid>/statm because that looks
>> > > > > like a proper place but getting this information is not for free
>> > > > > performance wise so I am not really sure something that relies on this
>> > > > > file would see unexpected stalls. Maybe this could be worked around by
>> > > > > some caching... I would suggest to check who is actually using this file
>> > > > > (top/ps etc...)
>> > > >
>> > > > What would this caching look like? Can any information be re-used between
>> > > > vma walks?
>> > >
>> > > yes basically return the same value if called within HZ or something
>> > > similar. But that assumes that statm latency really matters and it is
>> > > called often enough.
>> >
>> > That sounds horrible. If some application decides that they want to check
>> > statm directly after some action or so (like after program startup), this is
>> > going to give them a very bad time. That probably doesn't happen
>> > often - but still.
>> >
>> > I can already imagine some developer going "yeah, that usleep()... that's
>> > because the kernel API returns stale information for a couple milliseconds
>> > after we do something *shrug*".
>> >
>> > What are you trying to optimize for? Ten users on the same machine, each of
>> > which is running "top" because it looks so great?
>>
>> Please try to read what I wrote again. I didn't say this would be
>> needed. The idea was that _if_ /proc/<pid>/statm is used very _often_
>> than some caching might help to reduce the overhead. Especially when you
>> consider that the information is not precise anyway. It can change
>> anytime while you are doing the address space walk.

Just thinking out loud here -- I haven't looked closely at the code so
please bear with me :-)

Instead of checking when the last read was and returning old data,
what about a scheme where we still have a timestamp for last stat read
on and any changes to that address space invalidate the timestamp.

The invalidation could be racy because we're not too concerned about
immediate accuracy -- so just a write.   The main issue I could see
which this is that it could cause the cacheline holding this timestamp
to bounce around a lot?  Maybe there's an existing solution in the
page table locking that could be leveraged here to at least maintain
whatever scalability enhancements are present for this type of
situation where there are many updates happening in parallel.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
  2016-09-19 19:32                   ` Michal Hocko
@ 2016-09-20  0:27                     ` Robert Foss
  -1 siblings, 0 replies; 43+ messages in thread
From: Robert Foss @ 2016-09-20  0:27 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Sonny Rao, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, Seth Forshee, Djalal Harouni, linux-doc, linux-kernel,
	Ben Zhang, Bryan Freed, Filipe Brandenburger, Jann Horn,
	linux-api, Jacek Anaszewski



On 2016-09-19 03:32 PM, Michal Hocko wrote:
> On Mon 19-09-16 11:16:31, Robert Foss wrote:
>> On 2016-09-14 05:12 AM, Michal Hocko wrote:
>>> On Tue 13-09-16 13:27:39, Sonny Rao wrote:
> [...]
>>>> Given that smaps
>>>> doesn't provide this in a straightforward way, what do you think is
>>>> the right way to provide this information?
>>>
>>> I would be tempted to sneak it into /proc/<pid>/statm because that looks
>>> like a proper place but getting this information is not for free
>>> performance wise so I am not really sure something that relies on this
>>> file would see unexpected stalls. Maybe this could be worked around by
>>> some caching... I would suggest to check who is actually using this file
>>> (top/ps etc...)
>>
>> What would this caching look like? Can any information be re-used between
>> vma walks?
>
> yes basically return the same value if called within HZ or something
> similar. But that assumes that statm latency really matters and it is
> called often enough.

Any single application querying more often than HZ, would presumably do 
so for accuracy reasons.
However for multiple applications that combined query more often than 
HZ, this would most definitely be halpful in terms of performance.

@Sonny, does chromiumos fall into the first or second category?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-20  0:27                     ` Robert Foss
  0 siblings, 0 replies; 43+ messages in thread
From: Robert Foss @ 2016-09-20  0:27 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Sonny Rao, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, Seth Forshee



On 2016-09-19 03:32 PM, Michal Hocko wrote:
> On Mon 19-09-16 11:16:31, Robert Foss wrote:
>> On 2016-09-14 05:12 AM, Michal Hocko wrote:
>>> On Tue 13-09-16 13:27:39, Sonny Rao wrote:
> [...]
>>>> Given that smaps
>>>> doesn't provide this in a straightforward way, what do you think is
>>>> the right way to provide this information?
>>>
>>> I would be tempted to sneak it into /proc/<pid>/statm because that looks
>>> like a proper place but getting this information is not for free
>>> performance wise so I am not really sure something that relies on this
>>> file would see unexpected stalls. Maybe this could be worked around by
>>> some caching... I would suggest to check who is actually using this file
>>> (top/ps etc...)
>>
>> What would this caching look like? Can any information be re-used between
>> vma walks?
>
> yes basically return the same value if called within HZ or something
> similar. But that assumes that statm latency really matters and it is
> called often enough.

Any single application querying more often than HZ, would presumably do 
so for accuracy reasons.
However for multiple applications that combined query more often than 
HZ, this would most definitely be halpful in terms of performance.

@Sonny, does chromiumos fall into the first or second category?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-20  0:29                       ` Sonny Rao
  0 siblings, 0 replies; 43+ messages in thread
From: Sonny Rao @ 2016-09-20  0:29 UTC (permalink / raw)
  To: Robert Foss
  Cc: Michal Hocko, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim, ross.zwisler,
	jmarchan, Johannes Weiner, Kees Cook, oleg, Al Viro,
	Mateusz Guzik, Janis Danisevskis, calvinowens, Alexey Dobriyan,
	ebiederm, Seth Forshee, Djalal Harouni, linux-doc, linux-kernel,
	Ben Zhang, Bryan Freed, Filipe Brandenburger, Jann Horn,
	linux-api, Jacek Anaszewski

On Mon, Sep 19, 2016 at 5:27 PM, Robert Foss <robert.foss@collabora.com> wrote:
>
>
> On 2016-09-19 03:32 PM, Michal Hocko wrote:
>>
>> On Mon 19-09-16 11:16:31, Robert Foss wrote:
>>>
>>> On 2016-09-14 05:12 AM, Michal Hocko wrote:
>>>>
>>>> On Tue 13-09-16 13:27:39, Sonny Rao wrote:
>>
>> [...]
>>>>>
>>>>> Given that smaps
>>>>> doesn't provide this in a straightforward way, what do you think is
>>>>> the right way to provide this information?
>>>>
>>>>
>>>> I would be tempted to sneak it into /proc/<pid>/statm because that looks
>>>> like a proper place but getting this information is not for free
>>>> performance wise so I am not really sure something that relies on this
>>>> file would see unexpected stalls. Maybe this could be worked around by
>>>> some caching... I would suggest to check who is actually using this file
>>>> (top/ps etc...)
>>>
>>>
>>> What would this caching look like? Can any information be re-used between
>>> vma walks?
>>
>>
>> yes basically return the same value if called within HZ or something
>> similar. But that assumes that statm latency really matters and it is
>> called often enough.
>
>
> Any single application querying more often than HZ, would presumably do so
> for accuracy reasons.
> However for multiple applications that combined query more often than HZ,
> this would most definitely be halpful in terms of performance.
>
> @Sonny, does chromiumos fall into the first or second category?

It's a single application -- and it definitely doesn't query at HZ --
especially given how long it takes to gather the data :-)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps
@ 2016-09-20  0:29                       ` Sonny Rao
  0 siblings, 0 replies; 43+ messages in thread
From: Sonny Rao @ 2016-09-20  0:29 UTC (permalink / raw)
  To: Robert Foss
  Cc: Michal Hocko, Jonathan Corbet, Andrew Morton, Vlastimil Babka,
	Hugh Dickins, Konstantin Khlebnikov, Naoya Horiguchi,
	Kirill A. Shutemov, John Stultz, Minchan Kim,
	ross.zwisler-VuQAYsv1563Yd54FQh9/CA,
	jmarchan-H+wXaHxf7aLQT0dZR+AlfA, Johannes Weiner, Kees Cook,
	oleg-H+wXaHxf7aLQT0dZR+AlfA, Al Viro, Mateusz Guzik,
	Janis Danisevskis, calvinowens-b10kYP2dOMg, Alexey Dobriyan,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, Seth Forshee

On Mon, Sep 19, 2016 at 5:27 PM, Robert Foss <robert.foss-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org> wrote:
>
>
> On 2016-09-19 03:32 PM, Michal Hocko wrote:
>>
>> On Mon 19-09-16 11:16:31, Robert Foss wrote:
>>>
>>> On 2016-09-14 05:12 AM, Michal Hocko wrote:
>>>>
>>>> On Tue 13-09-16 13:27:39, Sonny Rao wrote:
>>
>> [...]
>>>>>
>>>>> Given that smaps
>>>>> doesn't provide this in a straightforward way, what do you think is
>>>>> the right way to provide this information?
>>>>
>>>>
>>>> I would be tempted to sneak it into /proc/<pid>/statm because that looks
>>>> like a proper place but getting this information is not for free
>>>> performance wise so I am not really sure something that relies on this
>>>> file would see unexpected stalls. Maybe this could be worked around by
>>>> some caching... I would suggest to check who is actually using this file
>>>> (top/ps etc...)
>>>
>>>
>>> What would this caching look like? Can any information be re-used between
>>> vma walks?
>>
>>
>> yes basically return the same value if called within HZ or something
>> similar. But that assumes that statm latency really matters and it is
>> called often enough.
>
>
> Any single application querying more often than HZ, would presumably do so
> for accuracy reasons.
> However for multiple applications that combined query more often than HZ,
> this would most definitely be halpful in terms of performance.
>
> @Sonny, does chromiumos fall into the first or second category?

It's a single application -- and it definitely doesn't query at HZ --
especially given how long it takes to gather the data :-)

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2016-09-20  0:37 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-05 20:14 [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps robert.foss
2016-09-05 20:14 ` robert.foss
2016-09-05 20:14 ` [PATCH v5 1/3] " robert.foss
2016-09-05 20:14   ` robert.foss
2016-09-07 12:58   ` Oleg Nesterov
2016-09-07 12:58     ` Oleg Nesterov
2016-09-12 22:12     ` Robert Foss
2016-09-12 22:12       ` Robert Foss
2016-09-05 20:14 ` [PATCH v5 2/3] Documentation/filesystems: Fixed typo robert.foss
2016-09-05 20:14   ` robert.foss-ZGY8ohtN/8qB+jHODAdFcQ
2016-09-07 23:22   ` Kees Cook
2016-09-07 23:22     ` Kees Cook
2016-09-08  0:22     ` Robert Foss
2016-09-08  0:22       ` Robert Foss
2016-09-08  6:23       ` Jonathan Corbet
2016-09-08  6:23         ` Jonathan Corbet
2016-09-05 20:14 ` [PATCH v5 3/3] Documentation/filesystems: Added /proc/PID/totmaps documentation robert.foss
2016-09-05 20:14   ` robert.foss
2016-09-12 12:02 ` [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps Michal Hocko
2016-09-12 15:31   ` Sonny Rao
2016-09-12 15:31     ` Sonny Rao
2016-09-12 17:15     ` Michal Hocko
2016-09-12 17:15       ` Michal Hocko
2016-09-12 17:28       ` Sonny Rao
2016-09-12 17:28         ` Sonny Rao
2016-09-13  7:12         ` Michal Hocko
2016-09-13  7:12           ` Michal Hocko
2016-09-13 20:27           ` Sonny Rao
2016-09-13 20:27             ` Sonny Rao
2016-09-14  9:12             ` Michal Hocko
2016-09-14  9:12               ` Michal Hocko
2016-09-19 15:16               ` Robert Foss
2016-09-19 15:16                 ` Robert Foss
2016-09-19 19:32                 ` Michal Hocko
2016-09-19 19:32                   ` Michal Hocko
     [not found]                   ` <20160919194001.GE2903@pc.thejh.net>
     [not found]                     ` <20160919195109.GB28639@dhcp22.suse.cz>
2016-09-19 19:56                       ` Jann Horn
2016-09-19 19:56                         ` Jann Horn
2016-09-19 20:15                         ` Sonny Rao
2016-09-19 20:15                           ` Sonny Rao
2016-09-20  0:27                   ` Robert Foss
2016-09-20  0:27                     ` Robert Foss
2016-09-20  0:29                     ` Sonny Rao
2016-09-20  0:29                       ` Sonny Rao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.