linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PACTH v4 0/3] Implement /proc/<pid>/totmaps
@ 2016-08-16 22:33 robert.foss
  2016-08-16 22:33 ` [PACTH v4 1/3] mm, proc: " robert.foss
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: robert.foss @ 2016-08-16 22:33 UTC (permalink / raw)
  To: corbet, akpm, vbabka, mhocko, koct9i, hughd, robert.foss,
	n-horiguchi, minchan, john.stultz, ross.zwisler, jmarchan,
	hannes, mingo, keescook, viro, gorcunov, mnfhuang, adobriyan,
	calvinowens, jdanis, jann, sonnyrao, kirill.shutemov, ldufour,
	linux-doc, linux-kernel, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Mateusz Guzik, Michal Hocko, linux-api

From: Robert Foss <robert.foss@collabora.com>


This series provides the /proc/PID/totmaps feature, which 
summarizes the information provided by /proc/PID/smaps for
improved performance and usability reasons.

A use case is to speed up monitoring of memory consumption in 
environments where RSS isn't precise.

For example Chrome tends to many processes which have hundreds of VMAs
with a substantial amount of shared memory, and the error of using
RSS rather than PSS tends to be very large when looking at overall
memory consumption.  PSS isn't kept as a single number that's exported
like RSS, so to calculate PSS means having to parse a very large smaps
file.

This process is slow and has to be repeated for many processes, and we
found that the just act of doing the parsing was taking up a
significant amount of CPU time, so this patch is an attempt to make
that process cheaper.

/proc/PID/totmaps provides roughly a 2x speedup compared to parsing
/proc/PID/smaps with awk.

$ ps aux | grep firefox
robertfoss   5025 24.3 13.7 3562820 2219616 ?     Rl   Aug15 277:44 /usr/lib/firefox/firefox https://allg.one/xpb
$ awk '/^[0-9a-f]/{print}' /proc/5025/smaps | wc -l
1503
$ /usr/bin/time -v -p zsh -c "(repeat 25 {cat /proc/5025/totmaps})"
[...]
	Command being timed: "zsh -c (repeat 25 {cat /proc/5025/totmaps})"
	User time (seconds): 0.00
	System time (seconds): 0.40
	Percent of CPU this job got: 90%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.45

$ /usr/bin/time -v -p zsh -c "repeat 25 { awk '/^Rss/{rss+=\$2} /^Pss/{pss+=\$2} END {printf \"rss:%d pss:%d\n\", rss, pss}\' /proc/5025/smaps }"
[...]
	Command being timed: "zsh -c repeat 25 { awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}\' /proc/5025/smaps }"
	User time (seconds): 0.37
	System time (seconds): 0.45
	Percent of CPU this job got: 92%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.89


Changes since v1:
- Removed IS_ERR check from get_task_mm() function
- Changed comment format
- Moved proc_totmaps_operations declaration inside internal.h
- Switched to using do_maps_open() in totmaps_open() function,
  which provides privilege checking
- Error handling reworked for totmaps_open() function
- Switched to stack allocated struct mem_size_stats mss_sum in
  totmaps_proc_show() function
- Removed get_task_mm() in totmaps_proc_show() since priv->mm
  already is available
- Added support to proc_map_release() fork priv==NULL, to allow
  function to be used for all failure cases
- Added proc_totmaps_op and for it helper functions
- Added documention in separate patch
- Removed totmaps_release() since it was just a wrapper for proc_map_release()

Changes since v2:
- Removed struct mem_size_stats *mss from struct proc_maps_private
- Removed priv->task assignment in totmaps_open() call
- Moved some assignements calls totmaps_open() around to increase code
  clarity
- Moved some function calls to unlock data structures before printing

Changes since v3:
- Fixed typo in totmaps documentation
- Fixed issue where proc_map_release wasn't called on error
- Fixed put_task_struct not being called during .release()

Robert Foss (3):
  mm, proc: Implement /proc/<pid>/totmaps
  Documentation/filesystems: Fixed typo
  Documentation/filesystems: Added /proc/PID/totmaps documentation

 Documentation/filesystems/proc.txt |  23 +++++-
 fs/proc/base.c                     |   1 +
 fs/proc/internal.h                 |   2 +
 fs/proc/task_mmu.c                 | 141 +++++++++++++++++++++++++++++++++++++
 4 files changed, 166 insertions(+), 1 deletion(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PACTH v4 1/3] mm, proc: Implement /proc/<pid>/totmaps
  2016-08-16 22:33 [PACTH v4 0/3] Implement /proc/<pid>/totmaps robert.foss
@ 2016-08-16 22:33 ` robert.foss
  2016-08-31  9:45   ` Jacek Anaszewski
  2016-08-16 22:33 ` [PACTH v4 2/3] Documentation/filesystems: Fixed typo robert.foss
  2016-08-16 22:33 ` [PACTH v4 3/3] Documentation/filesystems: Added /proc/PID/totmaps documentation robert.foss
  2 siblings, 1 reply; 8+ messages in thread
From: robert.foss @ 2016-08-16 22:33 UTC (permalink / raw)
  To: corbet, akpm, vbabka, mhocko, koct9i, hughd, robert.foss,
	n-horiguchi, minchan, john.stultz, ross.zwisler, jmarchan,
	hannes, mingo, keescook, viro, gorcunov, mnfhuang, adobriyan,
	calvinowens, jdanis, jann, sonnyrao, kirill.shutemov, ldufour,
	linux-doc, linux-kernel, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Mateusz Guzik, Michal Hocko, linux-api

From: Robert Foss <robert.foss@collabora.com>

This is based on earlier work by Thiago Goncales. It implements a new
per process proc file which summarizes the contents of the smaps file
but doesn't display any addresses.  It gives more detailed information
than statm like the PSS (proprotional set size).  It differs from the
original implementation in that it doesn't use the full blown set of
seq operations, uses a different termination condition, and doesn't
displayed "Locked" as that was broken on the original implemenation.

This new proc file provides information faster than parsing the potentially
huge smaps file.

Tested-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: Robert Foss <robert.foss@collabora.com>

Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
---
 fs/proc/base.c     |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 141 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 144 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index a11eb71..de3acdf 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2855,6 +2855,7 @@ static const struct pid_entry tgid_base_stuff[] = {
 	REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
 	REG("smaps",      S_IRUGO, proc_pid_smaps_operations),
 	REG("pagemap",    S_IRUSR, proc_pagemap_operations),
+	REG("totmaps",    S_IRUGO, proc_totmaps_operations),
 #endif
 #ifdef CONFIG_SECURITY
 	DIR("attr",       S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, proc_attr_dir_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index aa27810..99f97d7 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -297,6 +297,8 @@ extern const struct file_operations proc_pid_smaps_operations;
 extern const struct file_operations proc_tid_smaps_operations;
 extern const struct file_operations proc_clear_refs_operations;
 extern const struct file_operations proc_pagemap_operations;
+extern const struct file_operations proc_totmaps_operations;
+
 
 extern unsigned long task_vsize(struct mm_struct *);
 extern unsigned long task_statm(struct mm_struct *,
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4648c7f..fd8fd7f 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -802,6 +802,75 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
 	return 0;
 }
 
+static void add_smaps_sum(struct mem_size_stats *mss,
+		struct mem_size_stats *mss_sum)
+{
+	mss_sum->resident += mss->resident;
+	mss_sum->pss += mss->pss;
+	mss_sum->shared_clean += mss->shared_clean;
+	mss_sum->shared_dirty += mss->shared_dirty;
+	mss_sum->private_clean += mss->private_clean;
+	mss_sum->private_dirty += mss->private_dirty;
+	mss_sum->referenced += mss->referenced;
+	mss_sum->anonymous += mss->anonymous;
+	mss_sum->anonymous_thp += mss->anonymous_thp;
+	mss_sum->swap += mss->swap;
+}
+
+static int totmaps_proc_show(struct seq_file *m, void *data)
+{
+	struct proc_maps_private *priv = m->private;
+	struct mm_struct *mm = priv->mm;
+	struct vm_area_struct *vma;
+	struct mem_size_stats mss_sum;
+
+	memset(&mss_sum, 0, sizeof(mss_sum));
+	down_read(&mm->mmap_sem);
+	hold_task_mempolicy(priv);
+
+	for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
+		struct mem_size_stats mss;
+		struct mm_walk smaps_walk = {
+			.pmd_entry = smaps_pte_range,
+			.mm = vma->vm_mm,
+			.private = &mss,
+		};
+
+		if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
+			memset(&mss, 0, sizeof(mss));
+			walk_page_vma(vma, &smaps_walk);
+			add_smaps_sum(&mss, &mss_sum);
+		}
+	}
+
+	release_task_mempolicy(priv);
+	up_read(&mm->mmap_sem);
+
+	seq_printf(m,
+		   "Rss:            %8lu kB\n"
+		   "Pss:            %8lu kB\n"
+		   "Shared_Clean:   %8lu kB\n"
+		   "Shared_Dirty:   %8lu kB\n"
+		   "Private_Clean:  %8lu kB\n"
+		   "Private_Dirty:  %8lu kB\n"
+		   "Referenced:     %8lu kB\n"
+		   "Anonymous:      %8lu kB\n"
+		   "AnonHugePages:  %8lu kB\n"
+		   "Swap:           %8lu kB\n",
+		   mss_sum.resident >> 10,
+		   (unsigned long)(mss_sum.pss >> (10 + PSS_SHIFT)),
+		   mss_sum.shared_clean  >> 10,
+		   mss_sum.shared_dirty  >> 10,
+		   mss_sum.private_clean >> 10,
+		   mss_sum.private_dirty >> 10,
+		   mss_sum.referenced >> 10,
+		   mss_sum.anonymous >> 10,
+		   mss_sum.anonymous_thp >> 10,
+		   mss_sum.swap >> 10);
+
+	return 0;
+}
+
 static int show_pid_smap(struct seq_file *m, void *v)
 {
 	return show_smap(m, v, 1);
@@ -812,6 +881,28 @@ static int show_tid_smap(struct seq_file *m, void *v)
 	return show_smap(m, v, 0);
 }
 
+static void *m_totmaps_start(struct seq_file *p, loff_t *pos)
+{
+	return NULL + (*pos == 0);
+}
+
+static void *m_totmaps_next(struct seq_file *p, void *v, loff_t *pos)
+{
+	++*pos;
+	return NULL;
+}
+
+static void m_totmaps_stop(struct seq_file *p, void *v)
+{
+}
+
+static const struct seq_operations proc_totmaps_op = {
+	.start	= m_totmaps_start,
+	.next	= m_totmaps_next,
+	.stop	= m_totmaps_stop,
+	.show	= totmaps_proc_show
+};
+
 static const struct seq_operations proc_pid_smaps_op = {
 	.start	= m_start,
 	.next	= m_next,
@@ -836,6 +927,49 @@ static int tid_smaps_open(struct inode *inode, struct file *file)
 	return do_maps_open(inode, file, &proc_tid_smaps_op);
 }
 
+static int totmaps_open(struct inode *inode, struct file *file)
+{
+	struct proc_maps_private *priv = NULL;
+	struct seq_file *seq;
+	int ret;
+
+	ret = do_maps_open(inode, file, &proc_totmaps_op);
+	if (ret)
+		goto error;
+
+	/*
+	 * We need to grab references to the task_struct
+	 * at open time, because there's a potential information
+	 * leak where the totmaps file is opened and held open
+	 * while the underlying pid to task mapping changes
+	 * underneath it
+	 */
+	seq = file->private_data;
+	priv = seq->private;
+	priv->task = get_proc_task(inode);
+	if (!priv->task) {
+		ret = -ESRCH;
+		goto error_free;
+	}
+
+	return 0;
+
+error_free:
+	proc_map_release(inode, file);
+error:
+	return ret;
+}
+
+static int totmaps_release(struct inode *inode, struct file *file)
+{
+	struct seq_file *seq = file->private_data;
+	struct proc_maps_private *priv = seq->private;
+
+	put_task_struct(priv->task);
+
+	return proc_map_release(inode, file);
+}
+
 const struct file_operations proc_pid_smaps_operations = {
 	.open		= pid_smaps_open,
 	.read		= seq_read,
@@ -850,6 +984,13 @@ const struct file_operations proc_tid_smaps_operations = {
 	.release	= proc_map_release,
 };
 
+const struct file_operations proc_totmaps_operations = {
+	.open		= totmaps_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= totmaps_release,
+};
+
 enum clear_refs_types {
 	CLEAR_REFS_ALL = 1,
 	CLEAR_REFS_ANON,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PACTH v4 2/3] Documentation/filesystems: Fixed typo
  2016-08-16 22:33 [PACTH v4 0/3] Implement /proc/<pid>/totmaps robert.foss
  2016-08-16 22:33 ` [PACTH v4 1/3] mm, proc: " robert.foss
@ 2016-08-16 22:33 ` robert.foss
  2016-08-16 22:33 ` [PACTH v4 3/3] Documentation/filesystems: Added /proc/PID/totmaps documentation robert.foss
  2 siblings, 0 replies; 8+ messages in thread
From: robert.foss @ 2016-08-16 22:33 UTC (permalink / raw)
  To: corbet, akpm, vbabka, mhocko, koct9i, hughd, robert.foss,
	n-horiguchi, minchan, john.stultz, ross.zwisler, jmarchan,
	hannes, mingo, keescook, viro, gorcunov, mnfhuang, adobriyan,
	calvinowens, jdanis, jann, sonnyrao, kirill.shutemov, ldufour,
	linux-doc, linux-kernel, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Mateusz Guzik, Michal Hocko, linux-api

From: Robert Foss <robert.foss@collabora.com>

Fixed a -> an typo.

Signed-off-by: Robert Foss <robert.foss@collabora.com>
---
 Documentation/filesystems/proc.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index e8d0075..7d001be 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -145,7 +145,7 @@ Table 1-1: Process specific entries in /proc
 		symbol the task is blocked in - or "0" if not blocked.
  pagemap	Page table
  stack		Report full stack trace, enable via CONFIG_STACKTRACE
- smaps		a extension based on maps, showing the memory consumption of
+ smaps		an extension based on maps, showing the memory consumption of
 		each mapping and flags associated with it
  numa_maps	an extension based on maps, showing the memory locality and
 		binding policy as well as mem usage (in pages) of each mapping.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PACTH v4 3/3] Documentation/filesystems: Added /proc/PID/totmaps documentation
  2016-08-16 22:33 [PACTH v4 0/3] Implement /proc/<pid>/totmaps robert.foss
  2016-08-16 22:33 ` [PACTH v4 1/3] mm, proc: " robert.foss
  2016-08-16 22:33 ` [PACTH v4 2/3] Documentation/filesystems: Fixed typo robert.foss
@ 2016-08-16 22:33 ` robert.foss
  2 siblings, 0 replies; 8+ messages in thread
From: robert.foss @ 2016-08-16 22:33 UTC (permalink / raw)
  To: corbet, akpm, vbabka, mhocko, koct9i, hughd, robert.foss,
	n-horiguchi, minchan, john.stultz, ross.zwisler, jmarchan,
	hannes, mingo, keescook, viro, gorcunov, mnfhuang, adobriyan,
	calvinowens, jdanis, jann, sonnyrao, kirill.shutemov, ldufour,
	linux-doc, linux-kernel, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Mateusz Guzik, Michal Hocko, linux-api

From: Robert Foss <robert.foss@collabora.com>

Added documentation covering /proc/PID/totmaps.

Signed-off-by: Robert Foss <robert.foss@collabora.com>
---
 Documentation/filesystems/proc.txt | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 7d001be..4cb97df 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -11,6 +11,7 @@ Version 1.3                                              Kernel version 2.2.12
 					      Kernel version 2.4.0-test11-pre4
 ------------------------------------------------------------------------------
 fixes/update part 1.1  Stefani Seibold <stefani@seibold.net>       June 9 2009
+add totmaps            Robert Foss <robert.foss@collabora.com>  August 12 2016
 
 Table of Contents
 -----------------
@@ -147,6 +148,8 @@ Table 1-1: Process specific entries in /proc
  stack		Report full stack trace, enable via CONFIG_STACKTRACE
  smaps		an extension based on maps, showing the memory consumption of
 		each mapping and flags associated with it
+ totmaps        an extension based on maps, showing the total memory
+                consumption of all mappings
  numa_maps	an extension based on maps, showing the memory locality and
 		binding policy as well as mem usage (in pages) of each mapping.
 ..............................................................................
@@ -512,6 +515,24 @@ be vanished or the reverse -- new added.
 This file is only present if the CONFIG_MMU kernel configuration option is
 enabled.
 
+The /proc/PID/totmaps is an extension based on maps, showing the memory
+consumption totals for all of the process's mappings. It lists the sums of the
+same statistics as /proc/PID/smaps.
+
+The process' mappings will be summarized as a series of lines like the
+following:
+
+Rss:                4256 kB
+Pss:                1170 kB
+Shared_Clean:       2720 kB
+Shared_Dirty:       1136 kB
+Private_Clean:         0 kB
+Private_Dirty:       400 kB
+Referenced:         4256 kB
+Anonymous:          1536 kB
+AnonHugePages:         0 kB
+Swap:                  0 kB
+
 The /proc/PID/clear_refs is used to reset the PG_Referenced and ACCESSED/YOUNG
 bits on both physical and virtual pages associated with a process, and the
 soft-dirty bit on pte (see Documentation/vm/soft-dirty.txt for details).
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PACTH v4 1/3] mm, proc: Implement /proc/<pid>/totmaps
  2016-08-16 22:33 ` [PACTH v4 1/3] mm, proc: " robert.foss
@ 2016-08-31  9:45   ` Jacek Anaszewski
  2016-08-31 16:36     ` Robert Foss
  0 siblings, 1 reply; 8+ messages in thread
From: Jacek Anaszewski @ 2016-08-31  9:45 UTC (permalink / raw)
  To: robert.foss, corbet, akpm, vbabka, mhocko, koct9i, hughd,
	n-horiguchi, minchan, john.stultz, ross.zwisler, jmarchan,
	hannes, mingo, keescook, viro, gorcunov, mnfhuang, adobriyan,
	calvinowens, jdanis, jann, sonnyrao, kirill.shutemov, ldufour,
	linux-doc, linux-kernel, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Mateusz Guzik, Michal Hocko, linux-api

Hi Robert,

On 08/17/2016 12:33 AM, robert.foss@collabora.com wrote:
> From: Robert Foss <robert.foss@collabora.com>
>
> This is based on earlier work by Thiago Goncales. It implements a new
> per process proc file which summarizes the contents of the smaps file
> but doesn't display any addresses.  It gives more detailed information
> than statm like the PSS (proprotional set size).  It differs from the
> original implementation in that it doesn't use the full blown set of
> seq operations, uses a different termination condition, and doesn't
> displayed "Locked" as that was broken on the original implemenation.
>
> This new proc file provides information faster than parsing the potentially
> huge smaps file.
>
> Tested-by: Robert Foss <robert.foss@collabora.com>
> Signed-off-by: Robert Foss <robert.foss@collabora.com>
>
> Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
> ---
>  fs/proc/base.c     |   1 +
>  fs/proc/internal.h |   2 +
>  fs/proc/task_mmu.c | 141 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 144 insertions(+)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index a11eb71..de3acdf 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -2855,6 +2855,7 @@ static const struct pid_entry tgid_base_stuff[] = {
>  	REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
>  	REG("smaps",      S_IRUGO, proc_pid_smaps_operations),
>  	REG("pagemap",    S_IRUSR, proc_pagemap_operations),
> +	REG("totmaps",    S_IRUGO, proc_totmaps_operations),
>  #endif
>  #ifdef CONFIG_SECURITY
>  	DIR("attr",       S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, proc_attr_dir_operations),
> diff --git a/fs/proc/internal.h b/fs/proc/internal.h
> index aa27810..99f97d7 100644
> --- a/fs/proc/internal.h
> +++ b/fs/proc/internal.h
> @@ -297,6 +297,8 @@ extern const struct file_operations proc_pid_smaps_operations;
>  extern const struct file_operations proc_tid_smaps_operations;
>  extern const struct file_operations proc_clear_refs_operations;
>  extern const struct file_operations proc_pagemap_operations;
> +extern const struct file_operations proc_totmaps_operations;
> +
>
>  extern unsigned long task_vsize(struct mm_struct *);
>  extern unsigned long task_statm(struct mm_struct *,
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 4648c7f..fd8fd7f 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -802,6 +802,75 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
>  	return 0;
>  }
>
> +static void add_smaps_sum(struct mem_size_stats *mss,
> +		struct mem_size_stats *mss_sum)
> +{
> +	mss_sum->resident += mss->resident;
> +	mss_sum->pss += mss->pss;
> +	mss_sum->shared_clean += mss->shared_clean;
> +	mss_sum->shared_dirty += mss->shared_dirty;
> +	mss_sum->private_clean += mss->private_clean;
> +	mss_sum->private_dirty += mss->private_dirty;
> +	mss_sum->referenced += mss->referenced;
> +	mss_sum->anonymous += mss->anonymous;
> +	mss_sum->anonymous_thp += mss->anonymous_thp;
> +	mss_sum->swap += mss->swap;
> +}
> +
> +static int totmaps_proc_show(struct seq_file *m, void *data)
> +{
> +	struct proc_maps_private *priv = m->private;
> +	struct mm_struct *mm = priv->mm;
> +	struct vm_area_struct *vma;
> +	struct mem_size_stats mss_sum;
> +
> +	memset(&mss_sum, 0, sizeof(mss_sum));
> +	down_read(&mm->mmap_sem);
> +	hold_task_mempolicy(priv);
> +
> +	for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
> +		struct mem_size_stats mss;
> +		struct mm_walk smaps_walk = {
> +			.pmd_entry = smaps_pte_range,
> +			.mm = vma->vm_mm,
> +			.private = &mss,
> +		};
> +
> +		if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
> +			memset(&mss, 0, sizeof(mss));
> +			walk_page_vma(vma, &smaps_walk);
> +			add_smaps_sum(&mss, &mss_sum);
> +		}
> +	}
> +
> +	release_task_mempolicy(priv);
> +	up_read(&mm->mmap_sem);
> +
> +	seq_printf(m,
> +		   "Rss:            %8lu kB\n"
> +		   "Pss:            %8lu kB\n"
> +		   "Shared_Clean:   %8lu kB\n"
> +		   "Shared_Dirty:   %8lu kB\n"
> +		   "Private_Clean:  %8lu kB\n"
> +		   "Private_Dirty:  %8lu kB\n"
> +		   "Referenced:     %8lu kB\n"
> +		   "Anonymous:      %8lu kB\n"
> +		   "AnonHugePages:  %8lu kB\n"
> +		   "Swap:           %8lu kB\n",
> +		   mss_sum.resident >> 10,
> +		   (unsigned long)(mss_sum.pss >> (10 + PSS_SHIFT)),
> +		   mss_sum.shared_clean  >> 10,
> +		   mss_sum.shared_dirty  >> 10,
> +		   mss_sum.private_clean >> 10,
> +		   mss_sum.private_dirty >> 10,
> +		   mss_sum.referenced >> 10,
> +		   mss_sum.anonymous >> 10,
> +		   mss_sum.anonymous_thp >> 10,
> +		   mss_sum.swap >> 10);
> +
> +	return 0;
> +}
> +
>  static int show_pid_smap(struct seq_file *m, void *v)
>  {
>  	return show_smap(m, v, 1);
> @@ -812,6 +881,28 @@ static int show_tid_smap(struct seq_file *m, void *v)
>  	return show_smap(m, v, 0);
>  }
>
> +static void *m_totmaps_start(struct seq_file *p, loff_t *pos)
> +{
> +	return NULL + (*pos == 0);
> +}
> +
> +static void *m_totmaps_next(struct seq_file *p, void *v, loff_t *pos)
> +{
> +	++*pos;
> +	return NULL;
> +}
> +
> +static void m_totmaps_stop(struct seq_file *p, void *v)
> +{
> +}
> +
> +static const struct seq_operations proc_totmaps_op = {
> +	.start	= m_totmaps_start,
> +	.next	= m_totmaps_next,
> +	.stop	= m_totmaps_stop,
> +	.show	= totmaps_proc_show
> +};
> +
>  static const struct seq_operations proc_pid_smaps_op = {
>  	.start	= m_start,
>  	.next	= m_next,
> @@ -836,6 +927,49 @@ static int tid_smaps_open(struct inode *inode, struct file *file)
>  	return do_maps_open(inode, file, &proc_tid_smaps_op);
>  }
>
> +static int totmaps_open(struct inode *inode, struct file *file)
> +{
> +	struct proc_maps_private *priv = NULL;
> +	struct seq_file *seq;
> +	int ret;
> +
> +	ret = do_maps_open(inode, file, &proc_totmaps_op);
> +	if (ret)
> +		goto error;
> +
> +	/*
> +	 * We need to grab references to the task_struct
> +	 * at open time, because there's a potential information
> +	 * leak where the totmaps file is opened and held open
> +	 * while the underlying pid to task mapping changes
> +	 * underneath it
> +	 */
> +	seq = file->private_data;
> +	priv = seq->private;
> +	priv->task = get_proc_task(inode);
> +	if (!priv->task) {
> +		ret = -ESRCH;
> +		goto error_free;
> +	}
> +
> +	return 0;
> +
> +error_free:
> +	proc_map_release(inode, file);
> +error:
> +	return ret;
> +}
> +
> +static int totmaps_release(struct inode *inode, struct file *file)
> +{
> +	struct seq_file *seq = file->private_data;
> +	struct proc_maps_private *priv = seq->private;
> +
> +	put_task_struct(priv->task);
> +
> +	return proc_map_release(inode, file);
> +}
> +
>  const struct file_operations proc_pid_smaps_operations = {
>  	.open		= pid_smaps_open,
>  	.read		= seq_read,
> @@ -850,6 +984,13 @@ const struct file_operations proc_tid_smaps_operations = {
>  	.release	= proc_map_release,
>  };
>
> +const struct file_operations proc_totmaps_operations = {
> +	.open		= totmaps_open,
> +	.read		= seq_read,
> +	.llseek		= seq_lseek,
> +	.release	= totmaps_release,
> +};
> +
>  enum clear_refs_types {
>  	CLEAR_REFS_ALL = 1,
>  	CLEAR_REFS_ANON,
>

When reading totmaps of kernel processes the following NULL pointer
dereference occurs:

Unable to handle kernel NULL pointer dereference at virtual address 00000044
pgd = ee6e0000
[00000044] *pgd=7b83a831
Internal error: Oops: 17 [#6] PREEMPT SMP ARM
Modules linked in:
CPU: 2 PID: 1495 Comm: cat Tainted: G      D W 
4.8.0-rc2-00010-g22fe2db-dirty #159
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
task: ee596e00 task.stack: ee470000
PC is at down_read+0xc/0x48
LR is at totmaps_proc_show+0x2c/0x1e8
pc : [<c06c01f4>]    lr : [<c022a154>]    psr: 40000013
sp : ee471db8  ip : 00000000  fp : 00000000
r10: edfe1080  r9 : 00000001  r8 : 00000044
r7 : ee4abd00  r6 : edfe1080  r5 : edde0b80  r4 : 00000044
r3 : 00000000  r2 : 00000000  r1 : ffffffc8  r0 : 00000044
Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 6e6e004a  DAC: 00000051
Process cat (pid: 1495, stack limit = 0xee470210)
Stack: (0xee471db8 to 0xee472000)
1da0:                                                       00000000 
c022a154
1dc0: ee596e00 024200ca 00000000 024200ca 00000000 00000081 c0b02594 
024200ca
1de0: 00000055 ee5b7e44 00000800 c019cad0 00000000 c06c1af0 00000001 
c032aa90
1e00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
00000000
1e20: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
00000000
1e40: 00000000 00000000 c0a69764 c0a69758 0000000b c01afd60 eff4d000 
eff4d000
1e60: edc55f20 00000000 edfe10b0 0001c000 20000013 c06bfc9c 7ab80c7f 
c01bc060
1e80: 00000002 ef001b80 c0a695d0 024000c0 00008000 ee471ec0 00008000 
edfe1080
1ea0: ee4abd00 00000001 00000001 ee471f80 00000000 c01fe564 0001c000 
edfe10b0
1ec0: 00000000 00000000 00024e84 ee5b7e00 ee5b7e44 c0705348 0001c000 
ee4abd00
1ee0: ee471f80 00008000 ee470000 0001c000 00000000 c01dc850 c0b06aac 
ee471fb0
1f00: b6fbf220 b6fbf7c4 000001ff c0101308 386d6a0e 32e4d737 386d6a0e 
32e4d737
1f20: 00002838 00000000 ee4abd00 bec0eba0 00000000 bec0ed84 ee596e00 
00000000
1f40: ee4abd00 00008000 0001c000 00000000 ee471f80 c01ddca0 00000004 
ee478124
1f60: 00000001 00000000 00000000 ee4abd00 ee4abd00 00008000 0001c000 
c01ddd64
1f80: 00000000 00000000 00000000 00008000 0001c000 00000003 00000003 
c0107ac4
1fa0: 00000000 c0107900 00008000 0001c000 00000003 0001c000 00008000 
0001c000
1fc0: 00008000 0001c000 00000003 00000003 00008000 00000000 0000005e 
00000000
1fe0: 00000000 bec0eb0c 0000c694 b6f4248c 60000010 00000003 fdfffffb 
ffffffff
[<c06c01f4>] (down_read) from [<c022a154>] (totmaps_proc_show+0x2c/0x1e8)
[<c022a154>] (totmaps_proc_show) from [<c01fe564>] (seq_read+0x1c8/0x4b8)
[<c01fe564>] (seq_read) from [<c01dc850>] (__vfs_read+0x2c/0x110)
[<c01dc850>] (__vfs_read) from [<c01ddca0>] (vfs_read+0x8c/0x110)
[<c01ddca0>] (vfs_read) from [<c01ddd64>] (SyS_read+0x40/0x8c)
[<c01ddd64>] (SyS_read) from [<c0107900>] (ret_fast_syscall+0x0/0x3c)

It seems that some protection is needed for such processes, so that
totmaps would return empty string then, like in case of smaps.

-- 
Best regards,
Jacek Anaszewski

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PACTH v4 1/3] mm, proc: Implement /proc/<pid>/totmaps
  2016-08-31  9:45   ` Jacek Anaszewski
@ 2016-08-31 16:36     ` Robert Foss
  2016-08-31 17:04       ` Mateusz Guzik
  0 siblings, 1 reply; 8+ messages in thread
From: Robert Foss @ 2016-08-31 16:36 UTC (permalink / raw)
  To: Jacek Anaszewski, corbet, akpm, vbabka, mhocko, koct9i, hughd,
	n-horiguchi, minchan, john.stultz, ross.zwisler, jmarchan,
	hannes, mingo, keescook, viro, gorcunov, mnfhuang, adobriyan,
	calvinowens, jdanis, jann, sonnyrao, kirill.shutemov, ldufour,
	linux-doc, linux-kernel, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Mateusz Guzik, Michal Hocko, linux-api



On 2016-08-31 05:45 AM, Jacek Anaszewski wrote:
> Hi Robert,
>
> On 08/17/2016 12:33 AM, robert.foss@collabora.com wrote:
>> From: Robert Foss <robert.foss@collabora.com>
>>
>> This is based on earlier work by Thiago Goncales. It implements a new
>> per process proc file which summarizes the contents of the smaps file
>> but doesn't display any addresses.  It gives more detailed information
>> than statm like the PSS (proprotional set size).  It differs from the
>> original implementation in that it doesn't use the full blown set of
>> seq operations, uses a different termination condition, and doesn't
>> displayed "Locked" as that was broken on the original implemenation.
>>
>> This new proc file provides information faster than parsing the
>> potentially
>> huge smaps file.
>>
>> Tested-by: Robert Foss <robert.foss@collabora.com>
>> Signed-off-by: Robert Foss <robert.foss@collabora.com>
>>
>> Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
>> ---
>>  fs/proc/base.c     |   1 +
>>  fs/proc/internal.h |   2 +
>>  fs/proc/task_mmu.c | 141
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 144 insertions(+)
>>
>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>> index a11eb71..de3acdf 100644
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -2855,6 +2855,7 @@ static const struct pid_entry tgid_base_stuff[] = {
>>      REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
>>      REG("smaps",      S_IRUGO, proc_pid_smaps_operations),
>>      REG("pagemap",    S_IRUSR, proc_pagemap_operations),
>> +    REG("totmaps",    S_IRUGO, proc_totmaps_operations),
>>  #endif
>>  #ifdef CONFIG_SECURITY
>>      DIR("attr",       S_IRUGO|S_IXUGO,
>> proc_attr_dir_inode_operations, proc_attr_dir_operations),
>> diff --git a/fs/proc/internal.h b/fs/proc/internal.h
>> index aa27810..99f97d7 100644
>> --- a/fs/proc/internal.h
>> +++ b/fs/proc/internal.h
>> @@ -297,6 +297,8 @@ extern const struct file_operations
>> proc_pid_smaps_operations;
>>  extern const struct file_operations proc_tid_smaps_operations;
>>  extern const struct file_operations proc_clear_refs_operations;
>>  extern const struct file_operations proc_pagemap_operations;
>> +extern const struct file_operations proc_totmaps_operations;
>> +
>>
>>  extern unsigned long task_vsize(struct mm_struct *);
>>  extern unsigned long task_statm(struct mm_struct *,
>> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
>> index 4648c7f..fd8fd7f 100644
>> --- a/fs/proc/task_mmu.c
>> +++ b/fs/proc/task_mmu.c
>> @@ -802,6 +802,75 @@ static int show_smap(struct seq_file *m, void *v,
>> int is_pid)
>>      return 0;
>>  }
>>
>> +static void add_smaps_sum(struct mem_size_stats *mss,
>> +        struct mem_size_stats *mss_sum)
>> +{
>> +    mss_sum->resident += mss->resident;
>> +    mss_sum->pss += mss->pss;
>> +    mss_sum->shared_clean += mss->shared_clean;
>> +    mss_sum->shared_dirty += mss->shared_dirty;
>> +    mss_sum->private_clean += mss->private_clean;
>> +    mss_sum->private_dirty += mss->private_dirty;
>> +    mss_sum->referenced += mss->referenced;
>> +    mss_sum->anonymous += mss->anonymous;
>> +    mss_sum->anonymous_thp += mss->anonymous_thp;
>> +    mss_sum->swap += mss->swap;
>> +}
>> +
>> +static int totmaps_proc_show(struct seq_file *m, void *data)
>> +{
>> +    struct proc_maps_private *priv = m->private;
>> +    struct mm_struct *mm = priv->mm;
>> +    struct vm_area_struct *vma;
>> +    struct mem_size_stats mss_sum;
>> +
>> +    memset(&mss_sum, 0, sizeof(mss_sum));
>> +    down_read(&mm->mmap_sem);
>> +    hold_task_mempolicy(priv);
>> +
>> +    for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
>> +        struct mem_size_stats mss;
>> +        struct mm_walk smaps_walk = {
>> +            .pmd_entry = smaps_pte_range,
>> +            .mm = vma->vm_mm,
>> +            .private = &mss,
>> +        };
>> +
>> +        if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
>> +            memset(&mss, 0, sizeof(mss));
>> +            walk_page_vma(vma, &smaps_walk);
>> +            add_smaps_sum(&mss, &mss_sum);
>> +        }
>> +    }
>> +
>> +    release_task_mempolicy(priv);
>> +    up_read(&mm->mmap_sem);
>> +
>> +    seq_printf(m,
>> +           "Rss:            %8lu kB\n"
>> +           "Pss:            %8lu kB\n"
>> +           "Shared_Clean:   %8lu kB\n"
>> +           "Shared_Dirty:   %8lu kB\n"
>> +           "Private_Clean:  %8lu kB\n"
>> +           "Private_Dirty:  %8lu kB\n"
>> +           "Referenced:     %8lu kB\n"
>> +           "Anonymous:      %8lu kB\n"
>> +           "AnonHugePages:  %8lu kB\n"
>> +           "Swap:           %8lu kB\n",
>> +           mss_sum.resident >> 10,
>> +           (unsigned long)(mss_sum.pss >> (10 + PSS_SHIFT)),
>> +           mss_sum.shared_clean  >> 10,
>> +           mss_sum.shared_dirty  >> 10,
>> +           mss_sum.private_clean >> 10,
>> +           mss_sum.private_dirty >> 10,
>> +           mss_sum.referenced >> 10,
>> +           mss_sum.anonymous >> 10,
>> +           mss_sum.anonymous_thp >> 10,
>> +           mss_sum.swap >> 10);
>> +
>> +    return 0;
>> +}
>> +
>>  static int show_pid_smap(struct seq_file *m, void *v)
>>  {
>>      return show_smap(m, v, 1);
>> @@ -812,6 +881,28 @@ static int show_tid_smap(struct seq_file *m, void
>> *v)
>>      return show_smap(m, v, 0);
>>  }
>>
>> +static void *m_totmaps_start(struct seq_file *p, loff_t *pos)
>> +{
>> +    return NULL + (*pos == 0);
>> +}
>> +
>> +static void *m_totmaps_next(struct seq_file *p, void *v, loff_t *pos)
>> +{
>> +    ++*pos;
>> +    return NULL;
>> +}
>> +
>> +static void m_totmaps_stop(struct seq_file *p, void *v)
>> +{
>> +}
>> +
>> +static const struct seq_operations proc_totmaps_op = {
>> +    .start    = m_totmaps_start,
>> +    .next    = m_totmaps_next,
>> +    .stop    = m_totmaps_stop,
>> +    .show    = totmaps_proc_show
>> +};
>> +
>>  static const struct seq_operations proc_pid_smaps_op = {
>>      .start    = m_start,
>>      .next    = m_next,
>> @@ -836,6 +927,49 @@ static int tid_smaps_open(struct inode *inode,
>> struct file *file)
>>      return do_maps_open(inode, file, &proc_tid_smaps_op);
>>  }
>>
>> +static int totmaps_open(struct inode *inode, struct file *file)
>> +{
>> +    struct proc_maps_private *priv = NULL;
>> +    struct seq_file *seq;
>> +    int ret;
>> +
>> +    ret = do_maps_open(inode, file, &proc_totmaps_op);
>> +    if (ret)
>> +        goto error;
>> +
>> +    /*
>> +     * We need to grab references to the task_struct
>> +     * at open time, because there's a potential information
>> +     * leak where the totmaps file is opened and held open
>> +     * while the underlying pid to task mapping changes
>> +     * underneath it
>> +     */
>> +    seq = file->private_data;
>> +    priv = seq->private;
>> +    priv->task = get_proc_task(inode);
>> +    if (!priv->task) {
>> +        ret = -ESRCH;
>> +        goto error_free;
>> +    }
>> +
>> +    return 0;
>> +
>> +error_free:
>> +    proc_map_release(inode, file);
>> +error:
>> +    return ret;
>> +}
>> +
>> +static int totmaps_release(struct inode *inode, struct file *file)
>> +{
>> +    struct seq_file *seq = file->private_data;
>> +    struct proc_maps_private *priv = seq->private;
>> +
>> +    put_task_struct(priv->task);
>> +
>> +    return proc_map_release(inode, file);
>> +}
>> +
>>  const struct file_operations proc_pid_smaps_operations = {
>>      .open        = pid_smaps_open,
>>      .read        = seq_read,
>> @@ -850,6 +984,13 @@ const struct file_operations
>> proc_tid_smaps_operations = {
>>      .release    = proc_map_release,
>>  };
>>
>> +const struct file_operations proc_totmaps_operations = {
>> +    .open        = totmaps_open,
>> +    .read        = seq_read,
>> +    .llseek        = seq_lseek,
>> +    .release    = totmaps_release,
>> +};
>> +
>>  enum clear_refs_types {
>>      CLEAR_REFS_ALL = 1,
>>      CLEAR_REFS_ANON,
>>
>
> When reading totmaps of kernel processes the following NULL pointer
> dereference occurs:
>
> Unable to handle kernel NULL pointer dereference at virtual address
> 00000044
> pgd = ee6e0000
> [00000044] *pgd=7b83a831
> Internal error: Oops: 17 [#6] PREEMPT SMP ARM
> Modules linked in:
> CPU: 2 PID: 1495 Comm: cat Tainted: G      D W
> 4.8.0-rc2-00010-g22fe2db-dirty #159
> Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
> task: ee596e00 task.stack: ee470000
> PC is at down_read+0xc/0x48
> LR is at totmaps_proc_show+0x2c/0x1e8
> pc : [<c06c01f4>]    lr : [<c022a154>]    psr: 40000013
> sp : ee471db8  ip : 00000000  fp : 00000000
> r10: edfe1080  r9 : 00000001  r8 : 00000044
> r7 : ee4abd00  r6 : edfe1080  r5 : edde0b80  r4 : 00000044
> r3 : 00000000  r2 : 00000000  r1 : ffffffc8  r0 : 00000044
> Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> Control: 10c5387d  Table: 6e6e004a  DAC: 00000051
> Process cat (pid: 1495, stack limit = 0xee470210)
> Stack: (0xee471db8 to 0xee472000)
> 1da0:                                                       00000000
> c022a154
> 1dc0: ee596e00 024200ca 00000000 024200ca 00000000 00000081 c0b02594
> 024200ca
> 1de0: 00000055 ee5b7e44 00000800 c019cad0 00000000 c06c1af0 00000001
> c032aa90
> 1e00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> 00000000
> 1e20: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> 00000000
> 1e40: 00000000 00000000 c0a69764 c0a69758 0000000b c01afd60 eff4d000
> eff4d000
> 1e60: edc55f20 00000000 edfe10b0 0001c000 20000013 c06bfc9c 7ab80c7f
> c01bc060
> 1e80: 00000002 ef001b80 c0a695d0 024000c0 00008000 ee471ec0 00008000
> edfe1080
> 1ea0: ee4abd00 00000001 00000001 ee471f80 00000000 c01fe564 0001c000
> edfe10b0
> 1ec0: 00000000 00000000 00024e84 ee5b7e00 ee5b7e44 c0705348 0001c000
> ee4abd00
> 1ee0: ee471f80 00008000 ee470000 0001c000 00000000 c01dc850 c0b06aac
> ee471fb0
> 1f00: b6fbf220 b6fbf7c4 000001ff c0101308 386d6a0e 32e4d737 386d6a0e
> 32e4d737
> 1f20: 00002838 00000000 ee4abd00 bec0eba0 00000000 bec0ed84 ee596e00
> 00000000
> 1f40: ee4abd00 00008000 0001c000 00000000 ee471f80 c01ddca0 00000004
> ee478124
> 1f60: 00000001 00000000 00000000 ee4abd00 ee4abd00 00008000 0001c000
> c01ddd64
> 1f80: 00000000 00000000 00000000 00008000 0001c000 00000003 00000003
> c0107ac4
> 1fa0: 00000000 c0107900 00008000 0001c000 00000003 0001c000 00008000
> 0001c000
> 1fc0: 00008000 0001c000 00000003 00000003 00008000 00000000 0000005e
> 00000000
> 1fe0: 00000000 bec0eb0c 0000c694 b6f4248c 60000010 00000003 fdfffffb
> ffffffff
> [<c06c01f4>] (down_read) from [<c022a154>] (totmaps_proc_show+0x2c/0x1e8)
> [<c022a154>] (totmaps_proc_show) from [<c01fe564>] (seq_read+0x1c8/0x4b8)
> [<c01fe564>] (seq_read) from [<c01dc850>] (__vfs_read+0x2c/0x110)
> [<c01dc850>] (__vfs_read) from [<c01ddca0>] (vfs_read+0x8c/0x110)
> [<c01ddca0>] (vfs_read) from [<c01ddd64>] (SyS_read+0x40/0x8c)
> [<c01ddd64>] (SyS_read) from [<c0107900>] (ret_fast_syscall+0x0/0x3c)
>
> It seems that some protection is needed for such processes, so that
> totmaps would return empty string then, like in case of smaps.
>

Thanks for the testing Jacek!

I had a look around the corresponding smaps code, but I'm not seeing any 
checks, do you know where that check actually is made?


Rob.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PACTH v4 1/3] mm, proc: Implement /proc/<pid>/totmaps
  2016-08-31 16:36     ` Robert Foss
@ 2016-08-31 17:04       ` Mateusz Guzik
  2016-09-01 23:42         ` Robert Foss
  0 siblings, 1 reply; 8+ messages in thread
From: Mateusz Guzik @ 2016-08-31 17:04 UTC (permalink / raw)
  To: Robert Foss
  Cc: Jacek Anaszewski, corbet, akpm, vbabka, mhocko, koct9i, hughd,
	n-horiguchi, minchan, john.stultz, ross.zwisler, jmarchan,
	hannes, mingo, keescook, viro, gorcunov, mnfhuang, adobriyan,
	calvinowens, jdanis, jann, sonnyrao, kirill.shutemov, ldufour,
	linux-doc, linux-kernel, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Michal Hocko, linux-api

On Wed, Aug 31, 2016 at 12:36:26PM -0400, Robert Foss wrote:
> On 2016-08-31 05:45 AM, Jacek Anaszewski wrote:
> > > +static void *m_totmaps_start(struct seq_file *p, loff_t *pos)
> > > +{
> > > +    return NULL + (*pos == 0);
> > > +}
> > > +
> > > +static void *m_totmaps_next(struct seq_file *p, void *v, loff_t *pos)
> > > +{
> > > +    ++*pos;
> > > +    return NULL;
> > > +}
> > > +
> > 
> > When reading totmaps of kernel processes the following NULL pointer
> > dereference occurs:
> > 
> > Unable to handle kernel NULL pointer dereference at virtual address
> > 00000044
> > [<c06c01f4>] (down_read) from [<c022a154>] (totmaps_proc_show+0x2c/0x1e8)
> > [<c022a154>] (totmaps_proc_show) from [<c01fe564>] (seq_read+0x1c8/0x4b8)
> > [<c01fe564>] (seq_read) from [<c01dc850>] (__vfs_read+0x2c/0x110)
> > [<c01dc850>] (__vfs_read) from [<c01ddca0>] (vfs_read+0x8c/0x110)
> > [<c01ddca0>] (vfs_read) from [<c01ddd64>] (SyS_read+0x40/0x8c)
> > [<c01ddd64>] (SyS_read) from [<c0107900>] (ret_fast_syscall+0x0/0x3c)
> > 
> > It seems that some protection is needed for such processes, so that
> > totmaps would return empty string then, like in case of smaps.
> > 
> 
> Thanks for the testing Jacek!
> 
> I had a look around the corresponding smaps code, but I'm not seeing any
> checks, do you know where that check actually is made?
> 

See m_start in f/sproc/task_mmu.c. It not only check for non-null mm,
but also tries to bump ->mm_users and only then proceeds to walk the mm.

-- 
Mateusz Guzik

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PACTH v4 1/3] mm, proc: Implement /proc/<pid>/totmaps
  2016-08-31 17:04       ` Mateusz Guzik
@ 2016-09-01 23:42         ` Robert Foss
  0 siblings, 0 replies; 8+ messages in thread
From: Robert Foss @ 2016-09-01 23:42 UTC (permalink / raw)
  To: Mateusz Guzik
  Cc: Jacek Anaszewski, corbet, akpm, vbabka, mhocko, koct9i, hughd,
	n-horiguchi, minchan, john.stultz, ross.zwisler, jmarchan,
	hannes, mingo, keescook, viro, gorcunov, mnfhuang, adobriyan,
	calvinowens, jdanis, jann, sonnyrao, kirill.shutemov, ldufour,
	linux-doc, linux-kernel, Ben Zhang, Bryan Freed,
	Filipe Brandenburger, Michal Hocko, linux-api



On 2016-08-31 01:04 PM, Mateusz Guzik wrote:
> On Wed, Aug 31, 2016 at 12:36:26PM -0400, Robert Foss wrote:
>> On 2016-08-31 05:45 AM, Jacek Anaszewski wrote:
>>>> +static void *m_totmaps_start(struct seq_file *p, loff_t *pos)
>>>> +{
>>>> +    return NULL + (*pos == 0);
>>>> +}
>>>> +
>>>> +static void *m_totmaps_next(struct seq_file *p, void *v, loff_t *pos)
>>>> +{
>>>> +    ++*pos;
>>>> +    return NULL;
>>>> +}
>>>> +
>>>
>>> When reading totmaps of kernel processes the following NULL pointer
>>> dereference occurs:
>>>
>>> Unable to handle kernel NULL pointer dereference at virtual address
>>> 00000044
>>> [<c06c01f4>] (down_read) from [<c022a154>] (totmaps_proc_show+0x2c/0x1e8)
>>> [<c022a154>] (totmaps_proc_show) from [<c01fe564>] (seq_read+0x1c8/0x4b8)
>>> [<c01fe564>] (seq_read) from [<c01dc850>] (__vfs_read+0x2c/0x110)
>>> [<c01dc850>] (__vfs_read) from [<c01ddca0>] (vfs_read+0x8c/0x110)
>>> [<c01ddca0>] (vfs_read) from [<c01ddd64>] (SyS_read+0x40/0x8c)
>>> [<c01ddd64>] (SyS_read) from [<c0107900>] (ret_fast_syscall+0x0/0x3c)
>>>
>>> It seems that some protection is needed for such processes, so that
>>> totmaps would return empty string then, like in case of smaps.
>>>
>>
>> Thanks for the testing Jacek!
>>
>> I had a look around the corresponding smaps code, but I'm not seeing any
>> checks, do you know where that check actually is made?
>>
>
> See m_start in f/sproc/task_mmu.c. It not only check for non-null mm,
> but also tries to bump ->mm_users and only then proceeds to walk the mm.

So a m_totmaps_start that looks something like the below would be 
enough? And if so, would mm->mm_users need to be decrement inside of
m_totmaps_start?

static void *m_totmaps_start(struct seq_file *p, loff_t *pos)
{
	struct proc_maps_private *priv = m->private;
	struct mm_struct *mm;

	mm = priv->mm;
	if (!mm || !atomic_inc_not_zero(&mm->mm_users))
		return NULL;

	return NULL + (*pos == 0);
}

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-09-01 23:43 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-16 22:33 [PACTH v4 0/3] Implement /proc/<pid>/totmaps robert.foss
2016-08-16 22:33 ` [PACTH v4 1/3] mm, proc: " robert.foss
2016-08-31  9:45   ` Jacek Anaszewski
2016-08-31 16:36     ` Robert Foss
2016-08-31 17:04       ` Mateusz Guzik
2016-09-01 23:42         ` Robert Foss
2016-08-16 22:33 ` [PACTH v4 2/3] Documentation/filesystems: Fixed typo robert.foss
2016-08-16 22:33 ` [PACTH v4 3/3] Documentation/filesystems: Added /proc/PID/totmaps documentation robert.foss

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).