All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konstantin Khlebnikov <koct9i@gmail.com>
To: robert.foss@collabora.com
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Kees Cook <keescook@chromium.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Cyrill Gorcunov <gorcunov@openvz.org>,
	John Stultz <john.stultz@linaro.org>,
	plaguedbypenguins@gmail.com, sonnyrao@chromium.org,
	mguzik@redhat.com, Alexey Dobriyan <adobriyan@gmail.com>,
	jdanis@google.com, calvinowens@fb.com, Jann Horn <jann@thejh.net>,
	Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	ldufour@linux.vnet.ibm.com, Johannes Weiner <hannes@cmpxchg.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ben Zhang <benzh@chromium.org>, Bryan Freed <bfreed@chromium.org>,
	Filipe Brandenburger <filbranden@chromium.org>
Subject: Re: [PACTH v1] mm, proc: Implement /proc/<pid>/totmaps
Date: Tue, 9 Aug 2016 22:16:38 +0300	[thread overview]
Message-ID: <CALYGNiPjJVFWjq9C50ScBKP==ZfeYTPg-veZ9irWfCKWYFJihA@mail.gmail.com> (raw)
In-Reply-To: <1470758743-17685-1-git-send-email-robert.foss@collabora.com>

On Tue, Aug 9, 2016 at 7:05 PM,  <robert.foss@collabora.com> wrote:
> From: Sonny Rao <sonnyrao@chromium.org>
>
> This is based on earlier work by Thiago Goncales. It implements a new
> per process proc file which summarizes the contents of the smaps file
> but doesn't display any addresses.  It gives more detailed information
> than statm like the PSS (proprotional set size).  It differs from the
> original implementation in that it doesn't use the full blown set of
> seq operations, uses a different termination condition, and doesn't
> displayed "Locked" as that was broken on the original implemenation.
>
> This new proc file provides information faster than parsing the potentially
> huge smaps file.

What statistics do you really need?

I think, performance and flexibility issues could be really solved only by new
syscall for querying memory statistics for address range in any process:
process_vm_stat() or some kind of pumped fincore() for /proc/$pid/mem

>
> Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
>
> Tested-by: Robert Foss <robert.foss@collabora.com>
> Signed-off-by: Robert Foss <robert.foss@collabora.com>
>
> ---
>  fs/proc/base.c     |   1 +
>  fs/proc/internal.h |   4 ++
>  fs/proc/task_mmu.c | 126 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 131 insertions(+)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index a11eb71..de3acdf 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -2855,6 +2855,7 @@ static const struct pid_entry tgid_base_stuff[] = {
>         REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
>         REG("smaps",      S_IRUGO, proc_pid_smaps_operations),
>         REG("pagemap",    S_IRUSR, proc_pagemap_operations),
> +       REG("totmaps",    S_IRUGO, proc_totmaps_operations),
>  #endif
>  #ifdef CONFIG_SECURITY
>         DIR("attr",       S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, proc_attr_dir_operations),
> diff --git a/fs/proc/internal.h b/fs/proc/internal.h
> index aa27810..6f3540f 100644
> --- a/fs/proc/internal.h
> +++ b/fs/proc/internal.h
> @@ -58,6 +58,9 @@ union proc_op {
>                 struct task_struct *task);
>  };
>
> +
> +extern const struct file_operations proc_totmaps_operations;
> +
>  struct proc_inode {
>         struct pid *pid;
>         int fd;
> @@ -281,6 +284,7 @@ struct proc_maps_private {
>         struct mm_struct *mm;
>  #ifdef CONFIG_MMU
>         struct vm_area_struct *tail_vma;
> +       struct mem_size_stats *mss;
>  #endif
>  #ifdef CONFIG_NUMA
>         struct mempolicy *task_mempolicy;
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 4648c7f..b61873e 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -802,6 +802,81 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
>         return 0;
>  }
>
> +static void add_smaps_sum(struct mem_size_stats *mss,
> +               struct mem_size_stats *mss_sum)
> +{
> +       mss_sum->resident += mss->resident;
> +       mss_sum->pss += mss->pss;
> +       mss_sum->shared_clean += mss->shared_clean;
> +       mss_sum->shared_dirty += mss->shared_dirty;
> +       mss_sum->private_clean += mss->private_clean;
> +       mss_sum->private_dirty += mss->private_dirty;
> +       mss_sum->referenced += mss->referenced;
> +       mss_sum->anonymous += mss->anonymous;
> +       mss_sum->anonymous_thp += mss->anonymous_thp;
> +       mss_sum->swap += mss->swap;
> +}
> +
> +static int totmaps_proc_show(struct seq_file *m, void *data)
> +{
> +       struct proc_maps_private *priv = m->private;
> +       struct mm_struct *mm;
> +       struct vm_area_struct *vma;
> +       struct mem_size_stats *mss_sum = priv->mss;
> +
> +       /* reference to priv->task already taken */
> +       /* but need to get the mm here because */
> +       /* task could be in the process of exiting */
> +       mm = get_task_mm(priv->task);
> +       if (!mm || IS_ERR(mm))
> +               return -EINVAL;
> +
> +       down_read(&mm->mmap_sem);
> +       hold_task_mempolicy(priv);
> +
> +       for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
> +               struct mem_size_stats mss;
> +               struct mm_walk smaps_walk = {
> +                       .pmd_entry = smaps_pte_range,
> +                       .mm = vma->vm_mm,
> +                       .private = &mss,
> +               };
> +
> +               if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
> +                       memset(&mss, 0, sizeof(mss));
> +                       walk_page_vma(vma, &smaps_walk);
> +                       add_smaps_sum(&mss, mss_sum);
> +               }
> +       }
> +       seq_printf(m,
> +                  "Rss:            %8lu kB\n"
> +                  "Pss:            %8lu kB\n"
> +                  "Shared_Clean:   %8lu kB\n"
> +                  "Shared_Dirty:   %8lu kB\n"
> +                  "Private_Clean:  %8lu kB\n"
> +                  "Private_Dirty:  %8lu kB\n"
> +                  "Referenced:     %8lu kB\n"
> +                  "Anonymous:      %8lu kB\n"
> +                  "AnonHugePages:  %8lu kB\n"
> +                  "Swap:           %8lu kB\n",
> +                  mss_sum->resident >> 10,
> +                  (unsigned long)(mss_sum->pss >> (10 + PSS_SHIFT)),
> +                  mss_sum->shared_clean  >> 10,
> +                  mss_sum->shared_dirty  >> 10,
> +                  mss_sum->private_clean >> 10,
> +                  mss_sum->private_dirty >> 10,
> +                  mss_sum->referenced >> 10,
> +                  mss_sum->anonymous >> 10,
> +                  mss_sum->anonymous_thp >> 10,
> +                  mss_sum->swap >> 10);
> +
> +       release_task_mempolicy(priv);
> +       up_read(&mm->mmap_sem);
> +       mmput(mm);
> +
> +       return 0;
> +}
> +
>  static int show_pid_smap(struct seq_file *m, void *v)
>  {
>         return show_smap(m, v, 1);
> @@ -836,6 +911,50 @@ static int tid_smaps_open(struct inode *inode, struct file *file)
>         return do_maps_open(inode, file, &proc_tid_smaps_op);
>  }
>
> +static int totmaps_open(struct inode *inode, struct file *file)
> +{
> +       struct proc_maps_private *priv;
> +       int ret = -ENOMEM;
> +       priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> +       if (priv) {
> +               priv->mss = kzalloc(sizeof(*priv->mss), GFP_KERNEL);
> +               if (!priv->mss)
> +                       return -ENOMEM;
> +
> +               /* we need to grab references to the task_struct */
> +               /* at open time, because there's a potential information */
> +               /* leak where the totmaps file is opened and held open */
> +               /* while the underlying pid to task mapping changes */
> +               /* underneath it */
> +               priv->task = get_pid_task(proc_pid(inode), PIDTYPE_PID);
> +               if (!priv->task) {
> +                       kfree(priv->mss);
> +                       kfree(priv);
> +                       return -ESRCH;
> +               }
> +
> +               ret = single_open(file, totmaps_proc_show, priv);
> +               if (ret) {
> +                       put_task_struct(priv->task);
> +                       kfree(priv->mss);
> +                       kfree(priv);
> +               }
> +       }
> +       return ret;
> +}
> +
> +static int totmaps_release(struct inode *inode, struct file *file)
> +{
> +       struct seq_file *m = file->private_data;
> +       struct proc_maps_private *priv = m->private;
> +
> +       put_task_struct(priv->task);
> +       kfree(priv->mss);
> +       kfree(priv);
> +       m->private = NULL;
> +       return single_release(inode, file);
> +}
> +
>  const struct file_operations proc_pid_smaps_operations = {
>         .open           = pid_smaps_open,
>         .read           = seq_read,
> @@ -850,6 +969,13 @@ const struct file_operations proc_tid_smaps_operations = {
>         .release        = proc_map_release,
>  };
>
> +const struct file_operations proc_totmaps_operations = {
> +       .open           = totmaps_open,
> +       .read           = seq_read,
> +       .llseek         = seq_lseek,
> +       .release        = totmaps_release,
> +};
> +
>  enum clear_refs_types {
>         CLEAR_REFS_ALL = 1,
>         CLEAR_REFS_ANON,
> --
> 2.7.4
>

  parent reply	other threads:[~2016-08-09 19:17 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-09 16:05 [PACTH v1] mm, proc: Implement /proc/<pid>/totmaps robert.foss
2016-08-09 16:29 ` Mateusz Guzik
2016-08-09 16:56   ` Sonny Rao
2016-08-09 20:17   ` Robert Foss
2016-08-10 15:39     ` Robert Foss
2016-08-10 15:42       ` Mateusz Guzik
2016-08-10 15:50         ` Robert Foss
2016-08-09 16:58 ` Alexey Dobriyan
2016-08-09 18:28   ` Sonny Rao
2016-08-09 19:16 ` Konstantin Khlebnikov [this message]
2016-08-10  0:30   ` Sonny Rao
2016-08-09 19:24 ` Jann Horn
2016-08-09 21:01   ` Robert Foss
2016-08-09 22:30     ` Jann Horn
2016-08-10 14:16       ` Robert Foss
2016-08-10 15:02         ` Jann Horn
2016-08-10 16:24           ` Robert Foss
2016-08-10 17:23     ` Sonny Rao
2016-08-10 17:37       ` Jann Horn
2016-08-10 17:45         ` Sonny Rao
2016-08-10 18:05           ` Jann Horn
2016-08-12 16:28             ` Robert Foss
2016-08-13 12:39               ` Jann Horn
2016-08-13 12:39                 ` Jann Horn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALYGNiPjJVFWjq9C50ScBKP==ZfeYTPg-veZ9irWfCKWYFJihA@mail.gmail.com' \
    --to=koct9i@gmail.com \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=benzh@chromium.org \
    --cc=bfreed@chromium.org \
    --cc=calvinowens@fb.com \
    --cc=filbranden@chromium.org \
    --cc=gorcunov@openvz.org \
    --cc=hannes@cmpxchg.org \
    --cc=jann@thejh.net \
    --cc=jdanis@google.com \
    --cc=john.stultz@linaro.org \
    --cc=keescook@chromium.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=ldufour@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mguzik@redhat.com \
    --cc=mhocko@suse.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=plaguedbypenguins@gmail.com \
    --cc=robert.foss@collabora.com \
    --cc=sonnyrao@chromium.org \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.