All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sonny Rao <sonnyrao@chromium.org>
To: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Robert Foss <robert.foss@collabora.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Kees Cook <keescook@chromium.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Cyrill Gorcunov <gorcunov@openvz.org>,
	John Stultz <john.stultz@linaro.org>,
	Robin Humble <plaguedbypenguins@gmail.com>,
	Mateusz Guzik <mguzik@redhat.com>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Janis Danisevskis <jdanis@google.com>,
	calvinowens@fb.com, Jann Horn <jann@thejh.net>,
	Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	ldufour@linux.vnet.ibm.com, Johannes Weiner <hannes@cmpxchg.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ben Zhang <benzh@chromium.org>, Bryan Freed <bfreed@chromium.org>,
	Filipe Brandenburger <filbranden@chromium.org>
Subject: Re: [PACTH v1] mm, proc: Implement /proc/<pid>/totmaps
Date: Tue, 9 Aug 2016 17:30:29 -0700	[thread overview]
Message-ID: <CAPz6YkW-YkOVMQr_9ZUN9_cogM453sQ5_9EoGkofS_MtLypZ9A@mail.gmail.com> (raw)
In-Reply-To: <CALYGNiPjJVFWjq9C50ScBKP==ZfeYTPg-veZ9irWfCKWYFJihA@mail.gmail.com>

On Tue, Aug 9, 2016 at 12:16 PM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
>
> On Tue, Aug 9, 2016 at 7:05 PM,  <robert.foss@collabora.com> wrote:
> > From: Sonny Rao <sonnyrao@chromium.org>
> >
> > This is based on earlier work by Thiago Goncales. It implements a new
> > per process proc file which summarizes the contents of the smaps file
> > but doesn't display any addresses.  It gives more detailed information
> > than statm like the PSS (proprotional set size).  It differs from the
> > original implementation in that it doesn't use the full blown set of
> > seq operations, uses a different termination condition, and doesn't
> > displayed "Locked" as that was broken on the original implemenation.
> >
> > This new proc file provides information faster than parsing the potentially
> > huge smaps file.
>
> What statistics do you really need?

PSS (Proportional Set Size) and related accounting of shared pages
(swap could be shared) is where the existing summaries of memory usage
are cumbersome.

>
>
> I think, performance and flexibility issues could be really solved only by new
> syscall for querying memory statistics for address range in any process:
> process_vm_stat() or some kind of pumped fincore() for /proc/$pid/mem


That would be a good long term solution if people want similarly
complicated statistics without having to iterate through current
interfaces.
I mentioned monitoring before but I'll add that Proportional Set size,
Unique Set Size, Swap are per process are also useful because they
help us make better decisions about what processes need to be
throttled or gracefully killed.

>
> >
> > Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
> >
> > Tested-by: Robert Foss <robert.foss@collabora.com>
> > Signed-off-by: Robert Foss <robert.foss@collabora.com>
> >
> > ---
> >  fs/proc/base.c     |   1 +
> >  fs/proc/internal.h |   4 ++
> >  fs/proc/task_mmu.c | 126 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 131 insertions(+)
> >
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index a11eb71..de3acdf 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -2855,6 +2855,7 @@ static const struct pid_entry tgid_base_stuff[] = {
> >         REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
> >         REG("smaps",      S_IRUGO, proc_pid_smaps_operations),
> >         REG("pagemap",    S_IRUSR, proc_pagemap_operations),
> > +       REG("totmaps",    S_IRUGO, proc_totmaps_operations),
> >  #endif
> >  #ifdef CONFIG_SECURITY
> >         DIR("attr",       S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, proc_attr_dir_operations),
> > diff --git a/fs/proc/internal.h b/fs/proc/internal.h
> > index aa27810..6f3540f 100644
> > --- a/fs/proc/internal.h
> > +++ b/fs/proc/internal.h
> > @@ -58,6 +58,9 @@ union proc_op {
> >                 struct task_struct *task);
> >  };
> >
> > +
> > +extern const struct file_operations proc_totmaps_operations;
> > +
> >  struct proc_inode {
> >         struct pid *pid;
> >         int fd;
> > @@ -281,6 +284,7 @@ struct proc_maps_private {
> >         struct mm_struct *mm;
> >  #ifdef CONFIG_MMU
> >         struct vm_area_struct *tail_vma;
> > +       struct mem_size_stats *mss;
> >  #endif
> >  #ifdef CONFIG_NUMA
> >         struct mempolicy *task_mempolicy;
> > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > index 4648c7f..b61873e 100644
> > --- a/fs/proc/task_mmu.c
> > +++ b/fs/proc/task_mmu.c
> > @@ -802,6 +802,81 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
> >         return 0;
> >  }
> >
> > +static void add_smaps_sum(struct mem_size_stats *mss,
> > +               struct mem_size_stats *mss_sum)
> > +{
> > +       mss_sum->resident += mss->resident;
> > +       mss_sum->pss += mss->pss;
> > +       mss_sum->shared_clean += mss->shared_clean;
> > +       mss_sum->shared_dirty += mss->shared_dirty;
> > +       mss_sum->private_clean += mss->private_clean;
> > +       mss_sum->private_dirty += mss->private_dirty;
> > +       mss_sum->referenced += mss->referenced;
> > +       mss_sum->anonymous += mss->anonymous;
> > +       mss_sum->anonymous_thp += mss->anonymous_thp;
> > +       mss_sum->swap += mss->swap;
> > +}
> > +
> > +static int totmaps_proc_show(struct seq_file *m, void *data)
> > +{
> > +       struct proc_maps_private *priv = m->private;
> > +       struct mm_struct *mm;
> > +       struct vm_area_struct *vma;
> > +       struct mem_size_stats *mss_sum = priv->mss;
> > +
> > +       /* reference to priv->task already taken */
> > +       /* but need to get the mm here because */
> > +       /* task could be in the process of exiting */
> > +       mm = get_task_mm(priv->task);
> > +       if (!mm || IS_ERR(mm))
> > +               return -EINVAL;
> > +
> > +       down_read(&mm->mmap_sem);
> > +       hold_task_mempolicy(priv);
> > +
> > +       for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
> > +               struct mem_size_stats mss;
> > +               struct mm_walk smaps_walk = {
> > +                       .pmd_entry = smaps_pte_range,
> > +                       .mm = vma->vm_mm,
> > +                       .private = &mss,
> > +               };
> > +
> > +               if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
> > +                       memset(&mss, 0, sizeof(mss));
> > +                       walk_page_vma(vma, &smaps_walk);
> > +                       add_smaps_sum(&mss, mss_sum);
> > +               }
> > +       }
> > +       seq_printf(m,
> > +                  "Rss:            %8lu kB\n"
> > +                  "Pss:            %8lu kB\n"
> > +                  "Shared_Clean:   %8lu kB\n"
> > +                  "Shared_Dirty:   %8lu kB\n"
> > +                  "Private_Clean:  %8lu kB\n"
> > +                  "Private_Dirty:  %8lu kB\n"
> > +                  "Referenced:     %8lu kB\n"
> > +                  "Anonymous:      %8lu kB\n"
> > +                  "AnonHugePages:  %8lu kB\n"
> > +                  "Swap:           %8lu kB\n",
> > +                  mss_sum->resident >> 10,
> > +                  (unsigned long)(mss_sum->pss >> (10 + PSS_SHIFT)),
> > +                  mss_sum->shared_clean  >> 10,
> > +                  mss_sum->shared_dirty  >> 10,
> > +                  mss_sum->private_clean >> 10,
> > +                  mss_sum->private_dirty >> 10,
> > +                  mss_sum->referenced >> 10,
> > +                  mss_sum->anonymous >> 10,
> > +                  mss_sum->anonymous_thp >> 10,
> > +                  mss_sum->swap >> 10);
> > +
> > +       release_task_mempolicy(priv);
> > +       up_read(&mm->mmap_sem);
> > +       mmput(mm);
> > +
> > +       return 0;
> > +}
> > +
> >  static int show_pid_smap(struct seq_file *m, void *v)
> >  {
> >         return show_smap(m, v, 1);
> > @@ -836,6 +911,50 @@ static int tid_smaps_open(struct inode *inode, struct file *file)
> >         return do_maps_open(inode, file, &proc_tid_smaps_op);
> >  }
> >
> > +static int totmaps_open(struct inode *inode, struct file *file)
> > +{
> > +       struct proc_maps_private *priv;
> > +       int ret = -ENOMEM;
> > +       priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> > +       if (priv) {
> > +               priv->mss = kzalloc(sizeof(*priv->mss), GFP_KERNEL);
> > +               if (!priv->mss)
> > +                       return -ENOMEM;
> > +
> > +               /* we need to grab references to the task_struct */
> > +               /* at open time, because there's a potential information */
> > +               /* leak where the totmaps file is opened and held open */
> > +               /* while the underlying pid to task mapping changes */
> > +               /* underneath it */
> > +               priv->task = get_pid_task(proc_pid(inode), PIDTYPE_PID);
> > +               if (!priv->task) {
> > +                       kfree(priv->mss);
> > +                       kfree(priv);
> > +                       return -ESRCH;
> > +               }
> > +
> > +               ret = single_open(file, totmaps_proc_show, priv);
> > +               if (ret) {
> > +                       put_task_struct(priv->task);
> > +                       kfree(priv->mss);
> > +                       kfree(priv);
> > +               }
> > +       }
> > +       return ret;
> > +}
> > +
> > +static int totmaps_release(struct inode *inode, struct file *file)
> > +{
> > +       struct seq_file *m = file->private_data;
> > +       struct proc_maps_private *priv = m->private;
> > +
> > +       put_task_struct(priv->task);
> > +       kfree(priv->mss);
> > +       kfree(priv);
> > +       m->private = NULL;
> > +       return single_release(inode, file);
> > +}
> > +
> >  const struct file_operations proc_pid_smaps_operations = {
> >         .open           = pid_smaps_open,
> >         .read           = seq_read,
> > @@ -850,6 +969,13 @@ const struct file_operations proc_tid_smaps_operations = {
> >         .release        = proc_map_release,
> >  };
> >
> > +const struct file_operations proc_totmaps_operations = {
> > +       .open           = totmaps_open,
> > +       .read           = seq_read,
> > +       .llseek         = seq_lseek,
> > +       .release        = totmaps_release,
> > +};
> > +
> >  enum clear_refs_types {
> >         CLEAR_REFS_ALL = 1,
> >         CLEAR_REFS_ANON,
> > --
> > 2.7.4
> >

  reply	other threads:[~2016-08-10  0:30 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-09 16:05 [PACTH v1] mm, proc: Implement /proc/<pid>/totmaps robert.foss
2016-08-09 16:29 ` Mateusz Guzik
2016-08-09 16:56   ` Sonny Rao
2016-08-09 20:17   ` Robert Foss
2016-08-10 15:39     ` Robert Foss
2016-08-10 15:42       ` Mateusz Guzik
2016-08-10 15:50         ` Robert Foss
2016-08-09 16:58 ` Alexey Dobriyan
2016-08-09 18:28   ` Sonny Rao
2016-08-09 19:16 ` Konstantin Khlebnikov
2016-08-10  0:30   ` Sonny Rao [this message]
2016-08-09 19:24 ` Jann Horn
2016-08-09 21:01   ` Robert Foss
2016-08-09 22:30     ` Jann Horn
2016-08-10 14:16       ` Robert Foss
2016-08-10 15:02         ` Jann Horn
2016-08-10 16:24           ` Robert Foss
2016-08-10 17:23     ` Sonny Rao
2016-08-10 17:37       ` Jann Horn
2016-08-10 17:45         ` Sonny Rao
2016-08-10 18:05           ` Jann Horn
2016-08-12 16:28             ` Robert Foss
2016-08-13 12:39               ` Jann Horn
2016-08-13 12:39                 ` Jann Horn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPz6YkW-YkOVMQr_9ZUN9_cogM453sQ5_9EoGkofS_MtLypZ9A@mail.gmail.com \
    --to=sonnyrao@chromium.org \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=benzh@chromium.org \
    --cc=bfreed@chromium.org \
    --cc=calvinowens@fb.com \
    --cc=filbranden@chromium.org \
    --cc=gorcunov@openvz.org \
    --cc=hannes@cmpxchg.org \
    --cc=jann@thejh.net \
    --cc=jdanis@google.com \
    --cc=john.stultz@linaro.org \
    --cc=keescook@chromium.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=koct9i@gmail.com \
    --cc=ldufour@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mguzik@redhat.com \
    --cc=mhocko@suse.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=plaguedbypenguins@gmail.com \
    --cc=robert.foss@collabora.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.