All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sonny Rao <sonnyrao@chromium.org>
To: Robert Foss <robert.foss@collabora.com>
Cc: Jann Horn <jann@thejh.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Kees Cook <keescook@chromium.org>,
	viro@zeniv.linux.org.uk, gorcunov@openvz.org,
	John Stultz <john.stultz@linaro.org>,
	plaguedbypenguins@gmail.com, Mateusz Guzik <mguzik@redhat.com>,
	adobriyan@gmail.com, jdanis@google.com, calvinowens@fb.com,
	mhocko@suse.com, koct9i@gmail.com, vbabka@suse.cz,
	n-horiguchi@ah.jp.nec.com, kirill.shutemov@linux.intel.com,
	ldufour@linux.vnet.ibm.com, Johannes Weiner <hannes@cmpxchg.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Ben Zhang <benzh@chromium.org>, Bryan Freed <bfreed@chromium.org>,
	Filipe Brandenburger <filbranden@chromium.org>
Subject: Re: [PACTH v1] mm, proc: Implement /proc/<pid>/totmaps
Date: Wed, 10 Aug 2016 10:23:53 -0700	[thread overview]
Message-ID: <CAPz6YkUKN9xOO=m2sLGNGkH9uSa4cvyDWAPPS6RFwheOS=opcA@mail.gmail.com> (raw)
In-Reply-To: <8ac1b493-e051-ea0e-3a71-c4476054bdb2@collabora.com>

On Tue, Aug 9, 2016 at 2:01 PM, Robert Foss <robert.foss@collabora.com> wrote:
>
>
> On 2016-08-09 03:24 PM, Jann Horn wrote:
>>
>> On Tue, Aug 09, 2016 at 12:05:43PM -0400, robert.foss@collabora.com wrote:
>>>
>>> From: Sonny Rao <sonnyrao@chromium.org>
>>>
>>> This is based on earlier work by Thiago Goncales. It implements a new
>>> per process proc file which summarizes the contents of the smaps file
>>> but doesn't display any addresses.  It gives more detailed information
>>> than statm like the PSS (proprotional set size).  It differs from the
>>> original implementation in that it doesn't use the full blown set of
>>> seq operations, uses a different termination condition, and doesn't
>>> displayed "Locked" as that was broken on the original implemenation.
>>>
>>> This new proc file provides information faster than parsing the
>>> potentially
>>> huge smaps file.
>>>
>>> Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
>>>
>>> Tested-by: Robert Foss <robert.foss@collabora.com>
>>> Signed-off-by: Robert Foss <robert.foss@collabora.com>
>>
>>
>>
>>> +static int totmaps_proc_show(struct seq_file *m, void *data)
>>> +{
>>> +       struct proc_maps_private *priv = m->private;
>>> +       struct mm_struct *mm;
>>> +       struct vm_area_struct *vma;
>>> +       struct mem_size_stats *mss_sum = priv->mss;
>>> +
>>> +       /* reference to priv->task already taken */
>>> +       /* but need to get the mm here because */
>>> +       /* task could be in the process of exiting */
>>
>>
>> Can you please elaborate on this? My understanding here is that you
>> intend for the caller to be able to repeatedly read the same totmaps
>> file with pread() and still see updated information after the target
>> process has called execve() and be able to detect process death
>> (instead of simply seeing stale values). Is that accurate?
>>
>> I would prefer it if you could grab a reference to the mm_struct
>> directly at open time.
>
>
> Sonny, do you know more about the above comment?

I think right now the file gets re-opened every time, but the mode
where the file is opened once and repeatedly read is interesting
because it avoids having to open the file again and again.

I guess you could end up with a wierd situation where you don't read
the entire contents of the file in open call to read() and you might
get inconsistent data across the different statistics?

>
>>
>>
>>> +       mm = get_task_mm(priv->task);
>>> +       if (!mm || IS_ERR(mm))
>>> +               return -EINVAL;
>>
>>
>> get_task_mm() doesn't return error codes, and all other callers just
>> check whether the return value is NULL.
>>
>
> I'll have that fixed in v2, thanks for spotting it!
>
>
>>
>>> +       down_read(&mm->mmap_sem);
>>> +       hold_task_mempolicy(priv);
>>> +
>>> +       for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
>>> +               struct mem_size_stats mss;
>>> +               struct mm_walk smaps_walk = {
>>> +                       .pmd_entry = smaps_pte_range,
>>> +                       .mm = vma->vm_mm,
>>> +                       .private = &mss,
>>> +               };
>>> +
>>> +               if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
>>> +                       memset(&mss, 0, sizeof(mss));
>>> +                       walk_page_vma(vma, &smaps_walk);
>>> +                       add_smaps_sum(&mss, mss_sum);
>>> +               }
>>> +       }
>>
>>
>> Errrr... what? You accumulate values from mem_size_stats items into a
>> struct mss_sum that is associated with the struct file? So when you
>> read the file the second time, you get the old values plus the new ones?
>> And when you read the file in parallel, you get inconsistent values?
>>
>> For most files in procfs, the behavior is that you can just call
>> pread(fd, buf, sizeof(buf), 0) on the same fd again and again, giving
>> you the current values every time, without mutating state. I strongly
>> recommend that you get rid of priv->mss and just accumulate the state
>> in a local variable (maybe one on the stack).
>
>
> So a simple "static struct mem_size_stats" in totmaps_proc_show() would be a
> better solution?
>
>>
>>
>>> @@ -836,6 +911,50 @@ static int tid_smaps_open(struct inode *inode,
>>> struct file *file)
>>>         return do_maps_open(inode, file, &proc_tid_smaps_op);
>>>  }
>>>
>>> +static int totmaps_open(struct inode *inode, struct file *file)
>>> +{
>>> +       struct proc_maps_private *priv;
>>> +       int ret = -ENOMEM;
>>> +       priv = kzalloc(sizeof(*priv), GFP_KERNEL);
>>> +       if (priv) {
>>> +               priv->mss = kzalloc(sizeof(*priv->mss), GFP_KERNEL);
>>> +               if (!priv->mss)
>>> +                       return -ENOMEM;
>>
>>
>> Memory leak: If the first allocation works and the second one doesn't,
>> this
>> doesn't free the first allocation.
>>
>> Please change this to use the typical goto pattern for error handling.
>
>
> Fix will be implemented in v2.
>
>>
>>> +
>>> +               /* we need to grab references to the task_struct */
>>> +               /* at open time, because there's a potential information
>>> */
>>> +               /* leak where the totmaps file is opened and held open */
>>> +               /* while the underlying pid to task mapping changes */
>>> +               /* underneath it */
>>
>>
>> Nit: That's not how comments are done in the kernel. Maybe change this to
>> a normal block comment instead of one block comment per line?
>
>
> I'm not sure how that one slipped by, but I'll change it in v2.
>
>>
>>> +               priv->task = get_pid_task(proc_pid(inode), PIDTYPE_PID);
>>
>>
>> `get_pid_task(proc_pid(inode), PIDTYPE_PID)` is exactly the definition
>> of get_proc_task(inode), maybe use that instead?
>>
>
> Will do. v2 will fix this.
>
>>> +               if (!priv->task) {
>>> +                       kfree(priv->mss);
>>> +                       kfree(priv);
>>> +                       return -ESRCH;
>>> +               }
>>> +
>>> +               ret = single_open(file, totmaps_proc_show, priv);
>>> +               if (ret) {
>>> +                       put_task_struct(priv->task);
>>> +                       kfree(priv->mss);
>>> +                       kfree(priv);
>>> +               }
>>> +       }
>>> +       return ret;
>>> +}
>>
>>
>> Please change this method to use the typical goto pattern for error
>> handling. IMO repeating the undo steps in all error cases makes
>> mistakes (like the one above) more likely and increases the amount
>> of redundant code.
>
>
> Agreed. Change queued for v2.
>
>>
>> Also: The smaps file is only accessible to callers with
>> PTRACE_MODE_READ privileges on the target task. Your thing doesn't
>> do any access checks, neither in the open handler nor in the read
>> handler. Can you give an analysis of why it's okay to expose this
>> data? As far as I can tell, without spending a lot of time thinking
>> about it, this kind of data looks like it might potentially be
>> useful for side-channel information leaks or so.
>>
>
> I think it should require the same permissions as smaps, so changing the
> code to require PTRACE_MODE_READ privileges is most likely a good idea. I'll
> have a look at it for v2.

  parent reply	other threads:[~2016-08-10 18:25 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-09 16:05 [PACTH v1] mm, proc: Implement /proc/<pid>/totmaps robert.foss
2016-08-09 16:29 ` Mateusz Guzik
2016-08-09 16:56   ` Sonny Rao
2016-08-09 20:17   ` Robert Foss
2016-08-10 15:39     ` Robert Foss
2016-08-10 15:42       ` Mateusz Guzik
2016-08-10 15:50         ` Robert Foss
2016-08-09 16:58 ` Alexey Dobriyan
2016-08-09 18:28   ` Sonny Rao
2016-08-09 19:16 ` Konstantin Khlebnikov
2016-08-10  0:30   ` Sonny Rao
2016-08-09 19:24 ` Jann Horn
2016-08-09 21:01   ` Robert Foss
2016-08-09 22:30     ` Jann Horn
2016-08-10 14:16       ` Robert Foss
2016-08-10 15:02         ` Jann Horn
2016-08-10 16:24           ` Robert Foss
2016-08-10 17:23     ` Sonny Rao [this message]
2016-08-10 17:37       ` Jann Horn
2016-08-10 17:45         ` Sonny Rao
2016-08-10 18:05           ` Jann Horn
2016-08-12 16:28             ` Robert Foss
2016-08-13 12:39               ` Jann Horn
2016-08-13 12:39                 ` Jann Horn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPz6YkUKN9xOO=m2sLGNGkH9uSa4cvyDWAPPS6RFwheOS=opcA@mail.gmail.com' \
    --to=sonnyrao@chromium.org \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=benzh@chromium.org \
    --cc=bfreed@chromium.org \
    --cc=calvinowens@fb.com \
    --cc=filbranden@chromium.org \
    --cc=gorcunov@openvz.org \
    --cc=hannes@cmpxchg.org \
    --cc=jann@thejh.net \
    --cc=jdanis@google.com \
    --cc=john.stultz@linaro.org \
    --cc=keescook@chromium.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=koct9i@gmail.com \
    --cc=ldufour@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mguzik@redhat.com \
    --cc=mhocko@suse.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=plaguedbypenguins@gmail.com \
    --cc=robert.foss@collabora.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.