From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932416AbcHIVBx (ORCPT ); Tue, 9 Aug 2016 17:01:53 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:41740 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932324AbcHIVBw (ORCPT ); Tue, 9 Aug 2016 17:01:52 -0400 Subject: Re: [PACTH v1] mm, proc: Implement /proc//totmaps To: Jann Horn , Sonny Rao References: <1470758743-17685-1-git-send-email-robert.foss@collabora.com> <20160809192414.GA19573@pc.thejh.net> Cc: akpm@linux-foundation.org, keescook@chromium.org, viro@zeniv.linux.org.uk, gorcunov@openvz.org, john.stultz@linaro.org, plaguedbypenguins@gmail.com, mguzik@redhat.com, adobriyan@gmail.com, jdanis@google.com, calvinowens@fb.com, mhocko@suse.com, koct9i@gmail.com, vbabka@suse.cz, n-horiguchi@ah.jp.nec.com, kirill.shutemov@linux.intel.com, ldufour@linux.vnet.ibm.com, hannes@cmpxchg.org, linux-kernel@vger.kernel.org, Ben Zhang , Bryan Freed , Filipe Brandenburger From: Robert Foss Message-ID: <8ac1b493-e051-ea0e-3a71-c4476054bdb2@collabora.com> Date: Tue, 9 Aug 2016 17:01:44 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <20160809192414.GA19573@pc.thejh.net> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2016-08-09 03:24 PM, Jann Horn wrote: > On Tue, Aug 09, 2016 at 12:05:43PM -0400, robert.foss@collabora.com wrote: >> From: Sonny Rao >> >> This is based on earlier work by Thiago Goncales. It implements a new >> per process proc file which summarizes the contents of the smaps file >> but doesn't display any addresses. It gives more detailed information >> than statm like the PSS (proprotional set size). It differs from the >> original implementation in that it doesn't use the full blown set of >> seq operations, uses a different termination condition, and doesn't >> displayed "Locked" as that was broken on the original implemenation. >> >> This new proc file provides information faster than parsing the potentially >> huge smaps file. >> >> Signed-off-by: Sonny Rao >> >> Tested-by: Robert Foss >> Signed-off-by: Robert Foss > > >> +static int totmaps_proc_show(struct seq_file *m, void *data) >> +{ >> + struct proc_maps_private *priv = m->private; >> + struct mm_struct *mm; >> + struct vm_area_struct *vma; >> + struct mem_size_stats *mss_sum = priv->mss; >> + >> + /* reference to priv->task already taken */ >> + /* but need to get the mm here because */ >> + /* task could be in the process of exiting */ > > Can you please elaborate on this? My understanding here is that you > intend for the caller to be able to repeatedly read the same totmaps > file with pread() and still see updated information after the target > process has called execve() and be able to detect process death > (instead of simply seeing stale values). Is that accurate? > > I would prefer it if you could grab a reference to the mm_struct > directly at open time. Sonny, do you know more about the above comment? > > >> + mm = get_task_mm(priv->task); >> + if (!mm || IS_ERR(mm)) >> + return -EINVAL; > > get_task_mm() doesn't return error codes, and all other callers just > check whether the return value is NULL. > I'll have that fixed in v2, thanks for spotting it! > >> + down_read(&mm->mmap_sem); >> + hold_task_mempolicy(priv); >> + >> + for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) { >> + struct mem_size_stats mss; >> + struct mm_walk smaps_walk = { >> + .pmd_entry = smaps_pte_range, >> + .mm = vma->vm_mm, >> + .private = &mss, >> + }; >> + >> + if (vma->vm_mm && !is_vm_hugetlb_page(vma)) { >> + memset(&mss, 0, sizeof(mss)); >> + walk_page_vma(vma, &smaps_walk); >> + add_smaps_sum(&mss, mss_sum); >> + } >> + } > > Errrr... what? You accumulate values from mem_size_stats items into a > struct mss_sum that is associated with the struct file? So when you > read the file the second time, you get the old values plus the new ones? > And when you read the file in parallel, you get inconsistent values? > > For most files in procfs, the behavior is that you can just call > pread(fd, buf, sizeof(buf), 0) on the same fd again and again, giving > you the current values every time, without mutating state. I strongly > recommend that you get rid of priv->mss and just accumulate the state > in a local variable (maybe one on the stack). So a simple "static struct mem_size_stats" in totmaps_proc_show() would be a better solution? > > >> @@ -836,6 +911,50 @@ static int tid_smaps_open(struct inode *inode, struct file *file) >> return do_maps_open(inode, file, &proc_tid_smaps_op); >> } >> >> +static int totmaps_open(struct inode *inode, struct file *file) >> +{ >> + struct proc_maps_private *priv; >> + int ret = -ENOMEM; >> + priv = kzalloc(sizeof(*priv), GFP_KERNEL); >> + if (priv) { >> + priv->mss = kzalloc(sizeof(*priv->mss), GFP_KERNEL); >> + if (!priv->mss) >> + return -ENOMEM; > > Memory leak: If the first allocation works and the second one doesn't, this > doesn't free the first allocation. > > Please change this to use the typical goto pattern for error handling. Fix will be implemented in v2. > >> + >> + /* we need to grab references to the task_struct */ >> + /* at open time, because there's a potential information */ >> + /* leak where the totmaps file is opened and held open */ >> + /* while the underlying pid to task mapping changes */ >> + /* underneath it */ > > Nit: That's not how comments are done in the kernel. Maybe change this to > a normal block comment instead of one block comment per line? I'm not sure how that one slipped by, but I'll change it in v2. > >> + priv->task = get_pid_task(proc_pid(inode), PIDTYPE_PID); > > `get_pid_task(proc_pid(inode), PIDTYPE_PID)` is exactly the definition > of get_proc_task(inode), maybe use that instead? > Will do. v2 will fix this. >> + if (!priv->task) { >> + kfree(priv->mss); >> + kfree(priv); >> + return -ESRCH; >> + } >> + >> + ret = single_open(file, totmaps_proc_show, priv); >> + if (ret) { >> + put_task_struct(priv->task); >> + kfree(priv->mss); >> + kfree(priv); >> + } >> + } >> + return ret; >> +} > > Please change this method to use the typical goto pattern for error > handling. IMO repeating the undo steps in all error cases makes > mistakes (like the one above) more likely and increases the amount > of redundant code. Agreed. Change queued for v2. > > Also: The smaps file is only accessible to callers with > PTRACE_MODE_READ privileges on the target task. Your thing doesn't > do any access checks, neither in the open handler nor in the read > handler. Can you give an analysis of why it's okay to expose this > data? As far as I can tell, without spending a lot of time thinking > about it, this kind of data looks like it might potentially be > useful for side-channel information leaks or so. > I think it should require the same permissions as smaps, so changing the code to require PTRACE_MODE_READ privileges is most likely a good idea. I'll have a look at it for v2.