All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: linux-kernel@vger.kernel.org,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Borislav Petkov <bp@alien8.de>,
	Brendan Gregg <bgregg@netflix.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Christian Hansen <chansen3@cisco.com>,
	dancol@google.com, fmayer@google.com,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	joelaf@google.com, Jonathan Corbet <corbet@lwn.net>,
	Kees Cook <keescook@chromium.org>,
	kernel-team@android.com, linux-api@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, Michal Hocko <mhocko@suse.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	minchan@kernel.org, namhyung@google.com, paulmck@linux.ibm.com,
	Robin Murphy <robin.murphy@arm.com>, Roman Gushchin <guro@fb.com>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	surenb@google.com, Thomas Gleixner <tglx@linutronix.de>,
	tkjos@google.com, Vladimir Davydov <vdavydov.dev@gmail.com>,
	Vlastimil Babka <vbabka@suse.cz>, Will Deacon <will@kernel.org>,
	Brendan Gregg <brendan.d.gregg@gmail.com>
Subject: Re: [PATCH v4 1/5] mm/page_idle: Add per-pid idle page tracking using virtual indexing
Date: Tue, 6 Aug 2019 15:19:21 -0700	[thread overview]
Message-ID: <20190806151921.edec128271caccb5214fc1bd@linux-foundation.org> (raw)
In-Reply-To: <20190805170451.26009-1-joel@joelfernandes.org>

(cc Brendan's other email address, hoping for review input ;))

On Mon,  5 Aug 2019 13:04:47 -0400 "Joel Fernandes (Google)" <joel@joelfernandes.org> wrote:

> The page_idle tracking feature currently requires looking up the pagemap
> for a process followed by interacting with /sys/kernel/mm/page_idle.
> Looking up PFN from pagemap in Android devices is not supported by
> unprivileged process and requires SYS_ADMIN and gives 0 for the PFN.
> 
> This patch adds support to directly interact with page_idle tracking at
> the PID level by introducing a /proc/<pid>/page_idle file.  It follows
> the exact same semantics as the global /sys/kernel/mm/page_idle, but now
> looking up PFN through pagemap is not needed since the interface uses
> virtual frame numbers, and at the same time also does not require
> SYS_ADMIN.
> 
> In Android, we are using this for the heap profiler (heapprofd) which
> profiles and pin points code paths which allocates and leaves memory
> idle for long periods of time. This method solves the security issue
> with userspace learning the PFN, and while at it is also shown to yield
> better results than the pagemap lookup, the theory being that the window
> where the address space can change is reduced by eliminating the
> intermediate pagemap look up stage. In virtual address indexing, the
> process's mmap_sem is held for the duration of the access.

Quite a lot of changes to the page_idle code.  Has this all been
runtime tested on architectures where
CONFIG_HAVE_ARCH_PTE_SWP_PGIDLE=n?  That could be x86 with a little
Kconfig fiddle-for-testing-purposes.

> 8 files changed, 376 insertions(+), 45 deletions(-)

Quite a lot of new code unconditionally added to major architectures. 
Are we confident that everyone will want this feature?

>
> ...
>
> +static int proc_page_idle_open(struct inode *inode, struct file *file)
> +{
> +	struct mm_struct *mm;
> +
> +	mm = proc_mem_open(inode, PTRACE_MODE_READ);
> +	if (IS_ERR(mm))
> +		return PTR_ERR(mm);
> +	file->private_data = mm;
> +	return 0;
> +}
> +
> +static int proc_page_idle_release(struct inode *inode, struct file *file)
> +{
> +	struct mm_struct *mm = file->private_data;
> +
> +	if (mm)

I suspect the test isn't needed?  proc_page_idle_release) won't be
called if proc_page_idle_open() failed?

> +		mmdrop(mm);
> +	return 0;
> +}
>
> ...
>

WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: linux-kernel@vger.kernel.org,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Borislav Petkov <bp@alien8.de>,
	Brendan Gregg <bgregg@netflix.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Christian Hansen <chansen3@cisco.com>,
	dancol@google.com, fmayer@google.com,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	joelaf@google.com, Jonathan Corbet <corbet@lwn.net>,
	Kees Cook <keescook@chromium.org>,
	kernel-team@android.com, linux-api@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, Michal Hocko <mhocko@suse.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	minchan@kernel.org, namhyung@google.com, paulmck@linux.ibm.com,
	Robin Murphy <robin.murphy@arm.com>, Roman Gushchin <guro@fb.com>,
	Stephen Rothwell <sf>
Subject: Re: [PATCH v4 1/5] mm/page_idle: Add per-pid idle page tracking using virtual indexing
Date: Tue, 6 Aug 2019 15:19:21 -0700	[thread overview]
Message-ID: <20190806151921.edec128271caccb5214fc1bd@linux-foundation.org> (raw)
In-Reply-To: <20190805170451.26009-1-joel@joelfernandes.org>

(cc Brendan's other email address, hoping for review input ;))

On Mon,  5 Aug 2019 13:04:47 -0400 "Joel Fernandes (Google)" <joel@joelfernandes.org> wrote:

> The page_idle tracking feature currently requires looking up the pagemap
> for a process followed by interacting with /sys/kernel/mm/page_idle.
> Looking up PFN from pagemap in Android devices is not supported by
> unprivileged process and requires SYS_ADMIN and gives 0 for the PFN.
> 
> This patch adds support to directly interact with page_idle tracking at
> the PID level by introducing a /proc/<pid>/page_idle file.  It follows
> the exact same semantics as the global /sys/kernel/mm/page_idle, but now
> looking up PFN through pagemap is not needed since the interface uses
> virtual frame numbers, and at the same time also does not require
> SYS_ADMIN.
> 
> In Android, we are using this for the heap profiler (heapprofd) which
> profiles and pin points code paths which allocates and leaves memory
> idle for long periods of time. This method solves the security issue
> with userspace learning the PFN, and while at it is also shown to yield
> better results than the pagemap lookup, the theory being that the window
> where the address space can change is reduced by eliminating the
> intermediate pagemap look up stage. In virtual address indexing, the
> process's mmap_sem is held for the duration of the access.

Quite a lot of changes to the page_idle code.  Has this all been
runtime tested on architectures where
CONFIG_HAVE_ARCH_PTE_SWP_PGIDLE=n?  That could be x86 with a little
Kconfig fiddle-for-testing-purposes.

> 8 files changed, 376 insertions(+), 45 deletions(-)

Quite a lot of new code unconditionally added to major architectures. 
Are we confident that everyone will want this feature?

>
> ...
>
> +static int proc_page_idle_open(struct inode *inode, struct file *file)
> +{
> +	struct mm_struct *mm;
> +
> +	mm = proc_mem_open(inode, PTRACE_MODE_READ);
> +	if (IS_ERR(mm))
> +		return PTR_ERR(mm);
> +	file->private_data = mm;
> +	return 0;
> +}
> +
> +static int proc_page_idle_release(struct inode *inode, struct file *file)
> +{
> +	struct mm_struct *mm = file->private_data;
> +
> +	if (mm)

I suspect the test isn't needed?  proc_page_idle_release) won't be
called if proc_page_idle_open() failed?

> +		mmdrop(mm);
> +	return 0;
> +}
>
> ...
>

  parent reply	other threads:[~2019-08-06 22:19 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-05 17:04 [PATCH v4 1/5] mm/page_idle: Add per-pid idle page tracking using virtual indexing Joel Fernandes (Google)
2019-08-05 17:04 ` Joel Fernandes (Google)
2019-08-05 17:04 ` [PATCH v4 2/5] [RFC] x86: Add support for idle bit in swap PTE Joel Fernandes (Google)
2019-08-05 17:04   ` Joel Fernandes (Google)
2019-08-05 17:04 ` [PATCH v4 3/5] [RFC] arm64: " Joel Fernandes (Google)
2019-08-05 17:04   ` Joel Fernandes (Google)
2019-08-06  8:42   ` Michal Hocko
2019-08-06  8:42     ` Michal Hocko
2019-08-06 10:36     ` Joel Fernandes
2019-08-06 10:36       ` Joel Fernandes
2019-08-06 10:47       ` Michal Hocko
2019-08-06 10:47         ` Michal Hocko
2019-08-06 11:07         ` Minchan Kim
2019-08-06 11:07           ` Minchan Kim
2019-08-06 11:14           ` Michal Hocko
2019-08-06 11:14             ` Michal Hocko
2019-08-06 11:26             ` Joel Fernandes
2019-08-06 11:26               ` Joel Fernandes
2019-08-06 11:14         ` Joel Fernandes
2019-08-06 11:14           ` Joel Fernandes
2019-08-06 11:57           ` Michal Hocko
2019-08-06 11:57             ` Michal Hocko
2019-08-06 13:43             ` Joel Fernandes
2019-08-06 13:43               ` Joel Fernandes
2019-08-06 14:09               ` Michal Hocko
2019-08-06 14:09                 ` Michal Hocko
2019-08-06 14:47             ` Minchan Kim
2019-08-06 14:47               ` Minchan Kim
2019-08-06 15:20               ` Joel Fernandes
2019-08-06 15:20                 ` Joel Fernandes
2019-08-05 17:04 ` [PATCH v4 4/5] page_idle: Drain all LRU pagevec before idle tracking Joel Fernandes (Google)
2019-08-05 17:04   ` Joel Fernandes (Google)
2019-08-06  8:43   ` Michal Hocko
2019-08-06  8:43     ` Michal Hocko
2019-08-06 10:45     ` Joel Fernandes
2019-08-06 10:45       ` Joel Fernandes
2019-08-06 10:51       ` Michal Hocko
2019-08-06 10:51         ` Michal Hocko
2019-08-06 11:19         ` Joel Fernandes
2019-08-06 11:19           ` Joel Fernandes
2019-08-06 11:44           ` Michal Hocko
2019-08-06 11:44             ` Michal Hocko
2019-08-06 13:48             ` Joel Fernandes
2019-08-06 13:48               ` Joel Fernandes
2019-08-05 17:04 ` [PATCH v4 5/5] doc: Update documentation for page_idle virtual address indexing Joel Fernandes (Google)
2019-08-05 17:04   ` Joel Fernandes (Google)
2019-08-06  8:56 ` [PATCH v4 1/5] mm/page_idle: Add per-pid idle page tracking using virtual indexing Michal Hocko
2019-08-06  8:56   ` Michal Hocko
2019-08-06 10:47   ` Joel Fernandes
2019-08-06 10:47     ` Joel Fernandes
2019-08-06 22:19 ` Andrew Morton [this message]
2019-08-06 22:19   ` Andrew Morton
2019-08-07 10:00   ` Joel Fernandes
2019-08-07 10:00     ` Joel Fernandes
2019-08-07 20:01     ` Andrew Morton
2019-08-07 20:01       ` Andrew Morton
2019-08-07 20:44       ` Joel Fernandes
2019-08-07 20:44         ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190806151921.edec128271caccb5214fc1bd@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=adobriyan@gmail.com \
    --cc=bgregg@netflix.com \
    --cc=bp@alien8.de \
    --cc=brendan.d.gregg@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=chansen3@cisco.com \
    --cc=corbet@lwn.net \
    --cc=dancol@google.com \
    --cc=fmayer@google.com \
    --cc=guro@fb.com \
    --cc=hpa@zytor.com \
    --cc=joel@joelfernandes.org \
    --cc=joelaf@google.com \
    --cc=keescook@chromium.org \
    --cc=kernel-team@android.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@google.com \
    --cc=paulmck@linux.ibm.com \
    --cc=robin.murphy@arm.com \
    --cc=rppt@linux.ibm.com \
    --cc=sfr@canb.auug.org.au \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=tkjos@google.com \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.