linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: SeongJae Park <sjpark@amazon.com>
To: David Hildenbrand <david@redhat.com>
Cc: SeongJae Park <sjpark@amazon.com>, <akpm@linux-foundation.org>,
	"SeongJae Park" <sjpark@amazon.de>, <Jonathan.Cameron@Huawei.com>,
	<aarcange@redhat.com>, <acme@kernel.org>,
	<alexander.shishkin@linux.intel.com>, <amit@kernel.org>,
	<benh@kernel.crashing.org>, <brendan.d.gregg@gmail.com>,
	<brendanhiggins@google.com>, <cai@lca.pw>,
	<colin.king@canonical.com>, <corbet@lwn.net>, <dwmw@amazon.com>,
	<foersleo@amazon.de>, <irogers@google.com>, <jolsa@redhat.com>,
	<kirill@shutemov.name>, <mark.rutland@arm.com>, <mgorman@suse.de>,
	<minchan@kernel.org>, <mingo@redhat.com>, <namhyung@kernel.org>,
	<peterz@infradead.org>, <rdunlap@infradead.org>,
	<riel@surriel.com>, <rientjes@google.com>, <rostedt@goodmis.org>,
	<sblbir@amazon.com>, <shakeelb@google.com>, <shuah@kernel.org>,
	<sj38.park@gmail.com>, <snu@amazon.de>, <vbabka@suse.cz>,
	<vdavydov.dev@gmail.com>, <yang.shi@linux.alibaba.com>,
	<ying.huang@intel.com>, <linux-damon@amazon.com>,
	<linux-mm@kvack.org>, <linux-doc@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: Re: [RFC v2 7/9] mm/damon: Implement callbacks for physical memory monitoring
Date: Thu, 4 Jun 2020 17:23:36 +0200	[thread overview]
Message-ID: <20200604152336.4826-1-sjpark@amazon.com> (raw)
In-Reply-To: <4b5814d8-9626-e71c-9e5d-a4a61fcd12e8@redhat.com> (raw)

On Thu, 4 Jun 2020 16:58:13 +0200 David Hildenbrand <david@redhat.com> wrote:

> On 04.06.20 09:26, SeongJae Park wrote:
> > On Wed, 3 Jun 2020 18:09:21 +0200 David Hildenbrand <david@redhat.com> wrote:
> > 
> >> On 03.06.20 16:11, SeongJae Park wrote:
> >>> From: SeongJae Park <sjpark@amazon.de>
> >>>
> >>> This commit implements the four callbacks (->init_target_regions,
> >>> ->update_target_regions, ->prepare_access_check, and ->check_accesses)
> >>> for the basic access monitoring of the physical memory address space.
> >>> By setting the callback pointers to point those, users can easily
> >>> monitor the accesses to the physical memory.
> >>>
> >>> Internally, it uses the PTE Accessed bit, as similar to that of the
> >>> virtual memory support.  Also, it supports only page frames that
> >>> supported by idle page tracking.  Acutally, most of the code is stollen
> >>> from idle page tracking.  Users who want to use other access check
> >>> primitives and monitor the frames that not supported with this
> >>> implementation could implement their own callbacks on their own.
> >>>
> >>> Signed-off-by: SeongJae Park <sjpark@amazon.de>
> >>> ---
> >>>  include/linux/damon.h |   5 ++
> >>>  mm/damon.c            | 184 ++++++++++++++++++++++++++++++++++++++++++
> >>>  2 files changed, 189 insertions(+)
> >>>
> >>> diff --git a/include/linux/damon.h b/include/linux/damon.h
> >>> index 1a788bfd1b4e..f96503a532ea 100644
> >>> --- a/include/linux/damon.h
> >>> +++ b/include/linux/damon.h
> >>> @@ -216,6 +216,11 @@ void kdamond_update_vm_regions(struct damon_ctx *ctx);
> >>>  void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx);
> >>>  unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx);
> >>>  
> >>> +void kdamond_init_phys_regions(struct damon_ctx *ctx);
> >>> +void kdamond_update_phys_regions(struct damon_ctx *ctx);
> >>> +void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx);
> >>> +unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx);
> >>> +
> >>>  int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
> >>>  int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
> >>>  		unsigned long aggr_int, unsigned long regions_update_int,
> >>> diff --git a/mm/damon.c b/mm/damon.c
> >>> index f5cbc97a3bbc..6a5c6d540580 100644
> >>> --- a/mm/damon.c
> >>> +++ b/mm/damon.c
> >>> @@ -19,7 +19,9 @@
> >>>  #include <linux/mm.h>
> >>>  #include <linux/module.h>
> >>>  #include <linux/page_idle.h>
> >>> +#include <linux/pagemap.h>
> >>>  #include <linux/random.h>
> >>> +#include <linux/rmap.h>
> >>>  #include <linux/sched/mm.h>
> >>>  #include <linux/sched/task.h>
> >>>  #include <linux/slab.h>
> >>> @@ -480,6 +482,11 @@ void kdamond_init_vm_regions(struct damon_ctx *ctx)
> >>>  	}
> >>>  }
> >>>  
> >>> +/* Do nothing.  Users should set the initial regions by themselves */
> >>> +void kdamond_init_phys_regions(struct damon_ctx *ctx)
> >>> +{
> >>> +}
> >>> +
> >>>  static void damon_mkold(struct mm_struct *mm, unsigned long addr)
> >>>  {
> >>>  	pte_t *pte = NULL;
> >>> @@ -611,6 +618,178 @@ unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx)
> >>>  	return max_nr_accesses;
> >>>  }
> >>>  
> >>> +/* access check functions for physical address based regions */
> >>> +
> >>> +/* This code is stollen from page_idle.c */
> >>> +static struct page *damon_phys_get_page(unsigned long pfn)
> >>> +{
> >>> +	struct page *page;
> >>> +	pg_data_t *pgdat;
> >>> +
> >>> +	if (!pfn_valid(pfn))
> >>> +		return NULL;
> >>> +
> >>
> >> Who provides these pfns? Can these be random pfns, supplied unchecked by
> >> user space? Or are they at least mapped into some user space process?
> > 
> > Your guess is right, users can give random physical address and that will be
> > translated into pfn.
> > 
> 
> Note the difference to idle tracking: "Idle page tracking only considers
> user memory pages", this is very different to your use case. Note that
> this is why there is no pfn_to_online_page() check in page idle code.

My use case is same to that of idle page.  I also ignore non-user pages.
Actually, this function is for filtering of the non-user pages, which is simply
stollen from the page_idle.

> 
> >>
> >> IOW, do we need a pfn_to_online_page() to make sure the memmap even was
> >> initialized?
> > 
> > Thank you for pointing out this!  I will use it in the next spin.  Also, this
> > code is stollen from page_idle_get_page().  Seems like it should also be
> > modified to use it.  I will send the patch for it, either.
> 
> pfn_to_online_page() will only succeed for system RAM pages, not
> dax/pmem (ZONE_DEVICE). dax/pmem needs special care.
> 
> I can spot that you are taking references to random struct pages. This
> looks dangerous to me and might mess in complicated ways with page
> migration/isolation/onlining/offlining etc. I am not sure if we want that.

AFAIU, page_idle users can also pass random pfns by randomly accessing the
bitmap file.  Am I missing something?


Thanks,
SeongJae Park

> 
> -- 
> Thanks,
> 
> David / dhildenb

  reply	other threads:[~2020-06-04 15:24 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-03 14:11 [RFC v2 0/9] DAMON: Support Access Monitoring of Any Address Space Including Physical Memory SeongJae Park
2020-06-03 14:11 ` [RFC v2 1/9] mm/damon: Use vm-independent address range concept SeongJae Park
2020-06-03 14:11 ` [RFC v2 2/9] mm/damon: Clean up code using 'struct damon_addr_range' SeongJae Park
2020-06-03 14:11 ` [RFC v2 3/9] mm/damon: Make monitoring target regions init/update configurable SeongJae Park
2020-06-03 14:11 ` [RFC v2 4/9] mm/damon/debugfs: Allow users to set initial monitoring target regions SeongJae Park
2020-06-03 14:11 ` [RFC v2 5/9] Docs/damon: Document 'initial_regions' feature SeongJae Park
2020-06-03 14:11 ` [RFC v2 6/9] mm/damon: Make access check primitive configurable SeongJae Park
2020-06-03 14:11 ` [RFC v2 7/9] mm/damon: Implement callbacks for physical memory monitoring SeongJae Park
2020-06-03 16:09   ` David Hildenbrand
2020-06-04  7:26     ` SeongJae Park
2020-06-04 14:58       ` David Hildenbrand
2020-06-04 15:23         ` SeongJae Park [this message]
2020-06-04 15:39           ` David Hildenbrand
2020-06-04 15:51             ` SeongJae Park
2020-06-04 16:01               ` David Hildenbrand
2020-06-03 14:11 ` [RFC v2 8/9] mm/damon/debugfs: Support " SeongJae Park
2020-06-03 14:11 ` [RFC v2 9/9] Docs/damon: Document physical memory monitoring support SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200604152336.4826-1-sjpark@amazon.com \
    --to=sjpark@amazon.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=aarcange@redhat.com \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=amit@kernel.org \
    --cc=benh@kernel.crashing.org \
    --cc=brendan.d.gregg@gmail.com \
    --cc=brendanhiggins@google.com \
    --cc=cai@lca.pw \
    --cc=colin.king@canonical.com \
    --cc=corbet@lwn.net \
    --cc=david@redhat.com \
    --cc=dwmw@amazon.com \
    --cc=foersleo@amazon.de \
    --cc=irogers@google.com \
    --cc=jolsa@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-damon@amazon.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=sblbir@amazon.com \
    --cc=shakeelb@google.com \
    --cc=shuah@kernel.org \
    --cc=sj38.park@gmail.com \
    --cc=sjpark@amazon.de \
    --cc=snu@amazon.de \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=yang.shi@linux.alibaba.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).