linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: SeongJae Park <sjpark@amazon.com>
To: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Cc: SeongJae Park <sjpark@amazon.com>, <akpm@linux-foundation.org>,
	"SeongJae Park" <sjpark@amazon.de>, <aarcange@redhat.com>,
	<acme@kernel.org>, <alexander.shishkin@linux.intel.com>,
	<amit@kernel.org>, <brendan.d.gregg@gmail.com>,
	<brendanhiggins@google.com>, <cai@lca.pw>,
	<colin.king@canonical.com>, <corbet@lwn.net>, <dwmw@amazon.com>,
	<jolsa@redhat.com>, <kirill@shutemov.name>,
	<mark.rutland@arm.com>, <mgorman@suse.de>, <minchan@kernel.org>,
	<mingo@redhat.com>, <namhyung@kernel.org>, <peterz@infradead.org>,
	<rdunlap@infradead.org>, <riel@surriel.com>,
	<rientjes@google.com>, <rostedt@goodmis.org>, <shuah@kernel.org>,
	<sj38.park@gmail.com>, <vbabka@suse.cz>, <vdavydov.dev@gmail.com>,
	<yang.shi@linux.alibaba.com>, <ying.huang@intel.com>,
	<linux-mm@kvack.org>, <linux-doc@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: Re: [PATCH v7 05/15] mm/damon: Adaptively adjust regions
Date: Wed, 1 Apr 2020 10:22:53 +0200	[thread overview]
Message-ID: <20200401082253.21405-1-sjpark@amazon.com> (raw)
In-Reply-To: <20200331170855.0000024f@Huawei.com> (raw)

On Tue, 31 Mar 2020 17:08:55 +0100 Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Wed, 18 Mar 2020 12:27:12 +0100
> SeongJae Park <sjpark@amazon.com> wrote:
> 
> > From: SeongJae Park <sjpark@amazon.de>
> > 
> > At the beginning of the monitoring, DAMON constructs the initial regions
> > by evenly splitting the memory mapped address space of the process into
> > the user-specified minimal number of regions.  In this initial state,
> > the assumption of the regions (pages in same region have similar access
> > frequencies) is normally not kept and thus the monitoring quality could
> > be low.  To keep the assumption as much as possible, DAMON adaptively
> > merges and splits each region.
> > 
> > For each ``aggregation interval``, it compares the access frequencies of
> > adjacent regions and merges those if the frequency difference is small.
> > Then, after it reports and clears the aggregated access frequency of
> > each region, it splits each region into two regions if the total number
> > of regions is smaller than the half of the user-specified maximum number
> > of regions.
> > 
> > In this way, DAMON provides its best-effort quality and minimal overhead
> > while keeping the bounds users set for their trade-off.
> > 
> > Signed-off-by: SeongJae Park <sjpark@amazon.de>
> 
> A few more edge cases in here, and a suggestion that might be more costly
> but lead to simpler code.

Thank you for finding those!

> 
> Jonathan
> 
> > ---
> >  include/linux/damon.h |   6 +-
> >  mm/damon.c            | 148 ++++++++++++++++++++++++++++++++++++++++--
> >  2 files changed, 145 insertions(+), 9 deletions(-)
> > 
[...]
> > diff --git a/mm/damon.c b/mm/damon.c
> > index 018016793555..23c0de3b502e 100644
> > --- a/mm/damon.c
> > +++ b/mm/damon.c
[...]
> > +
> > +/*
> > + * Split a region into two small regions
> > + *
> > + * r		the region to be split
> > + * sz_r		size of the first sub-region that will be made
> > + */
> > +static void damon_split_region_at(struct damon_ctx *ctx,
> > +		struct damon_region *r, unsigned long sz_r)
> > +{
> > +	struct damon_region *new;
> > +
> > +	new = damon_new_region(ctx, r->vm_start + sz_r, r->vm_end);
> > +	r->vm_end = new->vm_start;
> 
> We may well have a sampling address that is in the wrong region.
> It should have little effect on the stats as will fix on next sample
> but in my view still worth cleaning up.

Good catch!  I will fix this in next spin.

> 
> > +
> > +	damon_insert_region(new, r, damon_next_region(r));
> > +}
> > +
[...]
> > @@ -571,21 +689,29 @@ static int kdamond_fn(void *data)
> >  	struct damon_task *t;
> >  	struct damon_region *r, *next;
> >  	struct mm_struct *mm;
> > +	unsigned int max_nr_accesses;
> >  
> >  	pr_info("kdamond (%d) starts\n", ctx->kdamond->pid);
> >  	kdamond_init_regions(ctx);
> >  	while (!kdamond_need_stop(ctx)) {
> > +		max_nr_accesses = 0;
> >  		damon_for_each_task(ctx, t) {
> >  			mm = damon_get_mm(t);
> >  			if (!mm)
> >  				continue;
> > -			damon_for_each_region(r, t)
> > +			damon_for_each_region(r, t) {
> >  				kdamond_check_access(ctx, mm, r);
> > +				max_nr_accesses = max(r->nr_accesses,
> > +						max_nr_accesses);
> > +			}
> >  			mmput(mm);
> >  		}
> >  
> > -		if (kdamond_aggregate_interval_passed(ctx))
> > +		if (kdamond_aggregate_interval_passed(ctx)) {
> > +			kdamond_merge_regions(ctx, max_nr_accesses / 10);
> >  			kdamond_reset_aggregated(ctx);
> > +			kdamond_split_regions(ctx);
> > +		}
> 
> I wonder if it would be simpler to split the sampling address setup and
> mkold from the access check.  We would have to walk regions twice,
> but not have to bother separately dealing with updating some regions
> if they are modified in the above block.
> 
> Also, the above has some overhead, so will bias that first sample each
> time the block above runs.  If we do the mkold afterwards it will make
> much less difference.

Agreed, it will make code much more simple and easy to read.  However, I'm not
sure how much of the overhead will be biased because 'aggregate interval' is
usually larger than 'sampling interval'.  Anyway, Will change so in the next
spin!


Thanks,
SeongJae Park

[...]


  reply	other threads:[~2020-04-01  8:23 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-18 11:27 [PATCH v7 00/15] Introduce Data Access MONitor (DAMON) SeongJae Park
2020-03-18 11:27 ` [PATCH v7 01/15] scripts/spelling: Add a few more typos SeongJae Park
2020-03-18 11:27 ` [PATCH v7 02/15] mm/page_ext: Export lookup_page_ext() to GPL modules SeongJae Park
2020-03-18 11:27 ` [PATCH v7 03/15] mm: Introduce Data Access MONitor (DAMON) SeongJae Park
2020-03-18 11:27 ` [PATCH v7 04/15] mm/damon: Implement region based sampling SeongJae Park
2020-03-31 16:02   ` Jonathan Cameron
2020-04-01  8:22     ` SeongJae Park
2020-04-01 14:24       ` Jonathan Cameron
2020-04-01 14:46         ` SeongJae Park
2020-04-02 13:59       ` SeongJae Park
2020-04-02 17:24         ` Jonathan Cameron
2020-04-02 18:00           ` SeongJae Park
2020-04-03 10:30             ` SeongJae Park
2020-04-02 15:07   ` SeongJae Park
2020-03-18 11:27 ` [PATCH v7 05/15] mm/damon: Adaptively adjust regions SeongJae Park
2020-03-31 16:08   ` Jonathan Cameron
2020-04-01  8:22     ` SeongJae Park [this message]
2020-03-18 11:27 ` [PATCH v7 06/15] mm/damon: Apply dynamic memory mapping changes SeongJae Park
2020-03-18 11:27 ` [PATCH v7 07/15] mm/damon: Implement callbacks SeongJae Park
2020-03-18 11:27 ` [PATCH v7 08/15] mm/damon: Implement access pattern recording SeongJae Park
2020-03-18 11:27 ` [PATCH v7 09/15] mm/damon: Add debugfs interface SeongJae Park
2020-03-18 11:27 ` [PATCH v7 10/15] mm/damon: Add tracepoints SeongJae Park
2020-03-18 11:27 ` [PATCH v7 11/15] tools: Add a minimal user-space tool for DAMON SeongJae Park
2020-03-18 11:27 ` [PATCH v7 12/15] Documentation/admin-guide/mm: Add a document " SeongJae Park
2020-03-18 11:27 ` [PATCH v7 13/15] mm/damon: Add kunit tests SeongJae Park
2020-03-18 11:38 ` [PATCH v7 14/15] mm/damon: Add user space selftests SeongJae Park
2020-03-18 11:39 ` [PATCH v7 15/15] MAINTAINERS: Update for DAMON SeongJae Park
2020-03-30 10:53 ` [PATCH v7 00/15] Introduce Data Access MONitor (DAMON) SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200401082253.21405-1-sjpark@amazon.com \
    --to=sjpark@amazon.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=aarcange@redhat.com \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=amit@kernel.org \
    --cc=brendan.d.gregg@gmail.com \
    --cc=brendanhiggins@google.com \
    --cc=cai@lca.pw \
    --cc=colin.king@canonical.com \
    --cc=corbet@lwn.net \
    --cc=dwmw@amazon.com \
    --cc=jolsa@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=shuah@kernel.org \
    --cc=sj38.park@gmail.com \
    --cc=sjpark@amazon.de \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=yang.shi@linux.alibaba.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).