All of lore.kernel.org
 help / color / mirror / Atom feed
From: SeongJae Park <sjpark@amazon.com>
To: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Cc: SeongJae Park <sjpark@amazon.com>, <akpm@linux-foundation.org>,
	"SeongJae Park" <sjpark@amazon.de>, <aarcange@redhat.com>,
	<yang.shi@linux.alibaba.com>, <acme@kernel.org>,
	<alexander.shishkin@linux.intel.com>, <amit@kernel.org>,
	<brendan.d.gregg@gmail.com>, <brendanhiggins@google.com>,
	<cai@lca.pw>, <colin.king@canonical.com>, <corbet@lwn.net>,
	<dwmw@amazon.com>, <jolsa@redhat.com>, <kirill@shutemov.name>,
	<mark.rutland@arm.com>, <mgorman@suse.de>, <minchan@kernel.org>,
	<mingo@redhat.com>, <namhyung@kernel.org>, <peterz@infradead.org>,
	<rdunlap@infradead.org>, <rientjes@google.com>,
	<rostedt@goodmis.org>, <shuah@kernel.org>, <sj38.park@gmail.com>,
	<vbabka@suse.cz>, <vdavydov.dev@gmail.com>, <linux-mm@kvack.org>,
	<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: Re: [PATCH v6 02/14] mm/damon: Implement region based sampling
Date: Tue, 10 Mar 2020 12:52:33 +0100	[thread overview]
Message-ID: <20200310115233.23246-1-sjpark@amazon.com> (raw)
In-Reply-To: <20200310085721.00000a0f@Huawei.com> (raw)

Added replies to your every comment in line below.  I agree to your whole
opinions, will apply those in next spin! :)

On Tue, 10 Mar 2020 08:57:21 +0000 Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Mon, 24 Feb 2020 13:30:35 +0100
> SeongJae Park <sjpark@amazon.com> wrote:
> 
> > From: SeongJae Park <sjpark@amazon.de>
> > 
> > This commit implements DAMON's basic access check and region based
> > sampling mechanisms.  This change would seems make no sense, mainly
> > because it is only a part of the DAMON's logics.  Following two commits
> > will make more sense.
> > 
> > This commit also exports `lookup_page_ext()` to GPL modules because
> > DAMON uses the function but also supports the module build.
> 
> Do that as a separate patch before this one.  Makes it easy to spot.

Agreed, will do so.

> 
> > 
[...]
> 
> Various things inline. In particularly can you make use of standard
> kthread_stop infrastructure rather than rolling your own?

Nice suggestion!  That will be much better, will use it.

> 
> > ---
> >  mm/damon.c    | 509 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  mm/page_ext.c |   1 +
> >  2 files changed, 510 insertions(+)
> > 
> > diff --git a/mm/damon.c b/mm/damon.c
> > index aafdca35b7b8..6bdeb84d89af 100644
> > --- a/mm/damon.c
> > +++ b/mm/damon.c
> > @@ -9,9 +9,14 @@
> >  
[...]
> > +/*
> > + * Get the mm_struct of the given task
> > + *
> > + * Callser should put the mm_struct after use, unless it is NULL.
> 
> Caller 

Good eye!  Will fix it.

> 
> > + *
> > + * Returns the mm_struct of the task on success, NULL on failure
> > + */
> > +static struct mm_struct *damon_get_mm(struct damon_task *t)
> > +{
> > +	struct task_struct *task;
> > +	struct mm_struct *mm;
> > +
> > +	task = damon_get_task_struct(t);
> > +	if (!task)
> > +		return NULL;
> > +
> > +	mm = get_task_mm(task);
> > +	put_task_struct(task);
> > +	return mm;
> > +}
> > +
> > +/*
> > + * Size-evenly split a region into 'nr_pieces' small regions
> > + *
> > + * Returns 0 on success, or negative error code otherwise.
> > + */
> > +static int damon_split_region_evenly(struct damon_ctx *ctx,
> > +		struct damon_region *r, unsigned int nr_pieces)
> > +{
> > +	unsigned long sz_orig, sz_piece, orig_end;
> > +	struct damon_region *piece = NULL, *next;
> > +	unsigned long start;
> > +
> > +	if (!r || !nr_pieces)
> > +		return -EINVAL;
> > +
> > +	orig_end = r->vm_end;
> > +	sz_orig = r->vm_end - r->vm_start;
> > +	sz_piece = sz_orig / nr_pieces;
> > +
> > +	if (!sz_piece)
> > +		return -EINVAL;
> > +
> > +	r->vm_end = r->vm_start + sz_piece;
> > +	next = damon_next_region(r);
> > +	for (start = r->vm_end; start + sz_piece <= orig_end;
> > +			start += sz_piece) {
> > +		piece = damon_new_region(ctx, start, start + sz_piece);
> > +		damon_add_region(piece, r, next);
> > +		r = piece;
> > +	}
> 
> I'd add a comment here. I think this next bit is to catch any rounding error
> holes, but I'm not 100% sure.

Yes, will make it clearer.

> 
> > +	if (piece)
> > +		piece->vm_end = orig_end;
> 
> blank line here.

Will add.

> 
> > +	return 0;
> > +}
[...]
> > +/*
> > + * Initialize the monitoring target regions for the given task
> > + *
> > + * t	the given target task
> > + *
> > + * Because only a number of small portions of the entire address space
> > + * is acutally mapped to the memory and accessed, monitoring the unmapped
> 
> actually

Good eye!  Will consider adding these in 'scripts/spelling.txt'.

> 
[...]
> > +/*
> > + * Check whether the given region has accessed since the last check
> 
> Should also make clear that this sets us up for the next access check at
> a different memory address it the region.
> 
> Given the lack of connection between activities perhaps just split this into
> two functions that are always called next to each other.

Will make the description more clearer as suggested.

Also, I found that I'm not clearing *pte and *pmd before going 'mkold', thanks
to this comment.  Will fix it, either.

> 
> > + *
> > + * mm	'mm_struct' for the given virtual address space
> > + * r	the region to be checked
> > + */
> > +static void kdamond_check_access(struct damon_ctx *ctx,
> > +			struct mm_struct *mm, struct damon_region *r)
> > +{
> > +	pte_t *pte = NULL;
> > +	pmd_t *pmd = NULL;
> > +	spinlock_t *ptl;
> > +
> > +	if (follow_pte_pmd(mm, r->sampling_addr, NULL, &pte, &pmd, &ptl))
> > +		goto mkold;
> > +
> > +	/* Read the page table access bit of the page */
> > +	if (pte && pte_young(*pte))
> > +		r->nr_accesses++;
> > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> 
> Is it worth having this protection?  Seems likely to have only a very small
> influence on performance and makes it a little harder to reason about the code.

It was necessary for addressing 'implicit declaration' problem of 'pmd_young()'
and 'pmd_mkold()' for build of DAMON on several architectures including User
Mode Linux.

Will modularize the code for better readability.

> 
> > +	else if (pmd && pmd_young(*pmd))
> > +		r->nr_accesses++;
> > +#endif	/* CONFIG_TRANSPARENT_HUGEPAGE */
> > +
> > +	spin_unlock(ptl);
> > +
> > +mkold:
> > +	/* mkold next target */
> > +	r->sampling_addr = damon_rand(ctx, r->vm_start, r->vm_end);
> > +
> > +	if (follow_pte_pmd(mm, r->sampling_addr, NULL, &pte, &pmd, &ptl))
> > +		return;
> > +
> > +	if (pte) {
> > +		if (pte_young(*pte)) {
> > +			clear_page_idle(pte_page(*pte));
> > +			set_page_young(pte_page(*pte));
> > +		}
> > +		*pte = pte_mkold(*pte);
> > +	}
> > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> > +	else if (pmd) {
> > +		if (pmd_young(*pmd)) {
> > +			clear_page_idle(pmd_page(*pmd));
> > +			set_page_young(pmd_page(*pmd));
> > +		}
> > +		*pmd = pmd_mkold(*pmd);
> > +	}
> > +#endif
> > +
> > +	spin_unlock(ptl);
> > +}
> > +
> > +/*
> > + * Check whether a time interval is elapsed
> 
> Another comment block that would be clearer if it was kernel-doc rather
> than nearly kernel-doc

Will apply the kernel-doc syntax.

> 
> > + *
> > + * baseline	the time to check whether the interval has elapsed since
> > + * interval	the time interval (microseconds)
> > + *
> > + * See whether the given time interval has passed since the given baseline
> > + * time.  If so, it also updates the baseline to current time for next check.
> > + *
> > + * Returns true if the time interval has passed, or false otherwise.
> > + */
> > +static bool damon_check_reset_time_interval(struct timespec64 *baseline,
> > +		unsigned long interval)
> > +{
> > +	struct timespec64 now;
> > +
> > +	ktime_get_coarse_ts64(&now);
> > +	if ((timespec64_to_ns(&now) - timespec64_to_ns(baseline)) <
> > +			interval * 1000)
> > +		return false;
> > +	*baseline = now;
> > +	return true;
> > +}
> > +
> > +/*
> > + * Check whether it is time to flush the aggregated information
> > + */
> > +static bool kdamond_aggregate_interval_passed(struct damon_ctx *ctx)
> > +{
> > +	return damon_check_reset_time_interval(&ctx->last_aggregation,
> > +			ctx->aggr_interval);
> > +}
> > +
> > +/*
> > + * Reset the aggregated monitoring results
> > + */
> > +static void kdamond_flush_aggregated(struct damon_ctx *c)
> 
> I wouldn't expect a reset function to be called flush.

It will work as flushing in next commit, but it makes no sense now.  Will
rename it.

> 
> > +{
> > +	struct damon_task *t;
> > +	struct damon_region *r;
> > +
> > +	damon_for_each_task(c, t) {
> > +		damon_for_each_region(r, t)
> > +			r->nr_accesses = 0;
> > +	}
> > +}
> > +
> > +/*
> > + * Check whether current monitoring should be stopped
> > + *
> > + * If users asked to stop, need stop.  Even though no user has asked to stop,
> > + * need stop if every target task has dead.
> > + *
> > + * Returns true if need to stop current monitoring.
> > + */
> > +static bool kdamond_need_stop(struct damon_ctx *ctx)
> > +{
> > +	struct damon_task *t;
> > +	struct task_struct *task;
> > +	bool stop;
> > +
> 
> As below comment asks, can you use kthread_should_stop?

Yes, I will.

> 
> > +	spin_lock(&ctx->kdamond_lock);
> > +	stop = ctx->kdamond_stop;
> > +	spin_unlock(&ctx->kdamond_lock);
> > +	if (stop)
> > +		return true;
> > +
> > +	damon_for_each_task(ctx, t) {
> > +		task = damon_get_task_struct(t);
> > +		if (task) {
> > +			put_task_struct(task);
> > +			return false;
> > +		}
> > +	}
> > +
> > +	return true;
> > +}
> > +
> > +/*
> > + * The monitoring daemon that runs as a kernel thread
> > + */
> > +static int kdamond_fn(void *data)
> > +{
> > +	struct damon_ctx *ctx = (struct damon_ctx *)data;
> 
> Never any need to explicitly cast a void * to some other pointer type.
> (C spec)

Ah, you're right.

> 
> 	struct damon_ctx *ctx = data;
> > +	struct damon_task *t;
> > +	struct damon_region *r, *next;
> > +	struct mm_struct *mm;
> > +
> > +	pr_info("kdamond (%d) starts\n", ctx->kdamond->pid);
> > +	kdamond_init_regions(ctx);
> > +	while (!kdamond_need_stop(ctx)) {
> > +		damon_for_each_task(ctx, t) {
> > +			mm = damon_get_mm(t);
> > +			if (!mm)
> > +				continue;
> > +			damon_for_each_region(r, t)
> > +				kdamond_check_access(ctx, mm, r);
> > +			mmput(mm);
> > +		}
> > +
> > +		if (kdamond_aggregate_interval_passed(ctx))
> > +			kdamond_flush_aggregated(ctx);
> > +
> > +		usleep_range(ctx->sample_interval, ctx->sample_interval + 1);
> 
> Is there any purpose in using a range for such a narrow window?

Actually, it needs to sleep only 'ctx->sample_interval', and thus I set the
interval so narrow.

> 
> > +	}
> > +	damon_for_each_task(ctx, t) {
> > +		damon_for_each_region_safe(r, next, t)
> > +			damon_destroy_region(r);
> > +	}
> > +	pr_info("kdamond (%d) finishes\n", ctx->kdamond->pid);
> 
> Feels like noise.  I'd drop tis to pr_debug.

Agreed, will remove it.

> 
> > +	spin_lock(&ctx->kdamond_lock);
> > +	ctx->kdamond = NULL;
> > +	spin_unlock(&ctx->kdamond_lock);
> 
> blank line.

Yup!

> 
> > +	return 0;
> > +}
> > +
> > +/*
> > + * Controller functions
> > + */
> > +
> > +/*
> > + * Start or stop the kdamond
> > + *
> > + * Returns 0 if success, negative error code otherwise.
> > + */
> > +static int damon_turn_kdamond(struct damon_ctx *ctx, bool on)
> > +{
> > +	spin_lock(&ctx->kdamond_lock);
> > +	ctx->kdamond_stop = !on;
> 
> Can't use the kthread_stop / kthread_should_stop approach?

Will use it.

> 
> > +	if (!ctx->kdamond && on) {
> > +		ctx->kdamond = kthread_run(kdamond_fn, ctx, "kdamond");
> > +		if (!ctx->kdamond)
> > +			goto fail;
> > +		goto success;
> 
> cleaner as 
> int ret = 0; above then
> 
> 		if (!ctx->kdamond)
> 			ret = -EINVAL;
> 		goto unlock;
> 
> with
> 
> unlock:
> 	spin_unlock(&ctx->dmanond_lock);
> 	return ret;

Agreed, will change so.

> 
> > +	}
> > +	if (ctx->kdamond && !on) {
> > +		spin_unlock(&ctx->kdamond_lock);
> > +		while (true) {
> 
> An unbounded loop is probably a bad idea.

Will add clear condition here.

> 
> > +			spin_lock(&ctx->kdamond_lock);
> > +			if (!ctx->kdamond)
> > +				goto success;
> > +			spin_unlock(&ctx->kdamond_lock);
> > +
> > +			usleep_range(ctx->sample_interval,
> > +					ctx->sample_interval * 2);
> > +		}
> > +	}
> > +
> > +	/* tried to turn on while turned on, or turn off while turned off */
> > +
> > +fail:
> > +	spin_unlock(&ctx->kdamond_lock);
> > +	return -EINVAL;
> > +
> > +success:
> > +	spin_unlock(&ctx->kdamond_lock);
> > +	return 0;
> > +}
> > +
> > +/*
> > + * This function should not be called while the kdamond is running.
> > + */
> > +static int damon_set_pids(struct damon_ctx *ctx,
> > +			unsigned long *pids, ssize_t nr_pids)
> > +{
> > +	ssize_t i;
> > +	struct damon_task *t, *next;
> > +
> > +	damon_for_each_task_safe(ctx, t, next)
> > +		damon_destroy_task(t);
> > +
> > +	for (i = 0; i < nr_pids; i++) {
> > +		t = damon_new_task(pids[i]);
> > +		if (!t) {
> > +			pr_err("Failed to alloc damon_task\n");
> > +			return -ENOMEM;
> > +		}
> > +		damon_add_task_tail(ctx, t);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +/*
> 
> This is kind of similar to kernel-doc formatting.  Might as well just make
> it kernel-doc!

Agreed, will do so!

> 
> > + * Set attributes for the monitoring
> > + *
> > + * sample_int		time interval between samplings
> > + * aggr_int		time interval between aggregations
> > + * min_nr_reg		minimal number of regions
> > + *
> > + * This function should not be called while the kdamond is running.
> > + * Every time interval is in micro-seconds.
> > + *
> > + * Returns 0 on success, negative error code otherwise.
> > + */
> > +static int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
> > +		unsigned long aggr_int, unsigned long min_nr_reg)
> > +{
> > +	if (min_nr_reg < 3) {
> > +		pr_err("min_nr_regions (%lu) should be bigger than 2\n",
> > +				min_nr_reg);
> > +		return -EINVAL;
> > +	}
> > +
> > +	ctx->sample_interval = sample_int;
> > +	ctx->aggr_interval = aggr_int;
> > +	ctx->min_nr_regions = min_nr_reg;
> 
> blank line helps readability a tiny little bit.

Agreed!


Thanks,
SeongJae Park

> 
> > +	return 0;
> > +}
> > +
> >  static int __init damon_init(void)
> >  {
> >  	pr_info("init\n");
> > diff --git a/mm/page_ext.c b/mm/page_ext.c
> > index 4ade843ff588..71169b45bba9 100644
> > --- a/mm/page_ext.c
> > +++ b/mm/page_ext.c
> > @@ -131,6 +131,7 @@ struct page_ext *lookup_page_ext(const struct page *page)
> >  					MAX_ORDER_NR_PAGES);
> >  	return get_entry(base, index);
> >  }
> > +EXPORT_SYMBOL_GPL(lookup_page_ext);
> >  
> >  static int __init alloc_node_page_ext(int nid)
> >  {

  reply	other threads:[~2020-03-10 11:53 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-24 12:30 [PATCH v6 00/14] Introduce Data Access MONitor (DAMON) SeongJae Park
2020-02-24 12:30 ` [PATCH v6 01/14] mm: " SeongJae Park
2020-03-10  8:54   ` Jonathan Cameron
2020-03-10 11:50     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 02/14] mm/damon: Implement region based sampling SeongJae Park
2020-03-10  8:57   ` Jonathan Cameron
2020-03-10 11:52     ` SeongJae Park [this message]
2020-03-10 15:55       ` Jonathan Cameron
2020-03-10 16:22         ` SeongJae Park
2020-03-10 17:39           ` Jonathan Cameron
2020-03-12  9:20             ` SeongJae Park
2020-03-13 17:29   ` Jonathan Cameron
2020-03-13 20:16     ` SeongJae Park
2020-03-17 11:32       ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 03/14] mm/damon: Adaptively adjust regions SeongJae Park
2020-03-10  8:57   ` Jonathan Cameron
2020-03-10 11:53     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 04/14] mm/damon: Apply dynamic memory mapping changes SeongJae Park
2020-03-10  9:00   ` Jonathan Cameron
2020-03-10 11:53     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 05/14] mm/damon: Implement callbacks SeongJae Park
2020-03-10  9:01   ` Jonathan Cameron
2020-03-10 11:55     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 06/14] mm/damon: Implement access pattern recording SeongJae Park
2020-03-10  9:01   ` Jonathan Cameron
2020-03-10 11:55     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 07/14] mm/damon: Implement kernel space API SeongJae Park
2020-03-10  9:01   ` Jonathan Cameron
2020-03-10 11:56     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 08/14] mm/damon: Add debugfs interface SeongJae Park
2020-03-10  9:02   ` Jonathan Cameron
2020-03-10 11:56     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 09/14] mm/damon: Add a tracepoint for result writing SeongJae Park
2020-03-10  9:03   ` Jonathan Cameron
2020-03-10 11:57     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 10/14] tools: Add a minimal user-space tool for DAMON SeongJae Park
2020-02-24 12:30 ` [PATCH v6 11/14] Documentation/admin-guide/mm: Add a document " SeongJae Park
2020-03-10  9:03   ` Jonathan Cameron
2020-03-10 11:57     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 12/14] mm/damon: Add kunit tests SeongJae Park
2020-02-24 12:30 ` [PATCH v6 13/14] mm/damon: Add user selftests SeongJae Park
2020-02-24 12:30 ` [PATCH v6 14/14] MAINTAINERS: Update for DAMON SeongJae Park
2020-03-02 11:35 ` [PATCH v6 00/14] Introduce Data Access MONitor (DAMON) SeongJae Park
2020-03-09 10:23   ` SeongJae Park
2020-03-10 17:21 ` Shakeel Butt
2020-03-10 17:21   ` Shakeel Butt
2020-03-12 10:07   ` SeongJae Park
2020-03-12 10:43     ` SeongJae Park
2020-03-18 19:52       ` Shakeel Butt
2020-03-18 19:52         ` Shakeel Butt
2020-03-19  9:03         ` SeongJae Park
2020-03-23 17:29           ` Shakeel Butt
2020-03-23 17:29             ` Shakeel Butt
2020-03-24  8:34             ` SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200310115233.23246-1-sjpark@amazon.com \
    --to=sjpark@amazon.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=aarcange@redhat.com \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=amit@kernel.org \
    --cc=brendan.d.gregg@gmail.com \
    --cc=brendanhiggins@google.com \
    --cc=cai@lca.pw \
    --cc=colin.king@canonical.com \
    --cc=corbet@lwn.net \
    --cc=dwmw@amazon.com \
    --cc=jolsa@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=shuah@kernel.org \
    --cc=sj38.park@gmail.com \
    --cc=sjpark@amazon.de \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.