Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
From: SeongJae Park <sjpark@amazon.com>
To: <vrd@amazon.com>
Cc: SeongJae Park <sjpark@amazon.com>, <akpm@linux-foundation.org>,
	"SeongJae Park" <sjpark@amazon.de>, <Jonathan.Cameron@Huawei.com>,
	<aarcange@redhat.com>, <acme@kernel.org>,
	<alexander.shishkin@linux.intel.com>, <amit@kernel.org>,
	<benh@kernel.crashing.org>, <brendan.d.gregg@gmail.com>,
	<brendanhiggins@google.com>, <cai@lca.pw>,
	<colin.king@canonical.com>, <corbet@lwn.net>, <dwmw@amazon.com>,
	<foersleo@amazon.de>, <irogers@google.com>, <jolsa@redhat.com>,
	<kirill@shutemov.name>, <mark.rutland@arm.com>, <mgorman@suse.de>,
	<minchan@kernel.org>, <mingo@redhat.com>, <namhyung@kernel.org>,
	<peterz@infradead.org>, <rdunlap@infradead.org>,
	<riel@surriel.com>, <rientjes@google.com>, <rostedt@goodmis.org>,
	<sblbir@amazon.com>, <shakeelb@google.com>, <shuah@kernel.org>,
	<sj38.park@gmail.com>, <snu@amazon.de>, <vbabka@suse.cz>,
	<vdavydov.dev@gmail.com>, <yang.shi@linux.alibaba.com>,
	<ying.huang@intel.com>, <david@redhat.com>,
	<linux-damon@amazon.com>, <linux-mm@kvack.org>,
	<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: Re: [PATCH v15 03/14] mm/damon: Implement region based sampling
Date: Thu, 11 Jun 2020 09:21:00 +0200
Message-ID: <20200611072100.5283-1-sjpark@amazon.com> (raw)
In-Reply-To: <e9c0655b-0b6c-46b2-275d-22bdcd01c66f@amazon.com> (raw)

On Wed, 10 Jun 2020 22:36:00 +0200 <vrd@amazon.com> wrote:

> On 6/8/20 1:40 PM, SeongJae Park wrote:
> > From: SeongJae Park <sjpark@amazon.de>
> > 
> > This commit implements DAMON's basic access check and region based
> > sampling mechanisms.  This change would seems make no sense, mainly
> > because it is only a part of the DAMON's logics.  Following two commits
> > will make more sense.
> > 
[...]
> > +
> > +/*
> > + * Find three regions separated by two biggest unmapped regions
> > + *
> > + * vma		the head vma of the target address space
> > + * regions	an array of three 'struct region's that results will be saved
> > + *
> > + * This function receives an address space and finds three regions in it which
> > + * separated by the two biggest unmapped regions in the space.  Please refer to
> > + * below comments of 'damon_init_regions_of()' function to know why this is
> > + * necessary.
> > + *
> > + * Returns 0 if success, or negative error code otherwise.
> > + */
> > +static int damon_three_regions_in_vmas(struct vm_area_struct *vma,
> > +		struct region regions[3])
> > +{
> > +	struct region gap = {0}, first_gap = {0}, second_gap = {0};
> > +	struct vm_area_struct *last_vma = NULL;
> > +	unsigned long start = 0;
> > +
> > +	/* Find two biggest gaps so that first_gap > second_gap > others */
> > +	for (; vma; vma = vma->vm_next) {
> 
> Since vm_area_struct already maintains information about the largest gap below this vma
> in the mm_rb rbtree, walking the vma via mm_rb instead of the linked list, and skipping
> the ones with don't fit the gap requirement via vma->rb_subtree_gap helps avoid the
> extra comparisons in this function.

Thanks for the idea!

> 
> I measured the following implementation to be considerably faster as the number of
> vmas grows for a process damon would attach to:
> 
> -static int damon_three_regions_in_vmas(struct vm_area_struct *vma,
> +static int damon_three_regions_in_vmas(struct rb_root *root,
>  		struct region regions[3])
>  {
> +	struct rb_node *nd = NULL;
>  	struct region gap = {0}, first_gap = {0}, second_gap = {0};
> -	struct vm_area_struct *last_vma = NULL;

I like this cleanup.  I'm so wonder how I forgot using '->vm_prev'. :)

> +	struct vm_area_struct *vma = NULL;
>  	unsigned long start = 0;
>  
>  	/* Find two biggest gaps so that first_gap > second_gap > others */
> -	for (; vma; vma = vma->vm_next) {
> -		if (!last_vma) {
> -			start = vma->vm_start;
> -			last_vma = vma;
> +	for (nd = rb_first(root); nd; nd = rb_next(nd)) {
> +		vma = rb_entry(nd, struct vm_area_struct, vm_rb);

This seems meaningless to me.  This will iterate the vma tree in address order,
as same to the old code.  Moreover, 'rb_next()' and 'rb_entry()' might be
slower than the direct reference of '->vm_next'.

> +
> +		if (vma->rb_subtree_gap < sz_region(&second_gap)) {
> +			/*
> +			 * Skip this vma if the largest gap at this vma is still
> +			 * smaller than what we have encountered so far.
> +			 */
>  			continue;

This means we are skipping this node only.  It would make no big difference
from the old code, as we still iterate all nodes.

Rather than that, by the definition of the '->rb_subtree_gap', we could skip
all vmas in the subtree.  For example:

	vma = rb_entry(rb_last(vma->vm_rb), struct vm_area_struct, vm_rb);
	continue;

Nevertheless, this function is not the performance critical point, as this will
be called only once for the initial time in this patch, and the followup
patches will make this function to be called for every regions update interval,
which defaults to 1 second.  The followup patches will also allow users set the
interval large enough and even configure their own optimized version.  For the
reason, I concern simpleness ratherthan performance here.

That said, your fundamental idea obviously makes sense and the changes for that
would be subtle.  I will update this patch in abovely modified way and do some
test.

If I missed something, please let me know.


Thanks,
SeongJae Park


  reply index

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-08 11:40 [PATCH v15 00/14] Introduce Data Access MONitor (DAMON) SeongJae Park
2020-06-08 11:40 ` [PATCH v15 01/14] mm/page_ext: Export lookup_page_ext() to GPL modules SeongJae Park
2020-06-08 11:53   ` David Hildenbrand
2020-06-08 11:56     ` SeongJae Park
2020-06-08 15:49     ` Christoph Hellwig
2020-06-08 17:48       ` SeongJae Park
2020-06-08 18:15       ` David Hildenbrand
2020-06-10 20:13   ` vrd
2020-06-08 11:40 ` [PATCH v15 02/14] mm: Introduce Data Access MONitor (DAMON) SeongJae Park
2020-06-08 11:40 ` [PATCH v15 03/14] mm/damon: Implement region based sampling SeongJae Park
2020-06-10 20:36   ` vrd
2020-06-11  7:21     ` SeongJae Park [this message]
2020-06-08 11:40 ` [PATCH v15 04/14] mm/damon: Adaptively adjust regions SeongJae Park
2020-06-10 10:13   ` SeongJae Park
2020-06-08 11:40 ` [PATCH v15 05/14] mm/damon: Apply dynamic memory mapping changes SeongJae Park
2020-06-08 11:40 ` [PATCH v15 06/14] mm/damon: Implement callbacks SeongJae Park
2020-06-08 11:40 ` [PATCH v15 07/14] mm/damon: Implement access pattern recording SeongJae Park
2020-06-08 11:40 ` [PATCH v15 08/14] mm/damon: Add debugfs interface SeongJae Park
2020-06-08 11:40 ` [PATCH v15 09/14] mm/damon: Add tracepoints SeongJae Park
2020-06-08 11:40 ` [PATCH v15 10/14] tools: Add a minimal user-space tool for DAMON SeongJae Park
2020-06-08 11:40 ` [PATCH v15 11/14] Documentation/admin-guide/mm: Add a document " SeongJae Park
2020-06-08 11:40 ` [PATCH v15 12/14] mm/damon: Add kunit tests SeongJae Park
2020-06-08 11:40 ` [PATCH v15 13/14] mm/damon: Add user space selftests SeongJae Park
2020-06-08 11:40 ` [PATCH v15 14/14] MAINTAINERS: Update for DAMON SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200611072100.5283-1-sjpark@amazon.com \
    --to=sjpark@amazon.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=aarcange@redhat.com \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=amit@kernel.org \
    --cc=benh@kernel.crashing.org \
    --cc=brendan.d.gregg@gmail.com \
    --cc=brendanhiggins@google.com \
    --cc=cai@lca.pw \
    --cc=colin.king@canonical.com \
    --cc=corbet@lwn.net \
    --cc=david@redhat.com \
    --cc=dwmw@amazon.com \
    --cc=foersleo@amazon.de \
    --cc=irogers@google.com \
    --cc=jolsa@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-damon@amazon.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=sblbir@amazon.com \
    --cc=shakeelb@google.com \
    --cc=shuah@kernel.org \
    --cc=sj38.park@gmail.com \
    --cc=sjpark@amazon.de \
    --cc=snu@amazon.de \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=vrd@amazon.com \
    --cc=yang.shi@linux.alibaba.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git