Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
From: SeongJae Park <sjpark@amazon.com>
To: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Cc: SeongJae Park <sjpark@amazon.com>, <akpm@linux-foundation.org>,
	"SeongJae Park" <sjpark@amazon.de>, <aarcange@redhat.com>,
	<yang.shi@linux.alibaba.com>, <acme@kernel.org>,
	<alexander.shishkin@linux.intel.com>, <amit@kernel.org>,
	<brendan.d.gregg@gmail.com>, <brendanhiggins@google.com>,
	<cai@lca.pw>, <colin.king@canonical.com>, <corbet@lwn.net>,
	<dwmw@amazon.com>, <jolsa@redhat.com>, <kirill@shutemov.name>,
	<mark.rutland@arm.com>, <mgorman@suse.de>, <minchan@kernel.org>,
	<mingo@redhat.com>, <namhyung@kernel.org>, <peterz@infradead.org>,
	<rdunlap@infradead.org>, <rientjes@google.com>,
	<rostedt@goodmis.org>, <shuah@kernel.org>, <sj38.park@gmail.com>,
	<vbabka@suse.cz>, <vdavydov.dev@gmail.com>, <linux-mm@kvack.org>,
	<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: Re: [PATCH v6 02/14] mm/damon: Implement region based sampling
Date: Thu, 12 Mar 2020 10:20:30 +0100
Message-ID: <20200312092030.347-1-sjpark@amazon.com> (raw)
In-Reply-To: <20200310173938.00002af4@Huawei.com> (raw)

On Tue, 10 Mar 2020 17:39:38 +0000 Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Tue, 10 Mar 2020 17:22:40 +0100
> SeongJae Park <sjpark@amazon.com> wrote:
> 
> > On Tue, 10 Mar 2020 15:55:10 +0000 Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> > 
> > > On Tue, 10 Mar 2020 12:52:33 +0100
> > > SeongJae Park <sjpark@amazon.com> wrote:
> > >   
> > > > Added replies to your every comment in line below.  I agree to your whole
> > > > opinions, will apply those in next spin! :)
> > > >   
> > > 
> > > One additional question inline that came to mind.  Using a single statistic
> > > to monitor huge page and normal page hits is going to give us problems
> > > I think.  
> > 
> > Ah, you're right!!!  This is indeed a critical bug!
> > 
> > > 
> > > Perhaps I'm missing something?
> > >   
> > > > > > +/*
> > > > > > + * Check whether the given region has accessed since the last check    
> > > > > 
> > > > > Should also make clear that this sets us up for the next access check at
> > > > > a different memory address it the region.
> > > > > 
> > > > > Given the lack of connection between activities perhaps just split this into
> > > > > two functions that are always called next to each other.    
> > > > 
> > > > Will make the description more clearer as suggested.
> > > > 
> > > > Also, I found that I'm not clearing *pte and *pmd before going 'mkold', thanks
> > > > to this comment.  Will fix it, either.
> > > >   
> > > > >     
> > > > > > + *
> > > > > > + * mm	'mm_struct' for the given virtual address space
> > > > > > + * r	the region to be checked
> > > > > > + */
> > > > > > +static void kdamond_check_access(struct damon_ctx *ctx,
> > > > > > +			struct mm_struct *mm, struct damon_region *r)
> > > > > > +{
> > > > > > +	pte_t *pte = NULL;
> > > > > > +	pmd_t *pmd = NULL;
> > > > > > +	spinlock_t *ptl;
> > > > > > +
> > > > > > +	if (follow_pte_pmd(mm, r->sampling_addr, NULL, &pte, &pmd, &ptl))
> > > > > > +		goto mkold;
> > > > > > +
> > > > > > +	/* Read the page table access bit of the page */
> > > > > > +	if (pte && pte_young(*pte))
> > > > > > +		r->nr_accesses++;
> > > > > > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE    
> > > > > 
> > > > > Is it worth having this protection?  Seems likely to have only a very small
> > > > > influence on performance and makes it a little harder to reason about the code.    
> > > > 
> > > > It was necessary for addressing 'implicit declaration' problem of 'pmd_young()'
> > > > and 'pmd_mkold()' for build of DAMON on several architectures including User
> > > > Mode Linux.
> > > > 
> > > > Will modularize the code for better readability.
> > > >   
> > > > >     
> > > > > > +	else if (pmd && pmd_young(*pmd))
> > > > > > +		r->nr_accesses++;  
> > > 
> > > So we increment a region count by one if we have an access in a huge page, or
> > > in a normal page.
> > > 
> > > If we get a region that has a mixture of the two, this seems likely to give a
> > > bad approximation.
> > > 
> > > Assume the region is accessed 'evenly' but each " 4k page" is only hit 10% of the time
> > > (where a hit is in one check period)
> > > 
> > > If our address in a page, then we'll hit 10% of the time, but if it is in a 2M
> > > huge page then we'll hit a much higher percentage of the time.
> > > 1 - (0.9^512) ~= 1
> > > 
> > > Should we look to somehow account for this?  
> > 
> > Yes, this is really critical bug and we should fix this!  Thank you so much for
> > finding this!
> > 
> > >   
> > > > > > +#endif	/* CONFIG_TRANSPARENT_HUGEPAGE */
> > > > > > +
> > > > > > +	spin_unlock(ptl);
> > > > > > +
> > > > > > +mkold:
> > > > > > +	/* mkold next target */
> > > > > > +	r->sampling_addr = damon_rand(ctx, r->vm_start, r->vm_end);
> > > > > > +
> > > > > > +	if (follow_pte_pmd(mm, r->sampling_addr, NULL, &pte, &pmd, &ptl))
> > > > > > +		return;
> > > > > > +
> > > > > > +	if (pte) {
> > > > > > +		if (pte_young(*pte)) {
> > > > > > +			clear_page_idle(pte_page(*pte));
> > > > > > +			set_page_young(pte_page(*pte));
> > > > > > +		}
> > > > > > +		*pte = pte_mkold(*pte);
> > > > > > +	}
> > > > > > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> > > > > > +	else if (pmd) {
> > > > > > +		if (pmd_young(*pmd)) {
> > > > > > +			clear_page_idle(pmd_page(*pmd));
> > > > > > +			set_page_young(pmd_page(*pmd));
> > > > > > +		}
> > > > > > +		*pmd = pmd_mkold(*pmd);
> > > > > > +	}  
> > 
> > This is also very problematic if several regions are backed by a single huge
> > page, as only one region in the huge page will be checked as accessed.
> > 
> > Will address these problems in next spin!
> 
> Good point.  There is little point in ever having multiple regions including
> a single huge page.  Would it be possible to tweak the region splitting algorithm
> to not do this?

Yes, it would be a good solution.  However, I believe this is a problem of the
access checking mechanism, as the definition of the region is only 'memory
area having similar access frequency'.  Adding more rules such as 'it should
be aligned by HUGE PAGE size' might make things more complex.  Also, we're
currently using page table Accessed bits as the primitive for the access check,
but it could be extended to other primitives in future.   Therefore, I would
like to modify the access checking mechanism to aware the huge pages
existance.

For regions containing both regular pages and huge pages, the huge pages will
make some errorneous high access frequency as you noted before,  but the
adaptive regions adjustment will eventually split them.

If you have other concerns or opinions, please let me know.


Thanks,
SeongJae Park

> 
> Jonathan
> 
> > 
> > 
> > Thanks,
> > SeongJae Park
> > 
> > > > > > +#endif
> > > > > > +
> > > > > > +	spin_unlock(ptl);
> > > > > > +}
> > > > > > +  
> > > 
> > >   
> 
> 


  reply index

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-24 12:30 [PATCH v6 00/14] Introduce Data Access MONitor (DAMON) SeongJae Park
2020-02-24 12:30 ` [PATCH v6 01/14] mm: " SeongJae Park
2020-03-10  8:54   ` Jonathan Cameron
2020-03-10 11:50     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 02/14] mm/damon: Implement region based sampling SeongJae Park
2020-03-10  8:57   ` Jonathan Cameron
2020-03-10 11:52     ` SeongJae Park
2020-03-10 15:55       ` Jonathan Cameron
2020-03-10 16:22         ` SeongJae Park
2020-03-10 17:39           ` Jonathan Cameron
2020-03-12  9:20             ` SeongJae Park [this message]
2020-03-13 17:29   ` Jonathan Cameron
2020-03-13 20:16     ` SeongJae Park
2020-03-17 11:32       ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 03/14] mm/damon: Adaptively adjust regions SeongJae Park
2020-03-10  8:57   ` Jonathan Cameron
2020-03-10 11:53     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 04/14] mm/damon: Apply dynamic memory mapping changes SeongJae Park
2020-03-10  9:00   ` Jonathan Cameron
2020-03-10 11:53     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 05/14] mm/damon: Implement callbacks SeongJae Park
2020-03-10  9:01   ` Jonathan Cameron
2020-03-10 11:55     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 06/14] mm/damon: Implement access pattern recording SeongJae Park
2020-03-10  9:01   ` Jonathan Cameron
2020-03-10 11:55     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 07/14] mm/damon: Implement kernel space API SeongJae Park
2020-03-10  9:01   ` Jonathan Cameron
2020-03-10 11:56     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 08/14] mm/damon: Add debugfs interface SeongJae Park
2020-03-10  9:02   ` Jonathan Cameron
2020-03-10 11:56     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 09/14] mm/damon: Add a tracepoint for result writing SeongJae Park
2020-03-10  9:03   ` Jonathan Cameron
2020-03-10 11:57     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 10/14] tools: Add a minimal user-space tool for DAMON SeongJae Park
2020-02-24 12:30 ` [PATCH v6 11/14] Documentation/admin-guide/mm: Add a document " SeongJae Park
2020-03-10  9:03   ` Jonathan Cameron
2020-03-10 11:57     ` SeongJae Park
2020-02-24 12:30 ` [PATCH v6 12/14] mm/damon: Add kunit tests SeongJae Park
2020-02-24 12:30 ` [PATCH v6 13/14] mm/damon: Add user selftests SeongJae Park
2020-02-24 12:30 ` [PATCH v6 14/14] MAINTAINERS: Update for DAMON SeongJae Park
2020-03-02 11:35 ` [PATCH v6 00/14] Introduce Data Access MONitor (DAMON) SeongJae Park
2020-03-09 10:23   ` SeongJae Park
2020-03-10 17:21 ` Shakeel Butt
2020-03-12 10:07   ` SeongJae Park
2020-03-12 10:43     ` SeongJae Park
2020-03-18 19:52       ` Shakeel Butt
2020-03-19  9:03         ` SeongJae Park
2020-03-23 17:29           ` Shakeel Butt
2020-03-24  8:34             ` SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200312092030.347-1-sjpark@amazon.com \
    --to=sjpark@amazon.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=aarcange@redhat.com \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=amit@kernel.org \
    --cc=brendan.d.gregg@gmail.com \
    --cc=brendanhiggins@google.com \
    --cc=cai@lca.pw \
    --cc=colin.king@canonical.com \
    --cc=corbet@lwn.net \
    --cc=dwmw@amazon.com \
    --cc=jolsa@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=shuah@kernel.org \
    --cc=sj38.park@gmail.com \
    --cc=sjpark@amazon.de \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git