linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov@parallels.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.cz>, Greg Thelen <gthelen@google.com>,
	Michel Lespinasse <walken@google.com>,
	David Rientjes <rientjes@google.com>,
	Pavel Emelyanov <xemul@parallels.com>,
	Cyrill Gorcunov <gorcunov@openvz.org>,
	Jonathan Corbet <corbet@lwn.net>, <linux-api@vger.kernel.org>,
	<linux-doc@vger.kernel.org>, <linux-mm@kvack.org>,
	<cgroups@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v3 3/3] proc: add kpageidle file
Date: Wed, 29 Apr 2015 12:12:48 +0300	[thread overview]
Message-ID: <20150429091248.GD1694@esperanza> (raw)
In-Reply-To: <20150429043536.GB11486@blaptop>

On Wed, Apr 29, 2015 at 01:35:36PM +0900, Minchan Kim wrote:
> On Tue, Apr 28, 2015 at 03:24:42PM +0300, Vladimir Davydov wrote:
> > diff --git a/fs/proc/page.c b/fs/proc/page.c
> > index 70d23245dd43..cfc55ba7fee6 100644
> > --- a/fs/proc/page.c
> > +++ b/fs/proc/page.c
> > @@ -275,6 +275,156 @@ static const struct file_operations proc_kpagecgroup_operations = {
> >  };
> >  #endif /* CONFIG_MEMCG */
> >  
> > +#ifdef CONFIG_IDLE_PAGE_TRACKING
> > +static struct page *kpageidle_get_page(unsigned long pfn)
> > +{
> > +	struct page *page;
> > +
> > +	if (!pfn_valid(pfn))
> > +		return NULL;
> > +	page = pfn_to_page(pfn);
> > +	/*
> > +	 * We are only interested in user memory pages, i.e. pages that are
> > +	 * allocated and on an LRU list.
> > +	 */
> > +	if (!page || page_count(page) == 0 || !PageLRU(page))
> 
> Why do you check (page_count == 0) even if we check it with get_page_unless_zero
> below?

I intended to avoid overhead of cmpxchg in case page_count is 0, but
diving deeper into get_page_unless_zero, I see that it already handles
such a scenario, so this check is useless. I'll remove it.

> 
> > +		return NULL;
> > +	if (!get_page_unless_zero(page))
> > +		return NULL;
> > +	if (unlikely(!PageLRU(page))) {
> 
> What lock protect the check PageLRU?
> If it is racing ClearPageLRU, what happens?

If we hold a reference to a page and see that it's on an LRU list, it
will surely remain a user memory page at least until we release the
reference to it, so it must be safe to play with idle/young flags. If we
race with isolate_lru_page, or any similar function temporarily clearing
PG_lru, we will silently skip the page w/o touching its idle/young
flags. We could consider isolated pages too, but that would increase the
cost of this function.

If you find this explanation OK, I'll add it to the comment to this
function.

> 
> > +		put_page(page);
> > +		return NULL;
> > +	}
> > +	return page;
> > +}
> > +
> > +static void kpageidle_clear_refs(struct page *page)
> > +{
> > +	unsigned long dummy;
> > +
> > +	if (page_referenced(page, 0, NULL, &dummy))
> > +		/*
> > +		 * This page was referenced. To avoid interference with the
> > +		 * reclaimer, mark it young so that the next call will also
> 
>                                                         next what call?
> 
> It just works with mapped page so kpageidle_clear_pte_refs as function name
> is more clear.
> 
> One more, kpageidle_clear_refs removes PG_idle via page_referenced which
> is important feature for the function. Please document it so we could
> understand why we need double check for PG_idle after calling
> kpageidle_clear_refs for pte access bit.

Sounds reasonable, will do.

> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index 24dd3f9fee27..12e73b758d9e 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -784,6 +784,13 @@ static int page_referenced_one(struct page *page, struct vm_area_struct *vma,
> >  	if (referenced) {
> >  		pra->referenced++;
> >  		pra->vm_flags |= vma->vm_flags;
> > +		if (page_is_idle(page))
> > +			clear_page_idle(page);
> > +	}
> > +
> > +	if (page_is_young(page)) {
> > +		clear_page_young(page);
> > +		pra->referenced++;
> 
> If a page was page_is_young and referenced recenlty,
> pra->referenced is increased doubly and it changes current
> behavior for file-backed page promotion. Look at page_check_references.

Yeah, you're quite right, I missed that. Something like this should get
rid of this extra reference:

diff --git a/mm/rmap.c b/mm/rmap.c
index 24dd3f9fee27..eca7416f55d7 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -781,6 +781,14 @@ static int page_referenced_one(struct page *page, struct vm_area_struct *vma,
 		pte_unmap_unlock(pte, ptl);
 	}
 
+	if (referenced && page_is_idle(page))
+		clear_page_idle(page);
+
+	if (page_is_young(page)) {
+		clear_page_young(page);
+		referenced++;
+	}
+
 	if (referenced) {
 		pra->referenced++;
 		pra->vm_flags |= vma->vm_flags;

Thanks,
Vladimir

  reply	other threads:[~2015-04-29  9:13 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-28 12:24 [PATCH v3 0/3] idle memory tracking Vladimir Davydov
2015-04-28 12:24 ` [PATCH v3 1/3] memcg: add page_cgroup_ino helper Vladimir Davydov
2015-04-28 12:24 ` [PATCH v3 2/3] proc: add kpagecgroup file Vladimir Davydov
2015-04-28 12:24 ` [PATCH v3 3/3] proc: add kpageidle file Vladimir Davydov
2015-04-29  4:35   ` Minchan Kim
2015-04-29  9:12     ` Vladimir Davydov [this message]
2015-04-30  8:25       ` Minchan Kim
2015-04-30 14:50         ` Vladimir Davydov
2015-05-04  3:17           ` Minchan Kim
2015-05-04  9:49             ` Vladimir Davydov
2015-05-04 10:54               ` Minchan Kim
2015-05-08  9:56                 ` Vladimir Davydov
2015-05-09 15:12                   ` Minchan Kim
2015-05-10 10:34                     ` Vladimir Davydov
2015-05-12  9:41                       ` Vladimir Davydov
2015-04-29  4:57   ` Minchan Kim
2015-04-29  8:31     ` Vladimir Davydov
2015-04-30  6:55       ` Minchan Kim
2015-04-29  3:57 ` [PATCH v3 0/3] idle memory tracking Minchan Kim
2015-04-29  7:58   ` Vladimir Davydov
2015-04-29  5:02 ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150429091248.GD1694@esperanza \
    --to=vdavydov@parallels.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=gorcunov@openvz.org \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=minchan@kernel.org \
    --cc=rientjes@google.com \
    --cc=walken@google.com \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).