linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Minchan Kim <minchan@kernel.org>,
	linux-kernel@vger.kernel.org, vdavydov.dev@gmail.com,
	Brendan Gregg <bgregg@netflix.com>,
	kernel-team@android.com, Alexey Dobriyan <adobriyan@gmail.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	carmenjackson@google.com, Christian Hansen <chansen3@cisco.com>,
	Colin Ian King <colin.king@canonical.com>,
	dancol@google.com, David Howells <dhowells@redhat.com>,
	fmayer@google.com, joaodias@google.com,
	Jonathan Corbet <corbet@lwn.net>,
	Kees Cook <keescook@chromium.org>,
	Kirill Tkhai <ktkhai@virtuozzo.com>,
	linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, Michal Hocko <mhocko@suse.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	namhyung@google.com, sspatil@google.c
Subject: Re: [PATCH v1 1/2] mm/page_idle: Add support for per-pid page_idle using virtual indexing
Date: Thu, 25 Jul 2019 20:06:54 -0400	[thread overview]
Message-ID: <20190726000654.GB66718@google.com> (raw)
In-Reply-To: <c116f836-5a72-c6e6-498f-a904497ef557@yandex-team.ru>

On Thu, Jul 25, 2019 at 11:15:53AM +0300, Konstantin Khlebnikov wrote:
[snip]
> >>> Thanks for bringing up the swapping corner case..  Perhaps we can improve
> >>> the heap profiler to detect this by looking at bits 0-4 in pagemap. While it
> >>
> >> Yeb, that could work but it could add overhead again what you want to remove?
> >> Even, userspace should keep metadata to identify that page was already swapped
> >> in last period or newly swapped in new period.
> >
> > Yep.
> Between samples page could be read from swap and swapped out back multiple times.
> For tracking this swap ptes could be marked with idle bit too.
> I believe it's not so hard to find free bit for this.
> 
> Refault\swapout will automatically clear this bit in pte even if
> page goes nowhere stays if swap-cache.

Could you clarify more about your idea? Do you mean swapout will clear the new
idle swap-pte bit if the page was accessed just before the swapout?

Instead, I thought of using is_swap_pte() to detect if the PTE belong to a
page that was swapped. And if so, then assume the page was idle. Sure we
would miss data that the page was accessed before the swap out in the
sampling window, however if the page was swapped out, then it is likely idle
anyway.

My current patch was just reporting swapped out pages as non-idle (idle bit
not set) which is wrong as Minchan pointed. So I added below patch on top of
this patch (still testing..) :

thanks,

 - Joel
---8<-----------------------

diff --git a/mm/page_idle.c b/mm/page_idle.c
index 3667ed9cc904..46c2dd18cca8 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -271,10 +271,14 @@ struct page_idle_proc_priv {
 	struct list_head *idle_page_list;
 };
 
+/*
+ * Add a page to the idle page list.
+ * page can also be NULL if pte was not present or swapped.
+ */
 static void add_page_idle_list(struct page *page,
 			       unsigned long addr, struct mm_walk *walk)
 {
-	struct page *page_get;
+	struct page *page_get = NULL;
 	struct page_node *pn;
 	int bit;
 	unsigned long frames;
@@ -290,9 +294,11 @@ static void add_page_idle_list(struct page *page,
 			return;
 	}
 
-	page_get = page_idle_get_page(page);
-	if (!page_get)
-		return;
+	if (page) {
+		page_get = page_idle_get_page(page);
+		if (!page_get)
+			return;
+	}
 
 	pn = &(priv->page_nodes[priv->cur_page_node++]);
 	pn->page = page_get;
@@ -326,6 +332,15 @@ static int pte_page_idle_proc_range(pmd_t *pmd, unsigned long addr,
 
 	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
 	for (; addr != end; pte++, addr += PAGE_SIZE) {
+		/*
+		 * We add swapped pages to the idle_page_list so that we can
+		 * reported to userspace that they are idle.
+		 */
+		if (is_swap_pte(*pte)) {
+			add_page_idle_list(NULL, addr, walk);
+			continue;
+		}
+
 		if (!pte_present(*pte))
 			continue;
 
@@ -413,10 +428,12 @@ ssize_t page_idle_proc_generic(struct file *file, char __user *ubuff,
 			goto remove_page;
 
 		if (write) {
-			page_idle_clear_pte_refs(page);
-			set_page_idle(page);
+			if (page) {
+				page_idle_clear_pte_refs(page);
+				set_page_idle(page);
+			}
 		} else {
-			if (page_really_idle(page)) {
+			if (!page || page_really_idle(page)) {
 				off = ((cur->addr) >> PAGE_SHIFT) - start_frame;
 				bit = off % BITMAP_CHUNK_BITS;
 				index = off / BITMAP_CHUNK_BITS;
-- 
2.22.0.709.g102302147b-goog


  reply	other threads:[~2019-07-26  0:06 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-22 21:32 [PATCH v1 1/2] mm/page_idle: Add support for per-pid page_idle using virtual indexing Joel Fernandes (Google)
2019-07-22 21:32 ` [PATCH v1 2/2] doc: Update documentation for page_idle virtual address indexing Joel Fernandes (Google)
2019-07-22 22:06 ` [PATCH v1 1/2] mm/page_idle: Add support for per-pid page_idle using virtual indexing Andrew Morton
2019-07-23 14:43   ` Joel Fernandes
2019-07-24 19:33   ` Joel Fernandes
2019-07-23  6:05 ` Michal Hocko
2019-07-23 14:34   ` Joel Fernandes
2019-07-23  6:13 ` Minchan Kim
2019-07-23 14:20   ` Joel Fernandes
2019-07-24  4:28     ` Minchan Kim
2019-07-24 14:10       ` Joel Fernandes
2019-07-25  8:15         ` Konstantin Khlebnikov
2019-07-26  0:06           ` Joel Fernandes [this message]
2019-07-26 11:16             ` Konstantin Khlebnikov
2019-07-26 12:54               ` Joel Fernandes
2019-07-23  8:43 ` Konstantin Khlebnikov
2019-07-23 10:10   ` Konstantin Khlebnikov
2019-07-23 13:47     ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190726000654.GB66718@google.com \
    --to=joel@joelfernandes.org \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=bgregg@netflix.com \
    --cc=carmenjackson@google.com \
    --cc=chansen3@cisco.com \
    --cc=colin.king@canonical.com \
    --cc=corbet@lwn.net \
    --cc=dancol@google.com \
    --cc=dhowells@redhat.com \
    --cc=fmayer@google.com \
    --cc=joaodias@google.com \
    --cc=keescook@chromium.org \
    --cc=kernel-team@android.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=namhyung@google.com \
    --cc=rppt@linux.ibm.com \
    --cc=sspatil@google.c \
    --cc=vdavydov.dev@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).