linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"jlayton@poochiereds.net" <jlayton@poochiereds.net>,
	"bfields@fieldses.org" <bfields@fieldses.org>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	"koct9i@gmail.com" <koct9i@gmail.com>,
	"aquini@redhat.com" <aquini@redhat.com>,
	"virtualization@lists.linux-foundation.org" 
	<virtualization@lists.linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, Hugh Dickins <hughd@google.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Rik van Riel <riel@redhat.com>,
	"rknize@motorola.com" <rknize@motorola.com>,
	Gioh Kim <gi-oh.kim@profitbricks.com>,
	Sangseok Lee <sangseok.lee@lge.com>,
	"Chan Gyun Jeong" <chan.jeong@lge.com>,
	Al Viro <viro@ZenIV.linux.org.uk>,
	YiPing Xu <xuyiping@hisilicon.com>
Subject: Re: [PATCH v3 01/16] mm: use put_page to free page instead of putback_lru_page
Date: Mon, 4 Apr 2016 04:45:12 +0000	[thread overview]
Message-ID: <20160404044458.GA20250@hori1.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <20160404013917.GC6543@bbox>

On Mon, Apr 04, 2016 at 10:39:17AM +0900, Minchan Kim wrote:
> On Fri, Apr 01, 2016 at 02:58:21PM +0200, Vlastimil Babka wrote:
> > On 03/30/2016 09:12 AM, Minchan Kim wrote:
> > >Procedure of page migration is as follows:
> > >
> > >First of all, it should isolate a page from LRU and try to
> > >migrate the page. If it is successful, it releases the page
> > >for freeing. Otherwise, it should put the page back to LRU
> > >list.
> > >
> > >For LRU pages, we have used putback_lru_page for both freeing
> > >and putback to LRU list. It's okay because put_page is aware of
> > >LRU list so if it releases last refcount of the page, it removes
> > >the page from LRU list. However, It makes unnecessary operations
> > >(e.g., lru_cache_add, pagevec and flags operations. It would be
> > >not significant but no worth to do) and harder to support new
> > >non-lru page migration because put_page isn't aware of non-lru
> > >page's data structure.
> > >
> > >To solve the problem, we can add new hook in put_page with
> > >PageMovable flags check but it can increase overhead in
> > >hot path and needs new locking scheme to stabilize the flag check
> > >with put_page.
> > >
> > >So, this patch cleans it up to divide two semantic(ie, put and putback).
> > >If migration is successful, use put_page instead of putback_lru_page and
> > >use putback_lru_page only on failure. That makes code more readable
> > >and doesn't add overhead in put_page.
> > >
> > >Comment from Vlastimil
> > >"Yeah, and compaction (perhaps also other migration users) has to drain
> > >the lru pvec... Getting rid of this stuff is worth even by itself."
> > >
> > >Cc: Mel Gorman <mgorman@suse.de>
> > >Cc: Hugh Dickins <hughd@google.com>
> > >Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> > >Acked-by: Vlastimil Babka <vbabka@suse.cz>
> > >Signed-off-by: Minchan Kim <minchan@kernel.org>
> > 
> > [...]
> > 
> > >@@ -974,28 +986,28 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
> > >  		list_del(&page->lru);
> > >  		dec_zone_page_state(page, NR_ISOLATED_ANON +
> > >  				page_is_file_cache(page));
> > >-		/* Soft-offlined page shouldn't go through lru cache list */
> > >+	}
> > >+
> > >+	/*
> > >+	 * If migration is successful, drop the reference grabbed during
> > >+	 * isolation. Otherwise, restore the page to LRU list unless we
> > >+	 * want to retry.
> > >+	 */
> > >+	if (rc == MIGRATEPAGE_SUCCESS) {
> > >+		put_page(page);
> > >  		if (reason == MR_MEMORY_FAILURE) {
> > >-			put_page(page);
> > >  			if (!test_set_page_hwpoison(page))
> > >  				num_poisoned_pages_inc();
> > >-		} else
> > >+		}
> > 
> > Hmm, I didn't notice it previously, or it's due to rebasing, but it
> > seems that you restricted the memory failure handling (i.e. setting
> > hwpoison) to MIGRATE_SUCCESS, while previously it was done for all
> > non-EAGAIN results. I think that goes against the intention of
> > hwpoison, which is IIRC to catch and kill the poor process that
> > still uses the page?
> 
> That's why I Cc'ed Naoya Horiguchi to catch things I might make
> mistake.
> 
> Thanks for catching it, Vlastimil.
> It was my mistake. But in this chance, I looked over hwpoison code and
> I saw other places which increases num_poisoned_pages are successful
> migration, already freed page and successful invalidated page.
> IOW, they are already successful isolated page so I guess it should
> increase the count when only successful migration is done?

Yes, that's right. When exiting with migration's failure, we shouldn't call
test_set_page_hwpoison or num_poisoned_pages_inc, so current code checking
(rc != -EAGAIN) is simply incorrect. Your change fixes the bug in memory
error handling. Great!

> And when I read memory_failure, it bails out without killing if it
> encounters HWPoisoned page so I think it's not for catching and
> kill the poor proces.
>
> > 
> > Also (but not your fault) the put_page() preceding
> > test_set_page_hwpoison(page)) IMHO deserves a comment saying which
> > pin we are releasing and which one we still have (hopefully? if I
> > read description of da1b13ccfbebe right) otherwise it looks like
> > doing something with a page that we just potentially freed.
>
> Yes, while I read the code, I had same question. I think the releasing
> refcount is for get_any_page.

As the other callers of page migration do, soft_offline_page expects the
migration source page to be freed at this put_page() (no pin remains.)
The refcount released here is from isolate_lru_page() in __soft_offline_page().
(the pin by get_any_page is released by put_hwpoison_page just after it.)

.. yes, doing something just after freeing page looks weird, but that's
how PageHWPoison flag works. IOW, many other page flags are maintained
only during one "allocate-free" life span, but PageHWPoison still does
its job beyond it.

As for commenting, this put_page() is called in any MIGRATEPAGE_SUCCESS
case (regardless of callers), so what we can say here is "we free the
source page here, bypassing LRU list" or something?

Thanks,
Naoya Horiguchi

  reply	other threads:[~2016-04-04  4:47 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-30  7:11 [PATCH v3 00/16] Support non-lru page migration Minchan Kim
2016-03-30  7:12 ` [PATCH v3 01/16] mm: use put_page to free page instead of putback_lru_page Minchan Kim
2016-04-01 12:58   ` Vlastimil Babka
2016-04-04  1:39     ` Minchan Kim
2016-04-04  4:45       ` Naoya Horiguchi [this message]
2016-04-04 14:46         ` Vlastimil Babka
2016-04-05  1:54           ` Naoya Horiguchi
2016-04-05  8:20             ` Vlastimil Babka
2016-04-06  0:54               ` Naoya Horiguchi
2016-04-06  7:57                 ` Vlastimil Babka
2016-04-04  5:53   ` Balbir Singh
2016-04-04  6:01     ` Minchan Kim
2016-04-05  3:10       ` Balbir Singh
2016-03-30  7:12 ` [PATCH v3 02/16] mm/compaction: support non-lru movable page migration Minchan Kim
2016-04-01 21:29   ` Vlastimil Babka
2016-04-04  5:12     ` Minchan Kim
2016-04-04 13:24       ` Vlastimil Babka
2016-04-07  2:35         ` Minchan Kim
2016-04-12  8:00   ` Chulmin Kim
2016-04-12 14:25     ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 03/16] mm: add non-lru movable page support document Minchan Kim
2016-04-01 14:38   ` Vlastimil Babka
2016-04-04  2:25     ` Minchan Kim
2016-04-04 13:09       ` Vlastimil Babka
2016-04-07  2:27         ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 04/16] mm/balloon: use general movable page feature into balloon Minchan Kim
2016-04-05 12:03   ` Vlastimil Babka
2016-04-11  4:29     ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 05/16] zsmalloc: keep max_object in size_class Minchan Kim
2016-04-17 15:08   ` Sergey Senozhatsky
2016-03-30  7:12 ` [PATCH v3 06/16] zsmalloc: squeeze inuse into page->mapping Minchan Kim
2016-04-17 15:08   ` Sergey Senozhatsky
2016-04-19  7:40     ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 07/16] zsmalloc: remove page_mapcount_reset Minchan Kim
2016-04-17 15:11   ` Sergey Senozhatsky
2016-03-30  7:12 ` [PATCH v3 08/16] zsmalloc: squeeze freelist into page->mapping Minchan Kim
2016-04-17 15:56   ` Sergey Senozhatsky
2016-04-19  7:42     ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 09/16] zsmalloc: move struct zs_meta from mapping to freelist Minchan Kim
2016-04-17 15:22   ` Sergey Senozhatsky
2016-03-30  7:12 ` [PATCH v3 10/16] zsmalloc: factor page chain functionality out Minchan Kim
2016-04-18  0:33   ` Sergey Senozhatsky
2016-04-19  7:46     ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 11/16] zsmalloc: separate free_zspage from putback_zspage Minchan Kim
2016-04-18  1:04   ` Sergey Senozhatsky
2016-04-19  7:51     ` Minchan Kim
2016-04-19  7:53       ` Sergey Senozhatsky
2016-03-30  7:12 ` [PATCH v3 12/16] zsmalloc: zs_compact refactoring Minchan Kim
2016-04-04  8:04   ` Chulmin Kim
2016-04-04  9:01     ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 13/16] zsmalloc: migrate head page of zspage Minchan Kim
2016-04-06 13:01   ` Chulmin Kim
2016-04-07  0:34     ` Chulmin Kim
2016-04-07  0:43     ` Minchan Kim
2016-04-19  6:08   ` Chulmin Kim
2016-04-19  6:15     ` Minchan Kim
2016-03-30  7:12 ` [PATCH v3 14/16] zsmalloc: use single linked list for page chain Minchan Kim
2016-03-30  7:12 ` [PATCH v3 15/16] zsmalloc: migrate tail pages in zspage Minchan Kim
2016-03-30  7:12 ` [PATCH v3 16/16] zram: use __GFP_MOVABLE for memory allocation Minchan Kim
2016-03-30 23:11 ` [PATCH v3 00/16] Support non-lru page migration Andrew Morton
2016-03-31  0:29   ` Sergey Senozhatsky
2016-03-31  0:57     ` Minchan Kim
2016-03-31  0:57   ` Minchan Kim
2016-04-04 13:17 ` John Einar Reitan
2016-04-11  4:35   ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160404044458.GA20250@hori1.linux.bs1.fc.nec.co.jp \
    --to=n-horiguchi@ah.jp.nec.com \
    --cc=akpm@linux-foundation.org \
    --cc=aquini@redhat.com \
    --cc=bfields@fieldses.org \
    --cc=chan.jeong@lge.com \
    --cc=gi-oh.kim@profitbricks.com \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=jlayton@poochiereds.net \
    --cc=koct9i@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    --cc=rknize@motorola.com \
    --cc=sangseok.lee@lge.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=viro@ZenIV.linux.org.uk \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xuyiping@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).