linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Oscar Salvador <osalvador@suse.de>
To: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"aris@ruivo.org" <aris@ruivo.org>,
	"mhocko@kernel.org" <mhocko@kernel.org>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"cai@lca.pw" <cai@lca.pw>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH v4 0/7] HWpoison: further fixes and cleanups
Date: Thu, 17 Sep 2020 15:40:06 +0200	[thread overview]
Message-ID: <20200917133959.GA2504@linux> (raw)
In-Reply-To: <20200917130948.GA1812@linux>

On Thu, Sep 17, 2020 at 03:09:52PM +0200, Oscar Salvador wrote:
> static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, bool release)
> {
>         if (release) {
>                 put_page(page);
>                 drain_all_pages(page_zone(page));
>         }
> 
> 	...
>         SetPageHWPoison(page);
>         page_ref_inc(page);
> 
> 1) You are freeing the page first, which means it goes to buddy
> 2) Then you set it as poisoned and you update its refcount.
> 
> Now we have a page sitting in Buddy with a refcount = 1 and poisoned, and that is quite wrong.

Hi Naoya,

Ok, I tested it and with the following changes on top I cannot reproduce the issue:

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index f68cb5e3b320..4ffaaa5c2603 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -67,11 +67,6 @@ atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0);
 
 static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, bool release)
 {
-	if (release) {
-		put_page(page);
-		drain_all_pages(page_zone(page));
-	}
-
 	if (hugepage_or_freepage) {
 		/*
 		 * Doing this check for free pages is also fine since dissolve_free_huge_page
@@ -89,6 +84,12 @@ static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, boo
 	}
 
 	SetPageHWPoison(page);
+
+	if (release) {
+                put_page(page);
+                drain_all_pages(page_zone(page));
+        }
+
 	page_ref_inc(page);
 	num_poisoned_pages_inc();
 	return true;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0d9f9bd0e06c..8a6ab05f074c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1173,6 +1173,16 @@ static __always_inline bool free_pages_prepare(struct page *page,
 
 	trace_mm_page_free(page, order);
 
+	if (unlikely(PageHWPoison(page)) && !order) {
+		/*
+		 * Untie memcg state and reset page's owner
+		 */
+		if (memcg_kmem_enabled() && PageKmemcg(page))
+			__memcg_kmem_uncharge_page(page, order);
+		reset_page_owner(page, order);
+		return false;
+	}
+
 	/*
 	 * Check tail pages before head page information is cleared to
 	 * avoid checking PageCompound for order-0 pages.


# sh tmp_run_ksm_madv.sh 
p1 0x7f6b6b08e000
p2 0x7f6b529ee000
madvise(p1) 0
madvise(p2) 0
writing p1 ... done
writing p2 ... done
soft offline
soft offline returns 0
OK


Can you try to re-run your tests and see if they come clean?
If they do, I will try to see if Andrew can squezee above changes into [1],
where they belong to.

Otherwise I will craft a v5 containing all fixups (pretty unfortunate).

[1] https://patchwork.kernel.org/patch/11704099/

-- 
Oscar Salvador
SUSE L3


  reply	other threads:[~2020-09-17 13:40 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-17  8:10 [PATCH v4 0/7] HWpoison: further fixes and cleanups Oscar Salvador
2020-09-17  8:10 ` [PATCH v4 1/7] mm,hwpoison: take free pages off the buddy freelists Oscar Salvador
2020-09-25  2:22   ` HORIGUCHI NAOYA(堀口 直也)
2020-09-17  8:10 ` [PATCH v4 2/7] mm,hwpoison: Do not set hugepage_or_freepage unconditionally Oscar Salvador
2020-09-18 19:26   ` Aristeu Rozanski
2020-09-17  8:10 ` [PATCH v4 3/7] mm,hwpoison: Try to narrow window race for free pages Oscar Salvador
2020-09-18 19:27   ` Aristeu Rozanski
2020-09-17  8:10 ` [PATCH v4 4/7] mm,hwpoison: refactor madvise_inject_error Oscar Salvador
2020-09-17  8:10 ` [PATCH v4 5/7] mm,hwpoison: drain pcplists before bailing out for non-buddy zero-refcount page Oscar Salvador
2020-09-25  2:22   ` HORIGUCHI NAOYA(堀口 直也)
2020-09-17  8:10 ` [PATCH v4 6/7] mm,hwpoison: drop unneeded pcplist draining Oscar Salvador
2020-09-17  8:10 ` [PATCH v4 7/7] mm,hwpoison: remove stale code Oscar Salvador
2020-09-17 11:39 ` [PATCH v4 0/7] HWpoison: further fixes and cleanups HORIGUCHI NAOYA(堀口 直也)
2020-09-17 13:09   ` Oscar Salvador
2020-09-17 13:40     ` Oscar Salvador [this message]
2020-09-17 15:27       ` HORIGUCHI NAOYA(堀口 直也)
2020-09-18  5:49         ` osalvador
2020-09-18 19:25         ` aris

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200917133959.GA2504@linux \
    --to=osalvador@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=aris@ruivo.org \
    --cc=cai@lca.pw \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).