From: Oscar Salvador <osalvador@suse.de>
To: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"aris@ruivo.org" <aris@ruivo.org>,
"mhocko@kernel.org" <mhocko@kernel.org>,
"tony.luck@intel.com" <tony.luck@intel.com>,
"cai@lca.pw" <cai@lca.pw>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH v4 0/7] HWpoison: further fixes and cleanups
Date: Thu, 17 Sep 2020 15:40:06 +0200 [thread overview]
Message-ID: <20200917133959.GA2504@linux> (raw)
In-Reply-To: <20200917130948.GA1812@linux>
On Thu, Sep 17, 2020 at 03:09:52PM +0200, Oscar Salvador wrote:
> static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, bool release)
> {
> if (release) {
> put_page(page);
> drain_all_pages(page_zone(page));
> }
>
> ...
> SetPageHWPoison(page);
> page_ref_inc(page);
>
> 1) You are freeing the page first, which means it goes to buddy
> 2) Then you set it as poisoned and you update its refcount.
>
> Now we have a page sitting in Buddy with a refcount = 1 and poisoned, and that is quite wrong.
Hi Naoya,
Ok, I tested it and with the following changes on top I cannot reproduce the issue:
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index f68cb5e3b320..4ffaaa5c2603 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -67,11 +67,6 @@ atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0);
static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, bool release)
{
- if (release) {
- put_page(page);
- drain_all_pages(page_zone(page));
- }
-
if (hugepage_or_freepage) {
/*
* Doing this check for free pages is also fine since dissolve_free_huge_page
@@ -89,6 +84,12 @@ static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, boo
}
SetPageHWPoison(page);
+
+ if (release) {
+ put_page(page);
+ drain_all_pages(page_zone(page));
+ }
+
page_ref_inc(page);
num_poisoned_pages_inc();
return true;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0d9f9bd0e06c..8a6ab05f074c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1173,6 +1173,16 @@ static __always_inline bool free_pages_prepare(struct page *page,
trace_mm_page_free(page, order);
+ if (unlikely(PageHWPoison(page)) && !order) {
+ /*
+ * Untie memcg state and reset page's owner
+ */
+ if (memcg_kmem_enabled() && PageKmemcg(page))
+ __memcg_kmem_uncharge_page(page, order);
+ reset_page_owner(page, order);
+ return false;
+ }
+
/*
* Check tail pages before head page information is cleared to
* avoid checking PageCompound for order-0 pages.
# sh tmp_run_ksm_madv.sh
p1 0x7f6b6b08e000
p2 0x7f6b529ee000
madvise(p1) 0
madvise(p2) 0
writing p1 ... done
writing p2 ... done
soft offline
soft offline returns 0
OK
Can you try to re-run your tests and see if they come clean?
If they do, I will try to see if Andrew can squezee above changes into [1],
where they belong to.
Otherwise I will craft a v5 containing all fixups (pretty unfortunate).
[1] https://patchwork.kernel.org/patch/11704099/
--
Oscar Salvador
SUSE L3
next prev parent reply other threads:[~2020-09-17 13:40 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-17 8:10 [PATCH v4 0/7] HWpoison: further fixes and cleanups Oscar Salvador
2020-09-17 8:10 ` [PATCH v4 1/7] mm,hwpoison: take free pages off the buddy freelists Oscar Salvador
2020-09-25 2:22 ` HORIGUCHI NAOYA(堀口 直也)
2020-09-17 8:10 ` [PATCH v4 2/7] mm,hwpoison: Do not set hugepage_or_freepage unconditionally Oscar Salvador
2020-09-18 19:26 ` Aristeu Rozanski
2020-09-17 8:10 ` [PATCH v4 3/7] mm,hwpoison: Try to narrow window race for free pages Oscar Salvador
2020-09-18 19:27 ` Aristeu Rozanski
2020-09-17 8:10 ` [PATCH v4 4/7] mm,hwpoison: refactor madvise_inject_error Oscar Salvador
2020-09-17 8:10 ` [PATCH v4 5/7] mm,hwpoison: drain pcplists before bailing out for non-buddy zero-refcount page Oscar Salvador
2020-09-25 2:22 ` HORIGUCHI NAOYA(堀口 直也)
2020-09-17 8:10 ` [PATCH v4 6/7] mm,hwpoison: drop unneeded pcplist draining Oscar Salvador
2020-09-17 8:10 ` [PATCH v4 7/7] mm,hwpoison: remove stale code Oscar Salvador
2020-09-17 11:39 ` [PATCH v4 0/7] HWpoison: further fixes and cleanups HORIGUCHI NAOYA(堀口 直也)
2020-09-17 13:09 ` Oscar Salvador
2020-09-17 13:40 ` Oscar Salvador [this message]
2020-09-17 15:27 ` HORIGUCHI NAOYA(堀口 直也)
2020-09-18 5:49 ` osalvador
2020-09-18 19:25 ` aris
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200917133959.GA2504@linux \
--to=osalvador@suse.de \
--cc=akpm@linux-foundation.org \
--cc=aris@ruivo.org \
--cc=cai@lca.pw \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=naoya.horiguchi@nec.com \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).