From: Aristeu Rozanski <aris@ruivo.org>
To: Oscar Salvador <osalvador@suse.de>, naoya.horiguchi@nec.com
Cc: akpm@linux-foundation.org, mhocko@kernel.org,
tony.luck@intel.com, cai@lca.pw, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH v3 0/5] HWpoison: further fixes and cleanups
Date: Tue, 15 Sep 2020 17:22:22 -0400 [thread overview]
Message-ID: <20200915212222.GA18315@cathedrallabs.org> (raw)
In-Reply-To: <20200914101559.17103-1-osalvador@suse.de>
Hi Oscar, Naoya,
On Mon, Sep 14, 2020 at 12:15:54PM +0200, Oscar Salvador wrote:
> The important bit of this patchset is patch#1, which is a fix to take off
> HWPoison pages off a buddy freelist since it can lead us to having HWPoison
> pages back in the game without no one noticing it.
> So fix it (we did that already for soft_offline_page [1]).
>
> The other patches are clean-ups and not that important, so if anything,
> consider patch#1 for inclusion.
>
> [1] https://patchwork.kernel.org/cover/11704083/
I found something strange with your and Naoya's hwpoison rework. We have a
customer with a testcase that basically does:
p1 = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
p2 = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
madvise(p1, size, MADV_MERGEABLE);
madvise(p2, size, MADV_MERGEABLE);
memset(p1, 'a', size);
memset(p2, 'a', size);
madvise(p1, size, MADV_SOFT_OFFLINE);
madvise(p1, size, MADV_UNMERGEABLE);
madvise(p2, size, MADV_UNMERGEABLE);
where size is about 200,000 pages. It works on a x86_64 box (with and without the
hwpoison rework). On ppc64 boxes (tested 3 different ones with at least 250GB memory)
it fails to take a page off the buddy list (page_handle_poison()/take_page_off_buddy())
(madvise MADV_SOFT_OFFLINE returns -EBUSY). Without the hwpoison rework the test passes.
Possibly related is that ppc64 takes a long time to run this test and according
perf, it spends most of the time clearing pages:
17.15% ksm_poison [kernel.kallsyms] [k] copypage_power7
13.39% ksm_poison [kernel.kallsyms] [k] clear_user_page
8.70% ksm_poison libc-2.28.so [.] __memset_power8
8.63% ksm_poison [kernel.kallsyms] [k] opal_return
6.04% ksm_poison [kernel.kallsyms] [k] __opal_call
2.67% ksm_poison [kernel.kallsyms] [k] opal_call
1.52% ksm_poison [kernel.kallsyms] [k] _raw_spin_lock
1.45% ksm_poison [kernel.kallsyms] [k] opal_flush_console
1.43% ksm_poison [unknown] [k] 0x0000000030005138
1.43% ksm_poison [kernel.kallsyms] [k] opal_console_write_buffer_space
1.26% ksm_poison [kernel.kallsyms] [k] hvc_console_print
(...)
I've run these tests using mmotm and mmotm with this patchset on top.
Do you know what might be happening here?
--
Aristeu
next prev parent reply other threads:[~2020-09-15 22:01 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-14 10:15 [PATCH v3 0/5] HWpoison: further fixes and cleanups Oscar Salvador
2020-09-14 10:15 ` [PATCH v3 1/5] mm,hwpoison: take free pages off the buddy freelists Oscar Salvador
2020-09-14 10:15 ` [PATCH v3 2/5] mm,hwpoison: refactor madvise_inject_error Oscar Salvador
2020-09-14 10:15 ` [PATCH v3 3/5] mm,hwpoison: drain pcplists before bailing out for non-buddy zero-refcount page Oscar Salvador
2020-09-14 10:15 ` [PATCH v3 4/5] mm,hwpoison: drop unneeded pcplist draining Oscar Salvador
2020-09-14 10:15 ` [PATCH v3 5/5] mm,hwpoison: remove stale code Oscar Salvador
2020-09-15 21:22 ` Aristeu Rozanski [this message]
2020-09-16 7:27 ` [PATCH v3 0/5] HWpoison: further fixes and cleanups Oscar Salvador
2020-09-16 13:53 ` Aristeu Rozanski
2020-09-16 14:09 ` Oscar Salvador
2020-09-16 14:46 ` Aristeu Rozanski
2020-09-16 16:30 ` osalvador
2020-09-16 16:34 ` osalvador
2020-09-16 17:58 ` Aristeu Rozanski
2020-09-16 18:12 ` osalvador
2020-09-16 13:42 ` HORIGUCHI NAOYA(堀口 直也)
2020-09-16 14:06 ` Oscar Salvador
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200915212222.GA18315@cathedrallabs.org \
--to=aris@ruivo.org \
--cc=akpm@linux-foundation.org \
--cc=cai@lca.pw \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=naoya.horiguchi@nec.com \
--cc=osalvador@suse.de \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.