All of lore.kernel.org
 help / color / mirror / Atom feed
From: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
To: Aristeu Rozanski <aris@ruivo.org>, Oscar Salvador <osalvador@suse.de>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"mhocko@kernel.org" <mhocko@kernel.org>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"cai@lca.pw" <cai@lca.pw>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH v3 0/5] HWpoison: further fixes and cleanups
Date: Wed, 16 Sep 2020 13:42:15 +0000	[thread overview]
Message-ID: <20200916134215.GA30407@hori.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <20200915212222.GA18315@cathedrallabs.org>

On Tue, Sep 15, 2020 at 05:22:22PM -0400, Aristeu Rozanski wrote:
> Hi Oscar, Naoya,
> 
> On Mon, Sep 14, 2020 at 12:15:54PM +0200, Oscar Salvador wrote:
> > The important bit of this patchset is patch#1, which is a fix to take off
> > HWPoison pages off a buddy freelist since it can lead us to having HWPoison
> > pages back in the game without no one noticing it.
> > So fix it (we did that already for soft_offline_page [1]).
> > 
> > The other patches are clean-ups and not that important, so if anything,
> > consider patch#1 for inclusion.
> > 
> > [1] https://patchwork.kernel.org/cover/11704083/
> 
> I found something strange with your and Naoya's hwpoison rework. We have a
> customer with a testcase that basically does:
> 
> 	p1 = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> 	p2 = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> 
> 	madvise(p1, size, MADV_MERGEABLE);
> 	madvise(p2, size, MADV_MERGEABLE);
> 
> 	memset(p1, 'a', size);
> 	memset(p2, 'a', size);
> 
> 	madvise(p1, size, MADV_SOFT_OFFLINE);
> 
> 	madvise(p1, size, MADV_UNMERGEABLE);
> 	madvise(p2, size, MADV_UNMERGEABLE);
> 
> 
> where size is about 200,000 pages. It works on a x86_64 box (with and without the
> hwpoison rework). On ppc64 boxes (tested 3 different ones with at least 250GB memory)
> it fails to take a page off the buddy list (page_handle_poison()/take_page_off_buddy())
> (madvise MADV_SOFT_OFFLINE returns -EBUSY). Without the hwpoison rework the test passes.

I reproduced the similar -EBUSY with small average x86 VM, where it seems to me
a race between page_take_off_buddy() and page allocation.  Oscar's debug patch
shows the following kernel messages:

    [  627.357009] Soft offlining pfn 0x235018 at process virtual address 0x7fd112140000
    [  627.358747] __get_any_page: 0x235018 free buddy page
    [  627.359875] page:00000000038b52c9 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x1 pfn:0x235018
    [  627.362002] flags: 0x57ffe000000000()
    [  627.362841] raw: 0057ffe000000000 fffff84648d12688 ffff955abffd1dd0 0000000000000000
    [  627.364555] raw: 0000000000000001 0000000000000000 00000000ffffff7f 0000000000000000
    [  627.366258] page dumped because: page_handle_poison
    [  627.367357] page->mem_cgroup:ffff9559b6912000
    [  627.368342] page_handle_poison: hugepage_or_freepage failed\xb8n
    [  627.368344] soft_offline_free_page: page_handle_poison -EBUSY
    [  627.370901] page:00000000038b52c9 refcount:6 mapcount:3 mapping:000000001226bf89 index:0x2710 pfn:0x235018
    [  627.373048] aops:ext4_da_aops ino:c63f3 dentry name:"system.journal"
    [  627.374526] flags: 0x57ffe00000201c(uptodate|dirty|lru|private)
    [  627.375865] raw: 0057ffe00000201c fffff84648d300c8 ffff955ab8c3f020 ffff955aba5f4ee0
    [  627.377586] raw: 0000000000002710 ffff9559b811fc98 0000000500000002 ffff9559b6912000
    [  627.379308] page dumped because: soft_offline_free_page
    [  627.380480] page->mem_cgroup:ffff9559b6912000

    CPU 0                                CPU 1

    get_any_page // returns 0 (free buddy path)
      soft_offline_free_page
                                         the page is allocated
        page_handle_poison -> fail
          return -EBUSY

I'm still not sure why this issue is invisible before rework patch,
but setting migrate type to MIGRATE_ISOLATE during offlining could affect
the behavior sensitively.

Thanks,
Naoya Horiguchi

  parent reply	other threads:[~2020-09-16 17:54 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-14 10:15 [PATCH v3 0/5] HWpoison: further fixes and cleanups Oscar Salvador
2020-09-14 10:15 ` [PATCH v3 1/5] mm,hwpoison: take free pages off the buddy freelists Oscar Salvador
2020-09-14 10:15 ` [PATCH v3 2/5] mm,hwpoison: refactor madvise_inject_error Oscar Salvador
2020-09-14 10:15 ` [PATCH v3 3/5] mm,hwpoison: drain pcplists before bailing out for non-buddy zero-refcount page Oscar Salvador
2020-09-14 10:15 ` [PATCH v3 4/5] mm,hwpoison: drop unneeded pcplist draining Oscar Salvador
2020-09-14 10:15 ` [PATCH v3 5/5] mm,hwpoison: remove stale code Oscar Salvador
2020-09-15 21:22 ` [PATCH v3 0/5] HWpoison: further fixes and cleanups Aristeu Rozanski
2020-09-16  7:27   ` Oscar Salvador
2020-09-16 13:53     ` Aristeu Rozanski
2020-09-16 14:09       ` Oscar Salvador
2020-09-16 14:46         ` Aristeu Rozanski
2020-09-16 16:30           ` osalvador
2020-09-16 16:34             ` osalvador
2020-09-16 17:58               ` Aristeu Rozanski
2020-09-16 18:12                 ` osalvador
2020-09-16 13:42   ` HORIGUCHI NAOYA(堀口 直也) [this message]
2020-09-16 14:06     ` Oscar Salvador

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200916134215.GA30407@hori.linux.bs1.fc.nec.co.jp \
    --to=naoya.horiguchi@nec.com \
    --cc=akpm@linux-foundation.org \
    --cc=aris@ruivo.org \
    --cc=cai@lca.pw \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=osalvador@suse.de \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.