All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oscar Salvador <osalvador@suse.de>
To: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
Cc: Aristeu Rozanski <aris@ruivo.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"mhocko@kernel.org" <mhocko@kernel.org>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"cai@lca.pw" <cai@lca.pw>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH v3 0/5] HWpoison: further fixes and cleanups
Date: Wed, 16 Sep 2020 16:06:43 +0200	[thread overview]
Message-ID: <20200916140624.GA17833@linux> (raw)
In-Reply-To: <20200916134215.GA30407@hori.linux.bs1.fc.nec.co.jp>

On Wed, Sep 16, 2020 at 01:42:15PM +0000, HORIGUCHI NAOYA(堀口 直也) wrote:
> On Tue, Sep 15, 2020 at 05:22:22PM -0400, Aristeu Rozanski wrote:
 
> I reproduced the similar -EBUSY with small average x86 VM, where it seems to me
> a race between page_take_off_buddy() and page allocation.  Oscar's debug patch
> shows the following kernel messages:
> 
>     [  627.357009] Soft offlining pfn 0x235018 at process virtual address 0x7fd112140000
>     [  627.358747] __get_any_page: 0x235018 free buddy page
>     [  627.359875] page:00000000038b52c9 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x1 pfn:0x235018
>     [  627.362002] flags: 0x57ffe000000000()
>     [  627.362841] raw: 0057ffe000000000 fffff84648d12688 ffff955abffd1dd0 0000000000000000
>     [  627.364555] raw: 0000000000000001 0000000000000000 00000000ffffff7f 0000000000000000
>     [  627.366258] page dumped because: page_handle_poison
>     [  627.367357] page->mem_cgroup:ffff9559b6912000
>     [  627.368342] page_handle_poison: hugepage_or_freepage failed\xb8n
>     [  627.368344] soft_offline_free_page: page_handle_poison -EBUSY
>     [  627.370901] page:00000000038b52c9 refcount:6 mapcount:3 mapping:000000001226bf89 index:0x2710 pfn:0x235018
>     [  627.373048] aops:ext4_da_aops ino:c63f3 dentry name:"system.journal"
>     [  627.374526] flags: 0x57ffe00000201c(uptodate|dirty|lru|private)
>     [  627.375865] raw: 0057ffe00000201c fffff84648d300c8 ffff955ab8c3f020 ffff955aba5f4ee0
>     [  627.377586] raw: 0000000000002710 ffff9559b811fc98 0000000500000002 ffff9559b6912000
>     [  627.379308] page dumped because: soft_offline_free_page
>     [  627.380480] page->mem_cgroup:ffff9559b6912000
> 
>     CPU 0                                CPU 1
> 
>     get_any_page // returns 0 (free buddy path)
>       soft_offline_free_page
>                                          the page is allocated
>         page_handle_poison -> fail
>           return -EBUSY
> 
> I'm still not sure why this issue is invisible before rework patch,
> but setting migrate type to MIGRATE_ISOLATE during offlining could affect
> the behavior sensitively.

Well, this is very timing depending.
AFAICS, before the rework patchset, we could still race with an allocation
as the page could have been allocated between the get_any_page()
and the call to set_hwpoison_free_buddy_page() which takes the zone->lock
to prevent that.

Maybe we just happen to take longer now to reach take_page_off_buddy, so the
race window is bigger.

AFAICS, this has nothing to do with MIGRATE_ISOLATE, because here we are
dealing with pages that already free (part of the buddy system).

The only thing that comes to my mind right off the bat, might be to do
a "retry" in soft_offline_page in case soft_offline_free_page returns -EBUSY,
so we can call again get_any_page and try to handle the new type of page.
Something like (untested):

@@ -1923,6 +1977,7 @@ int soft_offline_page(unsigned long pfn, int flags)
 {
 	int ret;
 	struct page *page;
+	bool try_again = true;
 
 	if (!pfn_valid(pfn))
 		return -ENXIO;
@@ -1938,6 +1993,7 @@ int soft_offline_page(unsigned long pfn, int flags)
 		return 0;
 	}
 
+retry:
 	get_online_mems();
 	ret = get_any_page(page, pfn, flags);
 	put_online_mems();
@@ -1945,7 +2001,10 @@ int soft_offline_page(unsigned long pfn, int flags)
 	if (ret > 0)
 		ret = soft_offline_in_use_page(page);
 	else if (ret == 0)
-		ret = soft_offline_free_page(page);
+		if (soft_offline_free_page(page) && try_again) {
+			try_again = false;
+			goto retry;
+		}
 
 	return ret;


-- 
Oscar Salvador
SUSE L3

      reply	other threads:[~2020-09-16 20:43 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-14 10:15 [PATCH v3 0/5] HWpoison: further fixes and cleanups Oscar Salvador
2020-09-14 10:15 ` [PATCH v3 1/5] mm,hwpoison: take free pages off the buddy freelists Oscar Salvador
2020-09-14 10:15 ` [PATCH v3 2/5] mm,hwpoison: refactor madvise_inject_error Oscar Salvador
2020-09-14 10:15 ` [PATCH v3 3/5] mm,hwpoison: drain pcplists before bailing out for non-buddy zero-refcount page Oscar Salvador
2020-09-14 10:15 ` [PATCH v3 4/5] mm,hwpoison: drop unneeded pcplist draining Oscar Salvador
2020-09-14 10:15 ` [PATCH v3 5/5] mm,hwpoison: remove stale code Oscar Salvador
2020-09-15 21:22 ` [PATCH v3 0/5] HWpoison: further fixes and cleanups Aristeu Rozanski
2020-09-16  7:27   ` Oscar Salvador
2020-09-16 13:53     ` Aristeu Rozanski
2020-09-16 14:09       ` Oscar Salvador
2020-09-16 14:46         ` Aristeu Rozanski
2020-09-16 16:30           ` osalvador
2020-09-16 16:34             ` osalvador
2020-09-16 17:58               ` Aristeu Rozanski
2020-09-16 18:12                 ` osalvador
2020-09-16 13:42   ` HORIGUCHI NAOYA(堀口 直也)
2020-09-16 14:06     ` Oscar Salvador [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200916140624.GA17833@linux \
    --to=osalvador@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=aris@ruivo.org \
    --cc=cai@lca.pw \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.