linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"xishi.qiuxishi@alibaba-inc.com" <xishi.qiuxishi@alibaba-inc.com>,
	"zy.zhengyi@alibaba-inc.com" <zy.zhengyi@alibaba-inc.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 1/2] mm: fix race on soft-offlining free huge pages
Date: Thu, 19 Jul 2018 10:27:43 +0200	[thread overview]
Message-ID: <20180719082743.GN7193@dhcp22.suse.cz> (raw)
In-Reply-To: <20180719080804.GA32756@hori1.linux.bs1.fc.nec.co.jp>

On Thu 19-07-18 08:08:05, Naoya Horiguchi wrote:
> On Thu, Jul 19, 2018 at 09:15:16AM +0200, Michal Hocko wrote:
> > On Thu 19-07-18 06:19:45, Naoya Horiguchi wrote:
> > > On Wed, Jul 18, 2018 at 10:50:32AM +0200, Michal Hocko wrote:
[...]
> > > > Why do we even need HWPoison flag here? Everything can be completely
> > > > transparent to the application. It shouldn't fail from what I
> > > > understood.
> > > 
> > > PageHWPoison flag is used to the 'remove from the allocator' part
> > > which is like below:
> > > 
> > >   static inline
> > >   struct page *rmqueue(
> > >           ...
> > >           do {
> > >                   page = NULL;
> > >                   if (alloc_flags & ALLOC_HARDER) {
> > >                           page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC);
> > >                           if (page)
> > >                                   trace_mm_page_alloc_zone_locked(page, order, migratetype);
> > >                   }
> > >                   if (!page)
> > >                           page = __rmqueue(zone, order, migratetype);
> > >           } while (page && check_new_pages(page, order));
> > > 
> > > check_new_pages() returns true if the page taken from free list has
> > > a hwpoison page so that the allocator iterates another round to get
> > > another page.
> > > 
> > > There's no function that can be called from outside allocator to remove
> > > a page in allocator.  So actual page removal is done at allocation time,
> > > not at error handling time. That's the reason why we need PageHWPoison.
> > 
> > hwpoison is an internal mm functionality so why cannot we simply add a
> > function that would do that?
> 
> That's one possible solution.

I would prefer that much more than add an overhead (albeit small) into
the page allocator directly. HWPoison should be a really rare event so
why should everybody pay the price? I would much rather see that the
poison path pays the additional price.

> I know about another downside in current implementation.
> If a hwpoison page is found during high order page allocation,
> all 2^order pages (not only hwpoison page) are removed from
> buddy because of the above quoted code. And these leaked pages
> are never returned to freelist even with unpoison_memory().
> If we have a page removal function which properly splits high order
> free pages into lower order pages, this problem is avoided.

Even more reason to move to a new scheme.

> OTOH PageHWPoison still has a role to report error to userspace.
> Without it unpoison_memory() doesn't work.

Sure but we do not really need a special page flag for that. We know the
page is not reachable other than via pfn walkers. If you make the page
reserved and note the fact it has been poisoned in the past then you can
emulate the missing functionality.

Btw. do we really need unpoisoning functionality? Who is really using
it, other than some tests? How does the memory become OK again? Don't we
really need to go through physical hotremove & hotadd to clean the
poison status?

Thanks!
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2018-07-19  8:27 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-17  5:32 [PATCH v2 0/2] mm: soft-offline: fix race against page allocation Naoya Horiguchi
2018-07-17  5:32 ` [PATCH v2 1/2] mm: fix race on soft-offlining free huge pages Naoya Horiguchi
2018-07-17 14:27   ` Michal Hocko
2018-07-17 20:10     ` Mike Kravetz
2018-07-18  1:28       ` Naoya Horiguchi
2018-07-18  2:36         ` Mike Kravetz
2018-07-18  0:55     ` Naoya Horiguchi
2018-07-18  1:41       ` Naoya Horiguchi
2018-07-18  8:50       ` Michal Hocko
2018-07-19  6:19         ` Naoya Horiguchi
2018-07-19  7:15           ` Michal Hocko
2018-07-19  8:08             ` Naoya Horiguchi
2018-07-19  8:27               ` Michal Hocko [this message]
2018-07-19  9:22                 ` Naoya Horiguchi
2018-07-19 10:32                   ` Michal Hocko
2018-07-17  5:32 ` [PATCH v2 2/2] mm: soft-offline: close the race against page allocation Naoya Horiguchi
2018-08-15 22:43 ` [PATCH v2 0/2] mm: soft-offline: fix " Andrew Morton
2018-08-22  1:37   ` Naoya Horiguchi
2018-08-22  2:25     ` Mike Kravetz
2018-08-22  8:00     ` Michal Hocko
2018-10-26  8:46       ` Michal Hocko
2018-10-30  6:54         ` Naoya Horiguchi
2018-10-30  8:16           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180719082743.GN7193@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=xishi.qiuxishi@alibaba-inc.com \
    --cc=zy.zhengyi@alibaba-inc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).