linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: David Hildenbrand <david@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: Teach pfn_to_online_page() about ZONE_DEVICE section collisions
Date: Wed, 6 Jan 2021 11:42:55 +0100	[thread overview]
Message-ID: <20210106104255.GK13207@dhcp22.suse.cz> (raw)
In-Reply-To: <785b9095-eca4-8100-33ea-6ae84e02a92e@redhat.com>

On Wed 06-01-21 10:56:19, David Hildenbrand wrote:
[...]
> Note that this is not sufficient in the general case. I already
> mentioned that we effectively override an already initialized memmap.
> 
> ---
> 
> [        SECTION             ]
> Before:
> [ ZONE_NORMAL ][    Hole     ]
> 
> The hole has some node/zone (currently 0/0, discussions ongoing on how
> to optimize that to e.g., ZONE_NORMAL in this example) and is
> PG_reserved - looks like an ordinary memory hole.
> 
> After memremap:
> [ ZONE_NORMAL ][ ZONE_DEVICE ]
> 
> The already initialized memmap was converted to ZONE_DEVICE. Your
> slowpath will work.
> 
> After memunmap (no poisioning):
> [ ZONE_NORMAL ][ ZONE_DEVICE ]
> 
> The slow path is no longer working. pfn_to_online_page() might return
> something that is ZONE_DEVICE.
> 
> After memunmap (poisioning):
> [ ZONE_NORMAL ][ POISONED    ]
> 
> The slow path is no longer working. pfn_to_online_page() might return
> something that will BUG_ON via page_to_nid() etc.
> 
> ---
> 
> Reason is that pfn_to_online_page() does no care about sub-sections. And
> for now, it didn't had to. If there was an online section, it either was
> 
> a) Completely present. The whole memmap is initialized to sane values.
> b) Partially present. The whole memmap is initialized to sane values.
> 
> memremap/memunmap messes with case b)

I do not see we ever clear the newly added flag and my understanding is
that the subsection removed would lead to get_dev_pagemap returning a
NULL. Which would obviously need to be checked for pfn_to_online_page.
Or do I miss anything and the above is not the case and we could still
get false positives?

> Well have to further tweak pfn_to_online_page(). You'll have to also
> check pfn_section_valid() *at least* on the slow path. Less-hacky would
> be checking it also in the "somehwat-faster" path - that would cover
> silently overriding a memmap that's visible via pfn_to_online_page().
> Might slow down things a bit.
> 
> 
> Not completely opposed to this, but I would certainly still prefer just
> avoiding this corner case completely instead of patching around it. Thanks!

Well, I would love to have no surprises either. So far there was not
actual argument why the pmem reserved space cannot be fully initialized.
On the other hand making sure that pfn_to_online_page sounds like the
right thing to do. And having an explicit check for zone device there in
a slow path makes sense to me.
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2021-01-06 10:44 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-06  4:07 [PATCH] mm: Teach pfn_to_online_page() about ZONE_DEVICE section collisions Dan Williams
2021-01-06  9:55 ` Michal Hocko
2021-01-12  9:15   ` Dan Williams
2021-01-06  9:56 ` David Hildenbrand
2021-01-06 10:42   ` Michal Hocko [this message]
2021-01-06 11:22     ` David Hildenbrand
2021-01-06 11:38       ` Michal Hocko
2021-01-06 20:02       ` Dan Williams
2021-01-07  9:15         ` David Hildenbrand
2021-01-12  9:18           ` Dan Williams
2021-01-12  9:44             ` David Hildenbrand
2021-01-12 20:52               ` Dan Williams
2021-01-06 10:04 ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210106104255.GK13207@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).