From: David Hildenbrand <david@redhat.com>
To: Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@kernel.org>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Vlastimil Babka <vbabka@suse.cz>,
Pavel Tatashin <pasha.tatashin@soleen.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 1/5] mm,memory_hotplug: Allocate memmap from the added memory range
Date: Wed, 17 Mar 2021 15:35:41 +0100 [thread overview]
Message-ID: <f996f570-eed9-509f-553c-280a62dc6d20@redhat.com> (raw)
In-Reply-To: <20210317140847.GA20407@linux>
On 17.03.21 15:08, Oscar Salvador wrote:
> On Tue, Mar 16, 2021 at 06:45:17PM +0100, David Hildenbrand wrote:
>>> I find that cross reference to vmemmap code a little hard to digest.
>>> I would have assume that we don't have to care about PMDs in this
>>> code here at all. The vmemmap population code should handle that.
>>>
>>> I think I already mentioned that somewhere, I think it should be like this:
>>>
>>> a) vmemmap code should *never* populate more memory than requested for
>>> a single memory section when we are populating from the altmap.
>>> If that cannot be guaranteed for PMDs, then we have to fallback
>>> to populating base pages. Populating PMDs from an altmap with
>>> sizeof(struct page) == 64 is highly dangerous.
>
> I guess you meant sizeof(struct page) != 64
>
Yes!
> But other usecases of using altmap (ZONE_DEVICE stuff) might not care whether
> they have sub-populated PMDs when populating sections from altmap?
Just assume you have two ranges
[ ZONE_DEVICE 0 ][ ZONE_DEVICE 1]
If the vmemmap of ZONE_DEVICE 1 could be taken from the altmap of
ZONE_DEVICE 0, we could be in trouble, as both parts can be
removed/repurposed independently ...
>
> Current vmemmap code populates PMD with PMD_SIZE if empty, and with basepages
> if there are still holes.
>
>>> Assume we have sizeof(struct page) == 56. A 128 MiB section
>>> spans 32768 pages - we need 32768 * sizeof(struct page)
>>> space for the vmemmap.
>>> With 64k pages we *can* use exactly one PMD. With 56k pages
>>> we need 448 individual (full!) pages for the vmemmap.
>>>
>>> IOW, we don't care how vmemmap code will do the mapping.
>>> vmemmap code has to get it right. IMHO, asserting it in
>>> this code is wrong.
>>>
>>>
>>> b) In this code, we really should only care about what
>>> memory onlining/offlining code can or can't do.
>>> We really only care that
>>>
>>> 1) size == memory_block_size_bytes()
>>> 2) remaining_size
>>> 3) IS_ALIGNED(remaining_size, pageblock_size);
>
> I agree with the above, but see below:
>
>>> Okay, please document the statement about single sections, that's
>>> important to understand what's happening.
>>>
>>> My take would be
>>>
>>> bool mhp_supports_memmap_on_memory(unsigned long size)
>>> {
>>> /*
>>> * Note: We calculate for a single memory section. The calculation
>>> */
>>> unsigned long nr_vmemmap_pages = SECTION_SIZE / PAGE_SIZE;
>>> unsigned long vmemmap_size = nr_vmemmap_pages * sizeof(struct page);
>>> unsigned long remaining_size = size - vmemmap_size;
>
> While it might be true that we need to back off from populating with altmap in
> case PMDs are not going to be fully populated because of the size of the struct
> page (I am not still not sure though as I said above, other usecases might not
> care at all), I would go __for now__ with placing vmemmap_size == PMD_SIZE in
> the check below as well.
>
> If the check comes true, we know that we fully populate PMDs when populating
> sections, so the feature can be used.
>
> Then I commit to have a look whether we need to back off in vmemmap-populating
> code in case altmap && !NOT_FULLY_POPULATED_PMDS.
>
> What do you think?
If we check for
IS_ALIGNED(nr_vmemmap_pages, PMD_SIZE), please add a proper TODO comment
that this is most probably the wrong place to take care of this.
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2021-03-17 14:59 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-09 17:55 [PATCH v4 0/5] Allocate memmap from hotadded memory (per device) Oscar Salvador
2021-03-09 17:55 ` [PATCH v4 1/5] mm,memory_hotplug: Allocate memmap from the added memory range Oscar Salvador
2021-03-11 19:06 ` David Hildenbrand
[not found] ` <20210315102224.GA24699@linux>
2021-03-16 16:46 ` David Hildenbrand
2021-03-16 17:45 ` David Hildenbrand
2021-03-17 14:08 ` Oscar Salvador
2021-03-17 14:35 ` David Hildenbrand [this message]
2021-03-18 8:27 ` Oscar Salvador
2021-03-18 10:38 ` Oscar Salvador
2021-03-18 11:24 ` David Hildenbrand
2021-03-18 12:03 ` Oscar Salvador
2021-03-18 12:05 ` David Hildenbrand
2021-03-09 17:55 ` [PATCH v4 2/5] acpi,memhotplug: Enable MHP_MEMMAP_ON_MEMORY when supported Oscar Salvador
2021-03-09 17:55 ` [PATCH v4 3/5] mm,memory_hotplug: Add kernel boot option to enable memmap_on_memory Oscar Salvador
2021-03-09 17:55 ` [PATCH v4 4/5] x86/Kconfig: Introduce ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE Oscar Salvador
2021-03-09 17:55 ` [PATCH v4 5/5] arm64/Kconfig: " Oscar Salvador
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f996f570-eed9-509f-553c-280a62dc6d20@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=osalvador@suse.de \
--cc=pasha.tatashin@soleen.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).