From: Oscar Salvador <osalvador@suse.de>
To: David Hildenbrand <david@redhat.com>
Cc: mhocko@kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, vbabka@suse.cz, pasha.tatashin@soleen.com
Subject: Re: [RFC PATCH 3/3] mm,memory_hotplug: Allocate memmap from the added memory range
Date: Thu, 19 Nov 2020 11:48:47 +0100 [thread overview]
Message-ID: <20201119104847.GA5281@localhost.localdomain> (raw)
In-Reply-To: <3cc37927-538e-ae7d-27bc-45aaabe06b3a@redhat.com>
On Tue, Nov 17, 2020 at 04:38:52PM +0100, David Hildenbrand wrote:
> Sorry for the late replay, fairly busy with all kinds of things.
Heh, no worries, I appreciate the time :-)
> > diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
> > index b02fd51e5589..6b57bf90ca72 100644
> > --- a/drivers/acpi/acpi_memhotplug.c
> > +++ b/drivers/acpi/acpi_memhotplug.c
> > @@ -195,7 +195,7 @@ static int acpi_memory_enable_device(struct acpi_memory_device *mem_device)
> > node = memory_add_physaddr_to_nid(info->start_addr);
> > result = __add_memory(node, info->start_addr, info->length,
> > - MHP_NONE);
> > + MEMHP_MEMMAP_ON_MEMORY);
>
> I'd suggest moving that into a separate patch.
Fine by me
> > diff --git a/include/linux/memory.h b/include/linux/memory.h
> > index 439a89e758d8..7cc93de5856c 100644
> > --- a/include/linux/memory.h
> > +++ b/include/linux/memory.h
> > @@ -30,6 +30,7 @@ struct memory_block {
> > int phys_device; /* to which fru does this belong? */
> > struct device dev;
> > int nid; /* NID for this memory block */
> > + unsigned long nr_vmemmap_pages; /* Number for vmemmap pages */
>
> Maybe also document that these pages are directly at the beginning of the
> memory block.
Sure
> > -static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
> > +static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages,
> > + unsigned long nr_vmemmap_pages)
> > {
> > return -EINVAL;
> > }
> > @@ -369,10 +372,12 @@ extern int add_memory_driver_managed(int nid, u64 start, u64 size,
> > mhp_t mhp_flags);
> > extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> > unsigned long nr_pages,
> > + unsigned long nr_vmemmap_pages,
> > struct vmem_altmap *altmap, int migratetype);
> > extern void remove_pfn_range_from_zone(struct zone *zone,
> > unsigned long start_pfn,
> > - unsigned long nr_pages);
> > + unsigned long nr_pages,
> > + unsigned long nr_vmemmap_pages);
>
> I think we should not pass nr_vmemmap_pages down here but instead do two
> separate calls to move_pfn_range_to_zone()/remove_pfn_range_from_zone() from
> online_pages()/offline_pages()
>
> 1. for vmemmap pages, migratetype = MIGRATE_UNMOVABLE
> 2. for remaining pages, migratetype = MIGRATE_ISOLATE
Ok, that was the other option, it might be even cleaner.
> > + valid_start_pfn = pfn + nr_vmemmap_pages;
> > + valid_nr_pages = nr_pages - nr_vmemmap_pages;
>
> Hm, valid sounds strange. More like "free_start_pfn" or "buddy_start_pfn".
Agreed, I might lean towards buddy_start_pfn.
> > - move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
> > + move_pfn_range_to_zone(zone, pfn, nr_pages, nr_vmemmap_pages, NULL,
> > + MIGRATE_ISOLATE);
>
> As mentioned, I'd suggest properly initializing the memmap here
>
> if (nr_vmemmap_pages) {
> move_pfn_range_to_zone(zone, pfn, nr_vmemmap_pages, NULL,
> MIGRATE_UNMOVABLE);
> }
> move_pfn_range_to_zone(zone, valid_start_pfn, valid_nr_pages, NULL,
Sure, agreed.
> > + if (!support_memmap_on_memory(size))
> > + mhp_flags &= ~MEMHP_MEMMAP_ON_MEMORY;
>
> Callers (e.g., virtio-mem) might rely on this. We should reject this with
> -EINVAL and provide a way for callers to test whether this flag is possible.
Uhm, we might want to make "support_memmap_on_memory" public, and
callers who might want to it use can check its return value?
Or do you have something else in mind?
Agreed on the -EINVAIL.
> > + if (mhp_flags & MEMHP_MEMMAP_ON_MEMORY)
> > + mhp_mark_vmemmap_pages(params.altmap);
>
> Do we really still need that? Pages are offline, so we're messing with an
> invalid memmap. online_pages() should handle initializing the memmap of
> these pages.
Yeah, on a second thought we do not need this.
Since the pages are still offline, no one should be messing with that
range yet anyway.
>
> [...]
>
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index e74ca22baaa1..043503fb8c6e 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -8761,6 +8761,13 @@ void __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
> > spin_lock_irqsave(&zone->lock, flags);
> > while (pfn < end_pfn) {
> > page = pfn_to_page(pfn);
> > + /*
> > + * Skip vmemmap pages
> > + */
> > + if (PageVmemmap(page)) {
> > + pfn += vmemmap_nr_pages(page);
> > + continue;
> > + }
>
> I'd assume calling code can handle that and exclude isolating such pages.
The thing is that __offline_isolated_pages calls offline_mem_sections(),
so we really need the first pfn, and not the "pfn + nr_vmemmap_pages".
Instead of skipping it in the loop, I might just skip it before entering
the loop.
Thanks!
--
Oscar Salvador
SUSE L3
next prev parent reply other threads:[~2020-11-19 10:48 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-22 12:58 [RFC PATCH 0/3] Allocate memmap from hotadded memory (per device) Oscar Salvador
2020-10-22 12:58 ` [RFC PATCH 1/3] mm,memory_hotplug: Introduce MHP_MEMMAP_ON_MEMORY Oscar Salvador
2020-10-22 13:04 ` David Hildenbrand
2020-10-22 12:58 ` [RFC PATCH 2/3] mm: Introduce a new Vmemmap page-type Oscar Salvador
2020-11-20 11:20 ` David Hildenbrand
2020-10-22 12:58 ` [RFC PATCH 3/3] mm,memory_hotplug: Allocate memmap from the added memory range Oscar Salvador
2020-11-17 15:38 ` David Hildenbrand
2020-11-19 10:48 ` Oscar Salvador [this message]
2020-11-20 9:31 ` David Hildenbrand
2020-10-22 13:01 ` [RFC PATCH 0/3] Allocate memmap from hotadded memory (per device) David Hildenbrand
2020-10-27 15:40 ` Oscar Salvador
2020-10-27 15:44 ` David Hildenbrand
2020-10-27 15:58 ` Oscar Salvador
2020-10-28 18:47 ` Mike Kravetz
2020-10-29 7:49 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201119104847.GA5281@localhost.localdomain \
--to=osalvador@suse.de \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=pasha.tatashin@soleen.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).