From: David Hildenbrand <david@redhat.com>
To: Oscar Salvador <osalvador@suse.de>
Cc: akpm@linux-foundation.org, dan.j.williams@intel.com,
pasha.tatashin@soleen.com, mhocko@suse.com,
anshuman.khandual@arm.com, Jonathan.Cameron@huawei.com,
vbabka@suse.cz, linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 2/5] mm: Introduce a new Vmemmap page-type
Date: Fri, 26 Jul 2019 11:41:46 +0200 [thread overview]
Message-ID: <dbd19aea-fe18-ec42-7932-f03109cb399e@redhat.com> (raw)
In-Reply-To: <20190726092548.GA26268@linux>
On 26.07.19 11:25, Oscar Salvador wrote:
> On Fri, Jul 26, 2019 at 10:48:54AM +0200, David Hildenbrand wrote:
>>> Signed-off-by: Oscar Salvador <osalvador@suse.de>
>>> ---
>>> include/linux/mm.h | 17 +++++++++++++++++
>>> include/linux/mm_types.h | 5 +++++
>>> include/linux/page-flags.h | 19 +++++++++++++++++++
>>> 3 files changed, 41 insertions(+)
>>>
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>> index 45f0ab0ed4f7..432175f8f8d2 100644
>>> --- a/include/linux/mm.h
>>> +++ b/include/linux/mm.h
>>> @@ -2904,6 +2904,23 @@ static inline bool debug_guardpage_enabled(void) { return false; }
>>> static inline bool page_is_guard(struct page *page) { return false; }
>>> #endif /* CONFIG_DEBUG_PAGEALLOC */
>>>
>>> +static __always_inline struct page *vmemmap_head(struct page *page)
>>> +{
>>> + return (struct page *)page->vmemmap_head;
>>> +}
>>> +
>>> +static __always_inline unsigned long vmemmap_nr_sections(struct page *page)
>>> +{
>>> + struct page *head = vmemmap_head(page);
>>> + return head->vmemmap_sections;
>>> +}
>>> +
>>> +static __always_inline unsigned long vmemmap_nr_pages(struct page *page)
>>> +{
>>> + struct page *head = vmemmap_head(page);
>>> + return head->vmemmap_pages - (page - head);
>>> +}
>>> +
>>> #if MAX_NUMNODES > 1
>>> void __init setup_nr_node_ids(void);
>>> #else
>>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>>> index 6a7a1083b6fb..51dd227f2a6b 100644
>>> --- a/include/linux/mm_types.h
>>> +++ b/include/linux/mm_types.h
>>> @@ -170,6 +170,11 @@ struct page {
>>> * pmem backed DAX files are mapped.
>>> */
>>> };
>>> + struct { /* Vmemmap pages */
>>> + unsigned long vmemmap_head;
>>> + unsigned long vmemmap_sections; /* Number of sections */
>>> + unsigned long vmemmap_pages; /* Number of pages */
>>> + };
>>>
>>> /** @rcu_head: You can use this to free a page by RCU. */
>>> struct rcu_head rcu_head;
>>> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
>>> index f91cb8898ff0..75f302a532f9 100644
>>> --- a/include/linux/page-flags.h
>>> +++ b/include/linux/page-flags.h
>>> @@ -708,6 +708,7 @@ PAGEFLAG_FALSE(DoubleMap)
>>> #define PG_kmemcg 0x00000200
>>> #define PG_table 0x00000400
>>> #define PG_guard 0x00000800
>>> +#define PG_vmemmap 0x00001000
>>>
>>> #define PageType(page, flag) \
>>> ((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
>>> @@ -764,6 +765,24 @@ PAGE_TYPE_OPS(Table, table)
>>> */
>>> PAGE_TYPE_OPS(Guard, guard)
>>>
>>> +/*
>>> + * Vmemmap pages refers to those pages that are used to create the memmap
>>> + * array, and reside within the same memory range that was hotppluged, so
>>> + * they are self-hosted. (see include/linux/memory_hotplug.h)
>>> + */
>>> +PAGE_TYPE_OPS(Vmemmap, vmemmap)
>>> +static __always_inline void SetPageVmemmap(struct page *page)
>>> +{
>>> + __SetPageVmemmap(page);
>>> + __SetPageReserved(page);
>>
>> So, the issue with some vmemmap pages is that the "struct pages" reside
>> on the memory they manage. (it is nice, but complicated - e.g. when
>> onlining/offlining)
>
> Hi David,
>
> Not really.
> Vemmap pages are just skipped when onling/offlining handling.
> We do not need them to a) send to the buddy and b) migrate them over.
> A look at patch#4 will probably help, as the crux of the matter is there.
Right, but you have to hinder onlining code from trying to reinitialize
the vmemmap - when you try to online the first memory block. Will dive
into the details (patch #4) next (maybe not today, but early next week) :)
>
>>
>> I would expect that you properly initialize the struct pages for the
>> vmemmap pages (now it gets confusing :) ) when adding memory. The other
>> struct pages are initialized when onlining/offlining.
>>
>> So, at this point, the pages should already be marked reserved, no? Or
>> are the struct pages for the vmemmap never initialized?
>>
>> What zone do these vmemmap pages have? They are not assigned to any zone
>> and will never be :/
>
> This patch is only a preparation, the real "fun" is in patch#4.
>
> Vmemmap pages initialization occurs in mhp_mark_vmemmap_pages, called from
> __add_pages() (patch#4).
> In there we a) mark the page as vmemmap and b) initialize the fields we need to
> track some medata (sections, pages and head).
>
> In __init_single_page(), when onlining, the rest of the fields will be set up
> properly (zone, refcount, etc).
>
> Chunk from patch#4:
>
> static void __meminit __init_single_page(struct page *page, unsigned long pfn,
> unsigned long zone, int nid)
> {
> if (PageVmemmap(page))
> /*
> * Vmemmap pages need to preserve their state.
> */
> goto preserve_state;
Can you be sure there are no false positives? (if I remember correctly,
this memory might be completely uninitialized - I might be wrong)
>
> mm_zero_struct_page(page);
> page_mapcount_reset(page);
> INIT_LIST_HEAD(&page->lru);
> preserve_state:
> init_page_count(page);
> set_page_links(page, zone, nid, pfn);
> page_cpupid_reset_last(page);
> page_kasan_tag_reset(page);
>
> So, vmemmap pages will fall within the same zone as the range we are adding,
> that does not change.
I wonder if that is the right thing to do, hmmmm, because they are
effectively not part of that zone (not online)
Will have a look at the details :)
>
>>> +}
>>> +
>>> +static __always_inline void ClearPageVmemmap(struct page *page)
>>> +{
>>> + __ClearPageVmemmap(page);
>>> + __ClearPageReserved(page);
>>
>> You sure you want to clear the reserved flag here? Is this function
>> really needed?
>>
>> (when you add memory, you can mark all relevant pages as vmemmap pages,
>> which is valid until removing the memory)
>>
>> Let's draw a picture so I am not confused
>>
>> [ ------ added memory ------ ]
>> [ vmemmap]
>>
>> The first page of the added memory is a vmemmap page AND contains its
>> own vmemmap, right?
>
> Not only the first page.
> Depending on how large is the chunk you are adding, the number of vmemmap
> pages will vary, because we need to cover the memmaps for the range.
>
> e.g:
>
> - 128MB (1 section) = 512 vmemmap pages at the beginning of the range
> - 256MB (2 section) = 1024 vmemmap pages at the beginning of the range
> ...
>
Right.
>> When adding memory, you would initialize set all struct pages of the
>> vmemmap (residing on itself) and set them to SetPageVmemmap().
>>
>> When removing memory, there is nothing to do, all struct pages are
>> dropped. So why do we need the ClearPageVmemmap() ?
>
> Well, it is not really needed as we only call ClearPageVmemmap when we are
> actually removing the memory with vmemmap_free()->...
> So one could argue that since the memory is going away, there is no need
> to clear anything in there.
>
> I just made it for consistency purposes.
>
> Can drop it if feeling strong here.
Not strong, was just wondering why that is needed at all in the big
picture :)
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2019-07-26 9:41 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-25 16:02 [PATCH v3 0/5] Allocate memmap from hotadded memory Oscar Salvador
2019-07-25 16:02 ` [PATCH v3 1/5] mm,memory_hotplug: Introduce MHP_MEMMAP_ON_MEMORY Oscar Salvador
2019-07-26 8:34 ` David Hildenbrand
2019-07-26 9:29 ` Oscar Salvador
2019-07-26 9:37 ` David Hildenbrand
2019-07-25 16:02 ` [PATCH v3 2/5] mm: Introduce a new Vmemmap page-type Oscar Salvador
2019-07-26 8:48 ` David Hildenbrand
2019-07-26 9:25 ` Oscar Salvador
2019-07-26 9:41 ` David Hildenbrand [this message]
2019-07-26 10:11 ` Oscar Salvador
2019-07-25 16:02 ` [PATCH v3 3/5] mm,sparse: Add SECTION_USE_VMEMMAP flag Oscar Salvador
2019-08-01 14:45 ` David Hildenbrand
2019-07-25 16:02 ` [PATCH v3 4/5] mm,memory_hotplug: Allocate memmap from the added memory range for sparse-vmemmap Oscar Salvador
2019-08-01 15:04 ` David Hildenbrand
2019-07-25 16:02 ` [PATCH v3 5/5] mm,memory_hotplug: Allow userspace to enable/disable vmemmap Oscar Salvador
2019-08-01 15:07 ` David Hildenbrand
2019-07-25 16:56 ` [PATCH v3 0/5] Allocate memmap from hotadded memory David Hildenbrand
2019-08-01 7:39 ` Oscar Salvador
2019-08-01 8:17 ` David Hildenbrand
2019-08-01 8:39 ` Oscar Salvador
2019-08-01 8:44 ` David Hildenbrand
2019-08-01 18:46 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dbd19aea-fe18-ec42-7932-f03109cb399e@redhat.com \
--to=david@redhat.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=dan.j.williams@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=osalvador@suse.de \
--cc=pasha.tatashin@soleen.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).