linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Oscar Salvador <osalvador@suse.de>, akpm@linux-foundation.org
Cc: dan.j.williams@intel.com, pasha.tatashin@soleen.com,
	mhocko@suse.com, anshuman.khandual@arm.com,
	Jonathan.Cameron@huawei.com, vbabka@suse.cz, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 2/5] mm: Introduce a new Vmemmap page-type
Date: Fri, 26 Jul 2019 10:48:54 +0200	[thread overview]
Message-ID: <7e8746ac-6a66-d73c-9f2a-4fc53c7e4c04@redhat.com> (raw)
In-Reply-To: <20190725160207.19579-3-osalvador@suse.de>

On 25.07.19 18:02, Oscar Salvador wrote:
> This patch introduces a new Vmemmap page-type.
> 
> It also introduces some functions to ease the handling of vmemmap pages:
> 
> - vmemmap_nr_sections: Returns the number of sections that used vmemmap.
> 
> - vmemmap_nr_pages: Allows us to retrieve the amount of vmemmap pages
>   derivated from any vmemmap-page in the section. Useful for accounting
>   and to know how much to we have to skip in the case where vmemmap pages
>   need to be ignored.
> 
> - vmemmap_head: Returns the vmemmap head page
> 
> - SetPageVmemmap: Sets Reserved flag bit, and sets page->type to Vmemmap.
>   Setting the Reserved flag bit is just for extra protection, actually
>   we do not expect anyone to use these pages for anything.
> 
> - ClearPageVmemmap: Clears Reserved flag bit and page->type.
>   Only used when sections containing vmemmap pages are removed.
> 
> These functions will be used for the code handling Vmemmap pages.
> 

Much cleaner using the page type :)

> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> ---
>  include/linux/mm.h         | 17 +++++++++++++++++
>  include/linux/mm_types.h   |  5 +++++
>  include/linux/page-flags.h | 19 +++++++++++++++++++
>  3 files changed, 41 insertions(+)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 45f0ab0ed4f7..432175f8f8d2 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2904,6 +2904,23 @@ static inline bool debug_guardpage_enabled(void) { return false; }
>  static inline bool page_is_guard(struct page *page) { return false; }
>  #endif /* CONFIG_DEBUG_PAGEALLOC */
>  
> +static __always_inline struct page *vmemmap_head(struct page *page)
> +{
> +	return (struct page *)page->vmemmap_head;
> +}
> +
> +static __always_inline unsigned long vmemmap_nr_sections(struct page *page)
> +{
> +	struct page *head = vmemmap_head(page);
> +	return head->vmemmap_sections;
> +}
> +
> +static __always_inline unsigned long vmemmap_nr_pages(struct page *page)
> +{
> +	struct page *head = vmemmap_head(page);
> +	return head->vmemmap_pages - (page - head);
> +}
> +
>  #if MAX_NUMNODES > 1
>  void __init setup_nr_node_ids(void);
>  #else
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 6a7a1083b6fb..51dd227f2a6b 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -170,6 +170,11 @@ struct page {
>  			 * pmem backed DAX files are mapped.
>  			 */
>  		};
> +		struct {        /* Vmemmap pages */
> +			unsigned long vmemmap_head;
> +			unsigned long vmemmap_sections; /* Number of sections */
> +			unsigned long vmemmap_pages;    /* Number of pages */
> +		};
>  
>  		/** @rcu_head: You can use this to free a page by RCU. */
>  		struct rcu_head rcu_head;
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index f91cb8898ff0..75f302a532f9 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -708,6 +708,7 @@ PAGEFLAG_FALSE(DoubleMap)
>  #define PG_kmemcg	0x00000200
>  #define PG_table	0x00000400
>  #define PG_guard	0x00000800
> +#define PG_vmemmap     0x00001000
>  
>  #define PageType(page, flag)						\
>  	((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
> @@ -764,6 +765,24 @@ PAGE_TYPE_OPS(Table, table)
>   */
>  PAGE_TYPE_OPS(Guard, guard)
>  
> +/*
> + * Vmemmap pages refers to those pages that are used to create the memmap
> + * array, and reside within the same memory range that was hotppluged, so
> + * they are self-hosted. (see include/linux/memory_hotplug.h)
> + */
> +PAGE_TYPE_OPS(Vmemmap, vmemmap)
> +static __always_inline void SetPageVmemmap(struct page *page)
> +{
> +	__SetPageVmemmap(page);
> +	__SetPageReserved(page);

So, the issue with some vmemmap pages is that the "struct pages" reside
on the memory they manage. (it is nice, but complicated - e.g. when
onlining/offlining)

I would expect that you properly initialize the struct pages for the
vmemmap pages (now it gets confusing :) ) when adding memory. The other
struct pages are initialized when onlining/offlining.

So, at this point, the pages should already be marked reserved, no? Or
are the struct pages for the vmemmap never initialized?

What zone do these vmemmap pages have? They are not assigned to any zone
and will never be :/

> +}
> +
> +static __always_inline void ClearPageVmemmap(struct page *page)
> +{
> +	__ClearPageVmemmap(page);
> +	__ClearPageReserved(page);

You sure you want to clear the reserved flag here? Is this function
really needed?

(when you add memory, you can mark all relevant pages as vmemmap pages,
which is valid until removing the memory)

Let's draw a picture so I am not confused

[ ------ added memory ------ ]
[ vmemmap]

The first page of the added memory is a vmemmap page AND contains its
own vmemmap, right?

When adding memory, you would initialize set all struct pages of the
vmemmap (residing on itself) and set them to SetPageVmemmap().

When removing memory, there is nothing to do, all struct pages are
dropped. So why do we need the ClearPageVmemmap() ?

-- 

Thanks,

David / dhildenb

  reply	other threads:[~2019-07-26  8:48 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-25 16:02 [PATCH v3 0/5] Allocate memmap from hotadded memory Oscar Salvador
2019-07-25 16:02 ` [PATCH v3 1/5] mm,memory_hotplug: Introduce MHP_MEMMAP_ON_MEMORY Oscar Salvador
2019-07-26  8:34   ` David Hildenbrand
2019-07-26  9:29     ` Oscar Salvador
2019-07-26  9:37       ` David Hildenbrand
2019-07-25 16:02 ` [PATCH v3 2/5] mm: Introduce a new Vmemmap page-type Oscar Salvador
2019-07-26  8:48   ` David Hildenbrand [this message]
2019-07-26  9:25     ` Oscar Salvador
2019-07-26  9:41       ` David Hildenbrand
2019-07-26 10:11         ` Oscar Salvador
2019-07-25 16:02 ` [PATCH v3 3/5] mm,sparse: Add SECTION_USE_VMEMMAP flag Oscar Salvador
2019-08-01 14:45   ` David Hildenbrand
2019-07-25 16:02 ` [PATCH v3 4/5] mm,memory_hotplug: Allocate memmap from the added memory range for sparse-vmemmap Oscar Salvador
2019-08-01 15:04   ` David Hildenbrand
2019-07-25 16:02 ` [PATCH v3 5/5] mm,memory_hotplug: Allow userspace to enable/disable vmemmap Oscar Salvador
2019-08-01 15:07   ` David Hildenbrand
2019-07-25 16:56 ` [PATCH v3 0/5] Allocate memmap from hotadded memory David Hildenbrand
2019-08-01  7:39 ` Oscar Salvador
2019-08-01  8:17   ` David Hildenbrand
2019-08-01  8:39     ` Oscar Salvador
2019-08-01  8:44       ` David Hildenbrand
2019-08-01 18:46 ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7e8746ac-6a66-d73c-9f2a-4fc53c7e4c04@redhat.com \
    --to=david@redhat.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=dan.j.williams@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@soleen.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).