All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, Mel Gorman <mgorman@suse.de>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Jerome Glisse <jglisse@redhat.com>,
	Reza Arbab <arbab@linux.vnet.ibm.com>,
	Yasuaki Ishimatsu <yasu.isimatu@gmail.com>,
	qiuxishi@huawei.com, Kani Toshimitsu <toshi.kani@hpe.com>,
	slaoub@gmail.com, Joonsoo Kim <js1304@gmail.com>,
	Andi Kleen <ak@linux.intel.com>,
	David Rientjes <rientjes@google.com>,
	Daniel Kiper <daniel.kiper@oracle.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH 07/14] mm: consider zone which is not fully populated to have holes
Date: Thu, 18 May 2017 18:14:39 +0200	[thread overview]
Message-ID: <ae859e14-bf82-ae37-9c85-d4b31ce89b0a@suse.cz> (raw)
In-Reply-To: <20170515085827.16474-8-mhocko@kernel.org>

On 05/15/2017 10:58 AM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> __pageblock_pfn_to_page has two users currently, set_zone_contiguous
> which checks whether the given zone contains holes and
> pageblock_pfn_to_page which then carefully returns a first valid
> page from the given pfn range for the given zone. This doesn't handle
> zones which are not fully populated though. Memory pageblocks can be
> offlined or might not have been onlined yet. In such a case the zone
> should be considered to have holes otherwise pfn walkers can touch
> and play with offline pages.
> 
> Current callers of pageblock_pfn_to_page in compaction seem to work
> properly right now because they only isolate PageBuddy
> (isolate_freepages_block) or PageLRU resp. __PageMovable
> (isolate_migratepages_block) which will be always false for these pages.
> It would be safer to skip these pages altogether, though.
> 
> In order to do this patch adds a new memory section state
> (SECTION_IS_ONLINE) which is set in memory_present (during boot
> time) or in online_pages_range during the memory hotplug. Similarly
> offline_mem_sections clears the bit and it is called when the memory
> range is offlined.
> 
> pfn_to_online_page helper is then added which check the mem section and
> only returns a page if it is onlined already.
> 
> Use the new helper in __pageblock_pfn_to_page and skip the whole page
> block in such a case.
> 
> Changes since v3
> - clarify pfn_valid semantic - requested by Joonsoo
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  include/linux/memory_hotplug.h | 21 ++++++++++++++++++++
>  include/linux/mmzone.h         | 35 ++++++++++++++++++++++++++------
>  mm/memory_hotplug.c            |  3 +++
>  mm/page_alloc.c                |  5 ++++-
>  mm/sparse.c                    | 45 +++++++++++++++++++++++++++++++++++++++++-
>  5 files changed, 101 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 3c8cf86201c3..fc1c873504eb 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -14,6 +14,19 @@ struct memory_block;
>  struct resource;
>  
>  #ifdef CONFIG_MEMORY_HOTPLUG
> +/*
> + * Return page for the valid pfn only if the page is online. All pfn
> + * walkers which rely on the fully initialized page->flags and others
> + * should use this rather than pfn_valid && pfn_to_page
> + */
> +#define pfn_to_online_page(pfn)				\
> +({							\
> +	struct page *___page = NULL;			\
> +							\
> +	if (online_section_nr(pfn_to_section_nr(pfn)))	\
> +		___page = pfn_to_page(pfn);		\
> +	___page;					\
> +})

This seems to be already assuming pfn_valid() to be true. There's no
"pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS" check and the comment
suggests as such, but...

>  /*
>   * Types for free bootmem stored in page->lru.next. These have to be in
> @@ -203,6 +216,14 @@ extern void set_zone_contiguous(struct zone *zone);
>  extern void clear_zone_contiguous(struct zone *zone);
>  
>  #else /* ! CONFIG_MEMORY_HOTPLUG */
> +#define pfn_to_online_page(pfn)			\
> +({						\
> +	struct page *___page = NULL;		\
> +	if (pfn_valid(pfn))			\
> +		___page = pfn_to_page(pfn);	\

This includes the pfn_valid() check itself. Why the discrepancy?
Somebody might develop code with !HOTPLUG and forget the check, and then
it starts breaking with HOTPLUG?

> +	___page;				\
> + })
> +
>  /*
>   * Stub functions for when hotplug is off
>   */

...

> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 05796ee974f7..c3a146028ba6 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -929,6 +929,9 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
>  	unsigned long i;
>  	unsigned long onlined_pages = *(unsigned long *)arg;
>  	struct page *page;
> +
> +	online_mem_sections(start_pfn, start_pfn + nr_pages);

Shouldn't this be moved *below* the loop that initializes struct pages?
In the offline case you do mark sections offline before "tearing" struct
pages, so that should be symmetric.

> +
>  	if (PageReserved(pfn_to_page(start_pfn)))
>  		for (i = 0; i < nr_pages; i++) {
>  			page = pfn_to_page(start_pfn + i);
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c1670f090107..7e5151a7dd7b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1353,7 +1353,9 @@ struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
>  	if (!pfn_valid(start_pfn) || !pfn_valid(end_pfn))
>  		return NULL;
>  
> -	start_page = pfn_to_page(start_pfn);
> +	start_page = pfn_to_online_page(start_pfn);
> +	if (!start_page)
> +		return NULL;
>  
>  	if (page_zone(start_page) != zone)
>  		return NULL;
> @@ -7671,6 +7673,7 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
>  			break;
>  	if (pfn == end_pfn)
>  		return;
> +	offline_mem_sections(pfn, end_pfn);
>  	zone = page_zone(pfn_to_page(pfn));
>  	spin_lock_irqsave(&zone->lock, flags);
>  	pfn = start_pfn;

WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz>
To: Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, Mel Gorman <mgorman@suse.de>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Jerome Glisse <jglisse@redhat.com>,
	Reza Arbab <arbab@linux.vnet.ibm.com>,
	Yasuaki Ishimatsu <yasu.isimatu@gmail.com>,
	qiuxishi@huawei.com, Kani Toshimitsu <toshi.kani@hpe.com>,
	slaoub@gmail.com, Joonsoo Kim <js1304@gmail.com>,
	Andi Kleen <ak@linux.intel.com>,
	David Rientjes <rientjes@google.com>,
	Daniel Kiper <daniel.kiper@oracle.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH 07/14] mm: consider zone which is not fully populated to have holes
Date: Thu, 18 May 2017 18:14:39 +0200	[thread overview]
Message-ID: <ae859e14-bf82-ae37-9c85-d4b31ce89b0a@suse.cz> (raw)
In-Reply-To: <20170515085827.16474-8-mhocko@kernel.org>

On 05/15/2017 10:58 AM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> __pageblock_pfn_to_page has two users currently, set_zone_contiguous
> which checks whether the given zone contains holes and
> pageblock_pfn_to_page which then carefully returns a first valid
> page from the given pfn range for the given zone. This doesn't handle
> zones which are not fully populated though. Memory pageblocks can be
> offlined or might not have been onlined yet. In such a case the zone
> should be considered to have holes otherwise pfn walkers can touch
> and play with offline pages.
> 
> Current callers of pageblock_pfn_to_page in compaction seem to work
> properly right now because they only isolate PageBuddy
> (isolate_freepages_block) or PageLRU resp. __PageMovable
> (isolate_migratepages_block) which will be always false for these pages.
> It would be safer to skip these pages altogether, though.
> 
> In order to do this patch adds a new memory section state
> (SECTION_IS_ONLINE) which is set in memory_present (during boot
> time) or in online_pages_range during the memory hotplug. Similarly
> offline_mem_sections clears the bit and it is called when the memory
> range is offlined.
> 
> pfn_to_online_page helper is then added which check the mem section and
> only returns a page if it is onlined already.
> 
> Use the new helper in __pageblock_pfn_to_page and skip the whole page
> block in such a case.
> 
> Changes since v3
> - clarify pfn_valid semantic - requested by Joonsoo
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  include/linux/memory_hotplug.h | 21 ++++++++++++++++++++
>  include/linux/mmzone.h         | 35 ++++++++++++++++++++++++++------
>  mm/memory_hotplug.c            |  3 +++
>  mm/page_alloc.c                |  5 ++++-
>  mm/sparse.c                    | 45 +++++++++++++++++++++++++++++++++++++++++-
>  5 files changed, 101 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 3c8cf86201c3..fc1c873504eb 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -14,6 +14,19 @@ struct memory_block;
>  struct resource;
>  
>  #ifdef CONFIG_MEMORY_HOTPLUG
> +/*
> + * Return page for the valid pfn only if the page is online. All pfn
> + * walkers which rely on the fully initialized page->flags and others
> + * should use this rather than pfn_valid && pfn_to_page
> + */
> +#define pfn_to_online_page(pfn)				\
> +({							\
> +	struct page *___page = NULL;			\
> +							\
> +	if (online_section_nr(pfn_to_section_nr(pfn)))	\
> +		___page = pfn_to_page(pfn);		\
> +	___page;					\
> +})

This seems to be already assuming pfn_valid() to be true. There's no
"pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS" check and the comment
suggests as such, but...

>  /*
>   * Types for free bootmem stored in page->lru.next. These have to be in
> @@ -203,6 +216,14 @@ extern void set_zone_contiguous(struct zone *zone);
>  extern void clear_zone_contiguous(struct zone *zone);
>  
>  #else /* ! CONFIG_MEMORY_HOTPLUG */
> +#define pfn_to_online_page(pfn)			\
> +({						\
> +	struct page *___page = NULL;		\
> +	if (pfn_valid(pfn))			\
> +		___page = pfn_to_page(pfn);	\

This includes the pfn_valid() check itself. Why the discrepancy?
Somebody might develop code with !HOTPLUG and forget the check, and then
it starts breaking with HOTPLUG?

> +	___page;				\
> + })
> +
>  /*
>   * Stub functions for when hotplug is off
>   */

...

> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 05796ee974f7..c3a146028ba6 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -929,6 +929,9 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
>  	unsigned long i;
>  	unsigned long onlined_pages = *(unsigned long *)arg;
>  	struct page *page;
> +
> +	online_mem_sections(start_pfn, start_pfn + nr_pages);

Shouldn't this be moved *below* the loop that initializes struct pages?
In the offline case you do mark sections offline before "tearing" struct
pages, so that should be symmetric.

> +
>  	if (PageReserved(pfn_to_page(start_pfn)))
>  		for (i = 0; i < nr_pages; i++) {
>  			page = pfn_to_page(start_pfn + i);
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c1670f090107..7e5151a7dd7b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1353,7 +1353,9 @@ struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
>  	if (!pfn_valid(start_pfn) || !pfn_valid(end_pfn))
>  		return NULL;
>  
> -	start_page = pfn_to_page(start_pfn);
> +	start_page = pfn_to_online_page(start_pfn);
> +	if (!start_page)
> +		return NULL;
>  
>  	if (page_zone(start_page) != zone)
>  		return NULL;
> @@ -7671,6 +7673,7 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
>  			break;
>  	if (pfn == end_pfn)
>  		return;
> +	offline_mem_sections(pfn, end_pfn);
>  	zone = page_zone(pfn_to_page(pfn));
>  	spin_lock_irqsave(&zone->lock, flags);
>  	pfn = start_pfn;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-05-18 16:15 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-15  8:58 [PATCH -v4 0/14] mm: make movable onlining suck less Michal Hocko
2017-05-15  8:58 ` Michal Hocko
2017-05-15  8:58 ` [PATCH 01/14] mm: remove return value from init_currently_empty_zone Michal Hocko
2017-05-15  8:58   ` Michal Hocko
2017-05-15  8:58 ` [PATCH 02/14] mm, memory_hotplug: use node instead of zone in can_online_high_movable Michal Hocko
2017-05-15  8:58   ` Michal Hocko
2017-05-15  8:58 ` [PATCH 03/14] mm: drop page_initialized check from get_nid_for_pfn Michal Hocko
2017-05-15  8:58   ` Michal Hocko
2017-05-15  8:58 ` [PATCH 04/14] mm, memory_hotplug: get rid of is_zone_device_section Michal Hocko
2017-05-15  8:58   ` Michal Hocko
2017-06-10  9:56   ` Wei Yang
2017-06-10  9:56     ` Wei Yang
2017-06-10 14:58     ` Wei Yang
2017-06-10 14:58       ` Wei Yang
2017-06-12  6:49       ` Michal Hocko
2017-06-12  6:49         ` Michal Hocko
2017-06-14  6:17         ` Wei Yang
2017-06-14  6:12     ` Wei Yang
2017-06-14  6:32       ` Michal Hocko
2017-06-14  6:32         ` Michal Hocko
2017-06-14  9:12         ` Wei Yang
2017-06-14  9:24           ` Michal Hocko
2017-06-14  9:24             ` Michal Hocko
2017-06-15  1:02             ` Wei Yang
2017-05-15  8:58 ` [PATCH 05/14] mm, memory_hotplug: split up register_one_node Michal Hocko
2017-05-15  8:58   ` Michal Hocko
2017-05-15  8:58 ` [PATCH 06/14] mm, memory_hotplug: consider offline memblocks removable Michal Hocko
2017-05-15  8:58   ` Michal Hocko
2017-05-15  8:58 ` [PATCH 07/14] mm: consider zone which is not fully populated to have holes Michal Hocko
2017-05-15  8:58   ` Michal Hocko
2017-05-18 16:14   ` Vlastimil Babka [this message]
2017-05-18 16:14     ` Vlastimil Babka
2017-05-18 16:42     ` Michal Hocko
2017-05-18 16:42       ` Michal Hocko
2017-05-19  7:21       ` Vlastimil Babka
2017-05-19  7:21         ` Vlastimil Babka
2017-05-15  8:58 ` [PATCH 08/14] mm, compaction: skip over holes in __reset_isolation_suitable Michal Hocko
2017-05-15  8:58   ` Michal Hocko
2017-05-19  7:23   ` Vlastimil Babka
2017-05-19  7:23     ` Vlastimil Babka
2017-05-15  8:58 ` [PATCH 09/14] mm: __first_valid_page skip over offline pages Michal Hocko
2017-05-15  8:58   ` Michal Hocko
2017-05-19  8:05   ` Vlastimil Babka
2017-05-19  8:05     ` Vlastimil Babka
2017-05-15  8:58 ` [PATCH 10/14] mm, vmstat: skip reporting offline pages in pagetypeinfo Michal Hocko
2017-05-15  8:58   ` Michal Hocko
2017-05-19  8:17   ` Vlastimil Babka
2017-05-19  8:17     ` Vlastimil Babka
2017-05-15  8:58 ` [PATCH 11/14] mm, memory_hotplug: do not associate hotadded memory to zones until online Michal Hocko
2017-05-15  8:58   ` Michal Hocko
2017-05-19  8:56   ` Vlastimil Babka
2017-05-19  8:56     ` Vlastimil Babka
2017-06-16  4:20   ` Wei Yang
2017-06-16  8:05     ` Michal Hocko
2017-06-16  8:05       ` Michal Hocko
2017-06-16  8:11   ` Wei Yang
2017-06-16  8:45     ` Michal Hocko
2017-06-16  8:45       ` Michal Hocko
2017-06-16  9:11       ` Wei Yang
2017-06-25  0:14   ` Wei Yang
2017-06-26  5:38     ` Michal Hocko
2017-06-26  5:38       ` Michal Hocko
2017-05-15  8:58 ` [PATCH 12/14] mm, memory_hotplug: replace for_device by want_memblock in arch_add_memory Michal Hocko
2017-05-15  8:58   ` Michal Hocko
2017-05-15  8:58 ` [PATCH 13/14] mm, memory_hotplug: fix the section mismatch warning Michal Hocko
2017-05-15  8:58   ` Michal Hocko
2017-05-15  8:58 ` [PATCH 14/14] mm, memory_hotplug: remove unused cruft after memory hotplug rework Michal Hocko
2017-05-15  8:58   ` Michal Hocko
2017-06-09  9:51 ` [PATCH -v4 0/14] mm: make movable onlining suck less Wei Yang
2017-06-09  9:51   ` Wei Yang
2017-06-09 10:49   ` Michal Hocko
2017-06-09 10:49     ` Michal Hocko
2017-06-10  2:20     ` Wei Yang
2017-06-10  2:20       ` Wei Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ae859e14-bf82-ae37-9c85-d4b31ce89b0a@suse.cz \
    --to=vbabka@suse.cz \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=arbab@linux.vnet.ibm.com \
    --cc=daniel.kiper@oracle.com \
    --cc=imammedo@redhat.com \
    --cc=jglisse@redhat.com \
    --cc=js1304@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=qiuxishi@huawei.com \
    --cc=rientjes@google.com \
    --cc=slaoub@gmail.com \
    --cc=toshi.kani@hpe.com \
    --cc=vkuznets@redhat.com \
    --cc=yasu.isimatu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.