linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/4] mm/memory_hotplug: online_pages() cleanups
@ 2019-08-09 12:56 David Hildenbrand
  2019-08-09 12:56 ` [PATCH v1 1/4] resource: Use PFN_UP / PFN_DOWN in walk_system_ram_range() David Hildenbrand
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: David Hildenbrand @ 2019-08-09 12:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Arun KS,
	Bjorn Helgaas, Borislav Petkov, Dan Williams, Dave Hansen,
	Ingo Molnar, Michal Hocko, Nadav Amit, Oscar Salvador,
	Pavel Tatashin, Wei Yang

Some cleanups (+ one fix for a special case) in the context of
online_pages(). Hope I am not missing something obvious. Did a sanity test
with DIMMs only.

David Hildenbrand (4):
  resource: Use PFN_UP / PFN_DOWN in walk_system_ram_range()
  mm/memory_hotplug: Handle unaligned start and nr_pages in
    online_pages_blocks()
  mm/memory_hotplug: Simplify online_pages_range()
  mm/memory_hotplug: online_pages cannot be 0 in online_pages()

 kernel/resource.c   |  4 +--
 mm/memory_hotplug.c | 62 ++++++++++++++++++++-------------------------
 2 files changed, 30 insertions(+), 36 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v1 1/4] resource: Use PFN_UP / PFN_DOWN in walk_system_ram_range()
  2019-08-09 12:56 [PATCH v1 0/4] mm/memory_hotplug: online_pages() cleanups David Hildenbrand
@ 2019-08-09 12:56 ` David Hildenbrand
  2019-08-14 14:06   ` Michal Hocko
  2019-08-09 12:56 ` [PATCH v1 2/4] mm/memory_hotplug: Handle unaligned start and nr_pages in online_pages_blocks() David Hildenbrand
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2019-08-09 12:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Dan Williams, Borislav Petkov,
	Andrew Morton, Bjorn Helgaas, Ingo Molnar, Dave Hansen,
	Nadav Amit, Wei Yang, Oscar Salvador

This makes it clearer that we will never call func() with duplicate PFNs
in case we have multiple sub-page memory resources. All unaligned parts
of PFNs are completely discarded.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 kernel/resource.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 7ea4306503c5..88ee39fa9103 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -487,8 +487,8 @@ int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 	while (start < end &&
 	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE,
 				    false, &res)) {
-		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
-		end_pfn = (res.end + 1) >> PAGE_SHIFT;
+		pfn = PFN_UP(res.start);
+		end_pfn = PFN_DOWN(res.end + 1);
 		if (end_pfn > pfn)
 			ret = (*func)(pfn, end_pfn - pfn, arg);
 		if (ret)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v1 2/4] mm/memory_hotplug: Handle unaligned start and nr_pages in online_pages_blocks()
  2019-08-09 12:56 [PATCH v1 0/4] mm/memory_hotplug: online_pages() cleanups David Hildenbrand
  2019-08-09 12:56 ` [PATCH v1 1/4] resource: Use PFN_UP / PFN_DOWN in walk_system_ram_range() David Hildenbrand
@ 2019-08-09 12:56 ` David Hildenbrand
  2019-08-09 21:46   ` Andrew Morton
  2019-08-14 14:08   ` Michal Hocko
  2019-08-09 12:57 ` [PATCH v1 3/4] mm/memory_hotplug: Simplify online_pages_range() David Hildenbrand
  2019-08-09 12:57 ` [PATCH v1 4/4] mm/memory_hotplug: online_pages cannot be 0 in online_pages() David Hildenbrand
  3 siblings, 2 replies; 16+ messages in thread
From: David Hildenbrand @ 2019-08-09 12:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Arun KS, Andrew Morton,
	Oscar Salvador, Michal Hocko, Pavel Tatashin, Dan Williams

Take care of nr_pages not being a power of two and start not being
properly aligned. Essentially, what walk_system_ram_range() could provide
to us. get_order() will round-up in case it's not a power of two.

This should only apply to memory blocks that contain strange memory
resources (especially with holes), not to ordinary DIMMs.

Fixes: a9cd410a3d29 ("mm/page_alloc.c: memory hotplug: free pages as higher order")
Cc: Arun KS <arunks@codeaurora.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 3706a137d880..2abd938c8c45 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -640,6 +640,10 @@ static int online_pages_blocks(unsigned long start, unsigned long nr_pages)
 	while (start < end) {
 		order = min(MAX_ORDER - 1,
 			get_order(PFN_PHYS(end) - PFN_PHYS(start)));
+		/* make sure the PFN is aligned and we don't exceed the range */
+		while (!IS_ALIGNED(start, 1ul << order) ||
+		       (1ul << order) > end - start)
+			order--;
 		(*online_page_callback)(pfn_to_page(start), order);
 
 		onlined_pages += (1UL << order);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v1 3/4] mm/memory_hotplug: Simplify online_pages_range()
  2019-08-09 12:56 [PATCH v1 0/4] mm/memory_hotplug: online_pages() cleanups David Hildenbrand
  2019-08-09 12:56 ` [PATCH v1 1/4] resource: Use PFN_UP / PFN_DOWN in walk_system_ram_range() David Hildenbrand
  2019-08-09 12:56 ` [PATCH v1 2/4] mm/memory_hotplug: Handle unaligned start and nr_pages in online_pages_blocks() David Hildenbrand
@ 2019-08-09 12:57 ` David Hildenbrand
  2019-08-14 14:19   ` Michal Hocko
  2019-08-09 12:57 ` [PATCH v1 4/4] mm/memory_hotplug: online_pages cannot be 0 in online_pages() David Hildenbrand
  3 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2019-08-09 12:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Oscar Salvador,
	Michal Hocko, Pavel Tatashin, Dan Williams

move_pfn_range_to_zone() will set all pages to PG_reserved via
memmap_init_zone(). The only way a page could no longer be reserved
would be if a MEM_GOING_ONLINE notifier would clear PG_reserved - which
is not done (the online_page callback is used for that purpose by
e.g., Hyper-V instead). walk_system_ram_range() will never call
online_pages_range() with duplicate PFNs, so drop the PageReserved() check.

Simplify the handling, as online_pages always corresponds to nr_pages.
There is no need for online_pages_blocks() anymore.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 42 ++++++++++++++++++------------------------
 1 file changed, 18 insertions(+), 24 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 2abd938c8c45..87f85597a19e 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -632,37 +632,31 @@ static void generic_online_page(struct page *page, unsigned int order)
 #endif
 }
 
-static int online_pages_blocks(unsigned long start, unsigned long nr_pages)
+static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
+			void *arg)
 {
-	unsigned long end = start + nr_pages;
-	int order, onlined_pages = 0;
+	const unsigned long end_pfn = start_pfn + nr_pages;
+	unsigned long pfn;
+	int order;
 
-	while (start < end) {
-		order = min(MAX_ORDER - 1,
-			get_order(PFN_PHYS(end) - PFN_PHYS(start)));
+	/*
+	 * Online the pages. The callback might decide to keep some pages
+	 * PG_reserved (to add them to the buddy later), but we still account
+	 * them as being online/belonging to this zone ("present").
+	 */
+	for (pfn = start_pfn; pfn < end_pfn; pfn += 1ul << order) {
+		order = min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn)));
 		/* make sure the PFN is aligned and we don't exceed the range */
-		while (!IS_ALIGNED(start, 1ul << order) ||
-		       (1ul << order) > end - start)
+		while (!IS_ALIGNED(start_pfn, 1ul << order) ||
+		       (1ul << order) > end_pfn - pfn)
 			order--;
-		(*online_page_callback)(pfn_to_page(start), order);
-
-		onlined_pages += (1UL << order);
-		start += (1UL << order);
+		(*online_page_callback)(pfn_to_page(pfn), order);
 	}
-	return onlined_pages;
-}
-
-static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
-			void *arg)
-{
-	unsigned long onlined_pages = *(unsigned long *)arg;
-
-	if (PageReserved(pfn_to_page(start_pfn)))
-		onlined_pages += online_pages_blocks(start_pfn, nr_pages);
 
-	online_mem_sections(start_pfn, start_pfn + nr_pages);
+	/* mark all involved sections as online */
+	online_mem_sections(start_pfn, end_pfn);
 
-	*(unsigned long *)arg = onlined_pages;
+	*(unsigned long *)arg += nr_pages;
 	return 0;
 }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v1 4/4] mm/memory_hotplug: online_pages cannot be 0 in online_pages()
  2019-08-09 12:56 [PATCH v1 0/4] mm/memory_hotplug: online_pages() cleanups David Hildenbrand
                   ` (2 preceding siblings ...)
  2019-08-09 12:57 ` [PATCH v1 3/4] mm/memory_hotplug: Simplify online_pages_range() David Hildenbrand
@ 2019-08-09 12:57 ` David Hildenbrand
  2019-08-14 14:26   ` Michal Hocko
  3 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2019-08-09 12:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Oscar Salvador,
	Michal Hocko, Pavel Tatashin, Dan Williams

walk_system_ram_range() will fail with -EINVAL in case
online_pages_range() was never called (== no resource applicable in the
range). Otherwise, we will always call online_pages_range() with
nr_pages > 0 and, therefore, have online_pages > 0.

Remove that special handling.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 22 +++++++++-------------
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 87f85597a19e..07e72fe17495 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -854,6 +854,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
 	ret = walk_system_ram_range(pfn, nr_pages, &onlined_pages,
 		online_pages_range);
 	if (ret) {
+		/* not a single memory resource was applicable */
 		if (need_zonelists_rebuild)
 			zone_pcp_reset(zone);
 		goto failed_addition;
@@ -867,27 +868,22 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
 
 	shuffle_zone(zone);
 
-	if (onlined_pages) {
-		node_states_set_node(nid, &arg);
-		if (need_zonelists_rebuild)
-			build_all_zonelists(NULL);
-		else
-			zone_pcp_update(zone);
-	}
+	node_states_set_node(nid, &arg);
+	if (need_zonelists_rebuild)
+		build_all_zonelists(NULL);
+	else
+		zone_pcp_update(zone);
 
 	init_per_zone_wmark_min();
 
-	if (onlined_pages) {
-		kswapd_run(nid);
-		kcompactd_run(nid);
-	}
+	kswapd_run(nid);
+	kcompactd_run(nid);
 
 	vm_total_pages = nr_free_pagecache_pages();
 
 	writeback_set_ratelimit();
 
-	if (onlined_pages)
-		memory_notify(MEM_ONLINE, &arg);
+	memory_notify(MEM_ONLINE, &arg);
 	mem_hotplug_done();
 	return 0;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 2/4] mm/memory_hotplug: Handle unaligned start and nr_pages in online_pages_blocks()
  2019-08-09 12:56 ` [PATCH v1 2/4] mm/memory_hotplug: Handle unaligned start and nr_pages in online_pages_blocks() David Hildenbrand
@ 2019-08-09 21:46   ` Andrew Morton
  2019-08-10  9:18     ` David Hildenbrand
  2019-08-14 14:08   ` Michal Hocko
  1 sibling, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2019-08-09 21:46 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Arun KS, Oscar Salvador, Michal Hocko,
	Pavel Tatashin, Dan Williams, Sasha Levin

On Fri,  9 Aug 2019 14:56:59 +0200 David Hildenbrand <david@redhat.com> wrote:

> Take care of nr_pages not being a power of two and start not being
> properly aligned. Essentially, what walk_system_ram_range() could provide
> to us. get_order() will round-up in case it's not a power of two.
> 
> This should only apply to memory blocks that contain strange memory
> resources (especially with holes), not to ordinary DIMMs.

I'm assuming this doesn't fix any known runtime problem and that a
-stable backport isn't needed.

> Fixes: a9cd410a3d29 ("mm/page_alloc.c: memory hotplug: free pages as higher order")

To that end, I replaced this with my new "Fixes-no-stable" in order to
discourage -stable maintainers from overriding our decision.

> Cc: Arun KS <arunks@codeaurora.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 2/4] mm/memory_hotplug: Handle unaligned start and nr_pages in online_pages_blocks()
  2019-08-09 21:46   ` Andrew Morton
@ 2019-08-10  9:18     ` David Hildenbrand
  0 siblings, 0 replies; 16+ messages in thread
From: David Hildenbrand @ 2019-08-10  9:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Arun KS, Oscar Salvador, Michal Hocko,
	Pavel Tatashin, Dan Williams, Sasha Levin

On 09.08.19 23:46, Andrew Morton wrote:
> On Fri,  9 Aug 2019 14:56:59 +0200 David Hildenbrand <david@redhat.com> wrote:
> 
>> Take care of nr_pages not being a power of two and start not being
>> properly aligned. Essentially, what walk_system_ram_range() could provide
>> to us. get_order() will round-up in case it's not a power of two.
>>
>> This should only apply to memory blocks that contain strange memory
>> resources (especially with holes), not to ordinary DIMMs.
> 
> I'm assuming this doesn't fix any known runtime problem and that a
> -stable backport isn't needed.

Yeah, my understanding is that this would only apply when offlining and
re-onlining boot memory that contains such weird memory holes. I don't
think this is stable material.

Thanks!

> 
>> Fixes: a9cd410a3d29 ("mm/page_alloc.c: memory hotplug: free pages as higher order")
> 
> To that end, I replaced this with my new "Fixes-no-stable" in order to
> discourage -stable maintainers from overriding our decision.
> 
>> Cc: Arun KS <arunks@codeaurora.org>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 1/4] resource: Use PFN_UP / PFN_DOWN in walk_system_ram_range()
  2019-08-09 12:56 ` [PATCH v1 1/4] resource: Use PFN_UP / PFN_DOWN in walk_system_ram_range() David Hildenbrand
@ 2019-08-14 14:06   ` Michal Hocko
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2019-08-14 14:06 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Dan Williams, Borislav Petkov,
	Andrew Morton, Bjorn Helgaas, Ingo Molnar, Dave Hansen,
	Nadav Amit, Wei Yang, Oscar Salvador

On Fri 09-08-19 14:56:58, David Hildenbrand wrote:
> This makes it clearer that we will never call func() with duplicate PFNs
> in case we have multiple sub-page memory resources. All unaligned parts
> of PFNs are completely discarded.
> 
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Nadav Amit <namit@vmware.com>
> Cc: Wei Yang <richardw.yang@linux.intel.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  kernel/resource.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/resource.c b/kernel/resource.c
> index 7ea4306503c5..88ee39fa9103 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -487,8 +487,8 @@ int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
>  	while (start < end &&
>  	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE,
>  				    false, &res)) {
> -		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
> -		end_pfn = (res.end + 1) >> PAGE_SHIFT;
> +		pfn = PFN_UP(res.start);
> +		end_pfn = PFN_DOWN(res.end + 1);
>  		if (end_pfn > pfn)
>  			ret = (*func)(pfn, end_pfn - pfn, arg);
>  		if (ret)
> -- 
> 2.21.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 2/4] mm/memory_hotplug: Handle unaligned start and nr_pages in online_pages_blocks()
  2019-08-09 12:56 ` [PATCH v1 2/4] mm/memory_hotplug: Handle unaligned start and nr_pages in online_pages_blocks() David Hildenbrand
  2019-08-09 21:46   ` Andrew Morton
@ 2019-08-14 14:08   ` Michal Hocko
  2019-08-14 14:17     ` David Hildenbrand
  1 sibling, 1 reply; 16+ messages in thread
From: Michal Hocko @ 2019-08-14 14:08 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Arun KS, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams

On Fri 09-08-19 14:56:59, David Hildenbrand wrote:
> Take care of nr_pages not being a power of two and start not being
> properly aligned. Essentially, what walk_system_ram_range() could provide
> to us. get_order() will round-up in case it's not a power of two.
> 
> This should only apply to memory blocks that contain strange memory
> resources (especially with holes), not to ordinary DIMMs.

I would really like to see an example of such setup before making the
code hard to read. Because I am not really sure something like that
exists at all.

> Fixes: a9cd410a3d29 ("mm/page_alloc.c: memory hotplug: free pages as higher order")
> Cc: Arun KS <arunks@codeaurora.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  mm/memory_hotplug.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 3706a137d880..2abd938c8c45 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -640,6 +640,10 @@ static int online_pages_blocks(unsigned long start, unsigned long nr_pages)
>  	while (start < end) {
>  		order = min(MAX_ORDER - 1,
>  			get_order(PFN_PHYS(end) - PFN_PHYS(start)));
> +		/* make sure the PFN is aligned and we don't exceed the range */
> +		while (!IS_ALIGNED(start, 1ul << order) ||
> +		       (1ul << order) > end - start)
> +			order--;
>  		(*online_page_callback)(pfn_to_page(start), order);
>  
>  		onlined_pages += (1UL << order);
> -- 
> 2.21.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 2/4] mm/memory_hotplug: Handle unaligned start and nr_pages in online_pages_blocks()
  2019-08-14 14:08   ` Michal Hocko
@ 2019-08-14 14:17     ` David Hildenbrand
  2019-08-14 14:28       ` David Hildenbrand
  2019-08-14 14:28       ` Michal Hocko
  0 siblings, 2 replies; 16+ messages in thread
From: David Hildenbrand @ 2019-08-14 14:17 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Arun KS, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams

On 14.08.19 16:08, Michal Hocko wrote:
> On Fri 09-08-19 14:56:59, David Hildenbrand wrote:
>> Take care of nr_pages not being a power of two and start not being
>> properly aligned. Essentially, what walk_system_ram_range() could provide
>> to us. get_order() will round-up in case it's not a power of two.
>>
>> This should only apply to memory blocks that contain strange memory
>> resources (especially with holes), not to ordinary DIMMs.
> 
> I would really like to see an example of such setup before making the
> code hard to read. Because I am not really sure something like that
> exists at all.

I don't have a real-live example at hand (founds this while exploring
the code), however, the linked commit changed it without stating why it
would be safe to do so.

In case you have an idea on how to make this code easier to read, please
let me know.

> 
>> Fixes: a9cd410a3d29 ("mm/page_alloc.c: memory hotplug: free pages as higher order")
>> Cc: Arun KS <arunks@codeaurora.org>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  mm/memory_hotplug.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 3706a137d880..2abd938c8c45 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -640,6 +640,10 @@ static int online_pages_blocks(unsigned long start, unsigned long nr_pages)
>>  	while (start < end) {
>>  		order = min(MAX_ORDER - 1,
>>  			get_order(PFN_PHYS(end) - PFN_PHYS(start)));
>> +		/* make sure the PFN is aligned and we don't exceed the range */
>> +		while (!IS_ALIGNED(start, 1ul << order) ||
>> +		       (1ul << order) > end - start)
>> +			order--;
>>  		(*online_page_callback)(pfn_to_page(start), order);
>>  
>>  		onlined_pages += (1UL << order);
>> -- 
>> 2.21.0
> 


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 3/4] mm/memory_hotplug: Simplify online_pages_range()
  2019-08-09 12:57 ` [PATCH v1 3/4] mm/memory_hotplug: Simplify online_pages_range() David Hildenbrand
@ 2019-08-14 14:19   ` Michal Hocko
  2019-08-14 14:23     ` David Hildenbrand
  0 siblings, 1 reply; 16+ messages in thread
From: Michal Hocko @ 2019-08-14 14:19 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams

On Fri 09-08-19 14:57:00, David Hildenbrand wrote:
> move_pfn_range_to_zone() will set all pages to PG_reserved via
> memmap_init_zone(). The only way a page could no longer be reserved
> would be if a MEM_GOING_ONLINE notifier would clear PG_reserved - which
> is not done (the online_page callback is used for that purpose by
> e.g., Hyper-V instead). walk_system_ram_range() will never call
> online_pages_range() with duplicate PFNs, so drop the PageReserved() check.
> 
> Simplify the handling, as online_pages always corresponds to nr_pages.
> There is no need for online_pages_blocks() anymore.

This would be easier to review if split up into two patches. One that
only performs cleanup without any other changes and the PageReserved
check. I like the check going away and we should get rid of the
dependency on the Reserved bit completely.

Other than that I find the start_pfn and pfn being used both for
iteration each a different way really confusing and I cannot convince
myself it is even correct because I didn't bother to look deeper as
I simply think that the order manipulation from the previous is just
making things worse at this moment. If the problem is even real then it
can be done on top instead with some real example of the memory layout
that breaks.

> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  mm/memory_hotplug.c | 42 ++++++++++++++++++------------------------
>  1 file changed, 18 insertions(+), 24 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 2abd938c8c45..87f85597a19e 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -632,37 +632,31 @@ static void generic_online_page(struct page *page, unsigned int order)
>  #endif
>  }
>  
> -static int online_pages_blocks(unsigned long start, unsigned long nr_pages)
> +static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
> +			void *arg)
>  {
> -	unsigned long end = start + nr_pages;
> -	int order, onlined_pages = 0;
> +	const unsigned long end_pfn = start_pfn + nr_pages;
> +	unsigned long pfn;
> +	int order;
>  
> -	while (start < end) {
> -		order = min(MAX_ORDER - 1,
> -			get_order(PFN_PHYS(end) - PFN_PHYS(start)));
> +	/*
> +	 * Online the pages. The callback might decide to keep some pages
> +	 * PG_reserved (to add them to the buddy later), but we still account
> +	 * them as being online/belonging to this zone ("present").
> +	 */
> +	for (pfn = start_pfn; pfn < end_pfn; pfn += 1ul << order) {
> +		order = min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn)));
>  		/* make sure the PFN is aligned and we don't exceed the range */
> -		while (!IS_ALIGNED(start, 1ul << order) ||
> -		       (1ul << order) > end - start)
> +		while (!IS_ALIGNED(start_pfn, 1ul << order) ||
> +		       (1ul << order) > end_pfn - pfn)
>  			order--;
> -		(*online_page_callback)(pfn_to_page(start), order);
> -
> -		onlined_pages += (1UL << order);
> -		start += (1UL << order);
> +		(*online_page_callback)(pfn_to_page(pfn), order);
>  	}
> -	return onlined_pages;
> -}
> -
> -static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
> -			void *arg)
> -{
> -	unsigned long onlined_pages = *(unsigned long *)arg;
> -
> -	if (PageReserved(pfn_to_page(start_pfn)))
> -		onlined_pages += online_pages_blocks(start_pfn, nr_pages);
>  
> -	online_mem_sections(start_pfn, start_pfn + nr_pages);
> +	/* mark all involved sections as online */
> +	online_mem_sections(start_pfn, end_pfn);
>  
> -	*(unsigned long *)arg = onlined_pages;
> +	*(unsigned long *)arg += nr_pages;
>  	return 0;
>  }
>  
> -- 
> 2.21.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 3/4] mm/memory_hotplug: Simplify online_pages_range()
  2019-08-14 14:19   ` Michal Hocko
@ 2019-08-14 14:23     ` David Hildenbrand
  0 siblings, 0 replies; 16+ messages in thread
From: David Hildenbrand @ 2019-08-14 14:23 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams

On 14.08.19 16:19, Michal Hocko wrote:
> On Fri 09-08-19 14:57:00, David Hildenbrand wrote:
>> move_pfn_range_to_zone() will set all pages to PG_reserved via
>> memmap_init_zone(). The only way a page could no longer be reserved
>> would be if a MEM_GOING_ONLINE notifier would clear PG_reserved - which
>> is not done (the online_page callback is used for that purpose by
>> e.g., Hyper-V instead). walk_system_ram_range() will never call
>> online_pages_range() with duplicate PFNs, so drop the PageReserved() check.
>>
>> Simplify the handling, as online_pages always corresponds to nr_pages.
>> There is no need for online_pages_blocks() anymore.
> 
> This would be easier to review if split up into two patches. One that
> only performs cleanup without any other changes and the PageReserved
> check. I like the check going away and we should get rid of the
> dependency on the Reserved bit completely.

Yes, makes sense.

> 
> Other than that I find the start_pfn and pfn being used both for
> iteration each a different way really confusing and I cannot convince
> myself it is even correct because I didn't bother to look deeper as
> I simply think that the order manipulation from the previous is just
> making things worse at this moment. If the problem is even real then it
> can be done on top instead with some real example of the memory layout
> that breaks.

Yes, I'll rework this. Only "pfn" should be used in the loop (it seems
to be correct but confusing I agree).

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 4/4] mm/memory_hotplug: online_pages cannot be 0 in online_pages()
  2019-08-09 12:57 ` [PATCH v1 4/4] mm/memory_hotplug: online_pages cannot be 0 in online_pages() David Hildenbrand
@ 2019-08-14 14:26   ` Michal Hocko
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2019-08-14 14:26 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams

On Fri 09-08-19 14:57:01, David Hildenbrand wrote:
> walk_system_ram_range() will fail with -EINVAL in case
> online_pages_range() was never called (== no resource applicable in the
> range). Otherwise, we will always call online_pages_range() with
> nr_pages > 0 and, therefore, have online_pages > 0.

I have no idea why those checks where there TBH. Tried to dig out
commits which added them but didn't help.

> Remove that special handling.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memory_hotplug.c | 22 +++++++++-------------
>  1 file changed, 9 insertions(+), 13 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 87f85597a19e..07e72fe17495 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -854,6 +854,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
>  	ret = walk_system_ram_range(pfn, nr_pages, &onlined_pages,
>  		online_pages_range);
>  	if (ret) {
> +		/* not a single memory resource was applicable */
>  		if (need_zonelists_rebuild)
>  			zone_pcp_reset(zone);
>  		goto failed_addition;
> @@ -867,27 +868,22 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
>  
>  	shuffle_zone(zone);
>  
> -	if (onlined_pages) {
> -		node_states_set_node(nid, &arg);
> -		if (need_zonelists_rebuild)
> -			build_all_zonelists(NULL);
> -		else
> -			zone_pcp_update(zone);
> -	}
> +	node_states_set_node(nid, &arg);
> +	if (need_zonelists_rebuild)
> +		build_all_zonelists(NULL);
> +	else
> +		zone_pcp_update(zone);
>  
>  	init_per_zone_wmark_min();
>  
> -	if (onlined_pages) {
> -		kswapd_run(nid);
> -		kcompactd_run(nid);
> -	}
> +	kswapd_run(nid);
> +	kcompactd_run(nid);
>  
>  	vm_total_pages = nr_free_pagecache_pages();
>  
>  	writeback_set_ratelimit();
>  
> -	if (onlined_pages)
> -		memory_notify(MEM_ONLINE, &arg);
> +	memory_notify(MEM_ONLINE, &arg);
>  	mem_hotplug_done();
>  	return 0;
>  
> -- 
> 2.21.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 2/4] mm/memory_hotplug: Handle unaligned start and nr_pages in online_pages_blocks()
  2019-08-14 14:17     ` David Hildenbrand
@ 2019-08-14 14:28       ` David Hildenbrand
  2019-08-14 16:02         ` Michal Hocko
  2019-08-14 14:28       ` Michal Hocko
  1 sibling, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2019-08-14 14:28 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Arun KS, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams

On 14.08.19 16:17, David Hildenbrand wrote:
> On 14.08.19 16:08, Michal Hocko wrote:
>> On Fri 09-08-19 14:56:59, David Hildenbrand wrote:
>>> Take care of nr_pages not being a power of two and start not being
>>> properly aligned. Essentially, what walk_system_ram_range() could provide
>>> to us. get_order() will round-up in case it's not a power of two.
>>>
>>> This should only apply to memory blocks that contain strange memory
>>> resources (especially with holes), not to ordinary DIMMs.
>>
>> I would really like to see an example of such setup before making the
>> code hard to read. Because I am not really sure something like that
>> exists at all.
> 
> I don't have a real-live example at hand (founds this while exploring
> the code), however, the linked commit changed it without stating why it
> would be safe to do so.

So, while I agree that "not a power of two" is rare, are you sure we
will only have holes that are aligned to 4MB (especially on x86)?

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 2/4] mm/memory_hotplug: Handle unaligned start and nr_pages in online_pages_blocks()
  2019-08-14 14:17     ` David Hildenbrand
  2019-08-14 14:28       ` David Hildenbrand
@ 2019-08-14 14:28       ` Michal Hocko
  1 sibling, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2019-08-14 14:28 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Arun KS, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams

On Wed 14-08-19 16:17:03, David Hildenbrand wrote:
> On 14.08.19 16:08, Michal Hocko wrote:
> > On Fri 09-08-19 14:56:59, David Hildenbrand wrote:
> >> Take care of nr_pages not being a power of two and start not being
> >> properly aligned. Essentially, what walk_system_ram_range() could provide
> >> to us. get_order() will round-up in case it's not a power of two.
> >>
> >> This should only apply to memory blocks that contain strange memory
> >> resources (especially with holes), not to ordinary DIMMs.
> > 
> > I would really like to see an example of such setup before making the
> > code hard to read. Because I am not really sure something like that
> > exists at all.
> 
> I don't have a real-live example at hand (founds this while exploring
> the code), however, the linked commit changed it without stating why it
> would be safe to do so.

Then just drop the change. It is making the code a real head scratcher,
and more so after the next patch. I am pretty sure that things would
easily and quickly blow up if we had ranges like that.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 2/4] mm/memory_hotplug: Handle unaligned start and nr_pages in online_pages_blocks()
  2019-08-14 14:28       ` David Hildenbrand
@ 2019-08-14 16:02         ` Michal Hocko
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2019-08-14 16:02 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Arun KS, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams

On Wed 14-08-19 16:28:21, David Hildenbrand wrote:
> On 14.08.19 16:17, David Hildenbrand wrote:
> > On 14.08.19 16:08, Michal Hocko wrote:
> >> On Fri 09-08-19 14:56:59, David Hildenbrand wrote:
> >>> Take care of nr_pages not being a power of two and start not being
> >>> properly aligned. Essentially, what walk_system_ram_range() could provide
> >>> to us. get_order() will round-up in case it's not a power of two.
> >>>
> >>> This should only apply to memory blocks that contain strange memory
> >>> resources (especially with holes), not to ordinary DIMMs.
> >>
> >> I would really like to see an example of such setup before making the
> >> code hard to read. Because I am not really sure something like that
> >> exists at all.
> > 
> > I don't have a real-live example at hand (founds this while exploring
> > the code), however, the linked commit changed it without stating why it
> > would be safe to do so.
> 
> So, while I agree that "not a power of two" is rare, are you sure we
> will only have holes that are aligned to 4MB (especially on x86)?

No I am not saying that because I do not know. I am not aware of any
such HW but that doesn't really mean anything. But I would like to see
an example and have it archived in the changelog before we make the code
more complicated.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-08-14 16:02 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-09 12:56 [PATCH v1 0/4] mm/memory_hotplug: online_pages() cleanups David Hildenbrand
2019-08-09 12:56 ` [PATCH v1 1/4] resource: Use PFN_UP / PFN_DOWN in walk_system_ram_range() David Hildenbrand
2019-08-14 14:06   ` Michal Hocko
2019-08-09 12:56 ` [PATCH v1 2/4] mm/memory_hotplug: Handle unaligned start and nr_pages in online_pages_blocks() David Hildenbrand
2019-08-09 21:46   ` Andrew Morton
2019-08-10  9:18     ` David Hildenbrand
2019-08-14 14:08   ` Michal Hocko
2019-08-14 14:17     ` David Hildenbrand
2019-08-14 14:28       ` David Hildenbrand
2019-08-14 16:02         ` Michal Hocko
2019-08-14 14:28       ` Michal Hocko
2019-08-09 12:57 ` [PATCH v1 3/4] mm/memory_hotplug: Simplify online_pages_range() David Hildenbrand
2019-08-14 14:19   ` Michal Hocko
2019-08-14 14:23     ` David Hildenbrand
2019-08-09 12:57 ` [PATCH v1 4/4] mm/memory_hotplug: online_pages cannot be 0 in online_pages() David Hildenbrand
2019-08-14 14:26   ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).