All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/1] Skip over regions of invalid pfns with NUMA=n && HAVE_MEMBLOCK=y
@ 2018-01-24 14:35 ` Eugeniu Rosca
  0 siblings, 0 replies; 24+ messages in thread
From: Eugeniu Rosca @ 2018-01-24 14:35 UTC (permalink / raw)
  To: Matthew Wilcox, Andrew Morton, Michal Hocko, Catalin Marinas,
	Ard Biesheuvel, Steven Sistare, AKASHI Takahiro, Pavel Tatashin,
	Gioh Kim, Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley
  Cc: Eugeniu Rosca, linux-kernel, linux-mm

Hello MM/kernel experts,

I include this cover letter to present some background and motivation
behind the patch, although the description included in the patch itself
should be rich enough already.

The context of this change is some effort to optimize the boot time of
Rcar Gen3 SoC family, which at its roots is driven by automotive
requirements like (well-known?) "2-seconds-to-rear-view-camera".

To fulfill those, we create a defconfig based on vanilla arm64
defconfig, which is then tailored to Rcar Gen3 SoC needs. This allows
us to reduce the kernel binary image size by almost 50%. We are very
picky during this cleanup process, to the point that, as showcased
with this patch, we start to submit changes in MM core part, where
(to be honest) we don't have much expertise.

As mentioned in the description of attached patch, disabling NUMA in
the v4.15-rc9 arm64 kernel decreases the binary Image by 64kB, but,
at the same time, increases the H3ULCB boot time by ~140ms, which is
counterintuitive, since by disabling NUMA we expect to get rid of
unused NUMA infrastructure and skip unneeded NUMA init.

As already mentioned in the attached patch, the slowdown happens because
v4.11-rc1 commit b92df1de5d28 ("mm: page_alloc: skip over regions of
invalid pfns where possible") conditions itself on
CONFIG_HAVE_MEMBLOCK_NODE_MAP, which on arm64 depends on NUMA:
$> git grep HAVE_MEMBLOCK_NODE_MAP | grep arm64
arch/arm64/Kconfig:     select HAVE_MEMBLOCK_NODE_MAP if NUMA

The attached patch attempts to present some evidence that the
aforementioned commit can speed up the execution of memmap_init_zone()
not only on arm64 NUMA, but also on arm64 non-NUMA machines. This is
achieved by "relaxing" the dependency of memblock_next_valid_pfn()
from being guarded by CONFIG_HAVE_MEMBLOCK_NODE_MAP to being
guarded by the more generic CONFIG_HAVE_MEMBLOCK.

If this doesn't sound or feel right, I would appreciate your feedback.
I will definitely participate in testing any alternative proposals
that may arise in your mind. TIA!

Best regards,
Eugeniu.

Changes v2->v3:
- Fix review comments from Matthew Wilcox (get rid of compile-time
  guards in mm/page_alloc.c).
- Rebase and re-measure the boot time improvement on v4.15-rc9

Changes v1->v2:
- Fix ARCH=tile build error, signalled by kbuild test robot
- Rebase and re-measure the boot time improvement on v4.15-rc8

Eugeniu Rosca (1):
  mm: page_alloc: skip over regions of invalid pfns on UMA

 include/linux/memblock.h | 1 -
 include/linux/mm.h       | 6 ++++++
 mm/memblock.c            | 2 ++
 mm/page_alloc.c          | 2 --
 4 files changed, 8 insertions(+), 3 deletions(-)

-- 
2.15.1

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3 0/1] Skip over regions of invalid pfns with NUMA=n && HAVE_MEMBLOCK=y
@ 2018-01-24 14:35 ` Eugeniu Rosca
  0 siblings, 0 replies; 24+ messages in thread
From: Eugeniu Rosca @ 2018-01-24 14:35 UTC (permalink / raw)
  To: Matthew Wilcox, Andrew Morton, Michal Hocko, Catalin Marinas,
	Ard Biesheuvel, Steven Sistare, AKASHI Takahiro, Pavel Tatashin,
	Gioh Kim, Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley
  Cc: Eugeniu Rosca, linux-kernel, linux-mm

Hello MM/kernel experts,

I include this cover letter to present some background and motivation
behind the patch, although the description included in the patch itself
should be rich enough already.

The context of this change is some effort to optimize the boot time of
Rcar Gen3 SoC family, which at its roots is driven by automotive
requirements like (well-known?) "2-seconds-to-rear-view-camera".

To fulfill those, we create a defconfig based on vanilla arm64
defconfig, which is then tailored to Rcar Gen3 SoC needs. This allows
us to reduce the kernel binary image size by almost 50%. We are very
picky during this cleanup process, to the point that, as showcased
with this patch, we start to submit changes in MM core part, where
(to be honest) we don't have much expertise.

As mentioned in the description of attached patch, disabling NUMA in
the v4.15-rc9 arm64 kernel decreases the binary Image by 64kB, but,
at the same time, increases the H3ULCB boot time by ~140ms, which is
counterintuitive, since by disabling NUMA we expect to get rid of
unused NUMA infrastructure and skip unneeded NUMA init.

As already mentioned in the attached patch, the slowdown happens because
v4.11-rc1 commit b92df1de5d28 ("mm: page_alloc: skip over regions of
invalid pfns where possible") conditions itself on
CONFIG_HAVE_MEMBLOCK_NODE_MAP, which on arm64 depends on NUMA:
$> git grep HAVE_MEMBLOCK_NODE_MAP | grep arm64
arch/arm64/Kconfig:     select HAVE_MEMBLOCK_NODE_MAP if NUMA

The attached patch attempts to present some evidence that the
aforementioned commit can speed up the execution of memmap_init_zone()
not only on arm64 NUMA, but also on arm64 non-NUMA machines. This is
achieved by "relaxing" the dependency of memblock_next_valid_pfn()
from being guarded by CONFIG_HAVE_MEMBLOCK_NODE_MAP to being
guarded by the more generic CONFIG_HAVE_MEMBLOCK.

If this doesn't sound or feel right, I would appreciate your feedback.
I will definitely participate in testing any alternative proposals
that may arise in your mind. TIA!

Best regards,
Eugeniu.

Changes v2->v3:
- Fix review comments from Matthew Wilcox (get rid of compile-time
  guards in mm/page_alloc.c).
- Rebase and re-measure the boot time improvement on v4.15-rc9

Changes v1->v2:
- Fix ARCH=tile build error, signalled by kbuild test robot
- Rebase and re-measure the boot time improvement on v4.15-rc8

Eugeniu Rosca (1):
  mm: page_alloc: skip over regions of invalid pfns on UMA

 include/linux/memblock.h | 1 -
 include/linux/mm.h       | 6 ++++++
 mm/memblock.c            | 2 ++
 mm/page_alloc.c          | 2 --
 4 files changed, 8 insertions(+), 3 deletions(-)

-- 
2.15.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
  2018-01-24 14:35 ` Eugeniu Rosca
@ 2018-01-24 14:35   ` Eugeniu Rosca
  -1 siblings, 0 replies; 24+ messages in thread
From: Eugeniu Rosca @ 2018-01-24 14:35 UTC (permalink / raw)
  To: Matthew Wilcox, Andrew Morton, Michal Hocko, Catalin Marinas,
	Ard Biesheuvel, Steven Sistare, AKASHI Takahiro, Pavel Tatashin,
	Gioh Kim, Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley
  Cc: Eugeniu Rosca, linux-kernel, linux-mm

As a result of bisecting the v4.10..v4.11 commit range, it was
determined that commits [1] and [2] are both responsible of a ~140ms
early startup improvement on Rcar-H3-ES20 arm64 platform.

Since Rcar Gen3 family is not NUMA, we don't define CONFIG_NUMA in the
rcar3 defconfig (which also reduces KNL binary image by ~64KB), but this
is how the boot time improvement is lost.

This patch makes optimization [2] available on UMA systems which
provide support for CONFIG_HAVE_MEMBLOCK.

Testing this change on Rcar H3-ES20-ULCB using v4.15-rc9 KNL and
vanilla arm64 defconfig + NUMA=n, a speed-up of ~139ms (from ~174ms [3]
to ~35ms [4]) is observed in the execution of memmap_init_zone().

No boot time improvement is sensed on Apollo Lake SoC.

[1] commit 0f84832fb8f9 ("arm64: defconfig: Enable NUMA and NUMA_BALANCING")
[2] commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")

[3] 174ms spent in memmap_init_zone() on H3ULCB w/o this patch (NUMA=n)
[    2.643685] On node 0 totalpages: 1015808
[    2.643688]   DMA zone: 3584 pages used for memmap
[    2.643691]   DMA zone: 0 pages reserved
[    2.643693]   DMA zone: 229376 pages, LIFO batch:31
[    2.643696] > memmap_init_zone
[    2.663628] < memmap_init_zone (19.932 ms)
[    2.663632]   Normal zone: 12288 pages used for memmap
[    2.663635]   Normal zone: 786432 pages, LIFO batch:31
[    2.663637] > memmap_init_zone
[    2.818012] < memmap_init_zone (154.375 ms)
[    2.818041] psci: probing for conduit method from DT.

[4] 35ms spent in memmap_init_zone() on H3ULCB with this patch (NUMA=n)
[    2.677202] On node 0 totalpages: 1015808
[    2.677205]   DMA zone: 3584 pages used for memmap
[    2.677208]   DMA zone: 0 pages reserved
[    2.677211]   DMA zone: 229376 pages, LIFO batch:31
[    2.677213] > memmap_init_zone
[    2.684378] < memmap_init_zone (7.165 ms)
[    2.684382]   Normal zone: 12288 pages used for memmap
[    2.684385]   Normal zone: 786432 pages, LIFO batch:31
[    2.684387] > memmap_init_zone
[    2.712556] < memmap_init_zone (28.169 ms)
[    2.712584] psci: probing for conduit method from DT.

Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 include/linux/memblock.h | 1 -
 include/linux/mm.h       | 6 ++++++
 mm/memblock.c            | 2 ++
 mm/page_alloc.c          | 2 --
 4 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 7ed0f7782d16..9efd592c5da4 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -187,7 +187,6 @@ int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
 			    unsigned long  *end_pfn);
 void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
 			  unsigned long *out_end_pfn, int *out_nid);
-unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
 
 /**
  * for_each_mem_pfn_range - early memory pfn range iterator
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ea818ff739cd..b82b30522585 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2064,8 +2064,14 @@ extern int __meminit __early_pfn_to_nid(unsigned long pfn,
 
 #ifdef CONFIG_HAVE_MEMBLOCK
 void zero_resv_unavail(void);
+unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
 #else
 static inline void zero_resv_unavail(void) {}
+static inline unsigned long memblock_next_valid_pfn(unsigned long pfn,
+						    unsigned long max_pfn)
+{
+	return pfn + 1;
+}
 #endif
 
 extern void set_dma_reserve(unsigned long new_dma_reserve);
diff --git a/mm/memblock.c b/mm/memblock.c
index 46aacdfa4f4d..ad48cf200e3b 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1100,6 +1100,7 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
 	if (out_nid)
 		*out_nid = r->nid;
 }
+#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
 
 unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
 						      unsigned long max_pfn)
@@ -1129,6 +1130,7 @@ unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
 		return min(PHYS_PFN(type->regions[right].base), max_pfn);
 }
 
+#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 /**
  * memblock_set_node - set node ID on memblock regions
  * @base: base of area to set node ID for
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 76c9688b6a0a..4a3d5936a9a0 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5344,14 +5344,12 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			goto not_early;
 
 		if (!early_pfn_valid(pfn)) {
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 			/*
 			 * Skip to the pfn preceding the next valid one (or
 			 * end_pfn), such that we hit a valid pfn (or end_pfn)
 			 * on our next iteration of the loop.
 			 */
 			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
-#endif
 			continue;
 		}
 		if (!early_pfn_in_nid(pfn, nid))
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
@ 2018-01-24 14:35   ` Eugeniu Rosca
  0 siblings, 0 replies; 24+ messages in thread
From: Eugeniu Rosca @ 2018-01-24 14:35 UTC (permalink / raw)
  To: Matthew Wilcox, Andrew Morton, Michal Hocko, Catalin Marinas,
	Ard Biesheuvel, Steven Sistare, AKASHI Takahiro, Pavel Tatashin,
	Gioh Kim, Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley
  Cc: Eugeniu Rosca, linux-kernel, linux-mm

As a result of bisecting the v4.10..v4.11 commit range, it was
determined that commits [1] and [2] are both responsible of a ~140ms
early startup improvement on Rcar-H3-ES20 arm64 platform.

Since Rcar Gen3 family is not NUMA, we don't define CONFIG_NUMA in the
rcar3 defconfig (which also reduces KNL binary image by ~64KB), but this
is how the boot time improvement is lost.

This patch makes optimization [2] available on UMA systems which
provide support for CONFIG_HAVE_MEMBLOCK.

Testing this change on Rcar H3-ES20-ULCB using v4.15-rc9 KNL and
vanilla arm64 defconfig + NUMA=n, a speed-up of ~139ms (from ~174ms [3]
to ~35ms [4]) is observed in the execution of memmap_init_zone().

No boot time improvement is sensed on Apollo Lake SoC.

[1] commit 0f84832fb8f9 ("arm64: defconfig: Enable NUMA and NUMA_BALANCING")
[2] commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")

[3] 174ms spent in memmap_init_zone() on H3ULCB w/o this patch (NUMA=n)
[    2.643685] On node 0 totalpages: 1015808
[    2.643688]   DMA zone: 3584 pages used for memmap
[    2.643691]   DMA zone: 0 pages reserved
[    2.643693]   DMA zone: 229376 pages, LIFO batch:31
[    2.643696] > memmap_init_zone
[    2.663628] < memmap_init_zone (19.932 ms)
[    2.663632]   Normal zone: 12288 pages used for memmap
[    2.663635]   Normal zone: 786432 pages, LIFO batch:31
[    2.663637] > memmap_init_zone
[    2.818012] < memmap_init_zone (154.375 ms)
[    2.818041] psci: probing for conduit method from DT.

[4] 35ms spent in memmap_init_zone() on H3ULCB with this patch (NUMA=n)
[    2.677202] On node 0 totalpages: 1015808
[    2.677205]   DMA zone: 3584 pages used for memmap
[    2.677208]   DMA zone: 0 pages reserved
[    2.677211]   DMA zone: 229376 pages, LIFO batch:31
[    2.677213] > memmap_init_zone
[    2.684378] < memmap_init_zone (7.165 ms)
[    2.684382]   Normal zone: 12288 pages used for memmap
[    2.684385]   Normal zone: 786432 pages, LIFO batch:31
[    2.684387] > memmap_init_zone
[    2.712556] < memmap_init_zone (28.169 ms)
[    2.712584] psci: probing for conduit method from DT.

Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
---
 include/linux/memblock.h | 1 -
 include/linux/mm.h       | 6 ++++++
 mm/memblock.c            | 2 ++
 mm/page_alloc.c          | 2 --
 4 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 7ed0f7782d16..9efd592c5da4 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -187,7 +187,6 @@ int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
 			    unsigned long  *end_pfn);
 void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
 			  unsigned long *out_end_pfn, int *out_nid);
-unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
 
 /**
  * for_each_mem_pfn_range - early memory pfn range iterator
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ea818ff739cd..b82b30522585 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2064,8 +2064,14 @@ extern int __meminit __early_pfn_to_nid(unsigned long pfn,
 
 #ifdef CONFIG_HAVE_MEMBLOCK
 void zero_resv_unavail(void);
+unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
 #else
 static inline void zero_resv_unavail(void) {}
+static inline unsigned long memblock_next_valid_pfn(unsigned long pfn,
+						    unsigned long max_pfn)
+{
+	return pfn + 1;
+}
 #endif
 
 extern void set_dma_reserve(unsigned long new_dma_reserve);
diff --git a/mm/memblock.c b/mm/memblock.c
index 46aacdfa4f4d..ad48cf200e3b 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1100,6 +1100,7 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
 	if (out_nid)
 		*out_nid = r->nid;
 }
+#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
 
 unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
 						      unsigned long max_pfn)
@@ -1129,6 +1130,7 @@ unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
 		return min(PHYS_PFN(type->regions[right].base), max_pfn);
 }
 
+#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 /**
  * memblock_set_node - set node ID on memblock regions
  * @base: base of area to set node ID for
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 76c9688b6a0a..4a3d5936a9a0 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5344,14 +5344,12 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			goto not_early;
 
 		if (!early_pfn_valid(pfn)) {
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 			/*
 			 * Skip to the pfn preceding the next valid one (or
 			 * end_pfn), such that we hit a valid pfn (or end_pfn)
 			 * on our next iteration of the loop.
 			 */
 			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
-#endif
 			continue;
 		}
 		if (!early_pfn_in_nid(pfn, nid))
-- 
2.15.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
  2018-01-24 14:35   ` Eugeniu Rosca
@ 2018-01-24 16:27     ` Pavel Tatashin
  -1 siblings, 0 replies; 24+ messages in thread
From: Pavel Tatashin @ 2018-01-24 16:27 UTC (permalink / raw)
  To: Eugeniu Rosca
  Cc: Matthew Wilcox, Andrew Morton, Michal Hocko, Catalin Marinas,
	Ard Biesheuvel, Steven Sistare, AKASHI Takahiro, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley,
	Linux Kernel Mailing List, Linux Memory Management List

Looks good.

Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>

On Wed, Jan 24, 2018 at 9:35 AM, Eugeniu Rosca <erosca@de.adit-jv.com> wrote:
> As a result of bisecting the v4.10..v4.11 commit range, it was
> determined that commits [1] and [2] are both responsible of a ~140ms
> early startup improvement on Rcar-H3-ES20 arm64 platform.
>
> Since Rcar Gen3 family is not NUMA, we don't define CONFIG_NUMA in the
> rcar3 defconfig (which also reduces KNL binary image by ~64KB), but this
> is how the boot time improvement is lost.
>
> This patch makes optimization [2] available on UMA systems which
> provide support for CONFIG_HAVE_MEMBLOCK.
>
> Testing this change on Rcar H3-ES20-ULCB using v4.15-rc9 KNL and
> vanilla arm64 defconfig + NUMA=n, a speed-up of ~139ms (from ~174ms [3]
> to ~35ms [4]) is observed in the execution of memmap_init_zone().
>
> No boot time improvement is sensed on Apollo Lake SoC.
>
> [1] commit 0f84832fb8f9 ("arm64: defconfig: Enable NUMA and NUMA_BALANCING")
> [2] commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")
>
> [3] 174ms spent in memmap_init_zone() on H3ULCB w/o this patch (NUMA=n)
> [    2.643685] On node 0 totalpages: 1015808
> [    2.643688]   DMA zone: 3584 pages used for memmap
> [    2.643691]   DMA zone: 0 pages reserved
> [    2.643693]   DMA zone: 229376 pages, LIFO batch:31
> [    2.643696] > memmap_init_zone
> [    2.663628] < memmap_init_zone (19.932 ms)
> [    2.663632]   Normal zone: 12288 pages used for memmap
> [    2.663635]   Normal zone: 786432 pages, LIFO batch:31
> [    2.663637] > memmap_init_zone
> [    2.818012] < memmap_init_zone (154.375 ms)
> [    2.818041] psci: probing for conduit method from DT.
>
> [4] 35ms spent in memmap_init_zone() on H3ULCB with this patch (NUMA=n)
> [    2.677202] On node 0 totalpages: 1015808
> [    2.677205]   DMA zone: 3584 pages used for memmap
> [    2.677208]   DMA zone: 0 pages reserved
> [    2.677211]   DMA zone: 229376 pages, LIFO batch:31
> [    2.677213] > memmap_init_zone
> [    2.684378] < memmap_init_zone (7.165 ms)
> [    2.684382]   Normal zone: 12288 pages used for memmap
> [    2.684385]   Normal zone: 786432 pages, LIFO batch:31
> [    2.684387] > memmap_init_zone
> [    2.712556] < memmap_init_zone (28.169 ms)
> [    2.712584] psci: probing for conduit method from DT.
>
> Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
> ---
>  include/linux/memblock.h | 1 -
>  include/linux/mm.h       | 6 ++++++
>  mm/memblock.c            | 2 ++
>  mm/page_alloc.c          | 2 --
>  4 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 7ed0f7782d16..9efd592c5da4 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -187,7 +187,6 @@ int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
>                             unsigned long  *end_pfn);
>  void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
>                           unsigned long *out_end_pfn, int *out_nid);
> -unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
>
>  /**
>   * for_each_mem_pfn_range - early memory pfn range iterator
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index ea818ff739cd..b82b30522585 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2064,8 +2064,14 @@ extern int __meminit __early_pfn_to_nid(unsigned long pfn,
>
>  #ifdef CONFIG_HAVE_MEMBLOCK
>  void zero_resv_unavail(void);
> +unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
>  #else
>  static inline void zero_resv_unavail(void) {}
> +static inline unsigned long memblock_next_valid_pfn(unsigned long pfn,
> +                                                   unsigned long max_pfn)
> +{
> +       return pfn + 1;
> +}
>  #endif
>
>  extern void set_dma_reserve(unsigned long new_dma_reserve);
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 46aacdfa4f4d..ad48cf200e3b 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1100,6 +1100,7 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
>         if (out_nid)
>                 *out_nid = r->nid;
>  }
> +#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
>
>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
>                                                       unsigned long max_pfn)
> @@ -1129,6 +1130,7 @@ unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
>                 return min(PHYS_PFN(type->regions[right].base), max_pfn);
>  }
>
> +#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
>  /**
>   * memblock_set_node - set node ID on memblock regions
>   * @base: base of area to set node ID for
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 76c9688b6a0a..4a3d5936a9a0 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5344,14 +5344,12 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>                         goto not_early;
>
>                 if (!early_pfn_valid(pfn)) {
> -#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
>                         /*
>                          * Skip to the pfn preceding the next valid one (or
>                          * end_pfn), such that we hit a valid pfn (or end_pfn)
>                          * on our next iteration of the loop.
>                          */
>                         pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
> -#endif
>                         continue;
>                 }
>                 if (!early_pfn_in_nid(pfn, nid))
> --
> 2.15.1
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
@ 2018-01-24 16:27     ` Pavel Tatashin
  0 siblings, 0 replies; 24+ messages in thread
From: Pavel Tatashin @ 2018-01-24 16:27 UTC (permalink / raw)
  To: Eugeniu Rosca
  Cc: Matthew Wilcox, Andrew Morton, Michal Hocko, Catalin Marinas,
	Ard Biesheuvel, Steven Sistare, AKASHI Takahiro, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley,
	Linux Kernel Mailing List, Linux Memory Management List

Looks good.

Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>

On Wed, Jan 24, 2018 at 9:35 AM, Eugeniu Rosca <erosca@de.adit-jv.com> wrote:
> As a result of bisecting the v4.10..v4.11 commit range, it was
> determined that commits [1] and [2] are both responsible of a ~140ms
> early startup improvement on Rcar-H3-ES20 arm64 platform.
>
> Since Rcar Gen3 family is not NUMA, we don't define CONFIG_NUMA in the
> rcar3 defconfig (which also reduces KNL binary image by ~64KB), but this
> is how the boot time improvement is lost.
>
> This patch makes optimization [2] available on UMA systems which
> provide support for CONFIG_HAVE_MEMBLOCK.
>
> Testing this change on Rcar H3-ES20-ULCB using v4.15-rc9 KNL and
> vanilla arm64 defconfig + NUMA=n, a speed-up of ~139ms (from ~174ms [3]
> to ~35ms [4]) is observed in the execution of memmap_init_zone().
>
> No boot time improvement is sensed on Apollo Lake SoC.
>
> [1] commit 0f84832fb8f9 ("arm64: defconfig: Enable NUMA and NUMA_BALANCING")
> [2] commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")
>
> [3] 174ms spent in memmap_init_zone() on H3ULCB w/o this patch (NUMA=n)
> [    2.643685] On node 0 totalpages: 1015808
> [    2.643688]   DMA zone: 3584 pages used for memmap
> [    2.643691]   DMA zone: 0 pages reserved
> [    2.643693]   DMA zone: 229376 pages, LIFO batch:31
> [    2.643696] > memmap_init_zone
> [    2.663628] < memmap_init_zone (19.932 ms)
> [    2.663632]   Normal zone: 12288 pages used for memmap
> [    2.663635]   Normal zone: 786432 pages, LIFO batch:31
> [    2.663637] > memmap_init_zone
> [    2.818012] < memmap_init_zone (154.375 ms)
> [    2.818041] psci: probing for conduit method from DT.
>
> [4] 35ms spent in memmap_init_zone() on H3ULCB with this patch (NUMA=n)
> [    2.677202] On node 0 totalpages: 1015808
> [    2.677205]   DMA zone: 3584 pages used for memmap
> [    2.677208]   DMA zone: 0 pages reserved
> [    2.677211]   DMA zone: 229376 pages, LIFO batch:31
> [    2.677213] > memmap_init_zone
> [    2.684378] < memmap_init_zone (7.165 ms)
> [    2.684382]   Normal zone: 12288 pages used for memmap
> [    2.684385]   Normal zone: 786432 pages, LIFO batch:31
> [    2.684387] > memmap_init_zone
> [    2.712556] < memmap_init_zone (28.169 ms)
> [    2.712584] psci: probing for conduit method from DT.
>
> Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
> ---
>  include/linux/memblock.h | 1 -
>  include/linux/mm.h       | 6 ++++++
>  mm/memblock.c            | 2 ++
>  mm/page_alloc.c          | 2 --
>  4 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 7ed0f7782d16..9efd592c5da4 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -187,7 +187,6 @@ int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
>                             unsigned long  *end_pfn);
>  void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
>                           unsigned long *out_end_pfn, int *out_nid);
> -unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
>
>  /**
>   * for_each_mem_pfn_range - early memory pfn range iterator
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index ea818ff739cd..b82b30522585 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2064,8 +2064,14 @@ extern int __meminit __early_pfn_to_nid(unsigned long pfn,
>
>  #ifdef CONFIG_HAVE_MEMBLOCK
>  void zero_resv_unavail(void);
> +unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
>  #else
>  static inline void zero_resv_unavail(void) {}
> +static inline unsigned long memblock_next_valid_pfn(unsigned long pfn,
> +                                                   unsigned long max_pfn)
> +{
> +       return pfn + 1;
> +}
>  #endif
>
>  extern void set_dma_reserve(unsigned long new_dma_reserve);
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 46aacdfa4f4d..ad48cf200e3b 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1100,6 +1100,7 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
>         if (out_nid)
>                 *out_nid = r->nid;
>  }
> +#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
>
>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
>                                                       unsigned long max_pfn)
> @@ -1129,6 +1130,7 @@ unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
>                 return min(PHYS_PFN(type->regions[right].base), max_pfn);
>  }
>
> +#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
>  /**
>   * memblock_set_node - set node ID on memblock regions
>   * @base: base of area to set node ID for
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 76c9688b6a0a..4a3d5936a9a0 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5344,14 +5344,12 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>                         goto not_early;
>
>                 if (!early_pfn_valid(pfn)) {
> -#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
>                         /*
>                          * Skip to the pfn preceding the next valid one (or
>                          * end_pfn), such that we hit a valid pfn (or end_pfn)
>                          * on our next iteration of the loop.
>                          */
>                         pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
> -#endif
>                         continue;
>                 }
>                 if (!early_pfn_in_nid(pfn, nid))
> --
> 2.15.1
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
  2018-01-24 16:27     ` Pavel Tatashin
@ 2018-01-29 17:06       ` Eugeniu Rosca
  -1 siblings, 0 replies; 24+ messages in thread
From: Eugeniu Rosca @ 2018-01-29 17:06 UTC (permalink / raw)
  To: Pavel Tatashin, Matthew Wilcox
  Cc: Andrew Morton, Michal Hocko, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Gioh Kim, Heiko Carstens,
	Wei Yang, Miles Chen, Vlastimil Babka, Mel Gorman,
	Johannes Weiner, Paul Burton, James Hartley, Eugeniu Rosca,
	Linux Kernel Mailing List, Linux Memory Management List

Hello Matthew, Pavel and MM people,

I am probably too impatient, but what would be the next step after
reaching [PATCH v3] and collecting two Reviewed-by signatures?

Is there a chance that this patch goes into v4.16-rc1 to be less of a
risk for the whole cycle?

Thanks,
Eugeniu.

On Wed, Jan 24, 2018 at 11:27:47AM -0500, Pavel Tatashin wrote:
> Looks good.
> 
> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
> 
> On Wed, Jan 24, 2018 at 9:35 AM, Eugeniu Rosca <erosca@de.adit-jv.com> wrote:
> > As a result of bisecting the v4.10..v4.11 commit range, it was
> > determined that commits [1] and [2] are both responsible of a ~140ms
> > early startup improvement on Rcar-H3-ES20 arm64 platform.
> >
> > Since Rcar Gen3 family is not NUMA, we don't define CONFIG_NUMA in the
> > rcar3 defconfig (which also reduces KNL binary image by ~64KB), but this
> > is how the boot time improvement is lost.
> >
> > This patch makes optimization [2] available on UMA systems which
> > provide support for CONFIG_HAVE_MEMBLOCK.
> >
> > Testing this change on Rcar H3-ES20-ULCB using v4.15-rc9 KNL and
> > vanilla arm64 defconfig + NUMA=n, a speed-up of ~139ms (from ~174ms [3]
> > to ~35ms [4]) is observed in the execution of memmap_init_zone().
> >
> > No boot time improvement is sensed on Apollo Lake SoC.
> >
> > [1] commit 0f84832fb8f9 ("arm64: defconfig: Enable NUMA and NUMA_BALANCING")
> > [2] commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")
> >
> > [3] 174ms spent in memmap_init_zone() on H3ULCB w/o this patch (NUMA=n)
> > [    2.643685] On node 0 totalpages: 1015808
> > [    2.643688]   DMA zone: 3584 pages used for memmap
> > [    2.643691]   DMA zone: 0 pages reserved
> > [    2.643693]   DMA zone: 229376 pages, LIFO batch:31
> > [    2.643696] > memmap_init_zone
> > [    2.663628] < memmap_init_zone (19.932 ms)
> > [    2.663632]   Normal zone: 12288 pages used for memmap
> > [    2.663635]   Normal zone: 786432 pages, LIFO batch:31
> > [    2.663637] > memmap_init_zone
> > [    2.818012] < memmap_init_zone (154.375 ms)
> > [    2.818041] psci: probing for conduit method from DT.
> >
> > [4] 35ms spent in memmap_init_zone() on H3ULCB with this patch (NUMA=n)
> > [    2.677202] On node 0 totalpages: 1015808
> > [    2.677205]   DMA zone: 3584 pages used for memmap
> > [    2.677208]   DMA zone: 0 pages reserved
> > [    2.677211]   DMA zone: 229376 pages, LIFO batch:31
> > [    2.677213] > memmap_init_zone
> > [    2.684378] < memmap_init_zone (7.165 ms)
> > [    2.684382]   Normal zone: 12288 pages used for memmap
> > [    2.684385]   Normal zone: 786432 pages, LIFO batch:31
> > [    2.684387] > memmap_init_zone
> > [    2.712556] < memmap_init_zone (28.169 ms)
> > [    2.712584] psci: probing for conduit method from DT.
> >
> > Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
> > Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
> > ---
> >  include/linux/memblock.h | 1 -
> >  include/linux/mm.h       | 6 ++++++
> >  mm/memblock.c            | 2 ++
> >  mm/page_alloc.c          | 2 --
> >  4 files changed, 8 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> > index 7ed0f7782d16..9efd592c5da4 100644
> > --- a/include/linux/memblock.h
> > +++ b/include/linux/memblock.h
> > @@ -187,7 +187,6 @@ int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
> >                             unsigned long  *end_pfn);
> >  void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
> >                           unsigned long *out_end_pfn, int *out_nid);
> > -unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
> >
> >  /**
> >   * for_each_mem_pfn_range - early memory pfn range iterator
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index ea818ff739cd..b82b30522585 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -2064,8 +2064,14 @@ extern int __meminit __early_pfn_to_nid(unsigned long pfn,
> >
> >  #ifdef CONFIG_HAVE_MEMBLOCK
> >  void zero_resv_unavail(void);
> > +unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
> >  #else
> >  static inline void zero_resv_unavail(void) {}
> > +static inline unsigned long memblock_next_valid_pfn(unsigned long pfn,
> > +                                                   unsigned long max_pfn)
> > +{
> > +       return pfn + 1;
> > +}
> >  #endif
> >
> >  extern void set_dma_reserve(unsigned long new_dma_reserve);
> > diff --git a/mm/memblock.c b/mm/memblock.c
> > index 46aacdfa4f4d..ad48cf200e3b 100644
> > --- a/mm/memblock.c
> > +++ b/mm/memblock.c
> > @@ -1100,6 +1100,7 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
> >         if (out_nid)
> >                 *out_nid = r->nid;
> >  }
> > +#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
> >
> >  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
> >                                                       unsigned long max_pfn)
> > @@ -1129,6 +1130,7 @@ unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
> >                 return min(PHYS_PFN(type->regions[right].base), max_pfn);
> >  }
> >
> > +#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
> >  /**
> >   * memblock_set_node - set node ID on memblock regions
> >   * @base: base of area to set node ID for
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 76c9688b6a0a..4a3d5936a9a0 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -5344,14 +5344,12 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
> >                         goto not_early;
> >
> >                 if (!early_pfn_valid(pfn)) {
> > -#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
> >                         /*
> >                          * Skip to the pfn preceding the next valid one (or
> >                          * end_pfn), such that we hit a valid pfn (or end_pfn)
> >                          * on our next iteration of the loop.
> >                          */
> >                         pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
> > -#endif
> >                         continue;
> >                 }
> >                 if (!early_pfn_in_nid(pfn, nid))
> > --
> > 2.15.1
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
@ 2018-01-29 17:06       ` Eugeniu Rosca
  0 siblings, 0 replies; 24+ messages in thread
From: Eugeniu Rosca @ 2018-01-29 17:06 UTC (permalink / raw)
  To: Pavel Tatashin, Matthew Wilcox
  Cc: Andrew Morton, Michal Hocko, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Gioh Kim, Heiko Carstens,
	Wei Yang, Miles Chen, Vlastimil Babka, Mel Gorman,
	Johannes Weiner, Paul Burton, James Hartley, Eugeniu Rosca,
	Linux Kernel Mailing List, Linux Memory Management List

Hello Matthew, Pavel and MM people,

I am probably too impatient, but what would be the next step after
reaching [PATCH v3] and collecting two Reviewed-by signatures?

Is there a chance that this patch goes into v4.16-rc1 to be less of a
risk for the whole cycle?

Thanks,
Eugeniu.

On Wed, Jan 24, 2018 at 11:27:47AM -0500, Pavel Tatashin wrote:
> Looks good.
> 
> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
> 
> On Wed, Jan 24, 2018 at 9:35 AM, Eugeniu Rosca <erosca@de.adit-jv.com> wrote:
> > As a result of bisecting the v4.10..v4.11 commit range, it was
> > determined that commits [1] and [2] are both responsible of a ~140ms
> > early startup improvement on Rcar-H3-ES20 arm64 platform.
> >
> > Since Rcar Gen3 family is not NUMA, we don't define CONFIG_NUMA in the
> > rcar3 defconfig (which also reduces KNL binary image by ~64KB), but this
> > is how the boot time improvement is lost.
> >
> > This patch makes optimization [2] available on UMA systems which
> > provide support for CONFIG_HAVE_MEMBLOCK.
> >
> > Testing this change on Rcar H3-ES20-ULCB using v4.15-rc9 KNL and
> > vanilla arm64 defconfig + NUMA=n, a speed-up of ~139ms (from ~174ms [3]
> > to ~35ms [4]) is observed in the execution of memmap_init_zone().
> >
> > No boot time improvement is sensed on Apollo Lake SoC.
> >
> > [1] commit 0f84832fb8f9 ("arm64: defconfig: Enable NUMA and NUMA_BALANCING")
> > [2] commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")
> >
> > [3] 174ms spent in memmap_init_zone() on H3ULCB w/o this patch (NUMA=n)
> > [    2.643685] On node 0 totalpages: 1015808
> > [    2.643688]   DMA zone: 3584 pages used for memmap
> > [    2.643691]   DMA zone: 0 pages reserved
> > [    2.643693]   DMA zone: 229376 pages, LIFO batch:31
> > [    2.643696] > memmap_init_zone
> > [    2.663628] < memmap_init_zone (19.932 ms)
> > [    2.663632]   Normal zone: 12288 pages used for memmap
> > [    2.663635]   Normal zone: 786432 pages, LIFO batch:31
> > [    2.663637] > memmap_init_zone
> > [    2.818012] < memmap_init_zone (154.375 ms)
> > [    2.818041] psci: probing for conduit method from DT.
> >
> > [4] 35ms spent in memmap_init_zone() on H3ULCB with this patch (NUMA=n)
> > [    2.677202] On node 0 totalpages: 1015808
> > [    2.677205]   DMA zone: 3584 pages used for memmap
> > [    2.677208]   DMA zone: 0 pages reserved
> > [    2.677211]   DMA zone: 229376 pages, LIFO batch:31
> > [    2.677213] > memmap_init_zone
> > [    2.684378] < memmap_init_zone (7.165 ms)
> > [    2.684382]   Normal zone: 12288 pages used for memmap
> > [    2.684385]   Normal zone: 786432 pages, LIFO batch:31
> > [    2.684387] > memmap_init_zone
> > [    2.712556] < memmap_init_zone (28.169 ms)
> > [    2.712584] psci: probing for conduit method from DT.
> >
> > Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
> > Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
> > ---
> >  include/linux/memblock.h | 1 -
> >  include/linux/mm.h       | 6 ++++++
> >  mm/memblock.c            | 2 ++
> >  mm/page_alloc.c          | 2 --
> >  4 files changed, 8 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> > index 7ed0f7782d16..9efd592c5da4 100644
> > --- a/include/linux/memblock.h
> > +++ b/include/linux/memblock.h
> > @@ -187,7 +187,6 @@ int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
> >                             unsigned long  *end_pfn);
> >  void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
> >                           unsigned long *out_end_pfn, int *out_nid);
> > -unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
> >
> >  /**
> >   * for_each_mem_pfn_range - early memory pfn range iterator
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index ea818ff739cd..b82b30522585 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -2064,8 +2064,14 @@ extern int __meminit __early_pfn_to_nid(unsigned long pfn,
> >
> >  #ifdef CONFIG_HAVE_MEMBLOCK
> >  void zero_resv_unavail(void);
> > +unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
> >  #else
> >  static inline void zero_resv_unavail(void) {}
> > +static inline unsigned long memblock_next_valid_pfn(unsigned long pfn,
> > +                                                   unsigned long max_pfn)
> > +{
> > +       return pfn + 1;
> > +}
> >  #endif
> >
> >  extern void set_dma_reserve(unsigned long new_dma_reserve);
> > diff --git a/mm/memblock.c b/mm/memblock.c
> > index 46aacdfa4f4d..ad48cf200e3b 100644
> > --- a/mm/memblock.c
> > +++ b/mm/memblock.c
> > @@ -1100,6 +1100,7 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
> >         if (out_nid)
> >                 *out_nid = r->nid;
> >  }
> > +#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
> >
> >  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
> >                                                       unsigned long max_pfn)
> > @@ -1129,6 +1130,7 @@ unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
> >                 return min(PHYS_PFN(type->regions[right].base), max_pfn);
> >  }
> >
> > +#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
> >  /**
> >   * memblock_set_node - set node ID on memblock regions
> >   * @base: base of area to set node ID for
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 76c9688b6a0a..4a3d5936a9a0 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -5344,14 +5344,12 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
> >                         goto not_early;
> >
> >                 if (!early_pfn_valid(pfn)) {
> > -#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
> >                         /*
> >                          * Skip to the pfn preceding the next valid one (or
> >                          * end_pfn), such that we hit a valid pfn (or end_pfn)
> >                          * on our next iteration of the loop.
> >                          */
> >                         pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
> > -#endif
> >                         continue;
> >                 }
> >                 if (!early_pfn_in_nid(pfn, nid))
> > --
> > 2.15.1
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
  2018-01-24 14:35   ` Eugeniu Rosca
@ 2018-01-29 18:47     ` Michal Hocko
  -1 siblings, 0 replies; 24+ messages in thread
From: Michal Hocko @ 2018-01-29 18:47 UTC (permalink / raw)
  To: Eugeniu Rosca
  Cc: Matthew Wilcox, Andrew Morton, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Pavel Tatashin, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley,
	linux-kernel, linux-mm

On Wed 24-01-18 15:35:45, Eugeniu Rosca wrote:
[...]
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 76c9688b6a0a..4a3d5936a9a0 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5344,14 +5344,12 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>  			goto not_early;
>  
>  		if (!early_pfn_valid(pfn)) {
> -#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
>  			/*
>  			 * Skip to the pfn preceding the next valid one (or
>  			 * end_pfn), such that we hit a valid pfn (or end_pfn)
>  			 * on our next iteration of the loop.
>  			 */
>  			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
> -#endif
>  			continue;

Wouldn't it be just simpler to have ifdef CONFIG_HAVE_MEMBLOCK rather
than define memblock_next_valid_pfn for !HAVE_MEMBLOCK and then do the
(pfn + 1 ) - 1 games. I am usually against ifdefs in the code but that
would require a larger surgery to memmap_init_zone.

To be completely honest, I would like to see HAVE_MEMBLOCK_NODE_MAP
gone.

Other than that, the patch looks sane to me.

>  		}
>  		if (!early_pfn_in_nid(pfn, nid))
> -- 
> 2.15.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
@ 2018-01-29 18:47     ` Michal Hocko
  0 siblings, 0 replies; 24+ messages in thread
From: Michal Hocko @ 2018-01-29 18:47 UTC (permalink / raw)
  To: Eugeniu Rosca
  Cc: Matthew Wilcox, Andrew Morton, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Pavel Tatashin, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley,
	linux-kernel, linux-mm

On Wed 24-01-18 15:35:45, Eugeniu Rosca wrote:
[...]
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 76c9688b6a0a..4a3d5936a9a0 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5344,14 +5344,12 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>  			goto not_early;
>  
>  		if (!early_pfn_valid(pfn)) {
> -#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
>  			/*
>  			 * Skip to the pfn preceding the next valid one (or
>  			 * end_pfn), such that we hit a valid pfn (or end_pfn)
>  			 * on our next iteration of the loop.
>  			 */
>  			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
> -#endif
>  			continue;

Wouldn't it be just simpler to have ifdef CONFIG_HAVE_MEMBLOCK rather
than define memblock_next_valid_pfn for !HAVE_MEMBLOCK and then do the
(pfn + 1 ) - 1 games. I am usually against ifdefs in the code but that
would require a larger surgery to memmap_init_zone.

To be completely honest, I would like to see HAVE_MEMBLOCK_NODE_MAP
gone.

Other than that, the patch looks sane to me.

>  		}
>  		if (!early_pfn_in_nid(pfn, nid))
> -- 
> 2.15.1
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
  2018-01-29 18:47     ` Michal Hocko
@ 2018-02-03 12:24       ` Eugeniu Rosca
  -1 siblings, 0 replies; 24+ messages in thread
From: Eugeniu Rosca @ 2018-02-03 12:24 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Matthew Wilcox, Andrew Morton, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Pavel Tatashin, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley,
	linux-kernel, linux-mm, Eugeniu Rosca

Hello Michal,

On Mon, Jan 29, 2018 at 07:47:46PM +0100, Michal Hocko wrote:
> On Wed 24-01-18 15:35:45, Eugeniu Rosca wrote:
> [...]
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 76c9688b6a0a..4a3d5936a9a0 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -5344,14 +5344,12 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
> >  			goto not_early;
> >  
> >  		if (!early_pfn_valid(pfn)) {
> > -#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
> >  			/*
> >  			 * Skip to the pfn preceding the next valid one (or
> >  			 * end_pfn), such that we hit a valid pfn (or end_pfn)
> >  			 * on our next iteration of the loop.
> >  			 */
> >  			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
> > -#endif
> >  			continue;
> 
> Wouldn't it be just simpler to have ifdef CONFIG_HAVE_MEMBLOCK rather
> than define memblock_next_valid_pfn for !HAVE_MEMBLOCK and then do the
> (pfn + 1 ) - 1 games.

This sounds to me like you prefer the earlier v2 of this patch.

To my understanding, the difference between v2 and v3 is mainly
meaningful for architectures which don't define CONFIG_HAVE_MEMBLOCK.
One of them is ARCH=tile (for which kbuild test robot reported a compile
failure for v1 of my patch). Out of curiosity, I compiled
mm/page_alloc.o for ARCH=tile using [PATCH v2] and [PATCH v3], then
disassembled the objects using `objdump -D` and compared the results.

What I see is that the disassembled versions of memmap_init_zone()
fully match. To me, this means that the main difference between v2 and
v3 is about code readability. This is definitely an important aspect
too, but I must admit I don't have a very strong opinion here. I expect
this to be arbitrated by MM developers and eventually by the MM
gatekeepers/maintainers.

For the record, to achieve the same boot time improvement, the
alternatives on our side are:
- enable CONFIG_NUMA, just to benefit from the ~140ms speedup in boot
  time. Besides the increase of kernel image size, this also leads to
  annoying "Numa node 0:" noise in backtrace and stackdump output,
  which is not interesting for a non-NUMA system.
- ship the patch to our customers, in spite of not being accepted by MM
  community. This is a risky and unhealthy path, which we don't like.

That said, I really hope this won't be the last comment in the thread
and appropriate suggestions will come on how to go forward.

Thank you,
Eugeniu.

> I am usually against ifdefs in the code but that
> would require a larger surgery to memmap_init_zone.
> 
> To be completely honest, I would like to see HAVE_MEMBLOCK_NODE_MAP
> gone.
> 
> Other than that, the patch looks sane to me.
> 
> >  		}
> >  		if (!early_pfn_in_nid(pfn, nid))
> > -- 
> > 2.15.1
> > 
> 
> -- 
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
@ 2018-02-03 12:24       ` Eugeniu Rosca
  0 siblings, 0 replies; 24+ messages in thread
From: Eugeniu Rosca @ 2018-02-03 12:24 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Matthew Wilcox, Andrew Morton, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Pavel Tatashin, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley,
	linux-kernel, linux-mm, Eugeniu Rosca

Hello Michal,

On Mon, Jan 29, 2018 at 07:47:46PM +0100, Michal Hocko wrote:
> On Wed 24-01-18 15:35:45, Eugeniu Rosca wrote:
> [...]
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 76c9688b6a0a..4a3d5936a9a0 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -5344,14 +5344,12 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
> >  			goto not_early;
> >  
> >  		if (!early_pfn_valid(pfn)) {
> > -#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
> >  			/*
> >  			 * Skip to the pfn preceding the next valid one (or
> >  			 * end_pfn), such that we hit a valid pfn (or end_pfn)
> >  			 * on our next iteration of the loop.
> >  			 */
> >  			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
> > -#endif
> >  			continue;
> 
> Wouldn't it be just simpler to have ifdef CONFIG_HAVE_MEMBLOCK rather
> than define memblock_next_valid_pfn for !HAVE_MEMBLOCK and then do the
> (pfn + 1 ) - 1 games.

This sounds to me like you prefer the earlier v2 of this patch.

To my understanding, the difference between v2 and v3 is mainly
meaningful for architectures which don't define CONFIG_HAVE_MEMBLOCK.
One of them is ARCH=tile (for which kbuild test robot reported a compile
failure for v1 of my patch). Out of curiosity, I compiled
mm/page_alloc.o for ARCH=tile using [PATCH v2] and [PATCH v3], then
disassembled the objects using `objdump -D` and compared the results.

What I see is that the disassembled versions of memmap_init_zone()
fully match. To me, this means that the main difference between v2 and
v3 is about code readability. This is definitely an important aspect
too, but I must admit I don't have a very strong opinion here. I expect
this to be arbitrated by MM developers and eventually by the MM
gatekeepers/maintainers.

For the record, to achieve the same boot time improvement, the
alternatives on our side are:
- enable CONFIG_NUMA, just to benefit from the ~140ms speedup in boot
  time. Besides the increase of kernel image size, this also leads to
  annoying "Numa node 0:" noise in backtrace and stackdump output,
  which is not interesting for a non-NUMA system.
- ship the patch to our customers, in spite of not being accepted by MM
  community. This is a risky and unhealthy path, which we don't like.

That said, I really hope this won't be the last comment in the thread
and appropriate suggestions will come on how to go forward.

Thank you,
Eugeniu.

> I am usually against ifdefs in the code but that
> would require a larger surgery to memmap_init_zone.
> 
> To be completely honest, I would like to see HAVE_MEMBLOCK_NODE_MAP
> gone.
> 
> Other than that, the patch looks sane to me.
> 
> >  		}
> >  		if (!early_pfn_in_nid(pfn, nid))
> > -- 
> > 2.15.1
> > 
> 
> -- 
> Michal Hocko
> SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
  2018-02-03 12:24       ` Eugeniu Rosca
@ 2018-02-09  0:12         ` Eugeniu Rosca
  -1 siblings, 0 replies; 24+ messages in thread
From: Eugeniu Rosca @ 2018-02-09  0:12 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Matthew Wilcox, Andrew Morton, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Pavel Tatashin, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley,
	linux-kernel, linux-mm, Eugeniu Rosca

Hello linux-mm community,

Everyone of you might have and probably do have more important tasks in
their queue, compared to the boot optimization discussed in this email
chain.

Still, this patch is important for our organization, our community and
our suppliers. So, I still hope that we will be able to move forward
with it somehow, especially because it has received positive feedback
from Matthew Wilcox and Pavel Tatashin. Michal Hocko suggested that a
previous version of this patch might be more appropriate due to its
simplicity and slightly better readability.

Many thanks for your participation so far.

Best regards,
Eugeniu.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
@ 2018-02-09  0:12         ` Eugeniu Rosca
  0 siblings, 0 replies; 24+ messages in thread
From: Eugeniu Rosca @ 2018-02-09  0:12 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Matthew Wilcox, Andrew Morton, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Pavel Tatashin, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley,
	linux-kernel, linux-mm, Eugeniu Rosca

Hello linux-mm community,

Everyone of you might have and probably do have more important tasks in
their queue, compared to the boot optimization discussed in this email
chain.

Still, this patch is important for our organization, our community and
our suppliers. So, I still hope that we will be able to move forward
with it somehow, especially because it has received positive feedback
from Matthew Wilcox and Pavel Tatashin. Michal Hocko suggested that a
previous version of this patch might be more appropriate due to its
simplicity and slightly better readability.

Many thanks for your participation so far.

Best regards,
Eugeniu.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
  2018-02-03 12:24       ` Eugeniu Rosca
@ 2018-02-12 15:03         ` Michal Hocko
  -1 siblings, 0 replies; 24+ messages in thread
From: Michal Hocko @ 2018-02-12 15:03 UTC (permalink / raw)
  To: Eugeniu Rosca
  Cc: Matthew Wilcox, Andrew Morton, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Pavel Tatashin, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley,
	linux-kernel, linux-mm

On Sat 03-02-18 13:24:22, Eugeniu Rosca wrote:
[...]
> That said, I really hope this won't be the last comment in the thread
> and appropriate suggestions will come on how to go forward.

Just to make sure we are on the same page. I was suggesting the
following. The patch is slightly larger just because I move
memblock_next_valid_pfn around which I find better than sprinkling
ifdefs around. Please note I haven't tried to compile test this.

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 8be5077efb5f..0d3cb4c70858 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -187,7 +187,6 @@ int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
 			    unsigned long  *end_pfn);
 void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
 			  unsigned long *out_end_pfn, int *out_nid);
-unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
 
 /**
  * for_each_mem_pfn_range - early memory pfn range iterator
@@ -204,6 +203,8 @@ unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
 	     i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
 #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
 
+unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
+
 /**
  * for_each_free_mem_range - iterate through free memblock areas
  * @i: u64 used as loop variable
diff --git a/mm/memblock.c b/mm/memblock.c
index 5a9ca2a1751b..8a627d4fa5b2 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1101,34 +1101,6 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
 		*out_nid = r->nid;
 }
 
-unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
-						      unsigned long max_pfn)
-{
-	struct memblock_type *type = &memblock.memory;
-	unsigned int right = type->cnt;
-	unsigned int mid, left = 0;
-	phys_addr_t addr = PFN_PHYS(pfn + 1);
-
-	do {
-		mid = (right + left) / 2;
-
-		if (addr < type->regions[mid].base)
-			right = mid;
-		else if (addr >= (type->regions[mid].base +
-				  type->regions[mid].size))
-			left = mid + 1;
-		else {
-			/* addr is within the region, so pfn + 1 is valid */
-			return min(pfn + 1, max_pfn);
-		}
-	} while (left < right);
-
-	if (right == type->cnt)
-		return max_pfn;
-	else
-		return min(PHYS_PFN(type->regions[right].base), max_pfn);
-}
-
 /**
  * memblock_set_node - set node ID on memblock regions
  * @base: base of area to set node ID for
@@ -1160,6 +1132,34 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size,
 }
 #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
 
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
+						      unsigned long max_pfn)
+{
+	struct memblock_type *type = &memblock.memory;
+	unsigned int right = type->cnt;
+	unsigned int mid, left = 0;
+	phys_addr_t addr = PFN_PHYS(pfn + 1);
+
+	do {
+		mid = (right + left) / 2;
+
+		if (addr < type->regions[mid].base)
+			right = mid;
+		else if (addr >= (type->regions[mid].base +
+				  type->regions[mid].size))
+			left = mid + 1;
+		else {
+			/* addr is within the region, so pfn + 1 is valid */
+			return min(pfn + 1, max_pfn);
+		}
+	} while (left < right);
+
+	if (right == type->cnt)
+		return max_pfn;
+	else
+		return min(PHYS_PFN(type->regions[right].base), max_pfn);
+}
+
 static phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size,
 					phys_addr_t align, phys_addr_t start,
 					phys_addr_t end, int nid, ulong flags)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e2b42f603b1a..b7968fd5736f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5352,14 +5352,13 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			goto not_early;
 
 		if (!early_pfn_valid(pfn)) {
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 			/*
 			 * Skip to the pfn preceding the next valid one (or
 			 * end_pfn), such that we hit a valid pfn (or end_pfn)
 			 * on our next iteration of the loop.
 			 */
-			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
-#endif
+			if IS_ENABLED(HAVE_MEMBLOCK)
+				pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
 			continue;
 		}
 		if (!early_pfn_in_nid(pfn, nid))
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
@ 2018-02-12 15:03         ` Michal Hocko
  0 siblings, 0 replies; 24+ messages in thread
From: Michal Hocko @ 2018-02-12 15:03 UTC (permalink / raw)
  To: Eugeniu Rosca
  Cc: Matthew Wilcox, Andrew Morton, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Pavel Tatashin, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley,
	linux-kernel, linux-mm

On Sat 03-02-18 13:24:22, Eugeniu Rosca wrote:
[...]
> That said, I really hope this won't be the last comment in the thread
> and appropriate suggestions will come on how to go forward.

Just to make sure we are on the same page. I was suggesting the
following. The patch is slightly larger just because I move
memblock_next_valid_pfn around which I find better than sprinkling
ifdefs around. Please note I haven't tried to compile test this.

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 8be5077efb5f..0d3cb4c70858 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -187,7 +187,6 @@ int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
 			    unsigned long  *end_pfn);
 void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
 			  unsigned long *out_end_pfn, int *out_nid);
-unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
 
 /**
  * for_each_mem_pfn_range - early memory pfn range iterator
@@ -204,6 +203,8 @@ unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
 	     i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
 #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
 
+unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
+
 /**
  * for_each_free_mem_range - iterate through free memblock areas
  * @i: u64 used as loop variable
diff --git a/mm/memblock.c b/mm/memblock.c
index 5a9ca2a1751b..8a627d4fa5b2 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1101,34 +1101,6 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
 		*out_nid = r->nid;
 }
 
-unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
-						      unsigned long max_pfn)
-{
-	struct memblock_type *type = &memblock.memory;
-	unsigned int right = type->cnt;
-	unsigned int mid, left = 0;
-	phys_addr_t addr = PFN_PHYS(pfn + 1);
-
-	do {
-		mid = (right + left) / 2;
-
-		if (addr < type->regions[mid].base)
-			right = mid;
-		else if (addr >= (type->regions[mid].base +
-				  type->regions[mid].size))
-			left = mid + 1;
-		else {
-			/* addr is within the region, so pfn + 1 is valid */
-			return min(pfn + 1, max_pfn);
-		}
-	} while (left < right);
-
-	if (right == type->cnt)
-		return max_pfn;
-	else
-		return min(PHYS_PFN(type->regions[right].base), max_pfn);
-}
-
 /**
  * memblock_set_node - set node ID on memblock regions
  * @base: base of area to set node ID for
@@ -1160,6 +1132,34 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size,
 }
 #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
 
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
+						      unsigned long max_pfn)
+{
+	struct memblock_type *type = &memblock.memory;
+	unsigned int right = type->cnt;
+	unsigned int mid, left = 0;
+	phys_addr_t addr = PFN_PHYS(pfn + 1);
+
+	do {
+		mid = (right + left) / 2;
+
+		if (addr < type->regions[mid].base)
+			right = mid;
+		else if (addr >= (type->regions[mid].base +
+				  type->regions[mid].size))
+			left = mid + 1;
+		else {
+			/* addr is within the region, so pfn + 1 is valid */
+			return min(pfn + 1, max_pfn);
+		}
+	} while (left < right);
+
+	if (right == type->cnt)
+		return max_pfn;
+	else
+		return min(PHYS_PFN(type->regions[right].base), max_pfn);
+}
+
 static phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size,
 					phys_addr_t align, phys_addr_t start,
 					phys_addr_t end, int nid, ulong flags)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e2b42f603b1a..b7968fd5736f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5352,14 +5352,13 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			goto not_early;
 
 		if (!early_pfn_valid(pfn)) {
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 			/*
 			 * Skip to the pfn preceding the next valid one (or
 			 * end_pfn), such that we hit a valid pfn (or end_pfn)
 			 * on our next iteration of the loop.
 			 */
-			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
-#endif
+			if IS_ENABLED(HAVE_MEMBLOCK)
+				pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
 			continue;
 		}
 		if (!early_pfn_in_nid(pfn, nid))
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
  2018-02-12 15:03         ` Michal Hocko
@ 2018-02-12 16:16           ` Eugeniu Rosca
  -1 siblings, 0 replies; 24+ messages in thread
From: Eugeniu Rosca @ 2018-02-12 16:16 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Matthew Wilcox, Andrew Morton, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Pavel Tatashin, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, Eugeniu Rosca,
	James Hartley, linux-kernel, linux-mm

Hi Michal,

On Mon, Feb 12, 2018 at 04:03:14PM +0100, Michal Hocko wrote:
> On Sat 03-02-18 13:24:22, Eugeniu Rosca wrote:
> [...]
> > That said, I really hope this won't be the last comment in the thread
> > and appropriate suggestions will come on how to go forward.
> 
> Just to make sure we are on the same page. I was suggesting the
> following. The patch is slightly larger just because I move
> memblock_next_valid_pfn around which I find better than sprinkling
> ifdefs around. Please note I haven't tried to compile test this.

I got your point. So, I was wrong. You are not preferring v2 of this
patch, but suggest a new variant of it. For the record, I've also
build/boot-tested your variant with no issues. The reason I did not
make it my favorite is to allow reviewers to concentrate on what's
actually the essence of this change, i.e. relaxing the dependency of
memblock_next_valid_pfn() from HAVE_MEMBLOCK_NODE_MAP (which requires/
depends on NUMA) to HAVE_MEMBLOCK (which doesn't).

As I've said in some previous reply, I am open minded about which
variant is selected by MM people, since, from my point of view, all of
them do the same thing with variable degree of code readability.

For me it's not a problem to submit a new patch. I guess that a
prerequisite for this is to reach some agreement on what people think is
the best option, which I feel didn't occur yet.

Best regards,
Eugeniu.

> 
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 8be5077efb5f..0d3cb4c70858 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -187,7 +187,6 @@ int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
>  			    unsigned long  *end_pfn);
>  void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
>  			  unsigned long *out_end_pfn, int *out_nid);
> -unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
>  
>  /**
>   * for_each_mem_pfn_range - early memory pfn range iterator
> @@ -204,6 +203,8 @@ unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
>  	     i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
>  #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
>  
> +unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
> +
>  /**
>   * for_each_free_mem_range - iterate through free memblock areas
>   * @i: u64 used as loop variable
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 5a9ca2a1751b..8a627d4fa5b2 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1101,34 +1101,6 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
>  		*out_nid = r->nid;
>  }
>  
> -unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
> -						      unsigned long max_pfn)
> -{
> -	struct memblock_type *type = &memblock.memory;
> -	unsigned int right = type->cnt;
> -	unsigned int mid, left = 0;
> -	phys_addr_t addr = PFN_PHYS(pfn + 1);
> -
> -	do {
> -		mid = (right + left) / 2;
> -
> -		if (addr < type->regions[mid].base)
> -			right = mid;
> -		else if (addr >= (type->regions[mid].base +
> -				  type->regions[mid].size))
> -			left = mid + 1;
> -		else {
> -			/* addr is within the region, so pfn + 1 is valid */
> -			return min(pfn + 1, max_pfn);
> -		}
> -	} while (left < right);
> -
> -	if (right == type->cnt)
> -		return max_pfn;
> -	else
> -		return min(PHYS_PFN(type->regions[right].base), max_pfn);
> -}
> -
>  /**
>   * memblock_set_node - set node ID on memblock regions
>   * @base: base of area to set node ID for
> @@ -1160,6 +1132,34 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size,
>  }
>  #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
>  
> +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
> +						      unsigned long max_pfn)
> +{
> +	struct memblock_type *type = &memblock.memory;
> +	unsigned int right = type->cnt;
> +	unsigned int mid, left = 0;
> +	phys_addr_t addr = PFN_PHYS(pfn + 1);
> +
> +	do {
> +		mid = (right + left) / 2;
> +
> +		if (addr < type->regions[mid].base)
> +			right = mid;
> +		else if (addr >= (type->regions[mid].base +
> +				  type->regions[mid].size))
> +			left = mid + 1;
> +		else {
> +			/* addr is within the region, so pfn + 1 is valid */
> +			return min(pfn + 1, max_pfn);
> +		}
> +	} while (left < right);
> +
> +	if (right == type->cnt)
> +		return max_pfn;
> +	else
> +		return min(PHYS_PFN(type->regions[right].base), max_pfn);
> +}
> +
>  static phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size,
>  					phys_addr_t align, phys_addr_t start,
>  					phys_addr_t end, int nid, ulong flags)
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index e2b42f603b1a..b7968fd5736f 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5352,14 +5352,13 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>  			goto not_early;
>  
>  		if (!early_pfn_valid(pfn)) {
> -#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
>  			/*
>  			 * Skip to the pfn preceding the next valid one (or
>  			 * end_pfn), such that we hit a valid pfn (or end_pfn)
>  			 * on our next iteration of the loop.
>  			 */
> -			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
> -#endif
> +			if IS_ENABLED(HAVE_MEMBLOCK)
> +				pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
>  			continue;
>  		}
>  		if (!early_pfn_in_nid(pfn, nid))
> -- 
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
@ 2018-02-12 16:16           ` Eugeniu Rosca
  0 siblings, 0 replies; 24+ messages in thread
From: Eugeniu Rosca @ 2018-02-12 16:16 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Matthew Wilcox, Andrew Morton, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Pavel Tatashin, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, Eugeniu Rosca,
	James Hartley, linux-kernel, linux-mm

Hi Michal,

On Mon, Feb 12, 2018 at 04:03:14PM +0100, Michal Hocko wrote:
> On Sat 03-02-18 13:24:22, Eugeniu Rosca wrote:
> [...]
> > That said, I really hope this won't be the last comment in the thread
> > and appropriate suggestions will come on how to go forward.
> 
> Just to make sure we are on the same page. I was suggesting the
> following. The patch is slightly larger just because I move
> memblock_next_valid_pfn around which I find better than sprinkling
> ifdefs around. Please note I haven't tried to compile test this.

I got your point. So, I was wrong. You are not preferring v2 of this
patch, but suggest a new variant of it. For the record, I've also
build/boot-tested your variant with no issues. The reason I did not
make it my favorite is to allow reviewers to concentrate on what's
actually the essence of this change, i.e. relaxing the dependency of
memblock_next_valid_pfn() from HAVE_MEMBLOCK_NODE_MAP (which requires/
depends on NUMA) to HAVE_MEMBLOCK (which doesn't).

As I've said in some previous reply, I am open minded about which
variant is selected by MM people, since, from my point of view, all of
them do the same thing with variable degree of code readability.

For me it's not a problem to submit a new patch. I guess that a
prerequisite for this is to reach some agreement on what people think is
the best option, which I feel didn't occur yet.

Best regards,
Eugeniu.

> 
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 8be5077efb5f..0d3cb4c70858 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -187,7 +187,6 @@ int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
>  			    unsigned long  *end_pfn);
>  void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
>  			  unsigned long *out_end_pfn, int *out_nid);
> -unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
>  
>  /**
>   * for_each_mem_pfn_range - early memory pfn range iterator
> @@ -204,6 +203,8 @@ unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
>  	     i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
>  #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
>  
> +unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
> +
>  /**
>   * for_each_free_mem_range - iterate through free memblock areas
>   * @i: u64 used as loop variable
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 5a9ca2a1751b..8a627d4fa5b2 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1101,34 +1101,6 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
>  		*out_nid = r->nid;
>  }
>  
> -unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
> -						      unsigned long max_pfn)
> -{
> -	struct memblock_type *type = &memblock.memory;
> -	unsigned int right = type->cnt;
> -	unsigned int mid, left = 0;
> -	phys_addr_t addr = PFN_PHYS(pfn + 1);
> -
> -	do {
> -		mid = (right + left) / 2;
> -
> -		if (addr < type->regions[mid].base)
> -			right = mid;
> -		else if (addr >= (type->regions[mid].base +
> -				  type->regions[mid].size))
> -			left = mid + 1;
> -		else {
> -			/* addr is within the region, so pfn + 1 is valid */
> -			return min(pfn + 1, max_pfn);
> -		}
> -	} while (left < right);
> -
> -	if (right == type->cnt)
> -		return max_pfn;
> -	else
> -		return min(PHYS_PFN(type->regions[right].base), max_pfn);
> -}
> -
>  /**
>   * memblock_set_node - set node ID on memblock regions
>   * @base: base of area to set node ID for
> @@ -1160,6 +1132,34 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size,
>  }
>  #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
>  
> +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
> +						      unsigned long max_pfn)
> +{
> +	struct memblock_type *type = &memblock.memory;
> +	unsigned int right = type->cnt;
> +	unsigned int mid, left = 0;
> +	phys_addr_t addr = PFN_PHYS(pfn + 1);
> +
> +	do {
> +		mid = (right + left) / 2;
> +
> +		if (addr < type->regions[mid].base)
> +			right = mid;
> +		else if (addr >= (type->regions[mid].base +
> +				  type->regions[mid].size))
> +			left = mid + 1;
> +		else {
> +			/* addr is within the region, so pfn + 1 is valid */
> +			return min(pfn + 1, max_pfn);
> +		}
> +	} while (left < right);
> +
> +	if (right == type->cnt)
> +		return max_pfn;
> +	else
> +		return min(PHYS_PFN(type->regions[right].base), max_pfn);
> +}
> +
>  static phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size,
>  					phys_addr_t align, phys_addr_t start,
>  					phys_addr_t end, int nid, ulong flags)
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index e2b42f603b1a..b7968fd5736f 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5352,14 +5352,13 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>  			goto not_early;
>  
>  		if (!early_pfn_valid(pfn)) {
> -#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
>  			/*
>  			 * Skip to the pfn preceding the next valid one (or
>  			 * end_pfn), such that we hit a valid pfn (or end_pfn)
>  			 * on our next iteration of the loop.
>  			 */
> -			pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
> -#endif
> +			if IS_ENABLED(HAVE_MEMBLOCK)
> +				pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
>  			continue;
>  		}
>  		if (!early_pfn_in_nid(pfn, nid))
> -- 
> Michal Hocko
> SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
  2018-02-12 16:16           ` Eugeniu Rosca
@ 2018-02-12 18:47             ` Michal Hocko
  -1 siblings, 0 replies; 24+ messages in thread
From: Michal Hocko @ 2018-02-12 18:47 UTC (permalink / raw)
  To: Eugeniu Rosca
  Cc: Matthew Wilcox, Andrew Morton, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Pavel Tatashin, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley,
	linux-kernel, linux-mm

On Mon 12-02-18 17:16:40, Eugeniu Rosca wrote:
> Hi Michal,
> 
> On Mon, Feb 12, 2018 at 04:03:14PM +0100, Michal Hocko wrote:
> > On Sat 03-02-18 13:24:22, Eugeniu Rosca wrote:
> > [...]
> > > That said, I really hope this won't be the last comment in the thread
> > > and appropriate suggestions will come on how to go forward.
> > 
> > Just to make sure we are on the same page. I was suggesting the
> > following. The patch is slightly larger just because I move
> > memblock_next_valid_pfn around which I find better than sprinkling
> > ifdefs around. Please note I haven't tried to compile test this.
> 
> I got your point. So, I was wrong. You are not preferring v2 of this
> patch, but suggest a new variant of it. For the record, I've also
> build/boot-tested your variant with no issues. The reason I did not
> make it my favorite is to allow reviewers to concentrate on what's
> actually the essence of this change, i.e. relaxing the dependency of
> memblock_next_valid_pfn() from HAVE_MEMBLOCK_NODE_MAP (which requires/
> depends on NUMA) to HAVE_MEMBLOCK (which doesn't).

Yes, and that makes perfect sense.

> As I've said in some previous reply, I am open minded about which
> variant is selected by MM people, since, from my point of view, all of
> them do the same thing with variable degree of code readability.

Agreed. I just wanted to reduce to necessity to define
memblock_next_valid_pfn for !CONFIG_HAVE_MEMBLOCK. IS_ENABLED check also
nicely hides the ifdefery. I also prefer to have more compact ifdef
blocks rather than smaller ones split by other functions.

> For me it's not a problem to submit a new patch. I guess that a
> prerequisite for this is to reach some agreement on what people think is
> the best option, which I feel didn't occur yet.

I do not have a _strong_ preference here as well. So I will leave the
decision to you.

In any case feel free to add
Acked-by: Michal Hocko <mhocko@suse.com>

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
@ 2018-02-12 18:47             ` Michal Hocko
  0 siblings, 0 replies; 24+ messages in thread
From: Michal Hocko @ 2018-02-12 18:47 UTC (permalink / raw)
  To: Eugeniu Rosca
  Cc: Matthew Wilcox, Andrew Morton, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Pavel Tatashin, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley,
	linux-kernel, linux-mm

On Mon 12-02-18 17:16:40, Eugeniu Rosca wrote:
> Hi Michal,
> 
> On Mon, Feb 12, 2018 at 04:03:14PM +0100, Michal Hocko wrote:
> > On Sat 03-02-18 13:24:22, Eugeniu Rosca wrote:
> > [...]
> > > That said, I really hope this won't be the last comment in the thread
> > > and appropriate suggestions will come on how to go forward.
> > 
> > Just to make sure we are on the same page. I was suggesting the
> > following. The patch is slightly larger just because I move
> > memblock_next_valid_pfn around which I find better than sprinkling
> > ifdefs around. Please note I haven't tried to compile test this.
> 
> I got your point. So, I was wrong. You are not preferring v2 of this
> patch, but suggest a new variant of it. For the record, I've also
> build/boot-tested your variant with no issues. The reason I did not
> make it my favorite is to allow reviewers to concentrate on what's
> actually the essence of this change, i.e. relaxing the dependency of
> memblock_next_valid_pfn() from HAVE_MEMBLOCK_NODE_MAP (which requires/
> depends on NUMA) to HAVE_MEMBLOCK (which doesn't).

Yes, and that makes perfect sense.

> As I've said in some previous reply, I am open minded about which
> variant is selected by MM people, since, from my point of view, all of
> them do the same thing with variable degree of code readability.

Agreed. I just wanted to reduce to necessity to define
memblock_next_valid_pfn for !CONFIG_HAVE_MEMBLOCK. IS_ENABLED check also
nicely hides the ifdefery. I also prefer to have more compact ifdef
blocks rather than smaller ones split by other functions.

> For me it's not a problem to submit a new patch. I guess that a
> prerequisite for this is to reach some agreement on what people think is
> the best option, which I feel didn't occur yet.

I do not have a _strong_ preference here as well. So I will leave the
decision to you.

In any case feel free to add
Acked-by: Michal Hocko <mhocko@suse.com>

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
  2018-02-12 18:47             ` Michal Hocko
@ 2018-02-17  0:43               ` Andrew Morton
  -1 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2018-02-17  0:43 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Eugeniu Rosca, Matthew Wilcox, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Pavel Tatashin, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley,
	linux-kernel, linux-mm

On Mon, 12 Feb 2018 19:47:59 +0100 Michal Hocko <mhocko@kernel.org> wrote:

> > prerequisite for this is to reach some agreement on what people think is
> > the best option, which I feel didn't occur yet.
> 
> I do not have a _strong_ preference here as well. So I will leave the
> decision to you.
> 
> In any case feel free to add
> Acked-by: Michal Hocko <mhocko@suse.com>

I find Michal's version to be a little tidier.

Eugeniu, please send Michal's patch at me with a fresh changelog, with
your signed-off-by and your tested-by and your reported-by and we may
as well add Michal's (thus-far-missing) signed-off-by ;)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
@ 2018-02-17  0:43               ` Andrew Morton
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2018-02-17  0:43 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Eugeniu Rosca, Matthew Wilcox, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Pavel Tatashin, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley,
	linux-kernel, linux-mm

On Mon, 12 Feb 2018 19:47:59 +0100 Michal Hocko <mhocko@kernel.org> wrote:

> > prerequisite for this is to reach some agreement on what people think is
> > the best option, which I feel didn't occur yet.
> 
> I do not have a _strong_ preference here as well. So I will leave the
> decision to you.
> 
> In any case feel free to add
> Acked-by: Michal Hocko <mhocko@suse.com>

I find Michal's version to be a little tidier.

Eugeniu, please send Michal's patch at me with a fresh changelog, with
your signed-off-by and your tested-by and your reported-by and we may
as well add Michal's (thus-far-missing) signed-off-by ;)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
  2018-02-17  0:43               ` Andrew Morton
@ 2018-02-17 22:48                 ` Eugeniu Rosca
  -1 siblings, 0 replies; 24+ messages in thread
From: Eugeniu Rosca @ 2018-02-17 22:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Matthew Wilcox, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Pavel Tatashin, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley,
	linux-kernel, linux-mm, Eugeniu Rosca, Eugeniu Rosca

Hello Andrew,

On Fri, Feb 16, 2018 at 04:43:28PM -0800, Andrew Morton wrote:
> On Mon, 12 Feb 2018 19:47:59 +0100 Michal Hocko <mhocko@kernel.org> wrote:
> 
> > > prerequisite for this is to reach some agreement on what people think is
> > > the best option, which I feel didn't occur yet.
> > 
> > I do not have a _strong_ preference here as well. So I will leave the
> > decision to you.
> > 
> > In any case feel free to add
> > Acked-by: Michal Hocko <mhocko@suse.com>
> 
> I find Michal's version to be a little tidier.
> 
> Eugeniu, please send Michal's patch at me with a fresh changelog, with
> your signed-off-by and your tested-by and your reported-by and we may
> as well add Michal's (thus-far-missing) signed-off-by ;)

I only needed to apply below touch to Michal's patch, which otherwise
works fine for me. I've sent it to you as v4. Thank you very much for
picking it.

Best regards,
Eugeniu.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cb3f844092ad..66891b3fb144 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5356,7 +5356,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
                         * end_pfn), such that we hit a valid pfn (or end_pfn)
                         * on our next iteration of the loop.
                         */
-                       if IS_ENABLED(HAVE_MEMBLOCK)
+                       if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK))
                                pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
                        continue;
                }

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA
@ 2018-02-17 22:48                 ` Eugeniu Rosca
  0 siblings, 0 replies; 24+ messages in thread
From: Eugeniu Rosca @ 2018-02-17 22:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Matthew Wilcox, Catalin Marinas, Ard Biesheuvel,
	Steven Sistare, AKASHI Takahiro, Pavel Tatashin, Gioh Kim,
	Heiko Carstens, Wei Yang, Miles Chen, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Paul Burton, James Hartley,
	linux-kernel, linux-mm, Eugeniu Rosca, Eugeniu Rosca

Hello Andrew,

On Fri, Feb 16, 2018 at 04:43:28PM -0800, Andrew Morton wrote:
> On Mon, 12 Feb 2018 19:47:59 +0100 Michal Hocko <mhocko@kernel.org> wrote:
> 
> > > prerequisite for this is to reach some agreement on what people think is
> > > the best option, which I feel didn't occur yet.
> > 
> > I do not have a _strong_ preference here as well. So I will leave the
> > decision to you.
> > 
> > In any case feel free to add
> > Acked-by: Michal Hocko <mhocko@suse.com>
> 
> I find Michal's version to be a little tidier.
> 
> Eugeniu, please send Michal's patch at me with a fresh changelog, with
> your signed-off-by and your tested-by and your reported-by and we may
> as well add Michal's (thus-far-missing) signed-off-by ;)

I only needed to apply below touch to Michal's patch, which otherwise
works fine for me. I've sent it to you as v4. Thank you very much for
picking it.

Best regards,
Eugeniu.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cb3f844092ad..66891b3fb144 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5356,7 +5356,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
                         * end_pfn), such that we hit a valid pfn (or end_pfn)
                         * on our next iteration of the loop.
                         */
-                       if IS_ENABLED(HAVE_MEMBLOCK)
+                       if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK))
                                pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
                        continue;
                }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-02-17 22:48 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-24 14:35 [PATCH v3 0/1] Skip over regions of invalid pfns with NUMA=n && HAVE_MEMBLOCK=y Eugeniu Rosca
2018-01-24 14:35 ` Eugeniu Rosca
2018-01-24 14:35 ` [PATCH v3 1/1] mm: page_alloc: skip over regions of invalid pfns on UMA Eugeniu Rosca
2018-01-24 14:35   ` Eugeniu Rosca
2018-01-24 16:27   ` Pavel Tatashin
2018-01-24 16:27     ` Pavel Tatashin
2018-01-29 17:06     ` Eugeniu Rosca
2018-01-29 17:06       ` Eugeniu Rosca
2018-01-29 18:47   ` Michal Hocko
2018-01-29 18:47     ` Michal Hocko
2018-02-03 12:24     ` Eugeniu Rosca
2018-02-03 12:24       ` Eugeniu Rosca
2018-02-09  0:12       ` Eugeniu Rosca
2018-02-09  0:12         ` Eugeniu Rosca
2018-02-12 15:03       ` Michal Hocko
2018-02-12 15:03         ` Michal Hocko
2018-02-12 16:16         ` Eugeniu Rosca
2018-02-12 16:16           ` Eugeniu Rosca
2018-02-12 18:47           ` Michal Hocko
2018-02-12 18:47             ` Michal Hocko
2018-02-17  0:43             ` Andrew Morton
2018-02-17  0:43               ` Andrew Morton
2018-02-17 22:48               ` Eugeniu Rosca
2018-02-17 22:48                 ` Eugeniu Rosca

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.