* [PATCH] mm, meminit: Recalculate pcpu batch and high limits after init completes -fix
@ 2019-10-23 8:47 Mel Gorman
2019-10-23 8:57 ` David Hildenbrand
0 siblings, 1 reply; 3+ messages in thread
From: Mel Gorman @ 2019-10-23 8:47 UTC (permalink / raw)
To: Andrew Morton
Cc: Michal Hocko, Vlastimil Babka, Thomas Gleixner, Matt Fleming,
Borislav Petkov, Linux-MM, Linux Kernel Mailing List, lkp
LKP reported the following build problem from two hunks that did not
survive the reshuffling of the series reordering.
ld: mm/page_alloc.o: in function `page_alloc_init_late':
mm/page_alloc.c:1956: undefined reference to `zone_pcp_update'
This is a fix for the mmotm patch
mm-meminit-recalculate-pcpu-batch-and-high-limits-after-init-completes.patch
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f9488efff680..12f3ce09d33d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8627,7 +8627,6 @@ void free_contig_range(unsigned long pfn, unsigned int nr_pages)
WARN(count != 0, "%d pages are still in use!\n", count);
}
-#ifdef CONFIG_MEMORY_HOTPLUG
/*
* The zone indicated has a new number of managed_pages; batch sizes and percpu
* page high values need to be recalulated.
@@ -8638,7 +8637,6 @@ void __meminit zone_pcp_update(struct zone *zone)
__zone_pcp_update(zone);
mutex_unlock(&pcp_batch_high_lock);
}
-#endif
void zone_pcp_reset(struct zone *zone)
{
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] mm, meminit: Recalculate pcpu batch and high limits after init completes -fix
2019-10-23 8:47 [PATCH] mm, meminit: Recalculate pcpu batch and high limits after init completes -fix Mel Gorman
@ 2019-10-23 8:57 ` David Hildenbrand
0 siblings, 0 replies; 3+ messages in thread
From: David Hildenbrand @ 2019-10-23 8:57 UTC (permalink / raw)
To: Mel Gorman, Andrew Morton
Cc: Michal Hocko, Vlastimil Babka, Thomas Gleixner, Matt Fleming,
Borislav Petkov, Linux-MM, Linux Kernel Mailing List, lkp
On 23.10.19 10:47, Mel Gorman wrote:
> LKP reported the following build problem from two hunks that did not
> survive the reshuffling of the series reordering.
>
> ld: mm/page_alloc.o: in function `page_alloc_init_late':
> mm/page_alloc.c:1956: undefined reference to `zone_pcp_update'
>
> This is a fix for the mmotm patch
> mm-meminit-recalculate-pcpu-batch-and-high-limits-after-init-completes.patch
>
> Reported-by: kbuild test robot <lkp@intel.com>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> ---
> mm/page_alloc.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index f9488efff680..12f3ce09d33d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8627,7 +8627,6 @@ void free_contig_range(unsigned long pfn, unsigned int nr_pages)
> WARN(count != 0, "%d pages are still in use!\n", count);
> }
>
> -#ifdef CONFIG_MEMORY_HOTPLUG
> /*
> * The zone indicated has a new number of managed_pages; batch sizes and percpu
> * page high values need to be recalulated.
> @@ -8638,7 +8637,6 @@ void __meminit zone_pcp_update(struct zone *zone)
> __zone_pcp_update(zone);
> mutex_unlock(&pcp_batch_high_lock);
> }
> -#endif
>
> void zone_pcp_reset(struct zone *zone)
> {
>
Acked-by: David Hildenbrand <david@redhat.com>
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 0/3] Recalculate per-cpu page allocator batch and high limits after deferred meminit v2
@ 2019-10-21 9:48 Mel Gorman
2019-10-21 9:48 ` [PATCH 1/3] mm, meminit: Recalculate pcpu batch and high limits after init completes Mel Gorman
0 siblings, 1 reply; 3+ messages in thread
From: Mel Gorman @ 2019-10-21 9:48 UTC (permalink / raw)
To: Andrew Morton
Cc: Michal Hocko, Vlastimil Babka, Thomas Gleixner, Matt Fleming,
Borislav Petkov, Linux-MM, Linux Kernel Mailing List, Mel Gorman
This is an updated series that addresses some review feedback and an
LKP warning. I did not preserve Michal Hocko's ack for the fix as it
has changed. This series replaces the following patches in mmotm
mm-pcp-share-common-code-between-memory-hotplug-and-percpu-sysctl-handler.patch
mm-meminit-recalculate-pcpu-batch-and-high-limits-after-init-completes.patch
mm-pcpu-make-zone-pcp-updates-and-reset-internal-to-the-mm.patch
Changelog since V1
o Fix a "might sleep" warning
o Reorder for easier backporting
A private report stated that system CPU usage was excessive on an AMD
EPYC 2 machine while building kernels with much longer build times than
expected. The issue is partially explained by high zone lock contention
due to the per-cpu page allocator batch and high limits being calculated
incorrectly. This series addresses a large chunk of the problem. Patch
1 is the real fix and the other two are cosmetic issues noticed while
implementing the fix.
include/linux/mm.h | 3 ---
mm/internal.h | 3 +++
mm/page_alloc.c | 31 ++++++++++++++++++++-----------
3 files changed, 23 insertions(+), 14 deletions(-)
--
2.16.4
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 1/3] mm, meminit: Recalculate pcpu batch and high limits after init completes
2019-10-21 9:48 [PATCH 0/3] Recalculate per-cpu page allocator batch and high limits after deferred meminit v2 Mel Gorman
@ 2019-10-21 9:48 ` Mel Gorman
2019-10-21 19:39 ` [PATCH] mm, meminit: Recalculate pcpu batch and high limits after init completes -fix Mel Gorman
0 siblings, 1 reply; 3+ messages in thread
From: Mel Gorman @ 2019-10-21 9:48 UTC (permalink / raw)
To: Andrew Morton
Cc: Michal Hocko, Vlastimil Babka, Thomas Gleixner, Matt Fleming,
Borislav Petkov, Linux-MM, Linux Kernel Mailing List, Mel Gorman
Deferred memory initialisation updates zone->managed_pages during
the initialisation phase but before that finishes, the per-cpu page
allocator (pcpu) calculates the number of pages allocated/freed in
batches as well as the maximum number of pages allowed on a per-cpu list.
As zone->managed_pages is not up to date yet, the pcpu initialisation
calculates inappropriately low batch and high values.
This increases zone lock contention quite severely in some cases with the
degree of severity depending on how many CPUs share a local zone and the
size of the zone. A private report indicated that kernel build times were
excessive with extremely high system CPU usage. A perf profile indicated
that a large chunk of time was lost on zone->lock contention.
This patch recalculates the pcpu batch and high values after deferred
initialisation completes for every populated zone in the system. It
was tested on a 2-socket AMD EPYC 2 machine using a kernel compilation
workload -- allmodconfig and all available CPUs.
mmtests configuration: config-workload-kernbench-max
Configuration was modified to build on a fresh XFS partition.
kernbench
5.4.0-rc3 5.4.0-rc3
vanilla resetpcpu-v2
Amean user-256 13249.50 ( 0.00%) 16401.31 * -23.79%*
Amean syst-256 14760.30 ( 0.00%) 4448.39 * 69.86%*
Amean elsp-256 162.42 ( 0.00%) 119.13 * 26.65%*
Stddev user-256 42.97 ( 0.00%) 19.15 ( 55.43%)
Stddev syst-256 336.87 ( 0.00%) 6.71 ( 98.01%)
Stddev elsp-256 2.46 ( 0.00%) 0.39 ( 84.03%)
5.4.0-rc3 5.4.0-rc3
vanilla resetpcpu-v2
Duration User 39766.24 49221.79
Duration System 44298.10 13361.67
Duration Elapsed 519.11 388.87
The patch reduces system CPU usage by 69.86% and total build time by
26.65%. The variance of system CPU usage is also much reduced.
Before, this was the breakdown of batch and high values over all zones was.
256 batch: 1
256 batch: 63
512 batch: 7
256 high: 0
256 high: 378
512 high: 42
512 pcpu pagesets had a batch limit of 7 and a high limit of 42. After the patch
256 batch: 1
768 batch: 63
256 high: 0
768 high: 378
Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c0b2e0306720..f972076d0f6b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1947,6 +1947,14 @@ void __init page_alloc_init_late(void)
/* Block until all are initialised */
wait_for_completion(&pgdat_init_all_done_comp);
+ /*
+ * The number of managed pages has changed due to the initialisation
+ * so the pcpu batch and high limits needs to be updated or the limits
+ * will be artificially small.
+ */
+ for_each_populated_zone(zone)
+ zone_pcp_update(zone);
+
/*
* We initialized the rest of the deferred pages. Permanently disable
* on-demand struct page initialization.
--
2.16.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH] mm, meminit: Recalculate pcpu batch and high limits after init completes -fix
2019-10-21 9:48 ` [PATCH 1/3] mm, meminit: Recalculate pcpu batch and high limits after init completes Mel Gorman
@ 2019-10-21 19:39 ` Mel Gorman
0 siblings, 0 replies; 3+ messages in thread
From: Mel Gorman @ 2019-10-21 19:39 UTC (permalink / raw)
To: Andrew Morton
Cc: Michal Hocko, Vlastimil Babka, Thomas Gleixner, Matt Fleming,
Borislav Petkov, Linux-MM, Linux Kernel Mailing List, lkp
LKP reported the following build problem from two hunks that did not
survive the reshuffling of the series reordering.
ld: mm/page_alloc.o: in function `page_alloc_init_late':
mm/page_alloc.c:1956: undefined reference to `zone_pcp_update'
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4179376bb336..e9926bf77463 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8524,7 +8524,6 @@ void free_contig_range(unsigned long pfn, unsigned int nr_pages)
WARN(count != 0, "%d pages are still in use!\n", count);
}
-#ifdef CONFIG_MEMORY_HOTPLUG
/*
* The zone indicated has a new number of managed_pages; batch sizes and percpu
* page high values need to be recalulated.
@@ -8535,7 +8534,6 @@ void __meminit zone_pcp_update(struct zone *zone)
__zone_pcp_update(zone);
mutex_unlock(&pcp_batch_high_lock);
}
-#endif
void zone_pcp_reset(struct zone *zone)
{
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-10-23 8:57 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-23 8:47 [PATCH] mm, meminit: Recalculate pcpu batch and high limits after init completes -fix Mel Gorman
2019-10-23 8:57 ` David Hildenbrand
-- strict thread matches above, loose matches on Subject: below --
2019-10-21 9:48 [PATCH 0/3] Recalculate per-cpu page allocator batch and high limits after deferred meminit v2 Mel Gorman
2019-10-21 9:48 ` [PATCH 1/3] mm, meminit: Recalculate pcpu batch and high limits after init completes Mel Gorman
2019-10-21 19:39 ` [PATCH] mm, meminit: Recalculate pcpu batch and high limits after init completes -fix Mel Gorman
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.