* Re: [PATCH 1/3] mm, meminit: Recalculate pcpu batch and high limits after init completes
2019-10-21 9:48 ` [PATCH 1/3] mm, meminit: Recalculate pcpu batch and high limits after init completes Mel Gorman
@ 2019-10-21 10:27 ` Michal Hocko
2019-10-21 11:42 ` Vlastimil Babka
` (3 subsequent siblings)
4 siblings, 0 replies; 13+ messages in thread
From: Michal Hocko @ 2019-10-21 10:27 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Vlastimil Babka, Thomas Gleixner, Matt Fleming,
Borislav Petkov, Linux-MM, Linux Kernel Mailing List
On Mon 21-10-19 10:48:06, Mel Gorman wrote:
> Deferred memory initialisation updates zone->managed_pages during
> the initialisation phase but before that finishes, the per-cpu page
> allocator (pcpu) calculates the number of pages allocated/freed in
> batches as well as the maximum number of pages allowed on a per-cpu list.
> As zone->managed_pages is not up to date yet, the pcpu initialisation
> calculates inappropriately low batch and high values.
>
> This increases zone lock contention quite severely in some cases with the
> degree of severity depending on how many CPUs share a local zone and the
> size of the zone. A private report indicated that kernel build times were
> excessive with extremely high system CPU usage. A perf profile indicated
> that a large chunk of time was lost on zone->lock contention.
>
> This patch recalculates the pcpu batch and high values after deferred
> initialisation completes for every populated zone in the system. It
> was tested on a 2-socket AMD EPYC 2 machine using a kernel compilation
> workload -- allmodconfig and all available CPUs.
>
> mmtests configuration: config-workload-kernbench-max
> Configuration was modified to build on a fresh XFS partition.
>
> kernbench
> 5.4.0-rc3 5.4.0-rc3
> vanilla resetpcpu-v2
> Amean user-256 13249.50 ( 0.00%) 16401.31 * -23.79%*
> Amean syst-256 14760.30 ( 0.00%) 4448.39 * 69.86%*
> Amean elsp-256 162.42 ( 0.00%) 119.13 * 26.65%*
> Stddev user-256 42.97 ( 0.00%) 19.15 ( 55.43%)
> Stddev syst-256 336.87 ( 0.00%) 6.71 ( 98.01%)
> Stddev elsp-256 2.46 ( 0.00%) 0.39 ( 84.03%)
>
> 5.4.0-rc3 5.4.0-rc3
> vanilla resetpcpu-v2
> Duration User 39766.24 49221.79
> Duration System 44298.10 13361.67
> Duration Elapsed 519.11 388.87
>
> The patch reduces system CPU usage by 69.86% and total build time by
> 26.65%. The variance of system CPU usage is also much reduced.
>
> Before, this was the breakdown of batch and high values over all zones was.
>
> 256 batch: 1
> 256 batch: 63
> 512 batch: 7
> 256 high: 0
> 256 high: 378
> 512 high: 42
>
> 512 pcpu pagesets had a batch limit of 7 and a high limit of 42. After the patch
>
> 256 batch: 1
> 768 batch: 63
> 256 high: 0
> 768 high: 378
>
> Cc: stable@vger.kernel.org # v4.1+
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Michal Hocko <mhocko@suse.com>
> ---
> mm/page_alloc.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c0b2e0306720..f972076d0f6b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1947,6 +1947,14 @@ void __init page_alloc_init_late(void)
> /* Block until all are initialised */
> wait_for_completion(&pgdat_init_all_done_comp);
>
> + /*
> + * The number of managed pages has changed due to the initialisation
> + * so the pcpu batch and high limits needs to be updated or the limits
> + * will be artificially small.
> + */
> + for_each_populated_zone(zone)
> + zone_pcp_update(zone);
> +
> /*
> * We initialized the rest of the deferred pages. Permanently disable
> * on-demand struct page initialization.
> --
> 2.16.4
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] mm, meminit: Recalculate pcpu batch and high limits after init completes
2019-10-21 9:48 ` [PATCH 1/3] mm, meminit: Recalculate pcpu batch and high limits after init completes Mel Gorman
2019-10-21 10:27 ` Michal Hocko
@ 2019-10-21 11:42 ` Vlastimil Babka
2019-10-21 14:01 ` Qian Cai
` (2 subsequent siblings)
4 siblings, 0 replies; 13+ messages in thread
From: Vlastimil Babka @ 2019-10-21 11:42 UTC (permalink / raw)
To: Mel Gorman, Andrew Morton
Cc: Michal Hocko, Thomas Gleixner, Matt Fleming, Borislav Petkov,
Linux-MM, Linux Kernel Mailing List
On 10/21/19 11:48 AM, Mel Gorman wrote:
> Deferred memory initialisation updates zone->managed_pages during
> the initialisation phase but before that finishes, the per-cpu page
> allocator (pcpu) calculates the number of pages allocated/freed in
> batches as well as the maximum number of pages allowed on a per-cpu list.
> As zone->managed_pages is not up to date yet, the pcpu initialisation
> calculates inappropriately low batch and high values.
>
> This increases zone lock contention quite severely in some cases with the
> degree of severity depending on how many CPUs share a local zone and the
> size of the zone. A private report indicated that kernel build times were
> excessive with extremely high system CPU usage. A perf profile indicated
> that a large chunk of time was lost on zone->lock contention.
>
> This patch recalculates the pcpu batch and high values after deferred
> initialisation completes for every populated zone in the system. It
> was tested on a 2-socket AMD EPYC 2 machine using a kernel compilation
> workload -- allmodconfig and all available CPUs.
>
> mmtests configuration: config-workload-kernbench-max
> Configuration was modified to build on a fresh XFS partition.
>
> kernbench
> 5.4.0-rc3 5.4.0-rc3
> vanilla resetpcpu-v2
> Amean user-256 13249.50 ( 0.00%) 16401.31 * -23.79%*
> Amean syst-256 14760.30 ( 0.00%) 4448.39 * 69.86%*
> Amean elsp-256 162.42 ( 0.00%) 119.13 * 26.65%*
> Stddev user-256 42.97 ( 0.00%) 19.15 ( 55.43%)
> Stddev syst-256 336.87 ( 0.00%) 6.71 ( 98.01%)
> Stddev elsp-256 2.46 ( 0.00%) 0.39 ( 84.03%)
>
> 5.4.0-rc3 5.4.0-rc3
> vanilla resetpcpu-v2
> Duration User 39766.24 49221.79
> Duration System 44298.10 13361.67
> Duration Elapsed 519.11 388.87
>
> The patch reduces system CPU usage by 69.86% and total build time by
> 26.65%. The variance of system CPU usage is also much reduced.
>
> Before, this was the breakdown of batch and high values over all zones was.
>
> 256 batch: 1
> 256 batch: 63
> 512 batch: 7
> 256 high: 0
> 256 high: 378
> 512 high: 42
>
> 512 pcpu pagesets had a batch limit of 7 and a high limit of 42. After the patch
>
> 256 batch: 1
> 768 batch: 63
> 256 high: 0
> 768 high: 378
>
> Cc: stable@vger.kernel.org # v4.1+
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> mm/page_alloc.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c0b2e0306720..f972076d0f6b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1947,6 +1947,14 @@ void __init page_alloc_init_late(void)
> /* Block until all are initialised */
> wait_for_completion(&pgdat_init_all_done_comp);
>
> + /*
> + * The number of managed pages has changed due to the initialisation
> + * so the pcpu batch and high limits needs to be updated or the limits
> + * will be artificially small.
> + */
> + for_each_populated_zone(zone)
> + zone_pcp_update(zone);
> +
> /*
> * We initialized the rest of the deferred pages. Permanently disable
> * on-demand struct page initialization.
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] mm, meminit: Recalculate pcpu batch and high limits after init completes
2019-10-21 9:48 ` [PATCH 1/3] mm, meminit: Recalculate pcpu batch and high limits after init completes Mel Gorman
2019-10-21 10:27 ` Michal Hocko
2019-10-21 11:42 ` Vlastimil Babka
@ 2019-10-21 14:01 ` Qian Cai
2019-10-21 14:12 ` Michal Hocko
2019-10-21 14:25 ` Mel Gorman
2019-10-21 19:39 ` [PATCH] mm, meminit: Recalculate pcpu batch and high limits after init completes -fix Mel Gorman
[not found] ` <20191026131036.A7A5421655@mail.kernel.org>
4 siblings, 2 replies; 13+ messages in thread
From: Qian Cai @ 2019-10-21 14:01 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Thomas Gleixner,
Matt Fleming, Borislav Petkov, Linux-MM,
Linux Kernel Mailing List
> On Oct 21, 2019, at 5:48 AM, Mel Gorman <mgorman@techsingularity.net> wrote:
>
> Deferred memory initialisation updates zone->managed_pages during
> the initialisation phase but before that finishes, the per-cpu page
> allocator (pcpu) calculates the number of pages allocated/freed in
> batches as well as the maximum number of pages allowed on a per-cpu list.
> As zone->managed_pages is not up to date yet, the pcpu initialisation
> calculates inappropriately low batch and high values.
>
> This increases zone lock contention quite severely in some cases with the
> degree of severity depending on how many CPUs share a local zone and the
> size of the zone. A private report indicated that kernel build times were
> excessive with extremely high system CPU usage. A perf profile indicated
> that a large chunk of time was lost on zone->lock contention.
>
> This patch recalculates the pcpu batch and high values after deferred
> initialisation completes for every populated zone in the system. It
> was tested on a 2-socket AMD EPYC 2 machine using a kernel compilation
> workload -- allmodconfig and all available CPUs.
>
> mmtests configuration: config-workload-kernbench-max
> Configuration was modified to build on a fresh XFS partition.
>
> kernbench
> 5.4.0-rc3 5.4.0-rc3
> vanilla resetpcpu-v2
> Amean user-256 13249.50 ( 0.00%) 16401.31 * -23.79%*
> Amean syst-256 14760.30 ( 0.00%) 4448.39 * 69.86%*
> Amean elsp-256 162.42 ( 0.00%) 119.13 * 26.65%*
> Stddev user-256 42.97 ( 0.00%) 19.15 ( 55.43%)
> Stddev syst-256 336.87 ( 0.00%) 6.71 ( 98.01%)
> Stddev elsp-256 2.46 ( 0.00%) 0.39 ( 84.03%)
>
> 5.4.0-rc3 5.4.0-rc3
> vanilla resetpcpu-v2
> Duration User 39766.24 49221.79
> Duration System 44298.10 13361.67
> Duration Elapsed 519.11 388.87
>
> The patch reduces system CPU usage by 69.86% and total build time by
> 26.65%. The variance of system CPU usage is also much reduced.
>
> Before, this was the breakdown of batch and high values over all zones was.
>
> 256 batch: 1
> 256 batch: 63
> 512 batch: 7
> 256 high: 0
> 256 high: 378
> 512 high: 42
>
> 512 pcpu pagesets had a batch limit of 7 and a high limit of 42. After the patch
>
> 256 batch: 1
> 768 batch: 63
> 256 high: 0
> 768 high: 378
>
> Cc: stable@vger.kernel.org # v4.1+
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> ---
> mm/page_alloc.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c0b2e0306720..f972076d0f6b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1947,6 +1947,14 @@ void __init page_alloc_init_late(void)
> /* Block until all are initialised */
> wait_for_completion(&pgdat_init_all_done_comp);
>
> + /*
> + * The number of managed pages has changed due to the initialisation
> + * so the pcpu batch and high limits needs to be updated or the limits
> + * will be artificially small.
> + */
> + for_each_populated_zone(zone)
> + zone_pcp_update(zone);
> +
> /*
> * We initialized the rest of the deferred pages. Permanently disable
> * on-demand struct page initialization.
> --
> 2.16.4
>
>
Warnings from linux-next,
[ 14.265911][ T659] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935
[ 14.265992][ T659] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 659, name: pgdatinit8
[ 14.266044][ T659] 1 lock held by pgdatinit8/659:
[ 14.266075][ T659] #0: c000201ffca87b40 (&(&pgdat->node_size_lock)->rlock){....}, at: deferred_init_memmap+0xc4/0x26c
[ 14.266160][ T659] irq event stamp: 26
[ 14.266194][ T659] hardirqs last enabled at (25): [<c000000000950584>] _raw_spin_unlock_irq+0x44/0x80
[ 14.266246][ T659] hardirqs last disabled at (26): [<c0000000009502ec>] _raw_spin_lock_irqsave+0x3c/0xa0
[ 14.266299][ T659] softirqs last enabled at (0): [<c0000000000ff8d0>] copy_process+0x720/0x19b0
[ 14.266339][ T659] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 14.266400][ T659] CPU: 64 PID: 659 Comm: pgdatinit8 Not tainted 5.4.0-rc4-next-20191021 #1
[ 14.266462][ T659] Call Trace:
[ 14.266494][ T659] [c00000003d8efae0] [c000000000921cf4] dump_stack+0xe8/0x164 (unreliable)
[ 14.266538][ T659] [c00000003d8efb30] [c000000000157c54] ___might_sleep+0x334/0x370
[ 14.266577][ T659] [c00000003d8efbb0] [c00000000094a784] __mutex_lock+0x84/0xb20
[ 14.266627][ T659] [c00000003d8efcc0] [c000000000954038] zone_pcp_update+0x34/0x64
[ 14.266677][ T659] [c00000003d8efcf0] [c000000000b9e6bc] deferred_init_memmap+0x1b8/0x26c
[ 14.266740][ T659] [c00000003d8efdb0] [c000000000149528] kthread+0x1a8/0x1b0
[ 14.266792][ T659] [c00000003d8efe20] [c00000000000b748] ret_from_kernel_thread+0x5c/0x74
[ 14.268288][ T659] node 8 initialised, 1879186 pages in 12200ms
[ 14.268527][ T659] pgdatinit8 (659) used greatest stack depth: 27984 bytes left
[ 15.589983][ T658] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935
[ 15.590041][ T658] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 658, name: pgdatinit0
[ 15.590078][ T658] 1 lock held by pgdatinit0/658:
[ 15.590108][ T658] #0: c000001fff5c7b40 (&(&pgdat->node_size_lock)->rlock){....}, at: deferred_init_memmap+0xc4/0x26c
[ 15.590192][ T658] irq event stamp: 18
[ 15.590224][ T658] hardirqs last enabled at (17): [<c000000000950654>] _raw_spin_unlock_irqrestore+0x94/0xd0
[ 15.590283][ T658] hardirqs last disabled at (18): [<c0000000009502ec>] _raw_spin_lock_irqsave+0x3c/0xa0
[ 15.590332][ T658] softirqs last enabled at (0): [<c0000000000ff8d0>] copy_process+0x720/0x19b0
[ 15.590379][ T658] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 15.590414][ T658] CPU: 8 PID: 658 Comm: pgdatinit0 Tainted: G W 5.4.0-rc4-next-20191021 #1
[ 15.590460][ T658] Call Trace:
[ 15.590491][ T658] [c00000003d8cfae0] [c000000000921cf4] dump_stack+0xe8/0x164 (unreliable)
[ 15.590541][ T658] [c00000003d8cfb30] [c000000000157c54] ___might_sleep+0x334/0x370
[ 15.590588][ T658] [c00000003d8cfbb0] [c00000000094a784] __mutex_lock+0x84/0xb20
[ 15.590643][ T658] [c00000003d8cfcc0] [c000000000954038] zone_pcp_update+0x34/0x64
[ 15.590689][ T658] [c00000003d8cfcf0] [c000000000b9e6bc] deferred_init_memmap+0x1b8/0x26c
[ 15.590739][ T658] [c00000003d8cfdb0] [c000000000149528] kthread+0x1a8/0x1b0
[ 15.590790][ T658] [c00000003d8cfe20] [c00000000000b748] ret_from_kernel_thread+0x5c/0x74
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] mm, meminit: Recalculate pcpu batch and high limits after init completes
2019-10-21 14:01 ` Qian Cai
@ 2019-10-21 14:12 ` Michal Hocko
2019-10-21 14:25 ` Mel Gorman
1 sibling, 0 replies; 13+ messages in thread
From: Michal Hocko @ 2019-10-21 14:12 UTC (permalink / raw)
To: Qian Cai
Cc: Mel Gorman, Andrew Morton, Vlastimil Babka, Thomas Gleixner,
Matt Fleming, Borislav Petkov, Linux-MM,
Linux Kernel Mailing List
On Mon 21-10-19 10:01:24, Qian Cai wrote:
>
>
> > On Oct 21, 2019, at 5:48 AM, Mel Gorman <mgorman@techsingularity.net> wrote:
i[...]
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index c0b2e0306720..f972076d0f6b 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -1947,6 +1947,14 @@ void __init page_alloc_init_late(void)
> > /* Block until all are initialised */
> > wait_for_completion(&pgdat_init_all_done_comp);
> >
> > + /*
> > + * The number of managed pages has changed due to the initialisation
> > + * so the pcpu batch and high limits needs to be updated or the limits
> > + * will be artificially small.
> > + */
> > + for_each_populated_zone(zone)
> > + zone_pcp_update(zone);
> > +
> > /*
> > * We initialized the rest of the deferred pages. Permanently disable
> > * on-demand struct page initialization.
> > --
> > 2.16.4
> >
> >
>
> Warnings from linux-next,
>
> [ 14.265911][ T659] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935
> [ 14.265992][ T659] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 659, name: pgdatinit8
> [ 14.266044][ T659] 1 lock held by pgdatinit8/659:
> [ 14.266075][ T659] #0: c000201ffca87b40 (&(&pgdat->node_size_lock)->rlock){....}, at: deferred_init_memmap+0xc4/0x26c
This is really surprising to say the least. I do not see any spinlock
held here. Besides that we do sleep in wait_for_completion already.
Is it possible that the patch has been misplaced? zone_pcp_update is
called from page_alloc_init_late which is a different context than
deferred_init_memmap which runs in a separate kthread.
> [ 14.266160][ T659] irq event stamp: 26
> [ 14.266194][ T659] hardirqs last enabled at (25): [<c000000000950584>] _raw_spin_unlock_irq+0x44/0x80
> [ 14.266246][ T659] hardirqs last disabled at (26): [<c0000000009502ec>] _raw_spin_lock_irqsave+0x3c/0xa0
> [ 14.266299][ T659] softirqs last enabled at (0): [<c0000000000ff8d0>] copy_process+0x720/0x19b0
> [ 14.266339][ T659] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [ 14.266400][ T659] CPU: 64 PID: 659 Comm: pgdatinit8 Not tainted 5.4.0-rc4-next-20191021 #1
> [ 14.266462][ T659] Call Trace:
> [ 14.266494][ T659] [c00000003d8efae0] [c000000000921cf4] dump_stack+0xe8/0x164 (unreliable)
> [ 14.266538][ T659] [c00000003d8efb30] [c000000000157c54] ___might_sleep+0x334/0x370
> [ 14.266577][ T659] [c00000003d8efbb0] [c00000000094a784] __mutex_lock+0x84/0xb20
> [ 14.266627][ T659] [c00000003d8efcc0] [c000000000954038] zone_pcp_update+0x34/0x64
> [ 14.266677][ T659] [c00000003d8efcf0] [c000000000b9e6bc] deferred_init_memmap+0x1b8/0x26c
> [ 14.266740][ T659] [c00000003d8efdb0] [c000000000149528] kthread+0x1a8/0x1b0
> [ 14.266792][ T659] [c00000003d8efe20] [c00000000000b748] ret_from_kernel_thread+0x5c/0x74
> [ 14.268288][ T659] node 8 initialised, 1879186 pages in 12200ms
> [ 14.268527][ T659] pgdatinit8 (659) used greatest stack depth: 27984 bytes left
> [ 15.589983][ T658] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935
> [ 15.590041][ T658] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 658, name: pgdatinit0
> [ 15.590078][ T658] 1 lock held by pgdatinit0/658:
> [ 15.590108][ T658] #0: c000001fff5c7b40 (&(&pgdat->node_size_lock)->rlock){....}, at: deferred_init_memmap+0xc4/0x26c
> [ 15.590192][ T658] irq event stamp: 18
> [ 15.590224][ T658] hardirqs last enabled at (17): [<c000000000950654>] _raw_spin_unlock_irqrestore+0x94/0xd0
> [ 15.590283][ T658] hardirqs last disabled at (18): [<c0000000009502ec>] _raw_spin_lock_irqsave+0x3c/0xa0
> [ 15.590332][ T658] softirqs last enabled at (0): [<c0000000000ff8d0>] copy_process+0x720/0x19b0
> [ 15.590379][ T658] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [ 15.590414][ T658] CPU: 8 PID: 658 Comm: pgdatinit0 Tainted: G W 5.4.0-rc4-next-20191021 #1
> [ 15.590460][ T658] Call Trace:
> [ 15.590491][ T658] [c00000003d8cfae0] [c000000000921cf4] dump_stack+0xe8/0x164 (unreliable)
> [ 15.590541][ T658] [c00000003d8cfb30] [c000000000157c54] ___might_sleep+0x334/0x370
> [ 15.590588][ T658] [c00000003d8cfbb0] [c00000000094a784] __mutex_lock+0x84/0xb20
> [ 15.590643][ T658] [c00000003d8cfcc0] [c000000000954038] zone_pcp_update+0x34/0x64
> [ 15.590689][ T658] [c00000003d8cfcf0] [c000000000b9e6bc] deferred_init_memmap+0x1b8/0x26c
> [ 15.590739][ T658] [c00000003d8cfdb0] [c000000000149528] kthread+0x1a8/0x1b0
> [ 15.590790][ T658] [c00000003d8cfe20] [c00000000000b748] ret_from_kernel_thread+0x5c/0x74
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] mm, meminit: Recalculate pcpu batch and high limits after init completes
2019-10-21 14:01 ` Qian Cai
2019-10-21 14:12 ` Michal Hocko
@ 2019-10-21 14:25 ` Mel Gorman
1 sibling, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2019-10-21 14:25 UTC (permalink / raw)
To: Qian Cai
Cc: Andrew Morton, Michal Hocko, Vlastimil Babka, Thomas Gleixner,
Matt Fleming, Borislav Petkov, Linux-MM,
Linux Kernel Mailing List
On Mon, Oct 21, 2019 at 10:01:24AM -0400, Qian Cai wrote:
> Warnings from linux-next,
>
> [ 14.265911][ T659] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935
> [ 14.265992][ T659] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 659, name: pgdatinit8
> [ 14.266044][ T659] 1 lock held by pgdatinit8/659:
Fixed in v2 posted this morning. It should hit linux-next eventually.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] mm, meminit: Recalculate pcpu batch and high limits after init completes -fix
2019-10-21 9:48 ` [PATCH 1/3] mm, meminit: Recalculate pcpu batch and high limits after init completes Mel Gorman
` (2 preceding siblings ...)
2019-10-21 14:01 ` Qian Cai
@ 2019-10-21 19:39 ` Mel Gorman
[not found] ` <20191026131036.A7A5421655@mail.kernel.org>
4 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2019-10-21 19:39 UTC (permalink / raw)
To: Andrew Morton
Cc: Michal Hocko, Vlastimil Babka, Thomas Gleixner, Matt Fleming,
Borislav Petkov, Linux-MM, Linux Kernel Mailing List, lkp
LKP reported the following build problem from two hunks that did not
survive the reshuffling of the series reordering.
ld: mm/page_alloc.o: in function `page_alloc_init_late':
mm/page_alloc.c:1956: undefined reference to `zone_pcp_update'
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
mm/page_alloc.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4179376bb336..e9926bf77463 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8524,7 +8524,6 @@ void free_contig_range(unsigned long pfn, unsigned int nr_pages)
WARN(count != 0, "%d pages are still in use!\n", count);
}
-#ifdef CONFIG_MEMORY_HOTPLUG
/*
* The zone indicated has a new number of managed_pages; batch sizes and percpu
* page high values need to be recalulated.
@@ -8535,7 +8534,6 @@ void __meminit zone_pcp_update(struct zone *zone)
__zone_pcp_update(zone);
mutex_unlock(&pcp_batch_high_lock);
}
-#endif
void zone_pcp_reset(struct zone *zone)
{
^ permalink raw reply related [flat|nested] 13+ messages in thread
[parent not found: <20191026131036.A7A5421655@mail.kernel.org>]
* Re: [PATCH 1/3] mm, meminit: Recalculate pcpu batch and high limits after init completes
[not found] ` <20191026131036.A7A5421655@mail.kernel.org>
@ 2019-10-27 20:43 ` Mel Gorman
0 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2019-10-27 20:43 UTC (permalink / raw)
To: Sasha Levin; +Cc: Andrew Morton, Michal Hocko, stable
On Sat, Oct 26, 2019 at 01:10:35PM +0000, Sasha Levin wrote:
> Hi,
>
> [This is an automated email]
>
> This commit has been processed because it contains a -stable tag.
> The stable tag indicates that it's relevant for the following trees: 4.1+
>
> The bot has tested the following trees: v5.3.7, v4.19.80, v4.14.150, v4.9.197, v4.4.197.
>
> v5.3.7: Build OK!
> v4.19.80: Build failed! Errors:
>
> v4.14.150: Failed to apply! Possible dependencies:
> 3c2c648842843 ("mm/page_alloc.c: fix typos in comments")
> 66e8b438bd5c7 ("mm/memblock.c: make the index explicit argument of for_each_memblock_type")
> c9e97a1997fbf ("mm: initialize pages on demand during boot")
>
> v4.9.197: Failed to apply! Possible dependencies:
> 3c2c648842843 ("mm/page_alloc.c: fix typos in comments")
> 66e8b438bd5c7 ("mm/memblock.c: make the index explicit argument of for_each_memblock_type")
> c9e97a1997fbf ("mm: initialize pages on demand during boot")
>
> v4.4.197: Failed to apply! Possible dependencies:
> 0a687aace3b8e ("mm,oom: do not loop !__GFP_FS allocation if the OOM killer is disabled")
> 0caeef63e6d2f ("libnvdimm: Add a poison list and export badblocks")
> 0e749e54244ee ("dax: increase granularity of dax_clear_blocks() operations")
> 34c0fd540e79f ("mm, dax, pmem: introduce pfn_t")
> 3c2c648842843 ("mm/page_alloc.c: fix typos in comments")
> 3da88fb3bacfa ("mm, oom: move GFP_NOFS check to out_of_memory")
> 4b94ffdc4163b ("x86, mm: introduce vmem_altmap to augment vmemmap_populate()")
> 5020e285856cb ("mm, oom: give __GFP_NOFAIL allocations access to memory reserves")
> 52db400fcd502 ("pmem, dax: clean up clear_pmem()")
> 66e8b438bd5c7 ("mm/memblock.c: make the index explicit argument of for_each_memblock_type")
> 7cf91a98e607c ("mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous")
> 87ba05dff3510 ("libnvdimm: don't fail init for full badblocks list")
> 8c9c1701c7c23 ("mm/memblock: introduce for_each_memblock_type()")
> 9476df7d80dfc ("mm: introduce find_dev_pagemap()")
> ad9a8bde2cb19 ("libnvdimm, pmem: move definition of nvdimm_namespace_add_poison to nd.h")
> b2e0d1625e193 ("dax: fix lifetime of in-kernel dax mappings with dax_map_atomic()")
> b95f5f4391fad ("libnvdimm: convert to statically allocated badblocks")
> ba6c19fd113a3 ("include/linux/memblock.h: Clean up code for several trivial details")
> c9e97a1997fbf ("mm: initialize pages on demand during boot")
>
>
> NOTE: The patch will not be queued to stable trees until it is upstream.
>
> How should we proceed with this patch?
>
What were the 4.19.80 build errors?
For the older kernels, it would have to be confirmed those kernels are
definietly affected. The test machines I tried fails to even boot on those
kernels so I need to find a NUMA machine that is old enough to boot those
kernels and confirmed affected by the bug before determining what the
backport needs to look like.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 13+ messages in thread