All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm/cgroup: avoid panic when init with low memory
@ 2017-02-15 10:36 ` Laurent Dufour
  0 siblings, 0 replies; 12+ messages in thread
From: Laurent Dufour @ 2017-02-15 10:36 UTC (permalink / raw)
  To: Johannes Weiner, Michal Hocko, Vladimir Davydov
  Cc: cgroups, linux-mm, linux-kernel

The system may panic when initialisation is done when almost all the
memory is assigned to the huge pages using the kernel command line
parameter hugepage=xxxx. Panic may occur like this:

[    0.082289] Unable to handle kernel paging request for data at address 0x00000000
[    0.082338] Faulting instruction address: 0xc000000000302b88
[    0.082377] Oops: Kernel access of bad area, sig: 11 [#1]
[    0.082408] SMP NR_CPUS=2048 [    0.082424] NUMA
[    0.082440] pSeries
[    0.082457] Modules linked in:
[    0.082490] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-15-generic #16-Ubuntu
[    0.082536] task: c00000021ed01600 task.stack: c00000010d108000
[    0.082575] NIP: c000000000302b88 LR: c000000000270e04 CTR: c00000000016cfd0
[    0.082621] REGS: c00000010d10b2c0 TRAP: 0300   Not tainted (4.9.0-15-generic)
[    0.082666] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>[ 0.082770]   CR: 28424422  XER: 00000000
[    0.082793] CFAR: c0000000003d28b8 DAR: 0000000000000000 DSISR: 40000000 SOFTE: 1
GPR00: c000000000270e04 c00000010d10b540 c00000000141a300 c00000010fff6300
GPR04: 0000000000000000 00000000026012c0 c00000010d10b630 0000000487ab0000
GPR08: 000000010ee90000 c000000001454fd8 0000000000000000 0000000000000000
GPR12: 0000000000004400 c00000000fb80000 00000000026012c0 00000000026012c0
GPR16: 00000000026012c0 0000000000000000 0000000000000000 0000000000000002
GPR20: 000000000000000c 0000000000000000 0000000000000000 00000000024200c0
GPR24: c0000000016eef48 0000000000000000 c00000010fff7d00 00000000026012c0
GPR28: 0000000000000000 c00000010fff7d00 c00000010fff6300 c00000010d10b6d0
NIP [c000000000302b88] mem_cgroup_soft_limit_reclaim+0xf8/0x4f0
[    0.083456] LR [c000000000270e04] do_try_to_free_pages+0x1b4/0x450
[    0.083494] Call Trace:
[    0.083511] [c00000010d10b540] [c00000010d10b640] 0xc00000010d10b640 (unreliable)
[    0.083567] [c00000010d10b610] [c000000000270e04] do_try_to_free_pages+0x1b4/0x450
[    0.083622] [c00000010d10b6b0] [c000000000271198] try_to_free_pages+0xf8/0x270
[    0.083676] [c00000010d10b740] [c000000000259dd8] __alloc_pages_nodemask+0x7a8/0xff0
[    0.083729] [c00000010d10b960] [c0000000002dd274] new_slab+0x104/0x8e0
[    0.083776] [c00000010d10ba40] [c0000000002e03d0] ___slab_alloc+0x620/0x700
[    0.083822] [c00000010d10bb70] [c0000000002e04e4] __slab_alloc+0x34/0x60
[    0.083868] [c00000010d10bba0] [c0000000002e101c] kmem_cache_alloc_node_trace+0xdc/0x310
[    0.083947] [c00000010d10bc00] [c000000000eb8120] mem_cgroup_init+0x158/0x1c8
[    0.083994] [c00000010d10bc40] [c00000000000dde8] do_one_initcall+0x68/0x1d0
[    0.084041] [c00000010d10bd00] [c000000000e84184] kernel_init_freeable+0x278/0x360
[    0.084094] [c00000010d10bdc0] [c00000000000e714] kernel_init+0x24/0x170
[    0.084143] [c00000010d10be30] [c00000000000c0e8] ret_from_kernel_thread+0x5c/0x74
[    0.084195] Instruction dump:
[    0.084220] eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020 3d230001 e9499a42 3d220004
[    0.084300] 3929acd8 794a1f24 7d295214 eac90100 <e9360000> 2fa90000 419eff74 3b200000
[    0.084382] ---[ end trace 342f5208b00d01b6 ]---

This is a chicken and egg issue where the kernel try to get free
memory when allocating per node data in mem_cgroup_init(), but in that
path mem_cgroup_soft_limit_reclaim() is called which assumes that
these data are allocated.

As mem_cgroup_soft_limit_reclaim() is best effort, it should return
when these data are not yet allocated.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 mm/memcontrol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1fd6affcdde7..213f96b2f601 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2556,7 +2556,7 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 	 * is empty. Do it lockless to prevent lock bouncing. Races
 	 * are acceptable as soft limit is best effort anyway.
 	 */
-	if (RB_EMPTY_ROOT(&mctz->rb_root))
+	if (!mctz || RB_EMPTY_ROOT(&mctz->rb_root))
 		return 0;
 
 	/*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH] mm/cgroup: avoid panic when init with low memory
@ 2017-02-15 10:36 ` Laurent Dufour
  0 siblings, 0 replies; 12+ messages in thread
From: Laurent Dufour @ 2017-02-15 10:36 UTC (permalink / raw)
  To: Johannes Weiner, Michal Hocko, Vladimir Davydov
  Cc: cgroups, linux-mm, linux-kernel

The system may panic when initialisation is done when almost all the
memory is assigned to the huge pages using the kernel command line
parameter hugepage=xxxx. Panic may occur like this:

[    0.082289] Unable to handle kernel paging request for data at address 0x00000000
[    0.082338] Faulting instruction address: 0xc000000000302b88
[    0.082377] Oops: Kernel access of bad area, sig: 11 [#1]
[    0.082408] SMP NR_CPUS=2048 [    0.082424] NUMA
[    0.082440] pSeries
[    0.082457] Modules linked in:
[    0.082490] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-15-generic #16-Ubuntu
[    0.082536] task: c00000021ed01600 task.stack: c00000010d108000
[    0.082575] NIP: c000000000302b88 LR: c000000000270e04 CTR: c00000000016cfd0
[    0.082621] REGS: c00000010d10b2c0 TRAP: 0300   Not tainted (4.9.0-15-generic)
[    0.082666] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>[ 0.082770]   CR: 28424422  XER: 00000000
[    0.082793] CFAR: c0000000003d28b8 DAR: 0000000000000000 DSISR: 40000000 SOFTE: 1
GPR00: c000000000270e04 c00000010d10b540 c00000000141a300 c00000010fff6300
GPR04: 0000000000000000 00000000026012c0 c00000010d10b630 0000000487ab0000
GPR08: 000000010ee90000 c000000001454fd8 0000000000000000 0000000000000000
GPR12: 0000000000004400 c00000000fb80000 00000000026012c0 00000000026012c0
GPR16: 00000000026012c0 0000000000000000 0000000000000000 0000000000000002
GPR20: 000000000000000c 0000000000000000 0000000000000000 00000000024200c0
GPR24: c0000000016eef48 0000000000000000 c00000010fff7d00 00000000026012c0
GPR28: 0000000000000000 c00000010fff7d00 c00000010fff6300 c00000010d10b6d0
NIP [c000000000302b88] mem_cgroup_soft_limit_reclaim+0xf8/0x4f0
[    0.083456] LR [c000000000270e04] do_try_to_free_pages+0x1b4/0x450
[    0.083494] Call Trace:
[    0.083511] [c00000010d10b540] [c00000010d10b640] 0xc00000010d10b640 (unreliable)
[    0.083567] [c00000010d10b610] [c000000000270e04] do_try_to_free_pages+0x1b4/0x450
[    0.083622] [c00000010d10b6b0] [c000000000271198] try_to_free_pages+0xf8/0x270
[    0.083676] [c00000010d10b740] [c000000000259dd8] __alloc_pages_nodemask+0x7a8/0xff0
[    0.083729] [c00000010d10b960] [c0000000002dd274] new_slab+0x104/0x8e0
[    0.083776] [c00000010d10ba40] [c0000000002e03d0] ___slab_alloc+0x620/0x700
[    0.083822] [c00000010d10bb70] [c0000000002e04e4] __slab_alloc+0x34/0x60
[    0.083868] [c00000010d10bba0] [c0000000002e101c] kmem_cache_alloc_node_trace+0xdc/0x310
[    0.083947] [c00000010d10bc00] [c000000000eb8120] mem_cgroup_init+0x158/0x1c8
[    0.083994] [c00000010d10bc40] [c00000000000dde8] do_one_initcall+0x68/0x1d0
[    0.084041] [c00000010d10bd00] [c000000000e84184] kernel_init_freeable+0x278/0x360
[    0.084094] [c00000010d10bdc0] [c00000000000e714] kernel_init+0x24/0x170
[    0.084143] [c00000010d10be30] [c00000000000c0e8] ret_from_kernel_thread+0x5c/0x74
[    0.084195] Instruction dump:
[    0.084220] eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020 3d230001 e9499a42 3d220004
[    0.084300] 3929acd8 794a1f24 7d295214 eac90100 <e9360000> 2fa90000 419eff74 3b200000
[    0.084382] ---[ end trace 342f5208b00d01b6 ]---

This is a chicken and egg issue where the kernel try to get free
memory when allocating per node data in mem_cgroup_init(), but in that
path mem_cgroup_soft_limit_reclaim() is called which assumes that
these data are allocated.

As mem_cgroup_soft_limit_reclaim() is best effort, it should return
when these data are not yet allocated.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 mm/memcontrol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1fd6affcdde7..213f96b2f601 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2556,7 +2556,7 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 	 * is empty. Do it lockless to prevent lock bouncing. Races
 	 * are acceptable as soft limit is best effort anyway.
 	 */
-	if (RB_EMPTY_ROOT(&mctz->rb_root))
+	if (!mctz || RB_EMPTY_ROOT(&mctz->rb_root))
 		return 0;
 
 	/*
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH] mm/cgroup: avoid panic when init with low memory
@ 2017-02-15 10:36 ` Laurent Dufour
  0 siblings, 0 replies; 12+ messages in thread
From: Laurent Dufour @ 2017-02-15 10:36 UTC (permalink / raw)
  To: Johannes Weiner, Michal Hocko, Vladimir Davydov
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

The system may panic when initialisation is done when almost all the
memory is assigned to the huge pages using the kernel command line
parameter hugepage=xxxx. Panic may occur like this:

[    0.082289] Unable to handle kernel paging request for data at address 0x00000000
[    0.082338] Faulting instruction address: 0xc000000000302b88
[    0.082377] Oops: Kernel access of bad area, sig: 11 [#1]
[    0.082408] SMP NR_CPUS=2048 [    0.082424] NUMA
[    0.082440] pSeries
[    0.082457] Modules linked in:
[    0.082490] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-15-generic #16-Ubuntu
[    0.082536] task: c00000021ed01600 task.stack: c00000010d108000
[    0.082575] NIP: c000000000302b88 LR: c000000000270e04 CTR: c00000000016cfd0
[    0.082621] REGS: c00000010d10b2c0 TRAP: 0300   Not tainted (4.9.0-15-generic)
[    0.082666] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>[ 0.082770]   CR: 28424422  XER: 00000000
[    0.082793] CFAR: c0000000003d28b8 DAR: 0000000000000000 DSISR: 40000000 SOFTE: 1
GPR00: c000000000270e04 c00000010d10b540 c00000000141a300 c00000010fff6300
GPR04: 0000000000000000 00000000026012c0 c00000010d10b630 0000000487ab0000
GPR08: 000000010ee90000 c000000001454fd8 0000000000000000 0000000000000000
GPR12: 0000000000004400 c00000000fb80000 00000000026012c0 00000000026012c0
GPR16: 00000000026012c0 0000000000000000 0000000000000000 0000000000000002
GPR20: 000000000000000c 0000000000000000 0000000000000000 00000000024200c0
GPR24: c0000000016eef48 0000000000000000 c00000010fff7d00 00000000026012c0
GPR28: 0000000000000000 c00000010fff7d00 c00000010fff6300 c00000010d10b6d0
NIP [c000000000302b88] mem_cgroup_soft_limit_reclaim+0xf8/0x4f0
[    0.083456] LR [c000000000270e04] do_try_to_free_pages+0x1b4/0x450
[    0.083494] Call Trace:
[    0.083511] [c00000010d10b540] [c00000010d10b640] 0xc00000010d10b640 (unreliable)
[    0.083567] [c00000010d10b610] [c000000000270e04] do_try_to_free_pages+0x1b4/0x450
[    0.083622] [c00000010d10b6b0] [c000000000271198] try_to_free_pages+0xf8/0x270
[    0.083676] [c00000010d10b740] [c000000000259dd8] __alloc_pages_nodemask+0x7a8/0xff0
[    0.083729] [c00000010d10b960] [c0000000002dd274] new_slab+0x104/0x8e0
[    0.083776] [c00000010d10ba40] [c0000000002e03d0] ___slab_alloc+0x620/0x700
[    0.083822] [c00000010d10bb70] [c0000000002e04e4] __slab_alloc+0x34/0x60
[    0.083868] [c00000010d10bba0] [c0000000002e101c] kmem_cache_alloc_node_trace+0xdc/0x310
[    0.083947] [c00000010d10bc00] [c000000000eb8120] mem_cgroup_init+0x158/0x1c8
[    0.083994] [c00000010d10bc40] [c00000000000dde8] do_one_initcall+0x68/0x1d0
[    0.084041] [c00000010d10bd00] [c000000000e84184] kernel_init_freeable+0x278/0x360
[    0.084094] [c00000010d10bdc0] [c00000000000e714] kernel_init+0x24/0x170
[    0.084143] [c00000010d10be30] [c00000000000c0e8] ret_from_kernel_thread+0x5c/0x74
[    0.084195] Instruction dump:
[    0.084220] eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020 3d230001 e9499a42 3d220004
[    0.084300] 3929acd8 794a1f24 7d295214 eac90100 <e9360000> 2fa90000 419eff74 3b200000
[    0.084382] ---[ end trace 342f5208b00d01b6 ]---

This is a chicken and egg issue where the kernel try to get free
memory when allocating per node data in mem_cgroup_init(), but in that
path mem_cgroup_soft_limit_reclaim() is called which assumes that
these data are allocated.

As mem_cgroup_soft_limit_reclaim() is best effort, it should return
when these data are not yet allocated.

Signed-off-by: Laurent Dufour <ldufour-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
 mm/memcontrol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1fd6affcdde7..213f96b2f601 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2556,7 +2556,7 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 	 * is empty. Do it lockless to prevent lock bouncing. Races
 	 * are acceptable as soft limit is best effort anyway.
 	 */
-	if (RB_EMPTY_ROOT(&mctz->rb_root))
+	if (!mctz || RB_EMPTY_ROOT(&mctz->rb_root))
 		return 0;
 
 	/*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/cgroup: avoid panic when init with low memory
  2017-02-15 10:36 ` Laurent Dufour
  (?)
@ 2017-02-20 13:01   ` Michal Hocko
  -1 siblings, 0 replies; 12+ messages in thread
From: Michal Hocko @ 2017-02-20 13:01 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: Johannes Weiner, Vladimir Davydov, cgroups, linux-mm, linux-kernel

On Wed 15-02-17 11:36:09, Laurent Dufour wrote:
> The system may panic when initialisation is done when almost all the
> memory is assigned to the huge pages using the kernel command line
> parameter hugepage=xxxx. Panic may occur like this:

I am pretty sure the system might blow up in many other ways when you
misconfigure it and pull basically all the memory out. Anyway...

[...]
 
> This is a chicken and egg issue where the kernel try to get free
> memory when allocating per node data in mem_cgroup_init(), but in that
> path mem_cgroup_soft_limit_reclaim() is called which assumes that
> these data are allocated.
> 
> As mem_cgroup_soft_limit_reclaim() is best effort, it should return
> when these data are not yet allocated.

... this makes some sense. Especially when there is no soft limit
configured. So this is a good step. I would just like to ask you to go
one step further. Can we make the whole soft reclaim thing uninitialized
until the soft limit is actually set? Soft limit is not used in cgroup
v2 at all and I would strongly discourage it in v1 as well. We will save
few bytes as a bonus.
 
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> ---
>  mm/memcontrol.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 1fd6affcdde7..213f96b2f601 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2556,7 +2556,7 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
>  	 * is empty. Do it lockless to prevent lock bouncing. Races
>  	 * are acceptable as soft limit is best effort anyway.
>  	 */
> -	if (RB_EMPTY_ROOT(&mctz->rb_root))
> +	if (!mctz || RB_EMPTY_ROOT(&mctz->rb_root))
>  		return 0;
>  
>  	/*
> -- 
> 2.7.4

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/cgroup: avoid panic when init with low memory
@ 2017-02-20 13:01   ` Michal Hocko
  0 siblings, 0 replies; 12+ messages in thread
From: Michal Hocko @ 2017-02-20 13:01 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: Johannes Weiner, Vladimir Davydov, cgroups, linux-mm, linux-kernel

On Wed 15-02-17 11:36:09, Laurent Dufour wrote:
> The system may panic when initialisation is done when almost all the
> memory is assigned to the huge pages using the kernel command line
> parameter hugepage=xxxx. Panic may occur like this:

I am pretty sure the system might blow up in many other ways when you
misconfigure it and pull basically all the memory out. Anyway...

[...]
 
> This is a chicken and egg issue where the kernel try to get free
> memory when allocating per node data in mem_cgroup_init(), but in that
> path mem_cgroup_soft_limit_reclaim() is called which assumes that
> these data are allocated.
> 
> As mem_cgroup_soft_limit_reclaim() is best effort, it should return
> when these data are not yet allocated.

... this makes some sense. Especially when there is no soft limit
configured. So this is a good step. I would just like to ask you to go
one step further. Can we make the whole soft reclaim thing uninitialized
until the soft limit is actually set? Soft limit is not used in cgroup
v2 at all and I would strongly discourage it in v1 as well. We will save
few bytes as a bonus.
 
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> ---
>  mm/memcontrol.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 1fd6affcdde7..213f96b2f601 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2556,7 +2556,7 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
>  	 * is empty. Do it lockless to prevent lock bouncing. Races
>  	 * are acceptable as soft limit is best effort anyway.
>  	 */
> -	if (RB_EMPTY_ROOT(&mctz->rb_root))
> +	if (!mctz || RB_EMPTY_ROOT(&mctz->rb_root))
>  		return 0;
>  
>  	/*
> -- 
> 2.7.4

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/cgroup: avoid panic when init with low memory
@ 2017-02-20 13:01   ` Michal Hocko
  0 siblings, 0 replies; 12+ messages in thread
From: Michal Hocko @ 2017-02-20 13:01 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: Johannes Weiner, Vladimir Davydov,
	cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Wed 15-02-17 11:36:09, Laurent Dufour wrote:
> The system may panic when initialisation is done when almost all the
> memory is assigned to the huge pages using the kernel command line
> parameter hugepage=xxxx. Panic may occur like this:

I am pretty sure the system might blow up in many other ways when you
misconfigure it and pull basically all the memory out. Anyway...

[...]
 
> This is a chicken and egg issue where the kernel try to get free
> memory when allocating per node data in mem_cgroup_init(), but in that
> path mem_cgroup_soft_limit_reclaim() is called which assumes that
> these data are allocated.
> 
> As mem_cgroup_soft_limit_reclaim() is best effort, it should return
> when these data are not yet allocated.

... this makes some sense. Especially when there is no soft limit
configured. So this is a good step. I would just like to ask you to go
one step further. Can we make the whole soft reclaim thing uninitialized
until the soft limit is actually set? Soft limit is not used in cgroup
v2 at all and I would strongly discourage it in v1 as well. We will save
few bytes as a bonus.
 
> Signed-off-by: Laurent Dufour <ldufour-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
> ---
>  mm/memcontrol.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 1fd6affcdde7..213f96b2f601 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2556,7 +2556,7 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
>  	 * is empty. Do it lockless to prevent lock bouncing. Races
>  	 * are acceptable as soft limit is best effort anyway.
>  	 */
> -	if (RB_EMPTY_ROOT(&mctz->rb_root))
> +	if (!mctz || RB_EMPTY_ROOT(&mctz->rb_root))
>  		return 0;
>  
>  	/*
> -- 
> 2.7.4

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/cgroup: avoid panic when init with low memory
  2017-02-20 13:01   ` Michal Hocko
@ 2017-02-20 17:09     ` Laurent Dufour
  -1 siblings, 0 replies; 12+ messages in thread
From: Laurent Dufour @ 2017-02-20 17:09 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Johannes Weiner, Vladimir Davydov, cgroups, linux-mm, linux-kernel

On 20/02/2017 14:01, Michal Hocko wrote:
> On Wed 15-02-17 11:36:09, Laurent Dufour wrote:
>> The system may panic when initialisation is done when almost all the
>> memory is assigned to the huge pages using the kernel command line
>> parameter hugepage=xxxx. Panic may occur like this:
> 
> I am pretty sure the system might blow up in many other ways when you
> misconfigure it and pull basically all the memory out. Anyway...
> 
> [...]
> 
>> This is a chicken and egg issue where the kernel try to get free
>> memory when allocating per node data in mem_cgroup_init(), but in that
>> path mem_cgroup_soft_limit_reclaim() is called which assumes that
>> these data are allocated.
>>
>> As mem_cgroup_soft_limit_reclaim() is best effort, it should return
>> when these data are not yet allocated.
> 
> ... this makes some sense. Especially when there is no soft limit
> configured. So this is a good step. I would just like to ask you to go
> one step further. Can we make the whole soft reclaim thing uninitialized
> until the soft limit is actually set? Soft limit is not used in cgroup
> v2 at all and I would strongly discourage it in v1 as well. We will save
> few bytes as a bonus.

Hi Michal, and thanks for the review.

I'm not familiar with that part of the kernel, so to be sure we are on
the same line, are you suggesting to set soft_limit_tree at the first
time mem_cgroup_write() is called to set a soft_limit field ?

Obviously, all callers to soft_limit_tree_node() and
soft_limit_tree_from_page() will have to check for the return pointer to
be NULL.

Cheers,
Laurent.


>> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
>> ---
>>  mm/memcontrol.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 1fd6affcdde7..213f96b2f601 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -2556,7 +2556,7 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
>>  	 * is empty. Do it lockless to prevent lock bouncing. Races
>>  	 * are acceptable as soft limit is best effort anyway.
>>  	 */
>> -	if (RB_EMPTY_ROOT(&mctz->rb_root))
>> +	if (!mctz || RB_EMPTY_ROOT(&mctz->rb_root))
>>  		return 0;
>>  
>>  	/*
>> -- 
>> 2.7.4
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/cgroup: avoid panic when init with low memory
@ 2017-02-20 17:09     ` Laurent Dufour
  0 siblings, 0 replies; 12+ messages in thread
From: Laurent Dufour @ 2017-02-20 17:09 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Johannes Weiner, Vladimir Davydov, cgroups, linux-mm, linux-kernel

On 20/02/2017 14:01, Michal Hocko wrote:
> On Wed 15-02-17 11:36:09, Laurent Dufour wrote:
>> The system may panic when initialisation is done when almost all the
>> memory is assigned to the huge pages using the kernel command line
>> parameter hugepage=xxxx. Panic may occur like this:
> 
> I am pretty sure the system might blow up in many other ways when you
> misconfigure it and pull basically all the memory out. Anyway...
> 
> [...]
> 
>> This is a chicken and egg issue where the kernel try to get free
>> memory when allocating per node data in mem_cgroup_init(), but in that
>> path mem_cgroup_soft_limit_reclaim() is called which assumes that
>> these data are allocated.
>>
>> As mem_cgroup_soft_limit_reclaim() is best effort, it should return
>> when these data are not yet allocated.
> 
> ... this makes some sense. Especially when there is no soft limit
> configured. So this is a good step. I would just like to ask you to go
> one step further. Can we make the whole soft reclaim thing uninitialized
> until the soft limit is actually set? Soft limit is not used in cgroup
> v2 at all and I would strongly discourage it in v1 as well. We will save
> few bytes as a bonus.

Hi Michal, and thanks for the review.

I'm not familiar with that part of the kernel, so to be sure we are on
the same line, are you suggesting to set soft_limit_tree at the first
time mem_cgroup_write() is called to set a soft_limit field ?

Obviously, all callers to soft_limit_tree_node() and
soft_limit_tree_from_page() will have to check for the return pointer to
be NULL.

Cheers,
Laurent.


>> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
>> ---
>>  mm/memcontrol.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 1fd6affcdde7..213f96b2f601 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -2556,7 +2556,7 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
>>  	 * is empty. Do it lockless to prevent lock bouncing. Races
>>  	 * are acceptable as soft limit is best effort anyway.
>>  	 */
>> -	if (RB_EMPTY_ROOT(&mctz->rb_root))
>> +	if (!mctz || RB_EMPTY_ROOT(&mctz->rb_root))
>>  		return 0;
>>  
>>  	/*
>> -- 
>> 2.7.4
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/cgroup: avoid panic when init with low memory
  2017-02-20 17:09     ` Laurent Dufour
@ 2017-02-20 17:42       ` Michal Hocko
  -1 siblings, 0 replies; 12+ messages in thread
From: Michal Hocko @ 2017-02-20 17:42 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: Johannes Weiner, Vladimir Davydov, cgroups, linux-mm, linux-kernel

On Mon 20-02-17 18:09:43, Laurent Dufour wrote:
> On 20/02/2017 14:01, Michal Hocko wrote:
> > On Wed 15-02-17 11:36:09, Laurent Dufour wrote:
> >> The system may panic when initialisation is done when almost all the
> >> memory is assigned to the huge pages using the kernel command line
> >> parameter hugepage=xxxx. Panic may occur like this:
> > 
> > I am pretty sure the system might blow up in many other ways when you
> > misconfigure it and pull basically all the memory out. Anyway...
> > 
> > [...]
> > 
> >> This is a chicken and egg issue where the kernel try to get free
> >> memory when allocating per node data in mem_cgroup_init(), but in that
> >> path mem_cgroup_soft_limit_reclaim() is called which assumes that
> >> these data are allocated.
> >>
> >> As mem_cgroup_soft_limit_reclaim() is best effort, it should return
> >> when these data are not yet allocated.
> > 
> > ... this makes some sense. Especially when there is no soft limit
> > configured. So this is a good step. I would just like to ask you to go
> > one step further. Can we make the whole soft reclaim thing uninitialized
> > until the soft limit is actually set? Soft limit is not used in cgroup
> > v2 at all and I would strongly discourage it in v1 as well. We will save
> > few bytes as a bonus.
> 
> Hi Michal, and thanks for the review.
> 
> I'm not familiar with that part of the kernel, so to be sure we are on
> the same line, are you suggesting to set soft_limit_tree at the first
> time mem_cgroup_write() is called to set a soft_limit field ?

yes

> Obviously, all callers to soft_limit_tree_node() and
> soft_limit_tree_from_page() will have to check for the return pointer to
> be NULL.

All callers that need to access the tree unconditionally, yes. Which is
the case anyway, right? I haven't checked the check you have added is
sufficient, but we shouldn't have that many of them because some code
paths are called only when the soft limit is enabled.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/cgroup: avoid panic when init with low memory
@ 2017-02-20 17:42       ` Michal Hocko
  0 siblings, 0 replies; 12+ messages in thread
From: Michal Hocko @ 2017-02-20 17:42 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: Johannes Weiner, Vladimir Davydov, cgroups, linux-mm, linux-kernel

On Mon 20-02-17 18:09:43, Laurent Dufour wrote:
> On 20/02/2017 14:01, Michal Hocko wrote:
> > On Wed 15-02-17 11:36:09, Laurent Dufour wrote:
> >> The system may panic when initialisation is done when almost all the
> >> memory is assigned to the huge pages using the kernel command line
> >> parameter hugepage=xxxx. Panic may occur like this:
> > 
> > I am pretty sure the system might blow up in many other ways when you
> > misconfigure it and pull basically all the memory out. Anyway...
> > 
> > [...]
> > 
> >> This is a chicken and egg issue where the kernel try to get free
> >> memory when allocating per node data in mem_cgroup_init(), but in that
> >> path mem_cgroup_soft_limit_reclaim() is called which assumes that
> >> these data are allocated.
> >>
> >> As mem_cgroup_soft_limit_reclaim() is best effort, it should return
> >> when these data are not yet allocated.
> > 
> > ... this makes some sense. Especially when there is no soft limit
> > configured. So this is a good step. I would just like to ask you to go
> > one step further. Can we make the whole soft reclaim thing uninitialized
> > until the soft limit is actually set? Soft limit is not used in cgroup
> > v2 at all and I would strongly discourage it in v1 as well. We will save
> > few bytes as a bonus.
> 
> Hi Michal, and thanks for the review.
> 
> I'm not familiar with that part of the kernel, so to be sure we are on
> the same line, are you suggesting to set soft_limit_tree at the first
> time mem_cgroup_write() is called to set a soft_limit field ?

yes

> Obviously, all callers to soft_limit_tree_node() and
> soft_limit_tree_from_page() will have to check for the return pointer to
> be NULL.

All callers that need to access the tree unconditionally, yes. Which is
the case anyway, right? I haven't checked the check you have added is
sufficient, but we shouldn't have that many of them because some code
paths are called only when the soft limit is enabled.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/cgroup: avoid panic when init with low memory
  2017-02-20 17:42       ` Michal Hocko
@ 2017-02-22 14:02         ` Laurent Dufour
  -1 siblings, 0 replies; 12+ messages in thread
From: Laurent Dufour @ 2017-02-22 14:02 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Johannes Weiner, Vladimir Davydov, cgroups, linux-mm, linux-kernel

On 20/02/2017 18:42, Michal Hocko wrote:
> On Mon 20-02-17 18:09:43, Laurent Dufour wrote:
>> On 20/02/2017 14:01, Michal Hocko wrote:
>>> On Wed 15-02-17 11:36:09, Laurent Dufour wrote:
>>>> The system may panic when initialisation is done when almost all the
>>>> memory is assigned to the huge pages using the kernel command line
>>>> parameter hugepage=xxxx. Panic may occur like this:
>>>
>>> I am pretty sure the system might blow up in many other ways when you
>>> misconfigure it and pull basically all the memory out. Anyway...
>>>
>>> [...]
>>>
>>>> This is a chicken and egg issue where the kernel try to get free
>>>> memory when allocating per node data in mem_cgroup_init(), but in that
>>>> path mem_cgroup_soft_limit_reclaim() is called which assumes that
>>>> these data are allocated.
>>>>
>>>> As mem_cgroup_soft_limit_reclaim() is best effort, it should return
>>>> when these data are not yet allocated.
>>>
>>> ... this makes some sense. Especially when there is no soft limit
>>> configured. So this is a good step. I would just like to ask you to go
>>> one step further. Can we make the whole soft reclaim thing uninitialized
>>> until the soft limit is actually set? Soft limit is not used in cgroup
>>> v2 at all and I would strongly discourage it in v1 as well. We will save
>>> few bytes as a bonus.
>>
>> Hi Michal, and thanks for the review.
>>
>> I'm not familiar with that part of the kernel, so to be sure we are on
>> the same line, are you suggesting to set soft_limit_tree at the first
>> time mem_cgroup_write() is called to set a soft_limit field ?
> 
> yes
> 
>> Obviously, all callers to soft_limit_tree_node() and
>> soft_limit_tree_from_page() will have to check for the return pointer to
>> be NULL.
> 
> All callers that need to access the tree unconditionally, yes. Which is
> the case anyway, right? I haven't checked the check you have added is
> sufficient, but we shouldn't have that many of them because some code
> paths are called only when the soft limit is enabled.

You're right there are not so much callers to fix.
I'll send a new series containing the previous patch fixing the initial
panic and another one delaying the data allocation.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/cgroup: avoid panic when init with low memory
@ 2017-02-22 14:02         ` Laurent Dufour
  0 siblings, 0 replies; 12+ messages in thread
From: Laurent Dufour @ 2017-02-22 14:02 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Johannes Weiner, Vladimir Davydov, cgroups, linux-mm, linux-kernel

On 20/02/2017 18:42, Michal Hocko wrote:
> On Mon 20-02-17 18:09:43, Laurent Dufour wrote:
>> On 20/02/2017 14:01, Michal Hocko wrote:
>>> On Wed 15-02-17 11:36:09, Laurent Dufour wrote:
>>>> The system may panic when initialisation is done when almost all the
>>>> memory is assigned to the huge pages using the kernel command line
>>>> parameter hugepage=xxxx. Panic may occur like this:
>>>
>>> I am pretty sure the system might blow up in many other ways when you
>>> misconfigure it and pull basically all the memory out. Anyway...
>>>
>>> [...]
>>>
>>>> This is a chicken and egg issue where the kernel try to get free
>>>> memory when allocating per node data in mem_cgroup_init(), but in that
>>>> path mem_cgroup_soft_limit_reclaim() is called which assumes that
>>>> these data are allocated.
>>>>
>>>> As mem_cgroup_soft_limit_reclaim() is best effort, it should return
>>>> when these data are not yet allocated.
>>>
>>> ... this makes some sense. Especially when there is no soft limit
>>> configured. So this is a good step. I would just like to ask you to go
>>> one step further. Can we make the whole soft reclaim thing uninitialized
>>> until the soft limit is actually set? Soft limit is not used in cgroup
>>> v2 at all and I would strongly discourage it in v1 as well. We will save
>>> few bytes as a bonus.
>>
>> Hi Michal, and thanks for the review.
>>
>> I'm not familiar with that part of the kernel, so to be sure we are on
>> the same line, are you suggesting to set soft_limit_tree at the first
>> time mem_cgroup_write() is called to set a soft_limit field ?
> 
> yes
> 
>> Obviously, all callers to soft_limit_tree_node() and
>> soft_limit_tree_from_page() will have to check for the return pointer to
>> be NULL.
> 
> All callers that need to access the tree unconditionally, yes. Which is
> the case anyway, right? I haven't checked the check you have added is
> sufficient, but we shouldn't have that many of them because some code
> paths are called only when the soft limit is enabled.

You're right there are not so much callers to fix.
I'll send a new series containing the previous patch fixing the initial
panic and another one delaying the data allocation.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-02-22 14:03 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-15 10:36 [PATCH] mm/cgroup: avoid panic when init with low memory Laurent Dufour
2017-02-15 10:36 ` Laurent Dufour
2017-02-15 10:36 ` Laurent Dufour
2017-02-20 13:01 ` Michal Hocko
2017-02-20 13:01   ` Michal Hocko
2017-02-20 13:01   ` Michal Hocko
2017-02-20 17:09   ` Laurent Dufour
2017-02-20 17:09     ` Laurent Dufour
2017-02-20 17:42     ` Michal Hocko
2017-02-20 17:42       ` Michal Hocko
2017-02-22 14:02       ` Laurent Dufour
2017-02-22 14:02         ` Laurent Dufour

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.