Frequent oom introduced in mainline when migrate_highatomic replace migrate

All of lore.kernel.org
 help / color / mirror / Atom feed

* Frequent oom introduced in mainline when migrate_highatomic replace migrate_reserve
@ 2019-06-24  4:43 zhong jiang
  2019-06-24  8:10 ` Michal Hocko
  0 siblings, 1 reply; 9+ messages in thread
From: zhong jiang @ 2019-06-24  4:43 UTC (permalink / raw)
  To: Michal Hocko, Andrea Arcangeli, Hugh Dickins, Minchan Kim,
	Vlastimil Babka
  Cc: Linux Memory Management List, Wangkefeng (Kevin)

Recently,  I  hit an frequent oom issue in linux-4.4 stable with less than 4M free memory after
the machine boots up.

As the process is created,  kernel stack will use the higher order to allocate continuous memory.
Due to the fragmentabtion,  we fails to allocate the memory.   And the low memory will result
in hardly memory compction.  hence,  it will easily to reproduce the oom.

But if we use migrate_reserve to reserve at least a pageblock at  the boot stage.   we can use
the reserve memory to allocate continuous memory for process when the system is under
severerly fragmentation.

In my opinion,  Reserve  memory will relieve the pressure effectively at least in small memroy machine.

Any ideas? Thanks

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Frequent oom introduced in mainline when migrate_highatomic replace migrate_reserve
  2019-06-24  4:43 Frequent oom introduced in mainline when migrate_highatomic replace migrate_reserve zhong jiang
@ 2019-06-24  8:10 ` Michal Hocko
  2019-06-24 13:11   ` zhong jiang
  0 siblings, 1 reply; 9+ messages in thread
From: Michal Hocko @ 2019-06-24  8:10 UTC (permalink / raw)
  To: zhong jiang
  Cc: Andrea Arcangeli, Hugh Dickins, Minchan Kim, Vlastimil Babka,
	Linux Memory Management List, Wangkefeng (Kevin)

On Mon 24-06-19 12:43:26, zhong jiang wrote:
> Recently,  I  hit an frequent oom issue in linux-4.4 stable with less than 4M free memory after
> the machine boots up.

Is this is a regression? Could you share the oom report?

> As the process is created,  kernel stack will use the higher order to allocate continuous memory.
> Due to the fragmentabtion,  we fails to allocate the memory.   And the low memory will result
> in hardly memory compction.  hence,  it will easily to reproduce the oom.

How get your get such a large fragmentation that you cannot allocate
order-1 pages and compaction is not making any progress?

> But if we use migrate_reserve to reserve at least a pageblock at  the boot stage.   we can use
> the reserve memory to allocate continuous memory for process when the system is under
> severerly fragmentation.

Well, any reservation is a finite resource so I am not sure how that can
help universally. But your description is quite vague. Could you be more
specific about that workload? Also do you see the same with the current
upstream kernel as well?
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Frequent oom introduced in mainline when migrate_highatomic replace migrate_reserve
  2019-06-24  8:10 ` Michal Hocko
@ 2019-06-24 13:11   ` zhong jiang
  2019-06-24 14:01     ` Michal Hocko
  0 siblings, 1 reply; 9+ messages in thread
From: zhong jiang @ 2019-06-24 13:11 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrea Arcangeli, Hugh Dickins, Minchan Kim, Vlastimil Babka,
	Linux Memory Management List, Wangkefeng (Kevin)

On 2019/6/24 16:10, Michal Hocko wrote:
> On Mon 24-06-19 12:43:26, zhong jiang wrote:
>> Recently,  I  hit an frequent oom issue in linux-4.4 stable with less than 4M free memory after
>> the machine boots up.
> Is this is a regression? Could you share the oom report?
Yep,  At least at the small memory machine.  I has tested that revert the migrate_highatomic  works well. 
The oom report message is as follows.

[  652.272622] sh invoked oom-killer: gfp_mask=0x26080c0, order=3, oom_score_adj=0
[  652.272683] CPU: 0 PID: 1748 Comm: sh Tainted: P           O    4.4.171 #8
[  652.345605] Hardware name: Qualcomm (Flattened Device Tree)
[  652.428968] [<c02149d0>] (unwind_backtrace) from [<c02125a4>] (show_stack+0x10/0x14)
[  652.494604] [<c02125a4>] (show_stack) from [<c037fb08>] (dump_stack+0xa0/0xd8)
[  652.590432] [<c037fb08>] (dump_stack) from [<c02bdf58>] (dump_header.constprop.6+0x40/0x15c)
[  652.674793] [<c02bdf58>] (dump_header.constprop.6) from [<c0287504>] (oom_kill_process+0xc4/0x434)
[  652.777934] [<c0287504>] (oom_kill_process) from [<c0287b5c>] (out_of_memory+0x284/0x318)
[  652.883120] [<c0287b5c>] (out_of_memory) from [<c028b990>] (__alloc_pages_nodemask+0x90c/0x9b4)
[  652.982080] [<c028b990>] (__alloc_pages_nodemask) from [<c021aa94>] (copy_process.part.2+0xe4/0x12f0)
[  653.084160] [<c021aa94>] (copy_process.part.2) from [<c021bdf8>] (_do_fork+0xb8/0x2d4)
[  653.196678] [<c021bdf8>] (_do_fork) from [<c021c0d0>] (SyS_clone+0x1c/0x24)
[  653.290424] [<c021c0d0>] (SyS_clone) from [<c020f480>] (ret_fast_syscall+0x0/0x4c)
[  653.452827] Mem-Info:
[  653.466390] active_anon:20377 inactive_anon:187 isolated_anon:0
[  653.466390]  active_file:5087 inactive_file:4825 isolated_file:0
[  653.466390]  unevictable:12 dirty:0 writeback:32 unstable:0
[  653.466390]  slab_reclaimable:636 slab_unreclaimable:1754
[  653.466390]  mapped:5338 shmem:194 pagetables:231 bounce:0
[  653.466390]  free:1086 free_pcp:85 free_cma:0
[  653.625286] Normal free:4248kB min:1696kB low:2120kB high:2544kB active_anon:81508kB inactive_anon:748kB active_file:20348kB inactive_file:19300kB unevictable:48kB isolated(anon):0kB isolated(file):0kB present:252928kB managed:180496kB mlocked:0kB dirty:0kB writeback:128kB mapped:21352kB shmem:776kB slab_reclaimable:2544kB slab_unreclaimable:7016kB kernel_stack:9856kB pagetables:924kB unstable:0kB bounce:0kB free_pcp:392kB local_pcp:392kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  654.177121] lowmem_reserve[]: 0 0 0
[  654.462015] Normal: 752*4kB (UME) 128*8kB (UM) 21*16kB (M) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4368kB
[  654.601093] 10132 total pagecache pages
[  654.606655] 63232 pages RAM
[  654.656658] 0 pages HighMem/MovableOnly
[  654.686821] 18108 pages reserved
[  654.731549] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
[  654.775019] [  108]     0   108      814       11       6       0        0             0 rcS
[  654.877730] [  113]     0   113      814      125       6       0        0             0 sh
[  654.978510] [  116]     0   116      485       16       5       0        0             0 fprefetch
[  655.083679] [  272]     0   272     1579       83       7       0        0         -1000 sshd
[  655.187516] [  276]     0   276     7591      518      10       0        0             0 redis-server
[  655.287893] [  282]     0   282     3495      364       8       0        0             0 callhome
[  655.406249] [  284]     0   284     1166      322       6       0        0             0 remote_plugin
[  655.524594] [  292]     0   292      523      113       6       0        0           -17 monitor
[  655.616477] [  293]     0   293    17958      609      39       0        0             0 cap32
[  655.724315] [  296]     0   296     2757     1106      10       0        0             0 confd
[  655.823061] [  297]     0   297    60183    20757     112       0        0             0 vos.o
[  655.952344] [ 1748]     0  1748      814       92       6       0        0             0 sh
......[  656.241027] *****************Start oom extend info.*****************
>> As the process is created,  kernel stack will use the higher order to allocate continuous memory.
>> Due to the fragmentabtion,  we fails to allocate the memory.   And the low memory will result
>> in hardly memory compction.  hence,  it will easily to reproduce the oom.
> How get your get such a large fragmentation that you cannot allocate
> order-1 pages and compaction is not making any progress?
From the above oom report,  we can see that  there is not order-2 pages.  It wil hardly to allocate kernel stack when
creating the process.  And we can easily to reproduce the situation when runing some userspace program.

But it rarely trigger the oom when It do not introducing the highatomic.  we test that in the kernel 3.10.
>> But if we use migrate_reserve to reserve at least a pageblock at  the boot stage.   we can use
>> the reserve memory to allocate continuous memory for process when the system is under
>> severerly fragmentation.
> Well, any reservation is a finite resource so I am not sure how that can
> help universally. But your description is quite vague. Could you be more
> specific about that workload? Also do you see the same with the current
> upstream kernel as well?
I  just compare the kernel 3.10 with  kernel 4.4 in same situation.  I do not test the issue in the upstream kernel.
and check the /proc/pagetypeinfo after the system boots up.

migrate_highatomic  is always zero.,  while migrate_reserve has an pageblock that it can use for high order allocation.
Even though go back the migrate_reserve can not solve the issue. but In fact , it did relieve the oom situation.

Thanks,
zhong jiang




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Frequent oom introduced in mainline when migrate_highatomic replace migrate_reserve
  2019-06-24 13:11   ` zhong jiang
@ 2019-06-24 14:01     ` Michal Hocko
  2019-06-24 15:28       ` zhong jiang
  2019-06-24 16:47       ` zhong jiang
  0 siblings, 2 replies; 9+ messages in thread
From: Michal Hocko @ 2019-06-24 14:01 UTC (permalink / raw)
  To: zhong jiang
  Cc: Andrea Arcangeli, Hugh Dickins, Minchan Kim, Vlastimil Babka,
	Linux Memory Management List, Wangkefeng (Kevin)

On Mon 24-06-19 21:11:55, zhong jiang wrote:
> [  652.272622] sh invoked oom-killer: gfp_mask=0x26080c0, order=3, oom_score_adj=0
> [  652.272683] CPU: 0 PID: 1748 Comm: sh Tainted: P           O    4.4.171 #8
> [  653.452827] Mem-Info:
> [  653.466390] active_anon:20377 inactive_anon:187 isolated_anon:0
> [  653.466390]  active_file:5087 inactive_file:4825 isolated_file:0
> [  653.466390]  unevictable:12 dirty:0 writeback:32 unstable:0
> [  653.466390]  slab_reclaimable:636 slab_unreclaimable:1754
> [  653.466390]  mapped:5338 shmem:194 pagetables:231 bounce:0
> [  653.466390]  free:1086 free_pcp:85 free_cma:0
> [  653.625286] Normal free:4248kB min:1696kB low:2120kB high:2544kB active_anon:81508kB inactive_anon:748kB active_file:20348kB inactive_file:19300kB unevictable:48kB isolated(anon):0kB isolated(file):0kB present:252928kB managed:180496kB mlocked:0kB dirty:0kB writeback:128kB mapped:21352kB shmem:776kB slab_reclaimable:2544kB slab_unreclaimable:7016kB kernel_stack:9856kB pagetables:924kB unstable:0kB bounce:0kB free_pcp:392kB local_pcp:392kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> [  654.177121] lowmem_reserve[]: 0 0 0
> [  654.462015] Normal: 752*4kB (UME) 128*8kB (UM) 21*16kB (M) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4368kB
> [  654.601093] 10132 total pagecache pages
> [  654.606655] 63232 pages RAM
[...]
> >> As the process is created,  kernel stack will use the higher order to allocate continuous memory.
> >> Due to the fragmentabtion,  we fails to allocate the memory.   And the low memory will result
> >> in hardly memory compction.  hence,  it will easily to reproduce the oom.
> > How get your get such a large fragmentation that you cannot allocate
> > order-1 pages and compaction is not making any progress?
> >From the above oom report,  we can see that  there is not order-2 pages.  It wil hardly to allocate kernel stack when
> creating the process.  And we can easily to reproduce the situation when runing some userspace program.
> 
> But it rarely trigger the oom when It do not introducing the highatomic.  we test that in the kernel 3.10.

I do not really see how highatomic reserves could make any difference.
We do drain them before OOM killer is invoked. The above oom report
confirms that there is indeed no order-3+ free page to be used.

It is hard to tell whether compaction has done all it could but there
have many changes in this area since 4.4 so I would be really curious
about the current upstream kernel behavior. I would also note that
relying on order-3 allocation is far from optimal. I am not sure what
exactly copy_process.part.2+0xe4 refers to but if this is really a stack
allocation then I would consider such a large stack really dangerous for
a small system.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Frequent oom introduced in mainline when migrate_highatomic replace migrate_reserve
  2019-06-24 14:01     ` Michal Hocko
@ 2019-06-24 15:28       ` zhong jiang
  2019-06-24 16:47       ` zhong jiang
  1 sibling, 0 replies; 9+ messages in thread
From: zhong jiang @ 2019-06-24 15:28 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrea Arcangeli, Hugh Dickins, Minchan Kim, Vlastimil Babka,
	Linux Memory Management List, Wangkefeng (Kevin)

On 2019/6/24 22:01, Michal Hocko wrote:
> On Mon 24-06-19 21:11:55, zhong jiang wrote:
>> [  652.272622] sh invoked oom-killer: gfp_mask=0x26080c0, order=3, oom_score_adj=0
>> [  652.272683] CPU: 0 PID: 1748 Comm: sh Tainted: P           O    4.4.171 #8
>> [  653.452827] Mem-Info:
>> [  653.466390] active_anon:20377 inactive_anon:187 isolated_anon:0
>> [  653.466390]  active_file:5087 inactive_file:4825 isolated_file:0
>> [  653.466390]  unevictable:12 dirty:0 writeback:32 unstable:0
>> [  653.466390]  slab_reclaimable:636 slab_unreclaimable:1754
>> [  653.466390]  mapped:5338 shmem:194 pagetables:231 bounce:0
>> [  653.466390]  free:1086 free_pcp:85 free_cma:0
>> [  653.625286] Normal free:4248kB min:1696kB low:2120kB high:2544kB active_anon:81508kB inactive_anon:748kB active_file:20348kB inactive_file:19300kB unevictable:48kB isolated(anon):0kB isolated(file):0kB present:252928kB managed:180496kB mlocked:0kB dirty:0kB writeback:128kB mapped:21352kB shmem:776kB slab_reclaimable:2544kB slab_unreclaimable:7016kB kernel_stack:9856kB pagetables:924kB unstable:0kB bounce:0kB free_pcp:392kB local_pcp:392kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
>> [  654.177121] lowmem_reserve[]: 0 0 0
>> [  654.462015] Normal: 752*4kB (UME) 128*8kB (UM) 21*16kB (M) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4368kB
>> [  654.601093] 10132 total pagecache pages
>> [  654.606655] 63232 pages RAM
> [...]
>>>> As the process is created,  kernel stack will use the higher order to allocate continuous memory.
>>>> Due to the fragmentabtion,  we fails to allocate the memory.   And the low memory will result
>>>> in hardly memory compction.  hence,  it will easily to reproduce the oom.
>>> How get your get such a large fragmentation that you cannot allocate
>>> order-1 pages and compaction is not making any progress?
>> >From the above oom report,  we can see that  there is not order-2 pages.  It wil hardly to allocate kernel stack when
>> creating the process.  And we can easily to reproduce the situation when runing some userspace program.
>>
>> But it rarely trigger the oom when It do not introducing the highatomic.  we test that in the kernel 3.10.
> I do not really see how highatomic reserves could make any difference.
> We do drain them before OOM killer is invoked. The above oom report
> confirms that there is indeed no order-3+ free page to be used.
Unfortunatly,  migrate_highatomic is alway zero,  hence it will not
work for this situation.

Thanks,
zhongjiang
> It is hard to tell whether compaction has done all it could but there
> have many changes in this area since 4.4 so I would be really curious
> about the current upstream kernel behavior. I would also note that
> relying on order-3 allocation is far from optimal. I am not sure what
> exactly copy_process.part.2+0xe4 refers to but if this is really a stack
> allocation then I would consider such a large stack really dangerous for
> a small system.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Frequent oom introduced in mainline when migrate_highatomic replace migrate_reserve
  2019-06-24 14:01     ` Michal Hocko
  2019-06-24 15:28       ` zhong jiang
@ 2019-06-24 16:47       ` zhong jiang
  2019-06-24 17:54         ` Michal Hocko
  1 sibling, 1 reply; 9+ messages in thread
From: zhong jiang @ 2019-06-24 16:47 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrea Arcangeli, Hugh Dickins, Minchan Kim, Vlastimil Babka,
	Linux Memory Management List, Wangkefeng (Kevin)

On 2019/6/24 22:01, Michal Hocko wrote:
> On Mon 24-06-19 21:11:55, zhong jiang wrote:
>> [  652.272622] sh invoked oom-killer: gfp_mask=0x26080c0, order=3, oom_score_adj=0
>> [  652.272683] CPU: 0 PID: 1748 Comm: sh Tainted: P           O    4.4.171 #8
>> [  653.452827] Mem-Info:
>> [  653.466390] active_anon:20377 inactive_anon:187 isolated_anon:0
>> [  653.466390]  active_file:5087 inactive_file:4825 isolated_file:0
>> [  653.466390]  unevictable:12 dirty:0 writeback:32 unstable:0
>> [  653.466390]  slab_reclaimable:636 slab_unreclaimable:1754
>> [  653.466390]  mapped:5338 shmem:194 pagetables:231 bounce:0
>> [  653.466390]  free:1086 free_pcp:85 free_cma:0
>> [  653.625286] Normal free:4248kB min:1696kB low:2120kB high:2544kB active_anon:81508kB inactive_anon:748kB active_file:20348kB inactive_file:19300kB unevictable:48kB isolated(anon):0kB isolated(file):0kB present:252928kB managed:180496kB mlocked:0kB dirty:0kB writeback:128kB mapped:21352kB shmem:776kB slab_reclaimable:2544kB slab_unreclaimable:7016kB kernel_stack:9856kB pagetables:924kB unstable:0kB bounce:0kB free_pcp:392kB local_pcp:392kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
>> [  654.177121] lowmem_reserve[]: 0 0 0
>> [  654.462015] Normal: 752*4kB (UME) 128*8kB (UM) 21*16kB (M) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4368kB
>> [  654.601093] 10132 total pagecache pages
>> [  654.606655] 63232 pages RAM
> [...]
>>>> As the process is created,  kernel stack will use the higher order to allocate continuous memory.
>>>> Due to the fragmentabtion,  we fails to allocate the memory.   And the low memory will result
>>>> in hardly memory compction.  hence,  it will easily to reproduce the oom.
>>> How get your get such a large fragmentation that you cannot allocate
>>> order-1 pages and compaction is not making any progress?
>> >From the above oom report,  we can see that  there is not order-2 pages.  It wil hardly to allocate kernel stack when
>> creating the process.  And we can easily to reproduce the situation when runing some userspace program.
>>
>> But it rarely trigger the oom when It do not introducing the highatomic.  we test that in the kernel 3.10.
> I do not really see how highatomic reserves could make any difference.
> We do drain them before OOM killer is invoked. The above oom report
> confirms that there is indeed no order-3+ free page to be used.
I mean that all order with migrate_highatomic is alway zero,  it can be  true that
we can not reserve the high order memory if we do not use gfp_atomic to allocate memory.

Thought?

Thanks,
zhong jiang
>
> It is hard to tell whether compaction has done all it could but there
> have many changes in this area since 4.4 so I would be really curious
> about the current upstream kernel behavior. I would also note that
> relying on order-3 allocation is far from optimal. I am not sure what
> exactly copy_process.part.2+0xe4 refers to but if this is really a stack
> allocation then I would consider such a large stack really dangerous for
> a small system.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Frequent oom introduced in mainline when migrate_highatomic replace migrate_reserve
  2019-06-24 16:47       ` zhong jiang
@ 2019-06-24 17:54         ` Michal Hocko
  2019-06-25  2:52           ` zhong jiang
  0 siblings, 1 reply; 9+ messages in thread
From: Michal Hocko @ 2019-06-24 17:54 UTC (permalink / raw)
  To: zhong jiang
  Cc: Andrea Arcangeli, Hugh Dickins, Minchan Kim, Vlastimil Babka,
	Linux Memory Management List, Wangkefeng (Kevin)

On Tue 25-06-19 00:47:11, zhong jiang wrote:
> On 2019/6/24 22:01, Michal Hocko wrote:
> > On Mon 24-06-19 21:11:55, zhong jiang wrote:
> >> [  652.272622] sh invoked oom-killer: gfp_mask=0x26080c0, order=3, oom_score_adj=0
> >> [  652.272683] CPU: 0 PID: 1748 Comm: sh Tainted: P           O    4.4.171 #8
> >> [  653.452827] Mem-Info:
> >> [  653.466390] active_anon:20377 inactive_anon:187 isolated_anon:0
> >> [  653.466390]  active_file:5087 inactive_file:4825 isolated_file:0
> >> [  653.466390]  unevictable:12 dirty:0 writeback:32 unstable:0
> >> [  653.466390]  slab_reclaimable:636 slab_unreclaimable:1754
> >> [  653.466390]  mapped:5338 shmem:194 pagetables:231 bounce:0
> >> [  653.466390]  free:1086 free_pcp:85 free_cma:0
> >> [  653.625286] Normal free:4248kB min:1696kB low:2120kB high:2544kB active_anon:81508kB inactive_anon:748kB active_file:20348kB inactive_file:19300kB unevictable:48kB isolated(anon):0kB isolated(file):0kB present:252928kB managed:180496kB mlocked:0kB dirty:0kB writeback:128kB mapped:21352kB shmem:776kB slab_reclaimable:2544kB slab_unreclaimable:7016kB kernel_stack:9856kB pagetables:924kB unstable:0kB bounce:0kB free_pcp:392kB local_pcp:392kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> >> [  654.177121] lowmem_reserve[]: 0 0 0
> >> [  654.462015] Normal: 752*4kB (UME) 128*8kB (UM) 21*16kB (M) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4368kB
> >> [  654.601093] 10132 total pagecache pages
> >> [  654.606655] 63232 pages RAM
> > [...]
> >>>> As the process is created,  kernel stack will use the higher order to allocate continuous memory.
> >>>> Due to the fragmentabtion,  we fails to allocate the memory.   And the low memory will result
> >>>> in hardly memory compction.  hence,  it will easily to reproduce the oom.
> >>> How get your get such a large fragmentation that you cannot allocate
> >>> order-1 pages and compaction is not making any progress?
> >> >From the above oom report,  we can see that  there is not order-2 pages.  It wil hardly to allocate kernel stack when
> >> creating the process.  And we can easily to reproduce the situation when runing some userspace program.
> >>
> >> But it rarely trigger the oom when It do not introducing the highatomic.  we test that in the kernel 3.10.
> > I do not really see how highatomic reserves could make any difference.
> > We do drain them before OOM killer is invoked. The above oom report
> > confirms that there is indeed no order-3+ free page to be used.
> I mean that all order with migrate_highatomic is alway zero,  it can be  true that

Yes, highatomic is meant to be used for higher order allocations which
already do have access to memory reserves. E.g. via __GFP_ATOMIC.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Frequent oom introduced in mainline when migrate_highatomic replace migrate_reserve
  2019-06-24 17:54         ` Michal Hocko
@ 2019-06-25  2:52           ` zhong jiang
  2019-06-25 10:36             ` Michal Hocko
  0 siblings, 1 reply; 9+ messages in thread
From: zhong jiang @ 2019-06-25  2:52 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrea Arcangeli, Hugh Dickins, Minchan Kim, Vlastimil Babka,
	Linux Memory Management List, Wangkefeng (Kevin)

On 2019/6/25 1:54, Michal Hocko wrote:
> On Tue 25-06-19 00:47:11, zhong jiang wrote:
>> On 2019/6/24 22:01, Michal Hocko wrote:
>>> On Mon 24-06-19 21:11:55, zhong jiang wrote:
>>>> [  652.272622] sh invoked oom-killer: gfp_mask=0x26080c0, order=3, oom_score_adj=0
>>>> [  652.272683] CPU: 0 PID: 1748 Comm: sh Tainted: P           O    4.4.171 #8
>>>> [  653.452827] Mem-Info:
>>>> [  653.466390] active_anon:20377 inactive_anon:187 isolated_anon:0
>>>> [  653.466390]  active_file:5087 inactive_file:4825 isolated_file:0
>>>> [  653.466390]  unevictable:12 dirty:0 writeback:32 unstable:0
>>>> [  653.466390]  slab_reclaimable:636 slab_unreclaimable:1754
>>>> [  653.466390]  mapped:5338 shmem:194 pagetables:231 bounce:0
>>>> [  653.466390]  free:1086 free_pcp:85 free_cma:0
>>>> [  653.625286] Normal free:4248kB min:1696kB low:2120kB high:2544kB active_anon:81508kB inactive_anon:748kB active_file:20348kB inactive_file:19300kB unevictable:48kB isolated(anon):0kB isolated(file):0kB present:252928kB managed:180496kB mlocked:0kB dirty:0kB writeback:128kB mapped:21352kB shmem:776kB slab_reclaimable:2544kB slab_unreclaimable:7016kB kernel_stack:9856kB pagetables:924kB unstable:0kB bounce:0kB free_pcp:392kB local_pcp:392kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
>>>> [  654.177121] lowmem_reserve[]: 0 0 0
>>>> [  654.462015] Normal: 752*4kB (UME) 128*8kB (UM) 21*16kB (M) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4368kB
>>>> [  654.601093] 10132 total pagecache pages
>>>> [  654.606655] 63232 pages RAM
>>> [...]
>>>>>> As the process is created,  kernel stack will use the higher order to allocate continuous memory.
>>>>>> Due to the fragmentabtion,  we fails to allocate the memory.   And the low memory will result
>>>>>> in hardly memory compction.  hence,  it will easily to reproduce the oom.
>>>>> How get your get such a large fragmentation that you cannot allocate
>>>>> order-1 pages and compaction is not making any progress?
>>>> >From the above oom report,  we can see that  there is not order-2 pages.  It wil hardly to allocate kernel stack when
>>>> creating the process.  And we can easily to reproduce the situation when runing some userspace program.
>>>>
>>>> But it rarely trigger the oom when It do not introducing the highatomic.  we test that in the kernel 3.10.
>>> I do not really see how highatomic reserves could make any difference.
>>> We do drain them before OOM killer is invoked. The above oom report
>>> confirms that there is indeed no order-3+ free page to be used.
>> I mean that all order with migrate_highatomic is alway zero,  it can be  true that
> Yes, highatomic is meant to be used for higher order allocations which
> already do have access to memory reserves. E.g. via __GFP_ATOMIC.
If current kernel have not use __GFP_ATOMIC to allocate memory,  highatomic will have not available higher order.
And we have order-3 kernel stack allocation requirement in the system.  

There is not  memory reserve to use for us in the emergency situation,  which is different from migrate_reserve.
Maybe I  think that we can change the reserve memory behaviour,  Not only reserve higher order in GFP_ATOMIC.

Thanks,
zhong jiang





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Frequent oom introduced in mainline when migrate_highatomic replace migrate_reserve
  2019-06-25  2:52           ` zhong jiang
@ 2019-06-25 10:36             ` Michal Hocko
  0 siblings, 0 replies; 9+ messages in thread
From: Michal Hocko @ 2019-06-25 10:36 UTC (permalink / raw)
  To: zhong jiang
  Cc: Andrea Arcangeli, Hugh Dickins, Minchan Kim, Vlastimil Babka,
	Linux Memory Management List, Wangkefeng (Kevin)

On Tue 25-06-19 10:52:17, zhong jiang wrote:
> On 2019/6/25 1:54, Michal Hocko wrote:
> > On Tue 25-06-19 00:47:11, zhong jiang wrote:
> >> On 2019/6/24 22:01, Michal Hocko wrote:
> >>> On Mon 24-06-19 21:11:55, zhong jiang wrote:
> >>>> [  652.272622] sh invoked oom-killer: gfp_mask=0x26080c0, order=3, oom_score_adj=0
> >>>> [  652.272683] CPU: 0 PID: 1748 Comm: sh Tainted: P           O    4.4.171 #8
> >>>> [  653.452827] Mem-Info:
> >>>> [  653.466390] active_anon:20377 inactive_anon:187 isolated_anon:0
> >>>> [  653.466390]  active_file:5087 inactive_file:4825 isolated_file:0
> >>>> [  653.466390]  unevictable:12 dirty:0 writeback:32 unstable:0
> >>>> [  653.466390]  slab_reclaimable:636 slab_unreclaimable:1754
> >>>> [  653.466390]  mapped:5338 shmem:194 pagetables:231 bounce:0
> >>>> [  653.466390]  free:1086 free_pcp:85 free_cma:0
> >>>> [  653.625286] Normal free:4248kB min:1696kB low:2120kB high:2544kB active_anon:81508kB inactive_anon:748kB active_file:20348kB inactive_file:19300kB unevictable:48kB isolated(anon):0kB isolated(file):0kB present:252928kB managed:180496kB mlocked:0kB dirty:0kB writeback:128kB mapped:21352kB shmem:776kB slab_reclaimable:2544kB slab_unreclaimable:7016kB kernel_stack:9856kB pagetables:924kB unstable:0kB bounce:0kB free_pcp:392kB local_pcp:392kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> >>>> [  654.177121] lowmem_reserve[]: 0 0 0
> >>>> [  654.462015] Normal: 752*4kB (UME) 128*8kB (UM) 21*16kB (M) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4368kB
> >>>> [  654.601093] 10132 total pagecache pages
> >>>> [  654.606655] 63232 pages RAM
> >>> [...]
> >>>>>> As the process is created,  kernel stack will use the higher order to allocate continuous memory.
> >>>>>> Due to the fragmentabtion,  we fails to allocate the memory.   And the low memory will result
> >>>>>> in hardly memory compction.  hence,  it will easily to reproduce the oom.
> >>>>> How get your get such a large fragmentation that you cannot allocate
> >>>>> order-1 pages and compaction is not making any progress?
> >>>> >From the above oom report,  we can see that  there is not order-2 pages.  It wil hardly to allocate kernel stack when
> >>>> creating the process.  And we can easily to reproduce the situation when runing some userspace program.
> >>>>
> >>>> But it rarely trigger the oom when It do not introducing the highatomic.  we test that in the kernel 3.10.
> >>> I do not really see how highatomic reserves could make any difference.
> >>> We do drain them before OOM killer is invoked. The above oom report
> >>> confirms that there is indeed no order-3+ free page to be used.
> >> I mean that all order with migrate_highatomic is alway zero,  it can be  true that
> > Yes, highatomic is meant to be used for higher order allocations which
> > already do have access to memory reserves. E.g. via __GFP_ATOMIC.
> If current kernel have not use __GFP_ATOMIC to allocate memory,  highatomic will have not available higher order.
> And we have order-3 kernel stack allocation requirement in the system.  
> 
> There is not  memory reserve to use for us in the emergency situation,  which is different from migrate_reserve.
> Maybe I  think that we can change the reserve memory behaviour,  Not only reserve higher order in GFP_ATOMIC.

Let me repeat. This is unlikely to help for something like a fork code
path which can be triggered by userspace and no matter how much you
reserve it can get depleted easily. Your real problem is to require
an order-3 allocation for this particular path.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-06-25 10:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-24  4:43 Frequent oom introduced in mainline when migrate_highatomic replace migrate_reserve zhong jiang
2019-06-24  8:10 ` Michal Hocko
2019-06-24 13:11   ` zhong jiang
2019-06-24 14:01     ` Michal Hocko
2019-06-24 15:28       ` zhong jiang
2019-06-24 16:47       ` zhong jiang
2019-06-24 17:54         ` Michal Hocko
2019-06-25  2:52           ` zhong jiang
2019-06-25 10:36             ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.