All of lore.kernel.org
 help / color / mirror / Atom feed
* linux kernel page allocation failure and tuning of page cache
@ 2019-05-31 15:07 Nagal, Amit               UTC CCS
  2019-05-31 19:30   ` Matthew Wilcox
  2019-05-31 21:27   ` Alexander Duyck
  0 siblings, 2 replies; 17+ messages in thread
From: Nagal, Amit               UTC CCS @ 2019-05-31 15:07 UTC (permalink / raw)
  To: linux-kernel, linux-mm; +Cc: CHAWLA, RITU              UTC CCS

Hi 

We are using Renesas RZ/A1 processor based custom target board . linux kernel version is 4.9.123.

1) the platform is low memory platform having memory 64MB.

2)  we are doing around 45MB TCP data transfer from PC to target using netcat utility .On Target , a process receives data over socket and writes the data to flash disk .

3) At the start of data transfer , we explicitly clear linux kernel cached memory by  calling echo 3 > /proc/sys/vm/drop_caches .

4) during TCP data transfer , we could see free -m showing "free" getting dropped to almost 1MB and most of the memory appearing as "cached" 

# free -m
                                            total         used   free     shared   buffers   cached
Mem:                                  57            56         1                 0            2           42
-/+ buffers/cache:                          12        45
Swap:                                   0              0           0

5) sometimes , we observed kernel memory getting exhausted as page allocation failure happens in kernel  with the backtrace is printed below :
# [  775.947949] nc.traditional: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC)
[  775.956362] CPU: 0 PID: 1288 Comm: nc.traditional Tainted: G           O    4.9.123-pic6-g31a13de-dirty #19
[  775.966085] Hardware name: Generic R7S72100 (Flattened Device Tree)
[  775.972501] [<c0109829>] (unwind_backtrace) from [<c010796f>] (show_stack+0xb/0xc)
[  775.980118] [<c010796f>] (show_stack) from [<c0151de3>] (warn_alloc+0x89/0xba)
[  775.987361] [<c0151de3>] (warn_alloc) from [<c0152043>] (__alloc_pages_nodemask+0x1eb/0x634)
[  775.995790] [<c0152043>] (__alloc_pages_nodemask) from [<c0152523>] (__alloc_page_frag+0x39/0xde)
[  776.004685] [<c0152523>] (__alloc_page_frag) from [<c03190f1>] (__netdev_alloc_skb+0x51/0xb0)
[  776.013217] [<c03190f1>] (__netdev_alloc_skb) from [<c02c1b6f>] (sh_eth_poll+0xbf/0x3c0)
[  776.021342] [<c02c1b6f>] (sh_eth_poll) from [<c031fd8f>] (net_rx_action+0x77/0x170)
[  776.029051] [<c031fd8f>] (net_rx_action) from [<c011238f>] (__do_softirq+0x107/0x160)
[  776.036896] [<c011238f>] (__do_softirq) from [<c0112589>] (irq_exit+0x5d/0x80)
[  776.044165] [<c0112589>] (irq_exit) from [<c012f4db>] (__handle_domain_irq+0x57/0x8c)
[  776.052007] [<c012f4db>] (__handle_domain_irq) from [<c01012e1>] (gic_handle_irq+0x31/0x48)
[  776.060362] [<c01012e1>] (gic_handle_irq) from [<c0108025>] (__irq_svc+0x65/0xac)
[  776.067835] Exception stack(0xc1cafd70 to 0xc1cafdb8)
[  776.072876] fd60:                                     0002751c c1dec6a0 0000000c 521c3be5
[  776.081042] fd80: 56feb08e f64823a6 ffb35f7b feab513d f9cb0643 0000056c c1caff10 ffffe000
[  776.089204] fda0: b1f49160 c1cafdc4 c180c677 c0234ace 200e0033 ffffffff
[  776.095816] [<c0108025>] (__irq_svc) from [<c0234ace>] (__copy_to_user_std+0x7e/0x430)
[  776.103796] [<c0234ace>] (__copy_to_user_std) from [<c0241715>] (copy_page_to_iter+0x105/0x250)
[  776.112503] [<c0241715>] (copy_page_to_iter) from [<c0319aeb>] (skb_copy_datagram_iter+0xa3/0x108)
[  776.121469] [<c0319aeb>] (skb_copy_datagram_iter) from [<c03443a7>] (tcp_recvmsg+0x3ab/0x5f4)
[  776.130045] [<c03443a7>] (tcp_recvmsg) from [<c035e249>] (inet_recvmsg+0x21/0x2c)
[  776.137576] [<c035e249>] (inet_recvmsg) from [<c031009f>] (sock_read_iter+0x51/0x6e)
[  776.145384] [<c031009f>] (sock_read_iter) from [<c017795d>] (__vfs_read+0x97/0xb0)
[  776.152967] [<c017795d>] (__vfs_read) from [<c01781d9>] (vfs_read+0x51/0xb0)
[  776.159983] [<c01781d9>] (vfs_read) from [<c0178aab>] (SyS_read+0x27/0x52)
[  776.166837] [<c0178aab>] (SyS_read) from [<c0105261>] (ret_fast_syscall+0x1/0x54)
[  776.174308] Mem-Info:
[  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0
[  776.176650]  active_file:2636 inactive_file:7391 isolated_file:32
[  776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0
[  776.176650]  slab_reclaimable:719 slab_unreclaimable:724
[  776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0
[  776.176650]  free:373 free_pcp:6 free_cma:0
[  776.209062] Node 0 active_anon:8148kB inactive_anon:92kB active_file:10544kB inactive_file:29564kB unevictable:0kB isolated(anon):0kB isolated(file):128kB mapped:7960kB dirty:5464kB writeback:5124kB shmem:104kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
[  776.233602] Normal free:1492kB min:964kB low:1204kB high:1444kB active_anon:8148kB inactive_anon:92kB active_file:10544kB inactive_file:29564kB unevictable:0kB writepending:10588kB present:65536kB managed:59304kB mlocked:0kB slab_reclaimable:2876kB slab_unreclaimable:2896kB kernel_stack:1152kB pagetables:636kB bounce:0kB free_pcp:24kB local_pcp:24kB free_cma:0kB
[  776.265406] lowmem_reserve[]: 0 0
[  776.268761] Normal: 7*4kB (H) 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB (H) 2*128kB (H) 2*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1492kB
10071 total pagecache pages
[  776.284124] 0 pages in swap cache
[  776.287446] Swap cache stats: add 0, delete 0, find 0/0
[  776.292645] Free swap  = 0kB
[  776.295532] Total swap = 0kB
[  776.298421] 16384 pages RAM
[  776.301224] 0 pages HighMem/MovableOnly
[  776.305052] 1558 pages reserved

6) we have certain questions as below :
a) how the kernel memory got exhausted ? at the time of low memory conditions in kernel , are the kernel page flusher threads , which should have written dirty pages from page cache to flash disk , not executing at right time ? is the kernel page reclaim mechanism not executing at right time ? 

b) are there any parameters available within the linux memory subsystem with which the reclaim procedure can be monitored and  fine tuned ?

c) can  some amount of free memory be reserved so that linux kernel does not caches it and kernel can use it for its other required page allocation ( particularly gfp_atomic ) as needed above on behalf of netcat nc process ? can some tuning be done in linux memory subsystem eg by using /proc/sys/vm/min_free_kbytes  to achieve this objective .

d) can we be provided with further clues on how to debug this issue further for out of memory condition in kernel  ?


Regards
Amit

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: linux kernel page allocation failure and tuning of page cache
  2019-05-31 15:07 linux kernel page allocation failure and tuning of page cache Nagal, Amit               UTC CCS
@ 2019-05-31 19:30   ` Matthew Wilcox
  2019-05-31 21:27   ` Alexander Duyck
  1 sibling, 0 replies; 17+ messages in thread
From: Matthew Wilcox @ 2019-05-31 19:30 UTC (permalink / raw)
  To: Nagal, Amit               UTC CCS
  Cc: linux-kernel, linux-mm, CHAWLA, RITU UTC CCS, netdev

> 1) the platform is low memory platform having memory 64MB.
> 
> 2)  we are doing around 45MB TCP data transfer from PC to target using netcat utility .On Target , a process receives data over socket and writes the data to flash disk .

I think your network is faster than your disk ...

> 5) sometimes , we observed kernel memory getting exhausted as page allocation failure happens in kernel  with the backtrace is printed below :
> # [  775.947949] nc.traditional: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC)

We're in the soft interrupt handler at this point, so we have very
few options for freeing memory; we can't wait for I/O to complete,
for example.

That said, this is a TCP connection.  We could drop the packet silently
without such a noisy warning.  Perhaps just collect statistics on how
many packets we dropped due to a low memory situation.

> [  775.956362] CPU: 0 PID: 1288 Comm: nc.traditional Tainted: G           O    4.9.123-pic6-g31a13de-dirty #19
> [  775.966085] Hardware name: Generic R7S72100 (Flattened Device Tree)
> [  775.972501] [<c0109829>] (unwind_backtrace) from [<c010796f>] (show_stack+0xb/0xc)
> [  775.980118] [<c010796f>] (show_stack) from [<c0151de3>] (warn_alloc+0x89/0xba)
> [  775.987361] [<c0151de3>] (warn_alloc) from [<c0152043>] (__alloc_pages_nodemask+0x1eb/0x634)
> [  775.995790] [<c0152043>] (__alloc_pages_nodemask) from [<c0152523>] (__alloc_page_frag+0x39/0xde)
> [  776.004685] [<c0152523>] (__alloc_page_frag) from [<c03190f1>] (__netdev_alloc_skb+0x51/0xb0)
> [  776.013217] [<c03190f1>] (__netdev_alloc_skb) from [<c02c1b6f>] (sh_eth_poll+0xbf/0x3c0)
> [  776.021342] [<c02c1b6f>] (sh_eth_poll) from [<c031fd8f>] (net_rx_action+0x77/0x170)
> [  776.029051] [<c031fd8f>] (net_rx_action) from [<c011238f>] (__do_softirq+0x107/0x160)
> [  776.036896] [<c011238f>] (__do_softirq) from [<c0112589>] (irq_exit+0x5d/0x80)
> [  776.044165] [<c0112589>] (irq_exit) from [<c012f4db>] (__handle_domain_irq+0x57/0x8c)
> [  776.052007] [<c012f4db>] (__handle_domain_irq) from [<c01012e1>] (gic_handle_irq+0x31/0x48)
> [  776.060362] [<c01012e1>] (gic_handle_irq) from [<c0108025>] (__irq_svc+0x65/0xac)
> [  776.067835] Exception stack(0xc1cafd70 to 0xc1cafdb8)
> [  776.072876] fd60:                                     0002751c c1dec6a0 0000000c 521c3be5
> [  776.081042] fd80: 56feb08e f64823a6 ffb35f7b feab513d f9cb0643 0000056c c1caff10 ffffe000
> [  776.089204] fda0: b1f49160 c1cafdc4 c180c677 c0234ace 200e0033 ffffffff
> [  776.095816] [<c0108025>] (__irq_svc) from [<c0234ace>] (__copy_to_user_std+0x7e/0x430)
> [  776.103796] [<c0234ace>] (__copy_to_user_std) from [<c0241715>] (copy_page_to_iter+0x105/0x250)
> [  776.112503] [<c0241715>] (copy_page_to_iter) from [<c0319aeb>] (skb_copy_datagram_iter+0xa3/0x108)
> [  776.121469] [<c0319aeb>] (skb_copy_datagram_iter) from [<c03443a7>] (tcp_recvmsg+0x3ab/0x5f4)
> [  776.130045] [<c03443a7>] (tcp_recvmsg) from [<c035e249>] (inet_recvmsg+0x21/0x2c)
> [  776.137576] [<c035e249>] (inet_recvmsg) from [<c031009f>] (sock_read_iter+0x51/0x6e)
> [  776.145384] [<c031009f>] (sock_read_iter) from [<c017795d>] (__vfs_read+0x97/0xb0)
> [  776.152967] [<c017795d>] (__vfs_read) from [<c01781d9>] (vfs_read+0x51/0xb0)
> [  776.159983] [<c01781d9>] (vfs_read) from [<c0178aab>] (SyS_read+0x27/0x52)
> [  776.166837] [<c0178aab>] (SyS_read) from [<c0105261>] (ret_fast_syscall+0x1/0x54)
> [  776.174308] Mem-Info:
> [  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0
> [  776.176650]  active_file:2636 inactive_file:7391 isolated_file:32
> [  776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0

Almost all the dirty pages are under writeback at this point.

> [  776.176650]  slab_reclaimable:719 slab_unreclaimable:724
> [  776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0
> [  776.176650]  free:373 free_pcp:6 free_cma:0

We have 373 free pages, but refused to allocate one of them to GFP_ATOMIC?
I don't understand why that failed.  We also didn't try to steal an
inactive_file or inactive_anon page, which seems like an obvious thing
we might want to do.

> [  776.209062] Node 0 active_anon:8148kB inactive_anon:92kB active_file:10544kB inactive_file:29564kB unevictable:0kB isolated(anon):0kB isolated(file):128kB mapped:7960kB dirty:5464kB writeback:5124kB shmem:104kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
> [  776.233602] Normal free:1492kB min:964kB low:1204kB high:1444kB active_anon:8148kB inactive_anon:92kB active_file:10544kB inactive_file:29564kB unevictable:0kB writepending:10588kB present:65536kB managed:59304kB mlocked:0kB slab_reclaimable:2876kB slab_unreclaimable:2896kB kernel_stack:1152kB pagetables:636kB bounce:0kB free_pcp:24kB local_pcp:24kB free_cma:0kB
> [  776.265406] lowmem_reserve[]: 0 0
> [  776.268761] Normal: 7*4kB (H) 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB (H) 2*128kB (H) 2*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1492kB
> 10071 total pagecache pages
> [  776.284124] 0 pages in swap cache
> [  776.287446] Swap cache stats: add 0, delete 0, find 0/0
> [  776.292645] Free swap  = 0kB
> [  776.295532] Total swap = 0kB
> [  776.298421] 16384 pages RAM
> [  776.301224] 0 pages HighMem/MovableOnly
> [  776.305052] 1558 pages reserved

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: linux kernel page allocation failure and tuning of page cache
@ 2019-05-31 19:30   ` Matthew Wilcox
  0 siblings, 0 replies; 17+ messages in thread
From: Matthew Wilcox @ 2019-05-31 19:30 UTC (permalink / raw)
  To: Nagal, Amit               UTC CCS
  Cc: linux-kernel, linux-mm, CHAWLA, RITU              UTC CCS, netdev

> 1) the platform is low memory platform having memory 64MB.
> 
> 2)  we are doing around 45MB TCP data transfer from PC to target using netcat utility .On Target , a process receives data over socket and writes the data to flash disk .

I think your network is faster than your disk ...

> 5) sometimes , we observed kernel memory getting exhausted as page allocation failure happens in kernel  with the backtrace is printed below :
> # [  775.947949] nc.traditional: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC)

We're in the soft interrupt handler at this point, so we have very
few options for freeing memory; we can't wait for I/O to complete,
for example.

That said, this is a TCP connection.  We could drop the packet silently
without such a noisy warning.  Perhaps just collect statistics on how
many packets we dropped due to a low memory situation.

> [  775.956362] CPU: 0 PID: 1288 Comm: nc.traditional Tainted: G           O    4.9.123-pic6-g31a13de-dirty #19
> [  775.966085] Hardware name: Generic R7S72100 (Flattened Device Tree)
> [  775.972501] [<c0109829>] (unwind_backtrace) from [<c010796f>] (show_stack+0xb/0xc)
> [  775.980118] [<c010796f>] (show_stack) from [<c0151de3>] (warn_alloc+0x89/0xba)
> [  775.987361] [<c0151de3>] (warn_alloc) from [<c0152043>] (__alloc_pages_nodemask+0x1eb/0x634)
> [  775.995790] [<c0152043>] (__alloc_pages_nodemask) from [<c0152523>] (__alloc_page_frag+0x39/0xde)
> [  776.004685] [<c0152523>] (__alloc_page_frag) from [<c03190f1>] (__netdev_alloc_skb+0x51/0xb0)
> [  776.013217] [<c03190f1>] (__netdev_alloc_skb) from [<c02c1b6f>] (sh_eth_poll+0xbf/0x3c0)
> [  776.021342] [<c02c1b6f>] (sh_eth_poll) from [<c031fd8f>] (net_rx_action+0x77/0x170)
> [  776.029051] [<c031fd8f>] (net_rx_action) from [<c011238f>] (__do_softirq+0x107/0x160)
> [  776.036896] [<c011238f>] (__do_softirq) from [<c0112589>] (irq_exit+0x5d/0x80)
> [  776.044165] [<c0112589>] (irq_exit) from [<c012f4db>] (__handle_domain_irq+0x57/0x8c)
> [  776.052007] [<c012f4db>] (__handle_domain_irq) from [<c01012e1>] (gic_handle_irq+0x31/0x48)
> [  776.060362] [<c01012e1>] (gic_handle_irq) from [<c0108025>] (__irq_svc+0x65/0xac)
> [  776.067835] Exception stack(0xc1cafd70 to 0xc1cafdb8)
> [  776.072876] fd60:                                     0002751c c1dec6a0 0000000c 521c3be5
> [  776.081042] fd80: 56feb08e f64823a6 ffb35f7b feab513d f9cb0643 0000056c c1caff10 ffffe000
> [  776.089204] fda0: b1f49160 c1cafdc4 c180c677 c0234ace 200e0033 ffffffff
> [  776.095816] [<c0108025>] (__irq_svc) from [<c0234ace>] (__copy_to_user_std+0x7e/0x430)
> [  776.103796] [<c0234ace>] (__copy_to_user_std) from [<c0241715>] (copy_page_to_iter+0x105/0x250)
> [  776.112503] [<c0241715>] (copy_page_to_iter) from [<c0319aeb>] (skb_copy_datagram_iter+0xa3/0x108)
> [  776.121469] [<c0319aeb>] (skb_copy_datagram_iter) from [<c03443a7>] (tcp_recvmsg+0x3ab/0x5f4)
> [  776.130045] [<c03443a7>] (tcp_recvmsg) from [<c035e249>] (inet_recvmsg+0x21/0x2c)
> [  776.137576] [<c035e249>] (inet_recvmsg) from [<c031009f>] (sock_read_iter+0x51/0x6e)
> [  776.145384] [<c031009f>] (sock_read_iter) from [<c017795d>] (__vfs_read+0x97/0xb0)
> [  776.152967] [<c017795d>] (__vfs_read) from [<c01781d9>] (vfs_read+0x51/0xb0)
> [  776.159983] [<c01781d9>] (vfs_read) from [<c0178aab>] (SyS_read+0x27/0x52)
> [  776.166837] [<c0178aab>] (SyS_read) from [<c0105261>] (ret_fast_syscall+0x1/0x54)
> [  776.174308] Mem-Info:
> [  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0
> [  776.176650]  active_file:2636 inactive_file:7391 isolated_file:32
> [  776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0

Almost all the dirty pages are under writeback at this point.

> [  776.176650]  slab_reclaimable:719 slab_unreclaimable:724
> [  776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0
> [  776.176650]  free:373 free_pcp:6 free_cma:0

We have 373 free pages, but refused to allocate one of them to GFP_ATOMIC?
I don't understand why that failed.  We also didn't try to steal an
inactive_file or inactive_anon page, which seems like an obvious thing
we might want to do.

> [  776.209062] Node 0 active_anon:8148kB inactive_anon:92kB active_file:10544kB inactive_file:29564kB unevictable:0kB isolated(anon):0kB isolated(file):128kB mapped:7960kB dirty:5464kB writeback:5124kB shmem:104kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
> [  776.233602] Normal free:1492kB min:964kB low:1204kB high:1444kB active_anon:8148kB inactive_anon:92kB active_file:10544kB inactive_file:29564kB unevictable:0kB writepending:10588kB present:65536kB managed:59304kB mlocked:0kB slab_reclaimable:2876kB slab_unreclaimable:2896kB kernel_stack:1152kB pagetables:636kB bounce:0kB free_pcp:24kB local_pcp:24kB free_cma:0kB
> [  776.265406] lowmem_reserve[]: 0 0
> [  776.268761] Normal: 7*4kB (H) 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB (H) 2*128kB (H) 2*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1492kB
> 10071 total pagecache pages
> [  776.284124] 0 pages in swap cache
> [  776.287446] Swap cache stats: add 0, delete 0, find 0/0
> [  776.292645] Free swap  = 0kB
> [  776.295532] Total swap = 0kB
> [  776.298421] 16384 pages RAM
> [  776.301224] 0 pages HighMem/MovableOnly
> [  776.305052] 1558 pages reserved


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: linux kernel page allocation failure and tuning of page cache
  2019-05-31 15:07 linux kernel page allocation failure and tuning of page cache Nagal, Amit               UTC CCS
@ 2019-05-31 21:27   ` Alexander Duyck
  2019-05-31 21:27   ` Alexander Duyck
  1 sibling, 0 replies; 17+ messages in thread
From: Alexander Duyck @ 2019-05-31 21:27 UTC (permalink / raw)
  To: Nagal, Amit UTC CCS; +Cc: linux-kernel, linux-mm, CHAWLA, RITU UTC CCS

On Fri, May 31, 2019 at 8:07 AM Nagal, Amit UTC CCS <Amit.Nagal@utc.com> wrote:
>
> Hi
>
> We are using Renesas RZ/A1 processor based custom target board . linux kernel version is 4.9.123.
>
> 1) the platform is low memory platform having memory 64MB.
>
> 2)  we are doing around 45MB TCP data transfer from PC to target using netcat utility .On Target , a process receives data over socket and writes the data to flash disk .
>
> 3) At the start of data transfer , we explicitly clear linux kernel cached memory by  calling echo 3 > /proc/sys/vm/drop_caches .
>
> 4) during TCP data transfer , we could see free -m showing "free" getting dropped to almost 1MB and most of the memory appearing as "cached"
>
> # free -m
>                                             total         used   free     shared   buffers   cached
> Mem:                                  57            56         1                 0            2           42
> -/+ buffers/cache:                          12        45
> Swap:                                   0              0           0
>
> 5) sometimes , we observed kernel memory getting exhausted as page allocation failure happens in kernel  with the backtrace is printed below :
> # [  775.947949] nc.traditional: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC)
> [  775.956362] CPU: 0 PID: 1288 Comm: nc.traditional Tainted: G           O    4.9.123-pic6-g31a13de-dirty #19
> [  775.966085] Hardware name: Generic R7S72100 (Flattened Device Tree)
> [  775.972501] [<c0109829>] (unwind_backtrace) from [<c010796f>] (show_stack+0xb/0xc)
> [  775.980118] [<c010796f>] (show_stack) from [<c0151de3>] (warn_alloc+0x89/0xba)
> [  775.987361] [<c0151de3>] (warn_alloc) from [<c0152043>] (__alloc_pages_nodemask+0x1eb/0x634)
> [  775.995790] [<c0152043>] (__alloc_pages_nodemask) from [<c0152523>] (__alloc_page_frag+0x39/0xde)
> [  776.004685] [<c0152523>] (__alloc_page_frag) from [<c03190f1>] (__netdev_alloc_skb+0x51/0xb0)
> [  776.013217] [<c03190f1>] (__netdev_alloc_skb) from [<c02c1b6f>] (sh_eth_poll+0xbf/0x3c0)
> [  776.021342] [<c02c1b6f>] (sh_eth_poll) from [<c031fd8f>] (net_rx_action+0x77/0x170)
> [  776.029051] [<c031fd8f>] (net_rx_action) from [<c011238f>] (__do_softirq+0x107/0x160)
> [  776.036896] [<c011238f>] (__do_softirq) from [<c0112589>] (irq_exit+0x5d/0x80)
> [  776.044165] [<c0112589>] (irq_exit) from [<c012f4db>] (__handle_domain_irq+0x57/0x8c)
> [  776.052007] [<c012f4db>] (__handle_domain_irq) from [<c01012e1>] (gic_handle_irq+0x31/0x48)
> [  776.060362] [<c01012e1>] (gic_handle_irq) from [<c0108025>] (__irq_svc+0x65/0xac)
> [  776.067835] Exception stack(0xc1cafd70 to 0xc1cafdb8)
> [  776.072876] fd60:                                     0002751c c1dec6a0 0000000c 521c3be5
> [  776.081042] fd80: 56feb08e f64823a6 ffb35f7b feab513d f9cb0643 0000056c c1caff10 ffffe000
> [  776.089204] fda0: b1f49160 c1cafdc4 c180c677 c0234ace 200e0033 ffffffff
> [  776.095816] [<c0108025>] (__irq_svc) from [<c0234ace>] (__copy_to_user_std+0x7e/0x430)
> [  776.103796] [<c0234ace>] (__copy_to_user_std) from [<c0241715>] (copy_page_to_iter+0x105/0x250)
> [  776.112503] [<c0241715>] (copy_page_to_iter) from [<c0319aeb>] (skb_copy_datagram_iter+0xa3/0x108)
> [  776.121469] [<c0319aeb>] (skb_copy_datagram_iter) from [<c03443a7>] (tcp_recvmsg+0x3ab/0x5f4)
> [  776.130045] [<c03443a7>] (tcp_recvmsg) from [<c035e249>] (inet_recvmsg+0x21/0x2c)
> [  776.137576] [<c035e249>] (inet_recvmsg) from [<c031009f>] (sock_read_iter+0x51/0x6e)
> [  776.145384] [<c031009f>] (sock_read_iter) from [<c017795d>] (__vfs_read+0x97/0xb0)
> [  776.152967] [<c017795d>] (__vfs_read) from [<c01781d9>] (vfs_read+0x51/0xb0)
> [  776.159983] [<c01781d9>] (vfs_read) from [<c0178aab>] (SyS_read+0x27/0x52)
> [  776.166837] [<c0178aab>] (SyS_read) from [<c0105261>] (ret_fast_syscall+0x1/0x54)

So it looks like you are interrupting the process that is draining the
socket to service the interrupt that is filling it. I am curious what
your tcp_rmem value is. If this is occurring often then you will
likely build up a backlog of packets in the receive buffer for the
socket and that may be where all your memory is going.

> [  776.174308] Mem-Info:
> [  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0
> [  776.176650]  active_file:2636 inactive_file:7391 isolated_file:32
> [  776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0
> [  776.176650]  slab_reclaimable:719 slab_unreclaimable:724
> [  776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0
> [  776.176650]  free:373 free_pcp:6 free_cma:0
> [  776.209062] Node 0 active_anon:8148kB inactive_anon:92kB active_file:10544kB inactive_file:29564kB unevictable:0kB isolated(anon):0kB isolated(file):128kB mapped:7960kB dirty:5464kB writeback:5124kB shmem:104kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
> [  776.233602] Normal free:1492kB min:964kB low:1204kB high:1444kB active_anon:8148kB inactive_anon:92kB active_file:10544kB inactive_file:29564kB unevictable:0kB writepending:10588kB present:65536kB managed:59304kB mlocked:0kB slab_reclaimable:2876kB slab_unreclaimable:2896kB kernel_stack:1152kB pagetables:636kB bounce:0kB free_pcp:24kB local_pcp:24kB free_cma:0kB
> [  776.265406] lowmem_reserve[]: 0 0
> [  776.268761] Normal: 7*4kB (H) 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB (H) 2*128kB (H) 2*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1492kB
> 10071 total pagecache pages
> [  776.284124] 0 pages in swap cache
> [  776.287446] Swap cache stats: add 0, delete 0, find 0/0
> [  776.292645] Free swap  = 0kB
> [  776.295532] Total swap = 0kB
> [  776.298421] 16384 pages RAM
> [  776.301224] 0 pages HighMem/MovableOnly
> [  776.305052] 1558 pages reserved
>
> 6) we have certain questions as below :
> a) how the kernel memory got exhausted ? at the time of low memory conditions in kernel , are the kernel page flusher threads , which should have written dirty pages from page cache to flash disk , not executing at right time ? is the kernel page reclaim mechanism not executing at right time ?

I suspect the pages are likely stuck in a state of buffering. In the
case of sockets the packets will get queued up until either they can
be serviced or the maximum size of the receive buffer as been exceeded
and they are dropped.

> b) are there any parameters available within the linux memory subsystem with which the reclaim procedure can be monitored and  fine tuned ?

I don't think freeing up more memory will solve the issue. I really
think you probably should look at tuning the network settings. I
suspect the socket itself is likely the thing holding all of the
memory.

> c) can  some amount of free memory be reserved so that linux kernel does not caches it and kernel can use it for its other required page allocation ( particularly gfp_atomic ) as needed above on behalf of netcat nc process ? can some tuning be done in linux memory subsystem eg by using /proc/sys/vm/min_free_kbytes  to achieve this objective .

Within the kernel we already have some emergency reserved that get
dipped into if the PF_MEMALLOC flag is set. However that is usually
reserved for the cases where you are booting off of something like
iscsi or NVMe over TCP.

> d) can we be provided with further clues on how to debug this issue further for out of memory condition in kernel  ?

My advice would be look at tuning your TCP socket values in sysctl. I
suspect you are likely using a larger window then your system can
currently handle given the memory constraints and that what you are
seeing is that all the memory is being consumed by buffering for the
TCP socket.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: linux kernel page allocation failure and tuning of page cache
@ 2019-05-31 21:27   ` Alexander Duyck
  0 siblings, 0 replies; 17+ messages in thread
From: Alexander Duyck @ 2019-05-31 21:27 UTC (permalink / raw)
  To: Nagal, Amit UTC CCS; +Cc: linux-kernel, linux-mm, CHAWLA, RITU UTC CCS

On Fri, May 31, 2019 at 8:07 AM Nagal, Amit UTC CCS <Amit.Nagal@utc.com> wrote:
>
> Hi
>
> We are using Renesas RZ/A1 processor based custom target board . linux kernel version is 4.9.123.
>
> 1) the platform is low memory platform having memory 64MB.
>
> 2)  we are doing around 45MB TCP data transfer from PC to target using netcat utility .On Target , a process receives data over socket and writes the data to flash disk .
>
> 3) At the start of data transfer , we explicitly clear linux kernel cached memory by  calling echo 3 > /proc/sys/vm/drop_caches .
>
> 4) during TCP data transfer , we could see free -m showing "free" getting dropped to almost 1MB and most of the memory appearing as "cached"
>
> # free -m
>                                             total         used   free     shared   buffers   cached
> Mem:                                  57            56         1                 0            2           42
> -/+ buffers/cache:                          12        45
> Swap:                                   0              0           0
>
> 5) sometimes , we observed kernel memory getting exhausted as page allocation failure happens in kernel  with the backtrace is printed below :
> # [  775.947949] nc.traditional: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC)
> [  775.956362] CPU: 0 PID: 1288 Comm: nc.traditional Tainted: G           O    4.9.123-pic6-g31a13de-dirty #19
> [  775.966085] Hardware name: Generic R7S72100 (Flattened Device Tree)
> [  775.972501] [<c0109829>] (unwind_backtrace) from [<c010796f>] (show_stack+0xb/0xc)
> [  775.980118] [<c010796f>] (show_stack) from [<c0151de3>] (warn_alloc+0x89/0xba)
> [  775.987361] [<c0151de3>] (warn_alloc) from [<c0152043>] (__alloc_pages_nodemask+0x1eb/0x634)
> [  775.995790] [<c0152043>] (__alloc_pages_nodemask) from [<c0152523>] (__alloc_page_frag+0x39/0xde)
> [  776.004685] [<c0152523>] (__alloc_page_frag) from [<c03190f1>] (__netdev_alloc_skb+0x51/0xb0)
> [  776.013217] [<c03190f1>] (__netdev_alloc_skb) from [<c02c1b6f>] (sh_eth_poll+0xbf/0x3c0)
> [  776.021342] [<c02c1b6f>] (sh_eth_poll) from [<c031fd8f>] (net_rx_action+0x77/0x170)
> [  776.029051] [<c031fd8f>] (net_rx_action) from [<c011238f>] (__do_softirq+0x107/0x160)
> [  776.036896] [<c011238f>] (__do_softirq) from [<c0112589>] (irq_exit+0x5d/0x80)
> [  776.044165] [<c0112589>] (irq_exit) from [<c012f4db>] (__handle_domain_irq+0x57/0x8c)
> [  776.052007] [<c012f4db>] (__handle_domain_irq) from [<c01012e1>] (gic_handle_irq+0x31/0x48)
> [  776.060362] [<c01012e1>] (gic_handle_irq) from [<c0108025>] (__irq_svc+0x65/0xac)
> [  776.067835] Exception stack(0xc1cafd70 to 0xc1cafdb8)
> [  776.072876] fd60:                                     0002751c c1dec6a0 0000000c 521c3be5
> [  776.081042] fd80: 56feb08e f64823a6 ffb35f7b feab513d f9cb0643 0000056c c1caff10 ffffe000
> [  776.089204] fda0: b1f49160 c1cafdc4 c180c677 c0234ace 200e0033 ffffffff
> [  776.095816] [<c0108025>] (__irq_svc) from [<c0234ace>] (__copy_to_user_std+0x7e/0x430)
> [  776.103796] [<c0234ace>] (__copy_to_user_std) from [<c0241715>] (copy_page_to_iter+0x105/0x250)
> [  776.112503] [<c0241715>] (copy_page_to_iter) from [<c0319aeb>] (skb_copy_datagram_iter+0xa3/0x108)
> [  776.121469] [<c0319aeb>] (skb_copy_datagram_iter) from [<c03443a7>] (tcp_recvmsg+0x3ab/0x5f4)
> [  776.130045] [<c03443a7>] (tcp_recvmsg) from [<c035e249>] (inet_recvmsg+0x21/0x2c)
> [  776.137576] [<c035e249>] (inet_recvmsg) from [<c031009f>] (sock_read_iter+0x51/0x6e)
> [  776.145384] [<c031009f>] (sock_read_iter) from [<c017795d>] (__vfs_read+0x97/0xb0)
> [  776.152967] [<c017795d>] (__vfs_read) from [<c01781d9>] (vfs_read+0x51/0xb0)
> [  776.159983] [<c01781d9>] (vfs_read) from [<c0178aab>] (SyS_read+0x27/0x52)
> [  776.166837] [<c0178aab>] (SyS_read) from [<c0105261>] (ret_fast_syscall+0x1/0x54)

So it looks like you are interrupting the process that is draining the
socket to service the interrupt that is filling it. I am curious what
your tcp_rmem value is. If this is occurring often then you will
likely build up a backlog of packets in the receive buffer for the
socket and that may be where all your memory is going.

> [  776.174308] Mem-Info:
> [  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0
> [  776.176650]  active_file:2636 inactive_file:7391 isolated_file:32
> [  776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0
> [  776.176650]  slab_reclaimable:719 slab_unreclaimable:724
> [  776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0
> [  776.176650]  free:373 free_pcp:6 free_cma:0
> [  776.209062] Node 0 active_anon:8148kB inactive_anon:92kB active_file:10544kB inactive_file:29564kB unevictable:0kB isolated(anon):0kB isolated(file):128kB mapped:7960kB dirty:5464kB writeback:5124kB shmem:104kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
> [  776.233602] Normal free:1492kB min:964kB low:1204kB high:1444kB active_anon:8148kB inactive_anon:92kB active_file:10544kB inactive_file:29564kB unevictable:0kB writepending:10588kB present:65536kB managed:59304kB mlocked:0kB slab_reclaimable:2876kB slab_unreclaimable:2896kB kernel_stack:1152kB pagetables:636kB bounce:0kB free_pcp:24kB local_pcp:24kB free_cma:0kB
> [  776.265406] lowmem_reserve[]: 0 0
> [  776.268761] Normal: 7*4kB (H) 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB (H) 2*128kB (H) 2*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1492kB
> 10071 total pagecache pages
> [  776.284124] 0 pages in swap cache
> [  776.287446] Swap cache stats: add 0, delete 0, find 0/0
> [  776.292645] Free swap  = 0kB
> [  776.295532] Total swap = 0kB
> [  776.298421] 16384 pages RAM
> [  776.301224] 0 pages HighMem/MovableOnly
> [  776.305052] 1558 pages reserved
>
> 6) we have certain questions as below :
> a) how the kernel memory got exhausted ? at the time of low memory conditions in kernel , are the kernel page flusher threads , which should have written dirty pages from page cache to flash disk , not executing at right time ? is the kernel page reclaim mechanism not executing at right time ?

I suspect the pages are likely stuck in a state of buffering. In the
case of sockets the packets will get queued up until either they can
be serviced or the maximum size of the receive buffer as been exceeded
and they are dropped.

> b) are there any parameters available within the linux memory subsystem with which the reclaim procedure can be monitored and  fine tuned ?

I don't think freeing up more memory will solve the issue. I really
think you probably should look at tuning the network settings. I
suspect the socket itself is likely the thing holding all of the
memory.

> c) can  some amount of free memory be reserved so that linux kernel does not caches it and kernel can use it for its other required page allocation ( particularly gfp_atomic ) as needed above on behalf of netcat nc process ? can some tuning be done in linux memory subsystem eg by using /proc/sys/vm/min_free_kbytes  to achieve this objective .

Within the kernel we already have some emergency reserved that get
dipped into if the PF_MEMALLOC flag is set. However that is usually
reserved for the cases where you are booting off of something like
iscsi or NVMe over TCP.

> d) can we be provided with further clues on how to debug this issue further for out of memory condition in kernel  ?

My advice would be look at tuning your TCP socket values in sysctl. I
suspect you are likely using a larger window then your system can
currently handle given the memory constraints and that what you are
seeing is that all the memory is being consumed by buffering for the
TCP socket.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [External] Re: linux kernel page allocation failure and tuning of page cache
  2019-05-31 21:27   ` Alexander Duyck
@ 2019-06-03  5:30     ` Nagal, Amit               UTC CCS
  -1 siblings, 0 replies; 17+ messages in thread
From: Nagal, Amit               UTC CCS @ 2019-06-03  5:30 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: linux-kernel, linux-mm, CHAWLA, RITU UTC CCS, Netter,
	Christian M UTC CCS

-----Original Message-----
From: Alexander Duyck [mailto:alexander.duyck@gmail.com] 
Sent: Saturday, June 1, 2019 2:57 AM
To: Nagal, Amit UTC CCS <Amit.Nagal@utc.com>
Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; CHAWLA, RITU UTC CCS <RITU.CHAWLA@utc.com>
Subject: [External] Re: linux kernel page allocation failure and tuning of page cache

On Fri, May 31, 2019 at 8:07 AM Nagal, Amit UTC CCS <Amit.Nagal@utc.com> wrote:
>
> Hi
>
> We are using Renesas RZ/A1 processor based custom target board . linux kernel version is 4.9.123.
>
> 1) the platform is low memory platform having memory 64MB.
>
> 2)  we are doing around 45MB TCP data transfer from PC to target using netcat utility .On Target , a process receives data over socket and writes the data to flash disk .
>
> 3) At the start of data transfer , we explicitly clear linux kernel cached memory by  calling echo 3 > /proc/sys/vm/drop_caches .
>
> 4) during TCP data transfer , we could see free -m showing "free" getting dropped to almost 1MB and most of the memory appearing as "cached"
>
> # free -m
>                                             total         used   free     shared   buffers   cached
> Mem:                                  57            56         1                 0            2           42
> -/+ buffers/cache:                          12        45
> Swap:                                   0              0           0
>
> 5) sometimes , we observed kernel memory getting exhausted as page allocation failure happens in kernel  with the backtrace is printed below :
> # [  775.947949] nc.traditional: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC)
> [  775.956362] CPU: 0 PID: 1288 Comm: nc.traditional Tainted: G           O    4.9.123-pic6-g31a13de-dirty #19
> [  775.966085] Hardware name: Generic R7S72100 (Flattened Device Tree) 
> [  775.972501] [<c0109829>] (unwind_backtrace) from [<c010796f>] 
> (show_stack+0xb/0xc) [  775.980118] [<c010796f>] (show_stack) from 
> [<c0151de3>] (warn_alloc+0x89/0xba) [  775.987361] [<c0151de3>] 
> (warn_alloc) from [<c0152043>] (__alloc_pages_nodemask+0x1eb/0x634)
> [  775.995790] [<c0152043>] (__alloc_pages_nodemask) from [<c0152523>] 
> (__alloc_page_frag+0x39/0xde) [  776.004685] [<c0152523>] 
> (__alloc_page_frag) from [<c03190f1>] (__netdev_alloc_skb+0x51/0xb0) [  
> 776.013217] [<c03190f1>] (__netdev_alloc_skb) from [<c02c1b6f>] 
> (sh_eth_poll+0xbf/0x3c0) [  776.021342] [<c02c1b6f>] (sh_eth_poll) 
> from [<c031fd8f>] (net_rx_action+0x77/0x170) [  776.029051] 
> [<c031fd8f>] (net_rx_action) from [<c011238f>] 
> (__do_softirq+0x107/0x160) [  776.036896] [<c011238f>] (__do_softirq) 
> from [<c0112589>] (irq_exit+0x5d/0x80) [  776.044165] [<c0112589>] 
> (irq_exit) from [<c012f4db>] (__handle_domain_irq+0x57/0x8c) [  776.052007] [<c012f4db>] (__handle_domain_irq) from [<c01012e1>] (gic_handle_irq+0x31/0x48) [  776.060362] [<c01012e1>] (gic_handle_irq) from [<c0108025>] (__irq_svc+0x65/0xac) [  776.067835] Exception stack(0xc1cafd70 to 0xc1cafdb8)
> [  776.072876] fd60:                                     0002751c c1dec6a0 0000000c 521c3be5
> [  776.081042] fd80: 56feb08e f64823a6 ffb35f7b feab513d f9cb0643 
> 0000056c c1caff10 ffffe000 [  776.089204] fda0: b1f49160 c1cafdc4 
> c180c677 c0234ace 200e0033 ffffffff [  776.095816] [<c0108025>] 
> (__irq_svc) from [<c0234ace>] (__copy_to_user_std+0x7e/0x430) [  
> 776.103796] [<c0234ace>] (__copy_to_user_std) from [<c0241715>] 
> (copy_page_to_iter+0x105/0x250) [  776.112503] [<c0241715>] 
> (copy_page_to_iter) from [<c0319aeb>] 
> (skb_copy_datagram_iter+0xa3/0x108)
> [  776.121469] [<c0319aeb>] (skb_copy_datagram_iter) from [<c03443a7>] 
> (tcp_recvmsg+0x3ab/0x5f4) [  776.130045] [<c03443a7>] (tcp_recvmsg) 
> from [<c035e249>] (inet_recvmsg+0x21/0x2c) [  776.137576] [<c035e249>] 
> (inet_recvmsg) from [<c031009f>] (sock_read_iter+0x51/0x6e) [  
> 776.145384] [<c031009f>] (sock_read_iter) from [<c017795d>] 
> (__vfs_read+0x97/0xb0) [  776.152967] [<c017795d>] (__vfs_read) from 
> [<c01781d9>] (vfs_read+0x51/0xb0) [  776.159983] [<c01781d9>] 
> (vfs_read) from [<c0178aab>] (SyS_read+0x27/0x52) [  776.166837] 
> [<c0178aab>] (SyS_read) from [<c0105261>] (ret_fast_syscall+0x1/0x54)

>So it looks like you are interrupting the process that is draining the socket to service the interrupt that is filling it. I am curious what your tcp_rmem value is. If this is occurring often then you will likely build up a >backlog of packets in the receive buffer for the socket and that may be where all your memory is going.

Thanks for the reply .
# cat /proc/sys/net/ipv4/tcp_rmem
4096    87380   454688

the maximum value is less than 1MB here .  which means that socket buffer is not consuming all the memory here right ?
 
> [  776.174308] Mem-Info:
> [  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0 [  
> 776.176650]  active_file:2636 inactive_file:7391 isolated_file:32 [  
> 776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0 [  
> 776.176650]  slab_reclaimable:719 slab_unreclaimable:724 [  
> 776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0 [  
> 776.176650]  free:373 free_pcp:6 free_cma:0 [  776.209062] Node 0 
> active_anon:8148kB inactive_anon:92kB active_file:10544kB 
> inactive_file:29564kB unevictable:0kB isolated(anon):0kB 
> isolated(file):128kB mapped:7960kB dirty:5464kB writeback:5124kB 
> shmem:104kB writeback_tmp:0kB unstable:0kB pages_scanned:0 
> all_unreclaimable? no [  776.233602] Normal free:1492kB min:964kB 
> low:1204kB high:1444kB active_anon:8148kB inactive_anon:92kB 
> active_file:10544kB inactive_file:29564kB unevictable:0kB 
> writepending:10588kB present:65536kB managed:59304kB mlocked:0kB 
> slab_reclaimable:2876kB slab_unreclaimable:2896kB kernel_stack:1152kB 
> pagetables:636kB bounce:0kB free_pcp:24kB local_pcp:24kB free_cma:0kB 
> [  776.265406] lowmem_reserve[]: 0 0 [  776.268761] Normal: 7*4kB (H) 
> 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB (H) 2*128kB (H) 2*256kB (H) 
> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1492kB
> 10071 total pagecache pages
> [  776.284124] 0 pages in swap cache
> [  776.287446] Swap cache stats: add 0, delete 0, find 0/0 [  
> 776.292645] Free swap  = 0kB [  776.295532] Total swap = 0kB [  
> 776.298421] 16384 pages RAM [  776.301224] 0 pages HighMem/MovableOnly 
> [  776.305052] 1558 pages reserved
>
> 6) we have certain questions as below :
> a) how the kernel memory got exhausted ? at the time of low memory conditions in kernel , are the kernel page flusher threads , which should have written dirty pages from page cache to flash disk , not > >executing at right time ? is the kernel page reclaim mechanism not executing at right time ?

>I suspect the pages are likely stuck in a state of buffering. In the case of sockets the packets will get queued up until either they can be serviced or the maximum size of the receive buffer as been exceeded >and they are dropped.

My concern here is that why the reclaim procedure has not triggered ?

> b) are there any parameters available within the linux memory subsystem with which the reclaim procedure can be monitored and  fine tuned ?

>I don't think freeing up more memory will solve the issue. I really think you probably should look at tuning the network settings. I suspect the socket itself is likely the thing holding all of the memory.

> c) can  some amount of free memory be reserved so that linux kernel does not caches it and kernel can use it for its other required page allocation ( particularly gfp_atomic ) as needed above on behalf of netcat nc process ? can some tuning be done in linux memory subsystem eg by using /proc/sys/vm/min_free_kbytes  to achieve this objective .

>Within the kernel we already have some emergency reserved that get dipped into if the PF_MEMALLOC flag is set. However that is usually reserved for the cases where you are booting off of something like >iscsi or NVMe over TCP.

> d) can we be provided with further clues on how to debug this issue further for out of memory condition in kernel  ?

>My advice would be look at tuning your TCP socket values in sysctl. I suspect you are likely using a larger window then your system can currently handle given the memory constraints and that what you are >seeing is that all the memory is being consumed by buffering for the TCP socket.

Any suggestions here what all TCP socket values I should look into and what values to tune to .  




^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [External] Re: linux kernel page allocation failure and tuning of page cache
@ 2019-06-03  5:30     ` Nagal, Amit               UTC CCS
  0 siblings, 0 replies; 17+ messages in thread
From: Nagal, Amit               UTC CCS @ 2019-06-03  5:30 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: linux-kernel, linux-mm, CHAWLA, RITU              UTC CCS,
	Netter, Christian M       UTC CCS

-----Original Message-----
From: Alexander Duyck [mailto:alexander.duyck@gmail.com] 
Sent: Saturday, June 1, 2019 2:57 AM
To: Nagal, Amit UTC CCS <Amit.Nagal@utc.com>
Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; CHAWLA, RITU UTC CCS <RITU.CHAWLA@utc.com>
Subject: [External] Re: linux kernel page allocation failure and tuning of page cache

On Fri, May 31, 2019 at 8:07 AM Nagal, Amit UTC CCS <Amit.Nagal@utc.com> wrote:
>
> Hi
>
> We are using Renesas RZ/A1 processor based custom target board . linux kernel version is 4.9.123.
>
> 1) the platform is low memory platform having memory 64MB.
>
> 2)  we are doing around 45MB TCP data transfer from PC to target using netcat utility .On Target , a process receives data over socket and writes the data to flash disk .
>
> 3) At the start of data transfer , we explicitly clear linux kernel cached memory by  calling echo 3 > /proc/sys/vm/drop_caches .
>
> 4) during TCP data transfer , we could see free -m showing "free" getting dropped to almost 1MB and most of the memory appearing as "cached"
>
> # free -m
>                                             total         used   free     shared   buffers   cached
> Mem:                                  57            56         1                 0            2           42
> -/+ buffers/cache:                          12        45
> Swap:                                   0              0           0
>
> 5) sometimes , we observed kernel memory getting exhausted as page allocation failure happens in kernel  with the backtrace is printed below :
> # [  775.947949] nc.traditional: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC)
> [  775.956362] CPU: 0 PID: 1288 Comm: nc.traditional Tainted: G           O    4.9.123-pic6-g31a13de-dirty #19
> [  775.966085] Hardware name: Generic R7S72100 (Flattened Device Tree) 
> [  775.972501] [<c0109829>] (unwind_backtrace) from [<c010796f>] 
> (show_stack+0xb/0xc) [  775.980118] [<c010796f>] (show_stack) from 
> [<c0151de3>] (warn_alloc+0x89/0xba) [  775.987361] [<c0151de3>] 
> (warn_alloc) from [<c0152043>] (__alloc_pages_nodemask+0x1eb/0x634)
> [  775.995790] [<c0152043>] (__alloc_pages_nodemask) from [<c0152523>] 
> (__alloc_page_frag+0x39/0xde) [  776.004685] [<c0152523>] 
> (__alloc_page_frag) from [<c03190f1>] (__netdev_alloc_skb+0x51/0xb0) [  
> 776.013217] [<c03190f1>] (__netdev_alloc_skb) from [<c02c1b6f>] 
> (sh_eth_poll+0xbf/0x3c0) [  776.021342] [<c02c1b6f>] (sh_eth_poll) 
> from [<c031fd8f>] (net_rx_action+0x77/0x170) [  776.029051] 
> [<c031fd8f>] (net_rx_action) from [<c011238f>] 
> (__do_softirq+0x107/0x160) [  776.036896] [<c011238f>] (__do_softirq) 
> from [<c0112589>] (irq_exit+0x5d/0x80) [  776.044165] [<c0112589>] 
> (irq_exit) from [<c012f4db>] (__handle_domain_irq+0x57/0x8c) [  776.052007] [<c012f4db>] (__handle_domain_irq) from [<c01012e1>] (gic_handle_irq+0x31/0x48) [  776.060362] [<c01012e1>] (gic_handle_irq) from [<c0108025>] (__irq_svc+0x65/0xac) [  776.067835] Exception stack(0xc1cafd70 to 0xc1cafdb8)
> [  776.072876] fd60:                                     0002751c c1dec6a0 0000000c 521c3be5
> [  776.081042] fd80: 56feb08e f64823a6 ffb35f7b feab513d f9cb0643 
> 0000056c c1caff10 ffffe000 [  776.089204] fda0: b1f49160 c1cafdc4 
> c180c677 c0234ace 200e0033 ffffffff [  776.095816] [<c0108025>] 
> (__irq_svc) from [<c0234ace>] (__copy_to_user_std+0x7e/0x430) [  
> 776.103796] [<c0234ace>] (__copy_to_user_std) from [<c0241715>] 
> (copy_page_to_iter+0x105/0x250) [  776.112503] [<c0241715>] 
> (copy_page_to_iter) from [<c0319aeb>] 
> (skb_copy_datagram_iter+0xa3/0x108)
> [  776.121469] [<c0319aeb>] (skb_copy_datagram_iter) from [<c03443a7>] 
> (tcp_recvmsg+0x3ab/0x5f4) [  776.130045] [<c03443a7>] (tcp_recvmsg) 
> from [<c035e249>] (inet_recvmsg+0x21/0x2c) [  776.137576] [<c035e249>] 
> (inet_recvmsg) from [<c031009f>] (sock_read_iter+0x51/0x6e) [  
> 776.145384] [<c031009f>] (sock_read_iter) from [<c017795d>] 
> (__vfs_read+0x97/0xb0) [  776.152967] [<c017795d>] (__vfs_read) from 
> [<c01781d9>] (vfs_read+0x51/0xb0) [  776.159983] [<c01781d9>] 
> (vfs_read) from [<c0178aab>] (SyS_read+0x27/0x52) [  776.166837] 
> [<c0178aab>] (SyS_read) from [<c0105261>] (ret_fast_syscall+0x1/0x54)

>So it looks like you are interrupting the process that is draining the socket to service the interrupt that is filling it. I am curious what your tcp_rmem value is. If this is occurring often then you will likely build up a >backlog of packets in the receive buffer for the socket and that may be where all your memory is going.

Thanks for the reply .
# cat /proc/sys/net/ipv4/tcp_rmem
4096    87380   454688

the maximum value is less than 1MB here .  which means that socket buffer is not consuming all the memory here right ?
 
> [  776.174308] Mem-Info:
> [  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0 [  
> 776.176650]  active_file:2636 inactive_file:7391 isolated_file:32 [  
> 776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0 [  
> 776.176650]  slab_reclaimable:719 slab_unreclaimable:724 [  
> 776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0 [  
> 776.176650]  free:373 free_pcp:6 free_cma:0 [  776.209062] Node 0 
> active_anon:8148kB inactive_anon:92kB active_file:10544kB 
> inactive_file:29564kB unevictable:0kB isolated(anon):0kB 
> isolated(file):128kB mapped:7960kB dirty:5464kB writeback:5124kB 
> shmem:104kB writeback_tmp:0kB unstable:0kB pages_scanned:0 
> all_unreclaimable? no [  776.233602] Normal free:1492kB min:964kB 
> low:1204kB high:1444kB active_anon:8148kB inactive_anon:92kB 
> active_file:10544kB inactive_file:29564kB unevictable:0kB 
> writepending:10588kB present:65536kB managed:59304kB mlocked:0kB 
> slab_reclaimable:2876kB slab_unreclaimable:2896kB kernel_stack:1152kB 
> pagetables:636kB bounce:0kB free_pcp:24kB local_pcp:24kB free_cma:0kB 
> [  776.265406] lowmem_reserve[]: 0 0 [  776.268761] Normal: 7*4kB (H) 
> 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB (H) 2*128kB (H) 2*256kB (H) 
> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1492kB
> 10071 total pagecache pages
> [  776.284124] 0 pages in swap cache
> [  776.287446] Swap cache stats: add 0, delete 0, find 0/0 [  
> 776.292645] Free swap  = 0kB [  776.295532] Total swap = 0kB [  
> 776.298421] 16384 pages RAM [  776.301224] 0 pages HighMem/MovableOnly 
> [  776.305052] 1558 pages reserved
>
> 6) we have certain questions as below :
> a) how the kernel memory got exhausted ? at the time of low memory conditions in kernel , are the kernel page flusher threads , which should have written dirty pages from page cache to flash disk , not > >executing at right time ? is the kernel page reclaim mechanism not executing at right time ?

>I suspect the pages are likely stuck in a state of buffering. In the case of sockets the packets will get queued up until either they can be serviced or the maximum size of the receive buffer as been exceeded >and they are dropped.

My concern here is that why the reclaim procedure has not triggered ?

> b) are there any parameters available within the linux memory subsystem with which the reclaim procedure can be monitored and  fine tuned ?

>I don't think freeing up more memory will solve the issue. I really think you probably should look at tuning the network settings. I suspect the socket itself is likely the thing holding all of the memory.

> c) can  some amount of free memory be reserved so that linux kernel does not caches it and kernel can use it for its other required page allocation ( particularly gfp_atomic ) as needed above on behalf of netcat nc process ? can some tuning be done in linux memory subsystem eg by using /proc/sys/vm/min_free_kbytes  to achieve this objective .

>Within the kernel we already have some emergency reserved that get dipped into if the PF_MEMALLOC flag is set. However that is usually reserved for the cases where you are booting off of something like >iscsi or NVMe over TCP.

> d) can we be provided with further clues on how to debug this issue further for out of memory condition in kernel  ?

>My advice would be look at tuning your TCP socket values in sysctl. I suspect you are likely using a larger window then your system can currently handle given the memory constraints and that what you are >seeing is that all the memory is being consumed by buffering for the TCP socket.

Any suggestions here what all TCP socket values I should look into and what values to tune to .  




^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [External] Re: linux kernel page allocation failure and tuning of page cache
  2019-05-31 19:30   ` Matthew Wilcox
@ 2019-06-03  5:37     ` Nagal, Amit               UTC CCS
  -1 siblings, 0 replies; 17+ messages in thread
From: Nagal, Amit               UTC CCS @ 2019-06-03  5:37 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, linux-mm, CHAWLA, RITU UTC CCS, netdev, Netter,
	Christian M UTC CCS



-----Original Message-----
From: Matthew Wilcox [mailto:willy@infradead.org] 
Sent: Saturday, June 1, 2019 1:01 AM
To: Nagal, Amit UTC CCS <Amit.Nagal@utc.com>
Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; CHAWLA, RITU UTC CCS <RITU.CHAWLA@utc.com>; netdev@vger.kernel.org
Subject: [External] Re: linux kernel page allocation failure and tuning of page cache

> 1) the platform is low memory platform having memory 64MB.
> 
> 2)  we are doing around 45MB TCP data transfer from PC to target using netcat utility .On Target , a process receives data over socket and writes the data to flash disk .

>I think your network is faster than your disk ...

Ok . I need to check it . But how does this affect page reclaim procedure .

> 5) sometimes , we observed kernel memory getting exhausted as page allocation failure happens in kernel  with the backtrace is printed below :
> # [  775.947949] nc.traditional: page allocation failure: order:0, 
> mode:0x2080020(GFP_ATOMIC)

>We're in the soft interrupt handler at this point, so we have very few options for freeing memory; we can't wait for I/O to complete, for example.

>That said, this is a TCP connection.  We could drop the packet silently without such a noisy warning.  Perhaps just collect statistics on how many packets we dropped due to a low memory situation.

I will collect statistics for it .

> [  775.956362] CPU: 0 PID: 1288 Comm: nc.traditional Tainted: G           O    4.9.123-pic6-g31a13de-dirty #19
> [  775.966085] Hardware name: Generic R7S72100 (Flattened Device Tree) 
> [  775.972501] [<c0109829>] (unwind_backtrace) from [<c010796f>] 
> (show_stack+0xb/0xc) [  775.980118] [<c010796f>] (show_stack) from 
> [<c0151de3>] (warn_alloc+0x89/0xba) [  775.987361] [<c0151de3>] 
> (warn_alloc) from [<c0152043>] (__alloc_pages_nodemask+0x1eb/0x634)
> [  775.995790] [<c0152043>] (__alloc_pages_nodemask) from [<c0152523>] 
> (__alloc_page_frag+0x39/0xde) [  776.004685] [<c0152523>] 
> (__alloc_page_frag) from [<c03190f1>] (__netdev_alloc_skb+0x51/0xb0) [  
> 776.013217] [<c03190f1>] (__netdev_alloc_skb) from [<c02c1b6f>] 
> (sh_eth_poll+0xbf/0x3c0) [  776.021342] [<c02c1b6f>] (sh_eth_poll) 
> from [<c031fd8f>] (net_rx_action+0x77/0x170) [  776.029051] 
> [<c031fd8f>] (net_rx_action) from [<c011238f>] 
> (__do_softirq+0x107/0x160) [  776.036896] [<c011238f>] (__do_softirq) 
> from [<c0112589>] (irq_exit+0x5d/0x80) [  776.044165] [<c0112589>] 
> (irq_exit) from [<c012f4db>] (__handle_domain_irq+0x57/0x8c) [  776.052007] [<c012f4db>] (__handle_domain_irq) from [<c01012e1>] (gic_handle_irq+0x31/0x48) [  776.060362] [<c01012e1>] (gic_handle_irq) from [<c0108025>] (__irq_svc+0x65/0xac) [  776.067835] Exception stack(0xc1cafd70 to 0xc1cafdb8)
> [  776.072876] fd60:                                     0002751c c1dec6a0 0000000c 521c3be5
> [  776.081042] fd80: 56feb08e f64823a6 ffb35f7b feab513d f9cb0643 
> 0000056c c1caff10 ffffe000 [  776.089204] fda0: b1f49160 c1cafdc4 
> c180c677 c0234ace 200e0033 ffffffff [  776.095816] [<c0108025>] 
> (__irq_svc) from [<c0234ace>] (__copy_to_user_std+0x7e/0x430) [  
> 776.103796] [<c0234ace>] (__copy_to_user_std) from [<c0241715>] 
> (copy_page_to_iter+0x105/0x250) [  776.112503] [<c0241715>] 
> (copy_page_to_iter) from [<c0319aeb>] 
> (skb_copy_datagram_iter+0xa3/0x108)
> [  776.121469] [<c0319aeb>] (skb_copy_datagram_iter) from [<c03443a7>] 
> (tcp_recvmsg+0x3ab/0x5f4) [  776.130045] [<c03443a7>] (tcp_recvmsg) 
> from [<c035e249>] (inet_recvmsg+0x21/0x2c) [  776.137576] [<c035e249>] 
> (inet_recvmsg) from [<c031009f>] (sock_read_iter+0x51/0x6e) [  
> 776.145384] [<c031009f>] (sock_read_iter) from [<c017795d>] 
> (__vfs_read+0x97/0xb0) [  776.152967] [<c017795d>] (__vfs_read) from 
> [<c01781d9>] (vfs_read+0x51/0xb0) [  776.159983] [<c01781d9>] 
> (vfs_read) from [<c0178aab>] (SyS_read+0x27/0x52) [  776.166837] [<c0178aab>] (SyS_read) from [<c0105261>] (ret_fast_syscall+0x1/0x54) [  776.174308] Mem-Info:
> [  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0 [  
> 776.176650]  active_file:2636 inactive_file:7391 isolated_file:32 [  
> 776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0

>Almost all the dirty pages are under writeback at this point.

> [  776.176650]  slab_reclaimable:719 slab_unreclaimable:724 [  
> 776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0 [  
> 776.176650]  free:373 free_pcp:6 free_cma:0

>We have 373 free pages, but refused to allocate one of them to GFP_ATOMIC?
>I don't understand why that failed.  We also didn't try to steal an inactive_file or inactive_anon page, which seems like an obvious thing we might want to do.

Yes that's where I am concerned . we do not have swap device so I am assuming perhaps inactive_anon pages are not stolen , but inactive_file pages could have been used . 

> [  776.209062] Node 0 active_anon:8148kB inactive_anon:92kB 
> active_file:10544kB inactive_file:29564kB unevictable:0kB 
> isolated(anon):0kB isolated(file):128kB mapped:7960kB dirty:5464kB 
> writeback:5124kB shmem:104kB writeback_tmp:0kB unstable:0kB 
> pages_scanned:0 all_unreclaimable? no [  776.233602] Normal 
> free:1492kB min:964kB low:1204kB high:1444kB active_anon:8148kB 
> inactive_anon:92kB active_file:10544kB inactive_file:29564kB 
> unevictable:0kB writepending:10588kB present:65536kB managed:59304kB 
> mlocked:0kB slab_reclaimable:2876kB slab_unreclaimable:2896kB 
> kernel_stack:1152kB pagetables:636kB bounce:0kB free_pcp:24kB 
> local_pcp:24kB free_cma:0kB [  776.265406] lowmem_reserve[]: 0 0 [  
> 776.268761] Normal: 7*4kB (H) 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB 
> (H) 2*128kB (H) 2*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 
> 1492kB
> 10071 total pagecache pages
> [  776.284124] 0 pages in swap cache
> [  776.287446] Swap cache stats: add 0, delete 0, find 0/0 [  
> 776.292645] Free swap  = 0kB [  776.295532] Total swap = 0kB [  
> 776.298421] 16384 pages RAM [  776.301224] 0 pages HighMem/MovableOnly 
> [  776.305052] 1558 pages reserved

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [External] Re: linux kernel page allocation failure and tuning of page cache
@ 2019-06-03  5:37     ` Nagal, Amit               UTC CCS
  0 siblings, 0 replies; 17+ messages in thread
From: Nagal, Amit               UTC CCS @ 2019-06-03  5:37 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, linux-mm, CHAWLA, RITU              UTC CCS,
	netdev, Netter, Christian M       UTC CCS



-----Original Message-----
From: Matthew Wilcox [mailto:willy@infradead.org] 
Sent: Saturday, June 1, 2019 1:01 AM
To: Nagal, Amit UTC CCS <Amit.Nagal@utc.com>
Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; CHAWLA, RITU UTC CCS <RITU.CHAWLA@utc.com>; netdev@vger.kernel.org
Subject: [External] Re: linux kernel page allocation failure and tuning of page cache

> 1) the platform is low memory platform having memory 64MB.
> 
> 2)  we are doing around 45MB TCP data transfer from PC to target using netcat utility .On Target , a process receives data over socket and writes the data to flash disk .

>I think your network is faster than your disk ...

Ok . I need to check it . But how does this affect page reclaim procedure .

> 5) sometimes , we observed kernel memory getting exhausted as page allocation failure happens in kernel  with the backtrace is printed below :
> # [  775.947949] nc.traditional: page allocation failure: order:0, 
> mode:0x2080020(GFP_ATOMIC)

>We're in the soft interrupt handler at this point, so we have very few options for freeing memory; we can't wait for I/O to complete, for example.

>That said, this is a TCP connection.  We could drop the packet silently without such a noisy warning.  Perhaps just collect statistics on how many packets we dropped due to a low memory situation.

I will collect statistics for it .

> [  775.956362] CPU: 0 PID: 1288 Comm: nc.traditional Tainted: G           O    4.9.123-pic6-g31a13de-dirty #19
> [  775.966085] Hardware name: Generic R7S72100 (Flattened Device Tree) 
> [  775.972501] [<c0109829>] (unwind_backtrace) from [<c010796f>] 
> (show_stack+0xb/0xc) [  775.980118] [<c010796f>] (show_stack) from 
> [<c0151de3>] (warn_alloc+0x89/0xba) [  775.987361] [<c0151de3>] 
> (warn_alloc) from [<c0152043>] (__alloc_pages_nodemask+0x1eb/0x634)
> [  775.995790] [<c0152043>] (__alloc_pages_nodemask) from [<c0152523>] 
> (__alloc_page_frag+0x39/0xde) [  776.004685] [<c0152523>] 
> (__alloc_page_frag) from [<c03190f1>] (__netdev_alloc_skb+0x51/0xb0) [  
> 776.013217] [<c03190f1>] (__netdev_alloc_skb) from [<c02c1b6f>] 
> (sh_eth_poll+0xbf/0x3c0) [  776.021342] [<c02c1b6f>] (sh_eth_poll) 
> from [<c031fd8f>] (net_rx_action+0x77/0x170) [  776.029051] 
> [<c031fd8f>] (net_rx_action) from [<c011238f>] 
> (__do_softirq+0x107/0x160) [  776.036896] [<c011238f>] (__do_softirq) 
> from [<c0112589>] (irq_exit+0x5d/0x80) [  776.044165] [<c0112589>] 
> (irq_exit) from [<c012f4db>] (__handle_domain_irq+0x57/0x8c) [  776.052007] [<c012f4db>] (__handle_domain_irq) from [<c01012e1>] (gic_handle_irq+0x31/0x48) [  776.060362] [<c01012e1>] (gic_handle_irq) from [<c0108025>] (__irq_svc+0x65/0xac) [  776.067835] Exception stack(0xc1cafd70 to 0xc1cafdb8)
> [  776.072876] fd60:                                     0002751c c1dec6a0 0000000c 521c3be5
> [  776.081042] fd80: 56feb08e f64823a6 ffb35f7b feab513d f9cb0643 
> 0000056c c1caff10 ffffe000 [  776.089204] fda0: b1f49160 c1cafdc4 
> c180c677 c0234ace 200e0033 ffffffff [  776.095816] [<c0108025>] 
> (__irq_svc) from [<c0234ace>] (__copy_to_user_std+0x7e/0x430) [  
> 776.103796] [<c0234ace>] (__copy_to_user_std) from [<c0241715>] 
> (copy_page_to_iter+0x105/0x250) [  776.112503] [<c0241715>] 
> (copy_page_to_iter) from [<c0319aeb>] 
> (skb_copy_datagram_iter+0xa3/0x108)
> [  776.121469] [<c0319aeb>] (skb_copy_datagram_iter) from [<c03443a7>] 
> (tcp_recvmsg+0x3ab/0x5f4) [  776.130045] [<c03443a7>] (tcp_recvmsg) 
> from [<c035e249>] (inet_recvmsg+0x21/0x2c) [  776.137576] [<c035e249>] 
> (inet_recvmsg) from [<c031009f>] (sock_read_iter+0x51/0x6e) [  
> 776.145384] [<c031009f>] (sock_read_iter) from [<c017795d>] 
> (__vfs_read+0x97/0xb0) [  776.152967] [<c017795d>] (__vfs_read) from 
> [<c01781d9>] (vfs_read+0x51/0xb0) [  776.159983] [<c01781d9>] 
> (vfs_read) from [<c0178aab>] (SyS_read+0x27/0x52) [  776.166837] [<c0178aab>] (SyS_read) from [<c0105261>] (ret_fast_syscall+0x1/0x54) [  776.174308] Mem-Info:
> [  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0 [  
> 776.176650]  active_file:2636 inactive_file:7391 isolated_file:32 [  
> 776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0

>Almost all the dirty pages are under writeback at this point.

> [  776.176650]  slab_reclaimable:719 slab_unreclaimable:724 [  
> 776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0 [  
> 776.176650]  free:373 free_pcp:6 free_cma:0

>We have 373 free pages, but refused to allocate one of them to GFP_ATOMIC?
>I don't understand why that failed.  We also didn't try to steal an inactive_file or inactive_anon page, which seems like an obvious thing we might want to do.

Yes that's where I am concerned . we do not have swap device so I am assuming perhaps inactive_anon pages are not stolen , but inactive_file pages could have been used . 

> [  776.209062] Node 0 active_anon:8148kB inactive_anon:92kB 
> active_file:10544kB inactive_file:29564kB unevictable:0kB 
> isolated(anon):0kB isolated(file):128kB mapped:7960kB dirty:5464kB 
> writeback:5124kB shmem:104kB writeback_tmp:0kB unstable:0kB 
> pages_scanned:0 all_unreclaimable? no [  776.233602] Normal 
> free:1492kB min:964kB low:1204kB high:1444kB active_anon:8148kB 
> inactive_anon:92kB active_file:10544kB inactive_file:29564kB 
> unevictable:0kB writepending:10588kB present:65536kB managed:59304kB 
> mlocked:0kB slab_reclaimable:2876kB slab_unreclaimable:2896kB 
> kernel_stack:1152kB pagetables:636kB bounce:0kB free_pcp:24kB 
> local_pcp:24kB free_cma:0kB [  776.265406] lowmem_reserve[]: 0 0 [  
> 776.268761] Normal: 7*4kB (H) 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB 
> (H) 2*128kB (H) 2*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 
> 1492kB
> 10071 total pagecache pages
> [  776.284124] 0 pages in swap cache
> [  776.287446] Swap cache stats: add 0, delete 0, find 0/0 [  
> 776.292645] Free swap  = 0kB [  776.295532] Total swap = 0kB [  
> 776.298421] 16384 pages RAM [  776.301224] 0 pages HighMem/MovableOnly 
> [  776.305052] 1558 pages reserved


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [External] Re: linux kernel page allocation failure and tuning of page cache
  2019-05-31 19:30   ` Matthew Wilcox
@ 2019-06-03 10:32     ` Nagal, Amit               UTC CCS
  -1 siblings, 0 replies; 17+ messages in thread
From: Nagal, Amit               UTC CCS @ 2019-06-03 10:32 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, linux-mm, CHAWLA, RITU UTC CCS, netdev, Netter,
	Christian M UTC CCS



-----Original Message-----
From: Nagal, Amit UTC CCS 
Sent: Monday, June 3, 2019 11:08 AM
To: 'Matthew Wilcox' <willy@infradead.org>
Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; CHAWLA, RITU UTC CCS <RITU.CHAWLA@utc.com>; netdev@vger.kernel.org; Netter, Christian M UTC CCS <christian.Netter@fs.UTC.COM>
Subject: RE: [External] Re: linux kernel page allocation failure and tuning of page cache



-----Original Message-----
From: Matthew Wilcox [mailto:willy@infradead.org]
Sent: Saturday, June 1, 2019 1:01 AM
To: Nagal, Amit UTC CCS <Amit.Nagal@utc.com>
Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; CHAWLA, RITU UTC CCS <RITU.CHAWLA@utc.com>; netdev@vger.kernel.org
Subject: [External] Re: linux kernel page allocation failure and tuning of page cache

> 1) the platform is low memory platform having memory 64MB.
> 
> 2)  we are doing around 45MB TCP data transfer from PC to target using netcat utility .On Target , a process receives data over socket and writes the data to flash disk .

>I think your network is faster than your disk ...

>Ok . I need to check it . But how does this affect page reclaim procedure .

> 5) sometimes , we observed kernel memory getting exhausted as page allocation failure happens in kernel  with the backtrace is printed below :
> # [  775.947949] nc.traditional: page allocation failure: order:0,
> mode:0x2080020(GFP_ATOMIC)

>We're in the soft interrupt handler at this point, so we have very few options for freeing memory; we can't wait for I/O to complete, for example.

>That said, this is a TCP connection.  We could drop the packet silently without such a noisy warning.  Perhaps just collect statistics on how many packets we dropped due to a low memory situation.

                                           total       used       free     shared    buffers     cached
             Mem:                     57         56               1          0              0                  20
-/+ buffers/cache:         35         22
Swap:                                    0          0                 0
eth0      Link encap:Ethernet  HWaddr D6:8C:FD:9D:35:AC
          inet addr:169.254.1.1  Bcast:0.0.0.0  Mask:255.255.255.0
          inet6 addr: fe80::d48c:fdff:fe9d:35ac/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:8466 errors:1 dropped:0 overruns:3 frame:5
          TX packets:1085 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:12230677 (11.6 MiB)  TX bytes:107425 (104.9 KiB)
          Interrupt:77 Base address:0x3000

Not too many packet drops seem to happen .  


> [  775.956362] CPU: 0 PID: 1288 Comm: nc.traditional Tainted: G           O    4.9.123-pic6-g31a13de-dirty #19
> [  775.966085] Hardware name: Generic R7S72100 (Flattened Device Tree) 
> [  775.972501] [<c0109829>] (unwind_backtrace) from [<c010796f>]
> (show_stack+0xb/0xc) [  775.980118] [<c010796f>] (show_stack) from 
> [<c0151de3>] (warn_alloc+0x89/0xba) [  775.987361] [<c0151de3>]
> (warn_alloc) from [<c0152043>] (__alloc_pages_nodemask+0x1eb/0x634)
> [  775.995790] [<c0152043>] (__alloc_pages_nodemask) from [<c0152523>]
> (__alloc_page_frag+0x39/0xde) [  776.004685] [<c0152523>]
> (__alloc_page_frag) from [<c03190f1>] (__netdev_alloc_skb+0x51/0xb0) [ 
> 776.013217] [<c03190f1>] (__netdev_alloc_skb) from [<c02c1b6f>]
> (sh_eth_poll+0xbf/0x3c0) [  776.021342] [<c02c1b6f>] (sh_eth_poll) 
> from [<c031fd8f>] (net_rx_action+0x77/0x170) [  776.029051] 
> [<c031fd8f>] (net_rx_action) from [<c011238f>]
> (__do_softirq+0x107/0x160) [  776.036896] [<c011238f>] (__do_softirq) 
> from [<c0112589>] (irq_exit+0x5d/0x80) [  776.044165] [<c0112589>]
> (irq_exit) from [<c012f4db>] (__handle_domain_irq+0x57/0x8c) [  776.052007] [<c012f4db>] (__handle_domain_irq) from [<c01012e1>] (gic_handle_irq+0x31/0x48) [  776.060362] [<c01012e1>] (gic_handle_irq) from [<c0108025>] (__irq_svc+0x65/0xac) [  776.067835] Exception stack(0xc1cafd70 to 0xc1cafdb8)
> [  776.072876] fd60:                                     0002751c c1dec6a0 0000000c 521c3be5
> [  776.081042] fd80: 56feb08e f64823a6 ffb35f7b feab513d f9cb0643 
> 0000056c c1caff10 ffffe000 [  776.089204] fda0: b1f49160 c1cafdc4
> c180c677 c0234ace 200e0033 ffffffff [  776.095816] [<c0108025>]
> (__irq_svc) from [<c0234ace>] (__copy_to_user_std+0x7e/0x430) [ 
> 776.103796] [<c0234ace>] (__copy_to_user_std) from [<c0241715>]
> (copy_page_to_iter+0x105/0x250) [  776.112503] [<c0241715>]
> (copy_page_to_iter) from [<c0319aeb>]
> (skb_copy_datagram_iter+0xa3/0x108)
> [  776.121469] [<c0319aeb>] (skb_copy_datagram_iter) from [<c03443a7>]
> (tcp_recvmsg+0x3ab/0x5f4) [  776.130045] [<c03443a7>] (tcp_recvmsg) 
> from [<c035e249>] (inet_recvmsg+0x21/0x2c) [  776.137576] [<c035e249>]
> (inet_recvmsg) from [<c031009f>] (sock_read_iter+0x51/0x6e) [ 
> 776.145384] [<c031009f>] (sock_read_iter) from [<c017795d>]
> (__vfs_read+0x97/0xb0) [  776.152967] [<c017795d>] (__vfs_read) from 
> [<c01781d9>] (vfs_read+0x51/0xb0) [  776.159983] [<c01781d9>]
> (vfs_read) from [<c0178aab>] (SyS_read+0x27/0x52) [  776.166837] [<c0178aab>] (SyS_read) from [<c0105261>] (ret_fast_syscall+0x1/0x54) [  776.174308] Mem-Info:
> [  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0 [ 
> 776.176650]  active_file:2636 inactive_file:7391 isolated_file:32 [ 
> 776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0

>Almost all the dirty pages are under writeback at this point.

> [  776.176650]  slab_reclaimable:719 slab_unreclaimable:724 [ 
> 776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0 [ 
> 776.176650]  free:373 free_pcp:6 free_cma:0

>We have 373 free pages, but refused to allocate one of them to GFP_ATOMIC?
>I don't understand why that failed.  We also didn't try to steal an inactive_file or inactive_anon page, which seems like an obvious thing we might want to do.

>Yes that's where I am concerned . we do not have swap device so I am assuming perhaps inactive_anon pages are not stolen , but inactive_file pages could have been used . 

> [  776.209062] Node 0 active_anon:8148kB inactive_anon:92kB 
> active_file:10544kB inactive_file:29564kB unevictable:0kB 
> isolated(anon):0kB isolated(file):128kB mapped:7960kB dirty:5464kB 
> writeback:5124kB shmem:104kB writeback_tmp:0kB unstable:0kB
> pages_scanned:0 all_unreclaimable? no [  776.233602] Normal 
> free:1492kB min:964kB low:1204kB high:1444kB active_anon:8148kB 
> inactive_anon:92kB active_file:10544kB inactive_file:29564kB 
> unevictable:0kB writepending:10588kB present:65536kB managed:59304kB 
> mlocked:0kB slab_reclaimable:2876kB slab_unreclaimable:2896kB 
> kernel_stack:1152kB pagetables:636kB bounce:0kB free_pcp:24kB 
> local_pcp:24kB free_cma:0kB [  776.265406] lowmem_reserve[]: 0 0 [ 
> 776.268761] Normal: 7*4kB (H) 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB
> (H) 2*128kB (H) 2*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 
> 1492kB
> 10071 total pagecache pages
> [  776.284124] 0 pages in swap cache
> [  776.287446] Swap cache stats: add 0, delete 0, find 0/0 [ 
> 776.292645] Free swap  = 0kB [  776.295532] Total swap = 0kB [ 
> 776.298421] 16384 pages RAM [  776.301224] 0 pages HighMem/MovableOnly 
> [  776.305052] 1558 pages reserved

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [External] Re: linux kernel page allocation failure and tuning of page cache
@ 2019-06-03 10:32     ` Nagal, Amit               UTC CCS
  0 siblings, 0 replies; 17+ messages in thread
From: Nagal, Amit               UTC CCS @ 2019-06-03 10:32 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, linux-mm, CHAWLA, RITU              UTC CCS,
	netdev, Netter, Christian M       UTC CCS



-----Original Message-----
From: Nagal, Amit UTC CCS 
Sent: Monday, June 3, 2019 11:08 AM
To: 'Matthew Wilcox' <willy@infradead.org>
Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; CHAWLA, RITU UTC CCS <RITU.CHAWLA@utc.com>; netdev@vger.kernel.org; Netter, Christian M UTC CCS <christian.Netter@fs.UTC.COM>
Subject: RE: [External] Re: linux kernel page allocation failure and tuning of page cache



-----Original Message-----
From: Matthew Wilcox [mailto:willy@infradead.org]
Sent: Saturday, June 1, 2019 1:01 AM
To: Nagal, Amit UTC CCS <Amit.Nagal@utc.com>
Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; CHAWLA, RITU UTC CCS <RITU.CHAWLA@utc.com>; netdev@vger.kernel.org
Subject: [External] Re: linux kernel page allocation failure and tuning of page cache

> 1) the platform is low memory platform having memory 64MB.
> 
> 2)  we are doing around 45MB TCP data transfer from PC to target using netcat utility .On Target , a process receives data over socket and writes the data to flash disk .

>I think your network is faster than your disk ...

>Ok . I need to check it . But how does this affect page reclaim procedure .

> 5) sometimes , we observed kernel memory getting exhausted as page allocation failure happens in kernel  with the backtrace is printed below :
> # [  775.947949] nc.traditional: page allocation failure: order:0,
> mode:0x2080020(GFP_ATOMIC)

>We're in the soft interrupt handler at this point, so we have very few options for freeing memory; we can't wait for I/O to complete, for example.

>That said, this is a TCP connection.  We could drop the packet silently without such a noisy warning.  Perhaps just collect statistics on how many packets we dropped due to a low memory situation.

                                           total       used       free     shared    buffers     cached
             Mem:                     57         56               1          0              0                  20
-/+ buffers/cache:         35         22
Swap:                                    0          0                 0
eth0      Link encap:Ethernet  HWaddr D6:8C:FD:9D:35:AC
          inet addr:169.254.1.1  Bcast:0.0.0.0  Mask:255.255.255.0
          inet6 addr: fe80::d48c:fdff:fe9d:35ac/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:8466 errors:1 dropped:0 overruns:3 frame:5
          TX packets:1085 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:12230677 (11.6 MiB)  TX bytes:107425 (104.9 KiB)
          Interrupt:77 Base address:0x3000

Not too many packet drops seem to happen .  


> [  775.956362] CPU: 0 PID: 1288 Comm: nc.traditional Tainted: G           O    4.9.123-pic6-g31a13de-dirty #19
> [  775.966085] Hardware name: Generic R7S72100 (Flattened Device Tree) 
> [  775.972501] [<c0109829>] (unwind_backtrace) from [<c010796f>]
> (show_stack+0xb/0xc) [  775.980118] [<c010796f>] (show_stack) from 
> [<c0151de3>] (warn_alloc+0x89/0xba) [  775.987361] [<c0151de3>]
> (warn_alloc) from [<c0152043>] (__alloc_pages_nodemask+0x1eb/0x634)
> [  775.995790] [<c0152043>] (__alloc_pages_nodemask) from [<c0152523>]
> (__alloc_page_frag+0x39/0xde) [  776.004685] [<c0152523>]
> (__alloc_page_frag) from [<c03190f1>] (__netdev_alloc_skb+0x51/0xb0) [ 
> 776.013217] [<c03190f1>] (__netdev_alloc_skb) from [<c02c1b6f>]
> (sh_eth_poll+0xbf/0x3c0) [  776.021342] [<c02c1b6f>] (sh_eth_poll) 
> from [<c031fd8f>] (net_rx_action+0x77/0x170) [  776.029051] 
> [<c031fd8f>] (net_rx_action) from [<c011238f>]
> (__do_softirq+0x107/0x160) [  776.036896] [<c011238f>] (__do_softirq) 
> from [<c0112589>] (irq_exit+0x5d/0x80) [  776.044165] [<c0112589>]
> (irq_exit) from [<c012f4db>] (__handle_domain_irq+0x57/0x8c) [  776.052007] [<c012f4db>] (__handle_domain_irq) from [<c01012e1>] (gic_handle_irq+0x31/0x48) [  776.060362] [<c01012e1>] (gic_handle_irq) from [<c0108025>] (__irq_svc+0x65/0xac) [  776.067835] Exception stack(0xc1cafd70 to 0xc1cafdb8)
> [  776.072876] fd60:                                     0002751c c1dec6a0 0000000c 521c3be5
> [  776.081042] fd80: 56feb08e f64823a6 ffb35f7b feab513d f9cb0643 
> 0000056c c1caff10 ffffe000 [  776.089204] fda0: b1f49160 c1cafdc4
> c180c677 c0234ace 200e0033 ffffffff [  776.095816] [<c0108025>]
> (__irq_svc) from [<c0234ace>] (__copy_to_user_std+0x7e/0x430) [ 
> 776.103796] [<c0234ace>] (__copy_to_user_std) from [<c0241715>]
> (copy_page_to_iter+0x105/0x250) [  776.112503] [<c0241715>]
> (copy_page_to_iter) from [<c0319aeb>]
> (skb_copy_datagram_iter+0xa3/0x108)
> [  776.121469] [<c0319aeb>] (skb_copy_datagram_iter) from [<c03443a7>]
> (tcp_recvmsg+0x3ab/0x5f4) [  776.130045] [<c03443a7>] (tcp_recvmsg) 
> from [<c035e249>] (inet_recvmsg+0x21/0x2c) [  776.137576] [<c035e249>]
> (inet_recvmsg) from [<c031009f>] (sock_read_iter+0x51/0x6e) [ 
> 776.145384] [<c031009f>] (sock_read_iter) from [<c017795d>]
> (__vfs_read+0x97/0xb0) [  776.152967] [<c017795d>] (__vfs_read) from 
> [<c01781d9>] (vfs_read+0x51/0xb0) [  776.159983] [<c01781d9>]
> (vfs_read) from [<c0178aab>] (SyS_read+0x27/0x52) [  776.166837] [<c0178aab>] (SyS_read) from [<c0105261>] (ret_fast_syscall+0x1/0x54) [  776.174308] Mem-Info:
> [  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0 [ 
> 776.176650]  active_file:2636 inactive_file:7391 isolated_file:32 [ 
> 776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0

>Almost all the dirty pages are under writeback at this point.

> [  776.176650]  slab_reclaimable:719 slab_unreclaimable:724 [ 
> 776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0 [ 
> 776.176650]  free:373 free_pcp:6 free_cma:0

>We have 373 free pages, but refused to allocate one of them to GFP_ATOMIC?
>I don't understand why that failed.  We also didn't try to steal an inactive_file or inactive_anon page, which seems like an obvious thing we might want to do.

>Yes that's where I am concerned . we do not have swap device so I am assuming perhaps inactive_anon pages are not stolen , but inactive_file pages could have been used . 

> [  776.209062] Node 0 active_anon:8148kB inactive_anon:92kB 
> active_file:10544kB inactive_file:29564kB unevictable:0kB 
> isolated(anon):0kB isolated(file):128kB mapped:7960kB dirty:5464kB 
> writeback:5124kB shmem:104kB writeback_tmp:0kB unstable:0kB
> pages_scanned:0 all_unreclaimable? no [  776.233602] Normal 
> free:1492kB min:964kB low:1204kB high:1444kB active_anon:8148kB 
> inactive_anon:92kB active_file:10544kB inactive_file:29564kB 
> unevictable:0kB writepending:10588kB present:65536kB managed:59304kB 
> mlocked:0kB slab_reclaimable:2876kB slab_unreclaimable:2896kB 
> kernel_stack:1152kB pagetables:636kB bounce:0kB free_pcp:24kB 
> local_pcp:24kB free_cma:0kB [  776.265406] lowmem_reserve[]: 0 0 [ 
> 776.268761] Normal: 7*4kB (H) 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB
> (H) 2*128kB (H) 2*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 
> 1492kB
> 10071 total pagecache pages
> [  776.284124] 0 pages in swap cache
> [  776.287446] Swap cache stats: add 0, delete 0, find 0/0 [ 
> 776.292645] Free swap  = 0kB [  776.295532] Total swap = 0kB [ 
> 776.298421] 16384 pages RAM [  776.301224] 0 pages HighMem/MovableOnly 
> [  776.305052] 1558 pages reserved


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [External] Re: linux kernel page allocation failure and tuning of page cache
  2019-06-03  5:30     ` Nagal, Amit               UTC CCS
@ 2019-06-03 12:11       ` Matthew Wilcox
  -1 siblings, 0 replies; 17+ messages in thread
From: Matthew Wilcox @ 2019-06-03 12:11 UTC (permalink / raw)
  To: Nagal, Amit               UTC CCS
  Cc: Alexander Duyck, linux-kernel, linux-mm, CHAWLA, RITU UTC CCS,
	Netter, Christian M UTC CCS

On Mon, Jun 03, 2019 at 05:30:57AM +0000, Nagal, Amit               UTC CCS wrote:
> > [  776.174308] Mem-Info:
> > [  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0 [  
> > 776.176650]  active_file:2636 inactive_file:7391 isolated_file:32 [  
> > 776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0 [  
> > 776.176650]  slab_reclaimable:719 slab_unreclaimable:724 [  
> > 776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0 [  
> > 776.176650]  free:373 free_pcp:6 free_cma:0 [  776.209062] Node 0 
> > active_anon:8148kB inactive_anon:92kB active_file:10544kB 
> > inactive_file:29564kB unevictable:0kB isolated(anon):0kB 
> > isolated(file):128kB mapped:7960kB dirty:5464kB writeback:5124kB 
> > shmem:104kB writeback_tmp:0kB unstable:0kB pages_scanned:0 
> > all_unreclaimable? no [  776.233602] Normal free:1492kB min:964kB 
> > low:1204kB high:1444kB active_anon:8148kB inactive_anon:92kB 
> > active_file:10544kB inactive_file:29564kB unevictable:0kB 
> > writepending:10588kB present:65536kB managed:59304kB mlocked:0kB 
> > slab_reclaimable:2876kB slab_unreclaimable:2896kB kernel_stack:1152kB 
> > pagetables:636kB bounce:0kB free_pcp:24kB local_pcp:24kB free_cma:0kB 
> > [  776.265406] lowmem_reserve[]: 0 0 [  776.268761] Normal: 7*4kB (H) 
> > 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB (H) 2*128kB (H) 2*256kB (H) 
> > 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1492kB
> > 10071 total pagecache pages
> > [  776.284124] 0 pages in swap cache
> > [  776.287446] Swap cache stats: add 0, delete 0, find 0/0 [  
> > 776.292645] Free swap  = 0kB [  776.295532] Total swap = 0kB [  
> > 776.298421] 16384 pages RAM [  776.301224] 0 pages HighMem/MovableOnly 
> > [  776.305052] 1558 pages reserved
> >
> > 6) we have certain questions as below :
> > a) how the kernel memory got exhausted ? at the time of low memory conditions in kernel , are the kernel page flusher threads , which should have written dirty pages from page cache to flash disk , not > >executing at right time ? is the kernel page reclaim mechanism not executing at right time ?
> 
> >I suspect the pages are likely stuck in a state of buffering. In the case of sockets the packets will get queued up until either they can be serviced or the maximum size of the receive buffer as been exceeded >and they are dropped.
> 
> My concern here is that why the reclaim procedure has not triggered ?

It has triggered.  1281 pages are under writeback.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [External] Re: linux kernel page allocation failure and tuning of page cache
@ 2019-06-03 12:11       ` Matthew Wilcox
  0 siblings, 0 replies; 17+ messages in thread
From: Matthew Wilcox @ 2019-06-03 12:11 UTC (permalink / raw)
  To: Nagal, Amit               UTC CCS
  Cc: Alexander Duyck, linux-kernel, linux-mm, CHAWLA,
	RITU              UTC CCS, Netter, Christian M       UTC CCS

On Mon, Jun 03, 2019 at 05:30:57AM +0000, Nagal, Amit               UTC CCS wrote:
> > [  776.174308] Mem-Info:
> > [  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0 [  
> > 776.176650]  active_file:2636 inactive_file:7391 isolated_file:32 [  
> > 776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0 [  
> > 776.176650]  slab_reclaimable:719 slab_unreclaimable:724 [  
> > 776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0 [  
> > 776.176650]  free:373 free_pcp:6 free_cma:0 [  776.209062] Node 0 
> > active_anon:8148kB inactive_anon:92kB active_file:10544kB 
> > inactive_file:29564kB unevictable:0kB isolated(anon):0kB 
> > isolated(file):128kB mapped:7960kB dirty:5464kB writeback:5124kB 
> > shmem:104kB writeback_tmp:0kB unstable:0kB pages_scanned:0 
> > all_unreclaimable? no [  776.233602] Normal free:1492kB min:964kB 
> > low:1204kB high:1444kB active_anon:8148kB inactive_anon:92kB 
> > active_file:10544kB inactive_file:29564kB unevictable:0kB 
> > writepending:10588kB present:65536kB managed:59304kB mlocked:0kB 
> > slab_reclaimable:2876kB slab_unreclaimable:2896kB kernel_stack:1152kB 
> > pagetables:636kB bounce:0kB free_pcp:24kB local_pcp:24kB free_cma:0kB 
> > [  776.265406] lowmem_reserve[]: 0 0 [  776.268761] Normal: 7*4kB (H) 
> > 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB (H) 2*128kB (H) 2*256kB (H) 
> > 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1492kB
> > 10071 total pagecache pages
> > [  776.284124] 0 pages in swap cache
> > [  776.287446] Swap cache stats: add 0, delete 0, find 0/0 [  
> > 776.292645] Free swap  = 0kB [  776.295532] Total swap = 0kB [  
> > 776.298421] 16384 pages RAM [  776.301224] 0 pages HighMem/MovableOnly 
> > [  776.305052] 1558 pages reserved
> >
> > 6) we have certain questions as below :
> > a) how the kernel memory got exhausted ? at the time of low memory conditions in kernel , are the kernel page flusher threads , which should have written dirty pages from page cache to flash disk , not > >executing at right time ? is the kernel page reclaim mechanism not executing at right time ?
> 
> >I suspect the pages are likely stuck in a state of buffering. In the case of sockets the packets will get queued up until either they can be serviced or the maximum size of the receive buffer as been exceeded >and they are dropped.
> 
> My concern here is that why the reclaim procedure has not triggered ?

It has triggered.  1281 pages are under writeback.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [External] Re: linux kernel page allocation failure and tuning of page cache
  2019-06-03 12:11       ` Matthew Wilcox
@ 2019-06-03 14:49         ` Nagal, Amit               UTC CCS
  -1 siblings, 0 replies; 17+ messages in thread
From: Nagal, Amit               UTC CCS @ 2019-06-03 14:49 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Alexander Duyck, linux-kernel, linux-mm, CHAWLA, RITU UTC CCS,
	Netter, Christian M UTC CCS


From: Matthew Wilcox [mailto:willy@infradead.org] 
Sent: Monday, June 3, 2019 5:42 PM
To: Nagal, Amit UTC CCS <Amit.Nagal@utc.com>
On Mon, Jun 03, 2019 at 05:30:57AM +0000, Nagal, Amit               UTC CCS wrote:
> > [  776.174308] Mem-Info:
> > [  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0 [ 
> > 776.176650]  active_file:2636 inactive_file:7391 isolated_file:32 [ 
> > 776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0 [ 
> > 776.176650]  slab_reclaimable:719 slab_unreclaimable:724 [ 
> > 776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0 [ 
> > 776.176650]  free:373 free_pcp:6 free_cma:0 [  776.209062] Node 0 
> > active_anon:8148kB inactive_anon:92kB active_file:10544kB 
> > inactive_file:29564kB unevictable:0kB isolated(anon):0kB 
> > isolated(file):128kB mapped:7960kB dirty:5464kB writeback:5124kB 
> > shmem:104kB writeback_tmp:0kB unstable:0kB pages_scanned:0 
> > all_unreclaimable? no [  776.233602] Normal free:1492kB min:964kB 
> > low:1204kB high:1444kB active_anon:8148kB inactive_anon:92kB 
> > active_file:10544kB inactive_file:29564kB unevictable:0kB 
> > writepending:10588kB present:65536kB managed:59304kB mlocked:0kB 
> > slab_reclaimable:2876kB slab_unreclaimable:2896kB 
> > kernel_stack:1152kB pagetables:636kB bounce:0kB free_pcp:24kB 
> > local_pcp:24kB free_cma:0kB [  776.265406] lowmem_reserve[]: 0 0 [  
> > 776.268761] Normal: 7*4kB (H) 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB 
> > (H) 2*128kB (H) 2*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 
> > 1492kB
> > 10071 total pagecache pages
> > [  776.284124] 0 pages in swap cache [  776.287446] Swap cache 
> > stats: add 0, delete 0, find 0/0 [ 776.292645] Free swap  = 0kB [  
> > 776.295532] Total swap = 0kB [ 776.298421] 16384 pages RAM [  
> > 776.301224] 0 pages HighMem/MovableOnly [  776.305052] 1558 pages 
> > reserved
> >
> > 6) we have certain questions as below :
> > a) how the kernel memory got exhausted ? at the time of low memory conditions in kernel , are the kernel page flusher threads , which should have written dirty pages from page cache to flash disk , not > >executing at right time ? is the kernel page reclaim mechanism not executing at right time ?
> 
> >I suspect the pages are likely stuck in a state of buffering. In the case of sockets the packets will get queued up until either they can be serviced or the maximum size of the receive buffer as been exceeded >and they are dropped.
> 
> My concern here is that why the reclaim procedure has not triggered ?

>It has triggered.  1281 pages are under writeback.
Thanks for the reply .

Also , on target , cat /proc/sys/vm/min_free_kbytes = 965 .  As per https://www.kernel.org/doc/Documentation/sysctl/vm.txt  , 
the minimum value min_free_kbytes  should be set must be 1024 . 
is this min_free_kbytes setting creating the problem ?

Target is having 64MB memory  , what value is recommended for setting min_free_kbytes  ?

also is this a problem if the process receiving socket data is run at elevated priority ( we set it firstly  chrt -r 20 and then changed it later to renice -n -20)
I observed lru-add-drain , writeback threads were executing at normal priority .












^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [External] Re: linux kernel page allocation failure and tuning of page cache
@ 2019-06-03 14:49         ` Nagal, Amit               UTC CCS
  0 siblings, 0 replies; 17+ messages in thread
From: Nagal, Amit               UTC CCS @ 2019-06-03 14:49 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Alexander Duyck, linux-kernel, linux-mm, CHAWLA,
	RITU              UTC CCS, Netter, Christian M       UTC CCS


From: Matthew Wilcox [mailto:willy@infradead.org] 
Sent: Monday, June 3, 2019 5:42 PM
To: Nagal, Amit UTC CCS <Amit.Nagal@utc.com>
On Mon, Jun 03, 2019 at 05:30:57AM +0000, Nagal, Amit               UTC CCS wrote:
> > [  776.174308] Mem-Info:
> > [  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0 [ 
> > 776.176650]  active_file:2636 inactive_file:7391 isolated_file:32 [ 
> > 776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0 [ 
> > 776.176650]  slab_reclaimable:719 slab_unreclaimable:724 [ 
> > 776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0 [ 
> > 776.176650]  free:373 free_pcp:6 free_cma:0 [  776.209062] Node 0 
> > active_anon:8148kB inactive_anon:92kB active_file:10544kB 
> > inactive_file:29564kB unevictable:0kB isolated(anon):0kB 
> > isolated(file):128kB mapped:7960kB dirty:5464kB writeback:5124kB 
> > shmem:104kB writeback_tmp:0kB unstable:0kB pages_scanned:0 
> > all_unreclaimable? no [  776.233602] Normal free:1492kB min:964kB 
> > low:1204kB high:1444kB active_anon:8148kB inactive_anon:92kB 
> > active_file:10544kB inactive_file:29564kB unevictable:0kB 
> > writepending:10588kB present:65536kB managed:59304kB mlocked:0kB 
> > slab_reclaimable:2876kB slab_unreclaimable:2896kB 
> > kernel_stack:1152kB pagetables:636kB bounce:0kB free_pcp:24kB 
> > local_pcp:24kB free_cma:0kB [  776.265406] lowmem_reserve[]: 0 0 [  
> > 776.268761] Normal: 7*4kB (H) 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB 
> > (H) 2*128kB (H) 2*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 
> > 1492kB
> > 10071 total pagecache pages
> > [  776.284124] 0 pages in swap cache [  776.287446] Swap cache 
> > stats: add 0, delete 0, find 0/0 [ 776.292645] Free swap  = 0kB [  
> > 776.295532] Total swap = 0kB [ 776.298421] 16384 pages RAM [  
> > 776.301224] 0 pages HighMem/MovableOnly [  776.305052] 1558 pages 
> > reserved
> >
> > 6) we have certain questions as below :
> > a) how the kernel memory got exhausted ? at the time of low memory conditions in kernel , are the kernel page flusher threads , which should have written dirty pages from page cache to flash disk , not > >executing at right time ? is the kernel page reclaim mechanism not executing at right time ?
> 
> >I suspect the pages are likely stuck in a state of buffering. In the case of sockets the packets will get queued up until either they can be serviced or the maximum size of the receive buffer as been exceeded >and they are dropped.
> 
> My concern here is that why the reclaim procedure has not triggered ?

>It has triggered.  1281 pages are under writeback.
Thanks for the reply .

Also , on target , cat /proc/sys/vm/min_free_kbytes = 965 .  As per https://www.kernel.org/doc/Documentation/sysctl/vm.txt  , 
the minimum value min_free_kbytes  should be set must be 1024 . 
is this min_free_kbytes setting creating the problem ?

Target is having 64MB memory  , what value is recommended for setting min_free_kbytes  ?

also is this a problem if the process receiving socket data is run at elevated priority ( we set it firstly  chrt -r 20 and then changed it later to renice -n -20)
I observed lru-add-drain , writeback threads were executing at normal priority .












^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [External] Re: linux kernel page allocation failure and tuning of page cache
  2019-06-03 12:11       ` Matthew Wilcox
@ 2019-06-03 15:12         ` Nagal, Amit               UTC CCS
  -1 siblings, 0 replies; 17+ messages in thread
From: Nagal, Amit               UTC CCS @ 2019-06-03 15:12 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Alexander Duyck, linux-kernel, linux-mm, CHAWLA, RITU UTC CCS,
	Netter, Christian M UTC CCS



From: Matthew Wilcox [mailto:willy@infradead.org]
Sent: Monday, June 3, 2019 5:42 PM
To: Nagal, Amit UTC CCS <Amit.Nagal@utc.com>
On Mon, Jun 03, 2019 at 05:30:57AM +0000, Nagal, Amit               UTC CCS wrote:
> > [  776.174308] Mem-Info:
> > [  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0 [ 
> > 776.176650]  active_file:2636 inactive_file:7391 isolated_file:32 [ 
> > 776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0 [ 
> > 776.176650]  slab_reclaimable:719 slab_unreclaimable:724 [ 
> > 776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0 [ 
> > 776.176650]  free:373 free_pcp:6 free_cma:0 [  776.209062] Node 0 
> > active_anon:8148kB inactive_anon:92kB active_file:10544kB 
> > inactive_file:29564kB unevictable:0kB isolated(anon):0kB 
> > isolated(file):128kB mapped:7960kB dirty:5464kB writeback:5124kB 
> > shmem:104kB writeback_tmp:0kB unstable:0kB pages_scanned:0 
> > all_unreclaimable? no [  776.233602] Normal free:1492kB min:964kB 
> > low:1204kB high:1444kB active_anon:8148kB inactive_anon:92kB 
> > active_file:10544kB inactive_file:29564kB unevictable:0kB 
> > writepending:10588kB present:65536kB managed:59304kB mlocked:0kB 
> > slab_reclaimable:2876kB slab_unreclaimable:2896kB 
> > kernel_stack:1152kB pagetables:636kB bounce:0kB free_pcp:24kB 
> > local_pcp:24kB free_cma:0kB [  776.265406] lowmem_reserve[]: 0 0 [ 
> > 776.268761] Normal: 7*4kB (H) 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB
> > (H) 2*128kB (H) 2*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 
> > 1492kB
> > 10071 total pagecache pages
> > [  776.284124] 0 pages in swap cache [  776.287446] Swap cache
> > stats: add 0, delete 0, find 0/0 [ 776.292645] Free swap  = 0kB [ 
> > 776.295532] Total swap = 0kB [ 776.298421] 16384 pages RAM [ 
> > 776.301224] 0 pages HighMem/MovableOnly [  776.305052] 1558 pages 
> > reserved
> >
> > 6) we have certain questions as below :
> > a) how the kernel memory got exhausted ? at the time of low memory conditions in kernel , are the kernel page flusher threads , which should have written dirty pages from page cache to flash disk , not > >executing at right time ? is the kernel page reclaim mechanism not executing at right time ?
> 
> >I suspect the pages are likely stuck in a state of buffering. In the case of sockets the packets will get queued up until either they can be serviced or the maximum size of the receive buffer as been exceeded >and they are dropped.
> 
> My concern here is that why the reclaim procedure has not triggered ?

>It has triggered.  1281 pages are under writeback.
Thanks for the reply .

Also , on target , cat /proc/sys/vm/min_free_kbytes = 965 .  As per https://www.kernel.org/doc/Documentation/sysctl/vm.txt  , the minimum value min_free_kbytes  should be set must be 1024 . 
is this min_free_kbytes setting creating the problem ?

Target is having 64MB memory  , what value is recommended for setting min_free_kbytes  ?

also is this a problem if the process receiving socket data is run at elevated priority ( we set it firstly  chrt -r 20 and then changed it later to renice -n -20) I observed lru-add-drain , writeback threads were executing at normal priority .

what I mean above is 2 separate iterations for process priority settings ( 1st iteration :: chrt -r 20  , 2nd iteration : renice -n -20 , there was no iteration in which both chrt and renice were used together) . 
although in  both priority settings , we got the page allocation failure problem .












^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [External] Re: linux kernel page allocation failure and tuning of page cache
@ 2019-06-03 15:12         ` Nagal, Amit               UTC CCS
  0 siblings, 0 replies; 17+ messages in thread
From: Nagal, Amit               UTC CCS @ 2019-06-03 15:12 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Alexander Duyck, linux-kernel, linux-mm, CHAWLA,
	RITU              UTC CCS, Netter, Christian M       UTC CCS



From: Matthew Wilcox [mailto:willy@infradead.org]
Sent: Monday, June 3, 2019 5:42 PM
To: Nagal, Amit UTC CCS <Amit.Nagal@utc.com>
On Mon, Jun 03, 2019 at 05:30:57AM +0000, Nagal, Amit               UTC CCS wrote:
> > [  776.174308] Mem-Info:
> > [  776.176650] active_anon:2037 inactive_anon:23 isolated_anon:0 [ 
> > 776.176650]  active_file:2636 inactive_file:7391 isolated_file:32 [ 
> > 776.176650]  unevictable:0 dirty:1366 writeback:1281 unstable:0 [ 
> > 776.176650]  slab_reclaimable:719 slab_unreclaimable:724 [ 
> > 776.176650]  mapped:1990 shmem:26 pagetables:159 bounce:0 [ 
> > 776.176650]  free:373 free_pcp:6 free_cma:0 [  776.209062] Node 0 
> > active_anon:8148kB inactive_anon:92kB active_file:10544kB 
> > inactive_file:29564kB unevictable:0kB isolated(anon):0kB 
> > isolated(file):128kB mapped:7960kB dirty:5464kB writeback:5124kB 
> > shmem:104kB writeback_tmp:0kB unstable:0kB pages_scanned:0 
> > all_unreclaimable? no [  776.233602] Normal free:1492kB min:964kB 
> > low:1204kB high:1444kB active_anon:8148kB inactive_anon:92kB 
> > active_file:10544kB inactive_file:29564kB unevictable:0kB 
> > writepending:10588kB present:65536kB managed:59304kB mlocked:0kB 
> > slab_reclaimable:2876kB slab_unreclaimable:2896kB 
> > kernel_stack:1152kB pagetables:636kB bounce:0kB free_pcp:24kB 
> > local_pcp:24kB free_cma:0kB [  776.265406] lowmem_reserve[]: 0 0 [ 
> > 776.268761] Normal: 7*4kB (H) 5*8kB (H) 7*16kB (H) 5*32kB (H) 6*64kB
> > (H) 2*128kB (H) 2*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 
> > 1492kB
> > 10071 total pagecache pages
> > [  776.284124] 0 pages in swap cache [  776.287446] Swap cache
> > stats: add 0, delete 0, find 0/0 [ 776.292645] Free swap  = 0kB [ 
> > 776.295532] Total swap = 0kB [ 776.298421] 16384 pages RAM [ 
> > 776.301224] 0 pages HighMem/MovableOnly [  776.305052] 1558 pages 
> > reserved
> >
> > 6) we have certain questions as below :
> > a) how the kernel memory got exhausted ? at the time of low memory conditions in kernel , are the kernel page flusher threads , which should have written dirty pages from page cache to flash disk , not > >executing at right time ? is the kernel page reclaim mechanism not executing at right time ?
> 
> >I suspect the pages are likely stuck in a state of buffering. In the case of sockets the packets will get queued up until either they can be serviced or the maximum size of the receive buffer as been exceeded >and they are dropped.
> 
> My concern here is that why the reclaim procedure has not triggered ?

>It has triggered.  1281 pages are under writeback.
Thanks for the reply .

Also , on target , cat /proc/sys/vm/min_free_kbytes = 965 .  As per https://www.kernel.org/doc/Documentation/sysctl/vm.txt  , the minimum value min_free_kbytes  should be set must be 1024 . 
is this min_free_kbytes setting creating the problem ?

Target is having 64MB memory  , what value is recommended for setting min_free_kbytes  ?

also is this a problem if the process receiving socket data is run at elevated priority ( we set it firstly  chrt -r 20 and then changed it later to renice -n -20) I observed lru-add-drain , writeback threads were executing at normal priority .

what I mean above is 2 separate iterations for process priority settings ( 1st iteration :: chrt -r 20  , 2nd iteration : renice -n -20 , there was no iteration in which both chrt and renice were used together) . 
although in  both priority settings , we got the page allocation failure problem .












^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-06-03 15:12 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-31 15:07 linux kernel page allocation failure and tuning of page cache Nagal, Amit               UTC CCS
2019-05-31 19:30 ` Matthew Wilcox
2019-05-31 19:30   ` Matthew Wilcox
2019-06-03  5:37   ` [External] " Nagal, Amit               UTC CCS
2019-06-03  5:37     ` Nagal, Amit               UTC CCS
2019-06-03 10:32   ` Nagal, Amit               UTC CCS
2019-06-03 10:32     ` Nagal, Amit               UTC CCS
2019-05-31 21:27 ` Alexander Duyck
2019-05-31 21:27   ` Alexander Duyck
2019-06-03  5:30   ` [External] " Nagal, Amit               UTC CCS
2019-06-03  5:30     ` Nagal, Amit               UTC CCS
2019-06-03 12:11     ` Matthew Wilcox
2019-06-03 12:11       ` Matthew Wilcox
2019-06-03 14:49       ` Nagal, Amit               UTC CCS
2019-06-03 14:49         ` Nagal, Amit               UTC CCS
2019-06-03 15:12       ` Nagal, Amit               UTC CCS
2019-06-03 15:12         ` Nagal, Amit               UTC CCS

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.