"cat /proc/sys/kernel/random/entropy_avail" sometimes causes page allocation failure

* "cat /proc/sys/kernel/random/entropy_avail" sometimes causes page allocation failure
@ 2021-02-01 11:26 Steven Noonan
  2021-02-01 12:20 ` Vlastimil Babka
  0 siblings, 1 reply; 4+ messages in thread
From: Steven Noonan @ 2021-02-01 11:26 UTC (permalink / raw)
  To: linux-mm

[-- Attachment #1.1: Type: text/plain, Size: 6323 bytes --]

(Please CC me on replies, I'm not subscribed to the list.)

I started seeing this problem several months ago, but after some Google searches I concluded that someone had already root caused and fixed it, and I just needed to wait for a newer kernel to come along:

https://www.spinics.net/lists/linux-mm/msg226311.html

But it turns out this has not yet been fixed, even in 5.10.10. Also, while it used to be logged as an order:5 allocation it's now an order:6 allocation for some reason. It only happens rarely (anywhere from once per day to once per week), but it's kind of scary to check the kernel log and see this:

[569537.153097] cat: page allocation failure: order:6, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
[569537.153104] CPU: 1 PID: 3153417 Comm: cat Tainted: P S        IO      5.10.10-1-hsw #1
[569537.153105] Hardware name: ASUSTeK COMPUTER INC. P10S-M WS Series/P10S-M WS Series, BIOS 4503 07/23/2019
[569537.153106] Call Trace:
[569537.153111]  dump_stack+0x58/0x6b
[569537.153113]  warn_alloc.cold+0x78/0xdc
[569537.153116]  ? __alloc_pages_direct_compact+0x15f/0x170
[569537.153117]  __alloc_pages_slowpath.constprop.0+0xc4e/0xc70
[569537.153120]  __alloc_pages_nodemask+0x2de/0x310
[569537.153121]  kmalloc_order+0x25/0x80
[569537.153122]  kmalloc_order_trace+0x17/0x80
[569537.153125]  proc_sys_call_handler+0xb8/0x250
[569537.153127]  new_sync_read+0x10f/0x190
[569537.153128]  vfs_read+0x147/0x1a0
[569537.153130]  ksys_read+0x5f/0xe0
[569537.153131]  do_syscall_64+0x2d/0x70
[569537.153134]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[569537.153135] RIP: 0033:0x7fcc21ef92b2
[569537.153137] Code: c0 e9 b2 fe ff ff 50 48 8d 3d d2 2a 0a 00 e8 e5 ec 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 &lt;48&gt; 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[569537.153138] RSP: 002b:00007ffcfbcc5c88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[569537.153140] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007fcc21ef92b2
[569537.153140] RDX: 0000000000020000 RSI: 00007fcc21ff8000 RDI: 0000000000000003
[569537.153141] RBP: 00007fcc21ff8000 R08: 00007fcc21ff7010 R09: 0000000000000000
[569537.153142] R10: 0000000000000022 R11: 0000000000000246 R12: 0000556e99b70c00
[569537.153142] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
[569537.153144] Mem-Info:
[569537.153146] active_anon:50924 inactive_anon:1715988 isolated_anon:0
                 active_file:163427 inactive_file:104041 isolated_file:0
                 unevictable:3247 dirty:5825 writeback:0
                 slab_reclaimable:39566 slab_unreclaimable:1760175
                 mapped:49406 shmem:17850 pagetables:10611 bounce:0
                 free:1060578 free_pcp:0 free_cma:0
[569537.153149] Node 0 active_anon:203696kB inactive_anon:6863952kB active_file:653708kB inactive_file:416164kB unevictable:12988kB isolated(anon):0kB isolated(file):0kB mapped:197624kB dirty:23300kB writeback:0kB shmem:71400kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:18112kB all_unreclaimable? no
[569537.153151] Node 0 DMA free:11800kB min:24kB low:36kB high:48kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15984kB managed:15896kB mlocked:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[569537.153153] lowmem_reserve[]: 0 1810 64139 64139 64139
[569537.153156] Node 0 DMA32 free:252476kB min:3208kB low:5076kB high:6944kB reserved_highatomic:0KB active_anon:756kB inactive_anon:124844kB active_file:2120kB inactive_file:180kB unevictable:0kB writepending:0kB present:1937436kB managed:1871900kB mlocked:0kB pagetables:8kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[569537.153158] lowmem_reserve[]: 0 0 62329 62329 62329
[569537.153160] Node 0 Normal free:3978036kB min:109404kB low:173232kB high:237060kB reserved_highatomic:2048KB active_anon:202940kB inactive_anon:6739108kB active_file:651588kB inactive_file:415984kB unevictable:12988kB writepending:23076kB present:65011712kB managed:63831896kB mlocked:11176kB pagetables:42436kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[569537.153162] lowmem_reserve[]: 0 0 0 0 0
[569537.153164] Node 0 DMA: 2*4kB (U) 2*8kB (U) 2*16kB (U) 3*32kB (U) 2*64kB (U) 2*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 2*4096kB (M) = 11800kB
[569537.153171] Node 0 DMA32: 3738*4kB (UME) 2341*8kB (UME) 1719*16kB (UME) 943*32kB (UME) 443*64kB (UME) 19*128kB (UME) 108*256kB (UME) 169*512kB (UM) 8*1024kB (UM) 4*2048kB (UM) 0*4096kB = 252704kB
[569537.153179] Node 0 Normal: 81071*4kB (UMEH) 112101*8kB (UMEH) 101108*16kB (UMEH) 34297*32kB (UMEH) 660*64kB (UME) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3978564kB
[569537.153185] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[569537.153186] Node 0 hugepages_total=64 hugepages_free=51 hugepages_surp=64 hugepages_size=2048kB
[569537.153186] 298118 total pagecache pages
[569537.153197] 14242 pages in swap cache
[569537.153198] Swap cache stats: add 165298, delete 151079, find 10932118/10944726
[569537.153199] Free swap  = 100129528kB
[569537.153199] Total swap = 100663288kB
[569537.153200] 16741283 pages RAM
[569537.153201] 0 pages HighMem/MovableOnly
[569537.153201] 311360 pages reserved
[569537.153202] 0 pages hwpoisoned

I used sysdig to figure out what 'cat' was trying to read when it failed, via:

# sysdig -p "%12user.name %6proc.pid %12proc.name %fd.name %evt.args" 'evt.type=read and evt.failed=true and proc.name=cat'

The relevant one was this:

nobody       315341 cat          /proc/sys/kernel/random/entropy_avail res=-12(ENOMEM) data=

Which I believe is a munin script gathering a bunch of data for graphs (munin-node runs as "nobody", uses shell scripts for its data gathering, and one of the plugins tracks entropy_avail, so that would seem to be the guilty party).

Any idea why it's trying to do an order:6 allocation for just a simple read of entropy_avail? Based on the stack trace I'm guessing this isn't owned by the "random" driver maintainers, it seems to be a more general issue with memory allocations while reading from procfs.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 509 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread