linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Steven Noonan <steven@uplinklabs.net>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	Josef Bacik <josef@toxicpanda.com>,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: "cat /proc/sys/kernel/random/entropy_avail" sometimes causes page allocation failure
Date: Mon, 1 Feb 2021 13:20:04 +0100	[thread overview]
Message-ID: <592ccd73-9152-0748-a7ce-be663593f8da@suse.cz> (raw)
In-Reply-To: <mlkR23JXTHsWUCZo0s8vcnO5Ge2ooq9s5rdDc26ASgFrgXnI5-Tt-m-EC8MgxzDrjJMFm0uDEtWfPO6bnF-E9R73YZrA8b6MmHbq-m8YzTY=@uplinklabs.net>

On 2/1/21 12:26 PM, Steven Noonan wrote:
> (Please CC me on replies, I'm not subscribed to the list.)
> 
> I started seeing this problem several months ago, but after some Google searches I concluded that someone had already root caused and fixed it, and I just needed to wait for a newer kernel to come along:
> 
> https://www.spinics.net/lists/linux-mm/msg226311.html

Yeah, that thread mentions Josef's series [1], but I don't see that it made it
into git log. What happened to it? Meanwhile Matthew refactored it with
"4bd6a7353ee1 ("sysctl: Convert to iter interfaces")" which left the kzalloc()
but changed count to count+1, which would explain the change from order-5 to
order-6. In my quick test, strace cat tells me it uses 128k sized read, which is
order-5.

So miminally there should be kvzalloc(), but it's still somewhat unfortunate to
do that for reading a few bytes.
Also the check for KMALLOC_MAX_SIZE happens before the +1. Meh. Matthew?

[1]
https://lore.kernel.org/linux-fsdevel/20200813210411.905010-2-josef@toxicpanda.com/

> But it turns out this has not yet been fixed, even in 5.10.10. Also, while it used to be logged as an order:5 allocation it's now an order:6 allocation for some reason. It only happens rarely (anywhere from once per day to once per week), but it's kind of scary to check the kernel log and see this:
> 
> [569537.153097] cat: page allocation failure: order:6, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
> [569537.153104] CPU: 1 PID: 3153417 Comm: cat Tainted: P S        IO      5.10.10-1-hsw #1
> [569537.153105] Hardware name: ASUSTeK COMPUTER INC. P10S-M WS Series/P10S-M WS Series, BIOS 4503 07/23/2019
> [569537.153106] Call Trace:
> [569537.153111]  dump_stack+0x58/0x6b
> [569537.153113]  warn_alloc.cold+0x78/0xdc
> [569537.153116]  ? __alloc_pages_direct_compact+0x15f/0x170
> [569537.153117]  __alloc_pages_slowpath.constprop.0+0xc4e/0xc70
> [569537.153120]  __alloc_pages_nodemask+0x2de/0x310
> [569537.153121]  kmalloc_order+0x25/0x80
> [569537.153122]  kmalloc_order_trace+0x17/0x80
> [569537.153125]  proc_sys_call_handler+0xb8/0x250
> [569537.153127]  new_sync_read+0x10f/0x190
> [569537.153128]  vfs_read+0x147/0x1a0
> [569537.153130]  ksys_read+0x5f/0xe0
> [569537.153131]  do_syscall_64+0x2d/0x70
> [569537.153134]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [569537.153135] RIP: 0033:0x7fcc21ef92b2
> [569537.153137] Code: c0 e9 b2 fe ff ff 50 48 8d 3d d2 2a 0a 00 e8 e5 ec 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 &lt;48&gt; 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
> [569537.153138] RSP: 002b:00007ffcfbcc5c88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> [569537.153140] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007fcc21ef92b2
> [569537.153140] RDX: 0000000000020000 RSI: 00007fcc21ff8000 RDI: 0000000000000003
> [569537.153141] RBP: 00007fcc21ff8000 R08: 00007fcc21ff7010 R09: 0000000000000000
> [569537.153142] R10: 0000000000000022 R11: 0000000000000246 R12: 0000556e99b70c00
> [569537.153142] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
> [569537.153144] Mem-Info:
> [569537.153146] active_anon:50924 inactive_anon:1715988 isolated_anon:0
>                  active_file:163427 inactive_file:104041 isolated_file:0
>                  unevictable:3247 dirty:5825 writeback:0
>                  slab_reclaimable:39566 slab_unreclaimable:1760175
>                  mapped:49406 shmem:17850 pagetables:10611 bounce:0
>                  free:1060578 free_pcp:0 free_cma:0
> [569537.153149] Node 0 active_anon:203696kB inactive_anon:6863952kB active_file:653708kB inactive_file:416164kB unevictable:12988kB isolated(anon):0kB isolated(file):0kB mapped:197624kB dirty:23300kB writeback:0kB shmem:71400kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:18112kB all_unreclaimable? no
> [569537.153151] Node 0 DMA free:11800kB min:24kB low:36kB high:48kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15984kB managed:15896kB mlocked:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [569537.153153] lowmem_reserve[]: 0 1810 64139 64139 64139
> [569537.153156] Node 0 DMA32 free:252476kB min:3208kB low:5076kB high:6944kB reserved_highatomic:0KB active_anon:756kB inactive_anon:124844kB active_file:2120kB inactive_file:180kB unevictable:0kB writepending:0kB present:1937436kB managed:1871900kB mlocked:0kB pagetables:8kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [569537.153158] lowmem_reserve[]: 0 0 62329 62329 62329
> [569537.153160] Node 0 Normal free:3978036kB min:109404kB low:173232kB high:237060kB reserved_highatomic:2048KB active_anon:202940kB inactive_anon:6739108kB active_file:651588kB inactive_file:415984kB unevictable:12988kB writepending:23076kB present:65011712kB managed:63831896kB mlocked:11176kB pagetables:42436kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [569537.153162] lowmem_reserve[]: 0 0 0 0 0
> [569537.153164] Node 0 DMA: 2*4kB (U) 2*8kB (U) 2*16kB (U) 3*32kB (U) 2*64kB (U) 2*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 2*4096kB (M) = 11800kB
> [569537.153171] Node 0 DMA32: 3738*4kB (UME) 2341*8kB (UME) 1719*16kB (UME) 943*32kB (UME) 443*64kB (UME) 19*128kB (UME) 108*256kB (UME) 169*512kB (UM) 8*1024kB (UM) 4*2048kB (UM) 0*4096kB = 252704kB
> [569537.153179] Node 0 Normal: 81071*4kB (UMEH) 112101*8kB (UMEH) 101108*16kB (UMEH) 34297*32kB (UMEH) 660*64kB (UME) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3978564kB
> [569537.153185] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> [569537.153186] Node 0 hugepages_total=64 hugepages_free=51 hugepages_surp=64 hugepages_size=2048kB
> [569537.153186] 298118 total pagecache pages
> [569537.153197] 14242 pages in swap cache
> [569537.153198] Swap cache stats: add 165298, delete 151079, find 10932118/10944726
> [569537.153199] Free swap  = 100129528kB
> [569537.153199] Total swap = 100663288kB
> [569537.153200] 16741283 pages RAM
> [569537.153201] 0 pages HighMem/MovableOnly
> [569537.153201] 311360 pages reserved
> [569537.153202] 0 pages hwpoisoned
> 
> I used sysdig to figure out what 'cat' was trying to read when it failed, via:
> 
> # sysdig -p "%12user.name %6proc.pid %12proc.name %fd.name %evt.args" 'evt.type=read and evt.failed=true and proc.name=cat'
> 
> The relevant one was this:
> 
> nobody       315341 cat          /proc/sys/kernel/random/entropy_avail res=-12(ENOMEM) data=
> 
> Which I believe is a munin script gathering a bunch of data for graphs (munin-node runs as "nobody", uses shell scripts for its data gathering, and one of the plugins tracks entropy_avail, so that would seem to be the guilty party).
> 
> Any idea why it's trying to do an order:6 allocation for just a simple read of entropy_avail? Based on the stack trace I'm guessing this isn't owned by the "random" driver maintainers, it seems to be a more general issue with memory allocations while reading from procfs.
> 



  reply	other threads:[~2021-02-01 12:20 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-01 11:26 "cat /proc/sys/kernel/random/entropy_avail" sometimes causes page allocation failure Steven Noonan
2021-02-01 12:20 ` Vlastimil Babka [this message]
2021-02-01 13:52   ` Matthew Wilcox
2021-02-09 11:17     ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=592ccd73-9152-0748-a7ce-be663593f8da@suse.cz \
    --to=vbabka@suse.cz \
    --cc=hch@infradead.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-mm@kvack.org \
    --cc=steven@uplinklabs.net \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).