/proc/stat vs. failed order-4 allocation

* /proc/stat vs. failed order-4 allocation
@ 2014-05-21 12:25 Heiko Carstens
  2014-05-21 14:32 ` Christoph Hellwig
  2014-05-22 11:29 ` /proc/stat vs. failed order-4 allocation Ian Kent
  0 siblings, 2 replies; 21+ messages in thread
From: Heiko Carstens @ 2014-05-21 12:25 UTC (permalink / raw)
  To: Andrew Morton, KAMEZAWA Hiroyuki, Andrea Righi, Eric Dumazet
  Cc: linux-kernel, linux-fsdevel, Hendrik Brueckner, Thorsten Diehl

Hi all,

I'm just wondering why /proc/stat is a single_open() seq_file and not a
regular seq_file with an iterator (say 48 online cpus for each iteration
or something similar).

Of course, in theory, the "intr" line may be very long as well...

With the current implementation everything must fit into a single buffer.
So if memory is highly fragmented we run into failing higher order
allocations (like below), which effectively means reading /proc/stat
doesn't work anymore.

>From stat_open:

        size_t size = 1024 + 128 * num_possible_cpus();
	[...]
        /* minimum size to display an interrupt count : 2 bytes */
        size += 2 * nr_irqs;
	[...]
        buf = kmalloc(size, GFP_KERNEL);
        if (!buf)
                return -ENOMEM;

With num_possible_cpus() = 256 we end up with an order 4 allocation.

So, would there be any objections, adding a cpu iterator to /proc/stat?

62129.701569] sadc: page allocation failure: order:4, mode:0x1040d0
[62129.701573] CPU: 1 PID: 192063 Comm: sadc Not tainted 3.10.0-123.el7.s390x #1
[62129.701574] 00000000edf27840 00000000edf27850 0000000000000002 0000000000000000 
00000000edf278e0 00000000edf27858 00000000edf27858 00000000001120c0 
0000000000000000 000000000072c7c0 0000000000711836 000000000000000b 
00000000edf278a0 00000000edf27840 0000000000000000 0000000000000000 
00000000001040d0 00000000001120c0 00000000edf27840 00000000edf278a0 
[62129.701586] Call Trace:
[62129.701588] ([<0000000000111fbe>] show_trace+0xe6/0x130)
[62129.701591] [<0000000000112074>] show_stack+0x6c/0xe8
[62129.701593] [<000000000020d356>] warn_alloc_failed+0xd6/0x138
[62129.701596] [<00000000002114d2>] __alloc_pages_nodemask+0x9da/0xb68
[62129.701598] [<000000000021168e>] __get_free_pages+0x2e/0x58
[62129.701599] [<000000000025a05c>] kmalloc_order_trace+0x44/0xc0
[62129.701602] [<00000000002f3ffa>] stat_open+0x5a/0xd8
[62129.701604] [<00000000002e9aaa>] proc_reg_open+0x8a/0x140
[62129.701606] [<0000000000273b64>] do_dentry_open+0x1bc/0x2c8
[62129.701608] [<000000000027411e>] finish_open+0x46/0x60
[62129.701610] [<000000000028675a>] do_last+0x382/0x10d0
[62129.701612] [<0000000000287570>] path_openat+0xc8/0x4f8
[62129.701614] [<0000000000288bde>] do_filp_open+0x46/0xa8
[62129.701616] [<000000000027541c>] do_sys_open+0x114/0x1f0
[62129.701618] [<00000000005b1c1c>] sysc_tracego+0x14/0x1a
[62129.701620] [<000003fffd0040a0>] 0x3fffd0040a0
[62129.701624] Mem-Info:
[62129.701625] DMA per-cpu:
[62129.701627] CPU 0: hi: 186, btch: 31 usd: 0
[62129.701628] CPU 1: hi: 186, btch: 31 usd: 0
[62129.701630] CPU 2: hi: 186, btch: 31 usd: 51
[62129.701631] Normal per-cpu:
[62129.701632] CPU 0: hi: 186, btch: 31 usd: 30
[62129.701634] CPU 1: hi: 186, btch: 31 usd: 0
[62129.701635] CPU 2: hi: 186, btch: 31 usd: 0
[62129.701639] active_anon:5416 inactive_anon:5571 isolated_anon:0
active_file:440513 inactive_file:406221 isolated_file:27
unevictable:1741 dirty:35305 writeback:0 unstable:0
free:40319 slab_reclaimable:41921 slab_unreclaimable:34553
mapped:3921 shmem:1351 pagetables:296 bounce:0
free_cma:0
[62129.701648] DMA free:25192kB min:11800kB low:14748kB high:17700kB active_anon:11032kB inactive_anon:11320kB active_file:1002092kB inactive_file:904260kB unevictable:3772kB isolated(anon):0kB isolated(file):4kB present:2097152kB managed:2070452kB mlocked:3772kB dirty:55072kB writeback:0kB mapped:6316kB shmem:1152kB slab_reclaimable:61192kB slab_unreclaimable:50108kB kernel_stack:2368kB pagetables:532kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:133 all_unreclaimable? no
[62129.701652] lowmem_reserve[]: 0 1837 1837
[62129.701658] Normal free:136084kB min:10724kB low:13404kB high:16084kB active_anon:10632kB inactive_anon:10964kB active_file:759960kB inactive_file:720624kB unevictable:3192kB isolated(anon):0kB isolated(file):4kB present:1966080kB managed:1881776kB mlocked:3192kB dirty:86148kB writeback:0kB mapped:9368kB shmem:4252kB slab_reclaimable:106492kB slab_unreclaimable:88104kB kernel_stack:5808kB pagetables:652kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[62129.701661] lowmem_reserve[]: 0 0 0
[62129.701664] DMA: 1540*4kB (UEM) 2217*8kB (UEM) 9*16kB (UEM) 1*32kB (R) 5*64kB (R) 1*128kB (R) 1*256kB (R) 1*512kB (R) 0*1024kB = 25288kB
[62129.701673] Normal: 21631*4kB (UEM) 5755*8kB (UEM) 145*16kB (UEM) 8*32kB (ER) 4*64kB (R) 2*128kB (R) 0*256kB 1*512kB (R) 0*1024kB = 136164kB
[62129.701682] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1024kB
[62129.701684] 849331 total pagecache pages
[62129.701685] 131 pages in swap cache
[62129.701687] Swap cache stats: add 9956, delete 9825, find 1049/1416
[62129.701688] Free swap = 7186784kB
[62129.701689] Total swap = 7212140kB
[62129.710679] 1015808 pages RAM
[62129.710681] 23437 pages reserved
[62129.710682] 1360146 pages shared
[62129.710683] 384507 pages non-shared

^ permalink raw reply	[flat|nested] 21+ messages in thread