From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752343AbbGOBrS (ORCPT ); Tue, 14 Jul 2015 21:47:18 -0400 Received: from szxga01-in.huawei.com ([58.251.152.64]:54017 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750996AbbGOBrQ (ORCPT ); Tue, 14 Jul 2015 21:47:16 -0400 From: Xuzhichuang To: David Rientjes CC: "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "Songjiangtao (mygirlsjt)" , "Zhangwei (FF)" , Qiuxishi Subject: =?utf-8?B?562U5aSNOiBbQlVHIFJFUE9SVF0gT09NIEtpbGxlciBpcyBpbnZva2VkIHdo?= =?utf-8?Q?ile_the_system_still_has_much_memory?= Thread-Topic: [BUG REPORT] OOM Killer is invoked while the system still has much memory Thread-Index: AdC+AlDpKIM7UVOHSzyQPu01KZJWsQAAHW+wABMuS4AAE+qukA== Date: Wed, 15 Jul 2015 01:46:51 +0000 Message-ID: <6D317A699782EA4DB9A0E6266C9219696CA2B45B@SZXEMA501-MBX.china.huawei.com> References: <6D317A699782EA4DB9A0E6266C9219696CA2B3BC@SZXEMA501-MBX.china.huawei.com> In-Reply-To: Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.177.21.112] Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id t6F1lMT6029208 Hi, Thanks for your replying. According to the OOM message, OOM killer is invoked by the function seq_read, I found two patches in the latest kernel which can be avoid or fixed this problem. https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/fs/seq_file.c?id=058504edd02667eef8fac9be27ab3ea74332e9b4 https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/fs/seq_file.c?id=5cec38ac866bfb8775638e71a86e4d8cac30caae As the patches said, it changed the seq_file code fallback to vmalloc allocations if kmalloc failed, instead of OOM kill processes. -----邮件原件----- 发件人: David Rientjes [mailto:rientjes@google.com] 发送时间: 2015年7月15日 8:10 收件人: Xuzhichuang 抄送: linux-mm@kvack.org; linux-kernel@vger.kernel.org; Songjiangtao (mygirlsjt); Zhangwei (FF); Qiuxishi 主题: Re: [BUG REPORT] OOM Killer is invoked while the system still has much memory On Tue, 14 Jul 2015, Xuzhichuang wrote: > Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138764] iostat invoked > oom-killer: gfp_mask=0xd0, order=2, oom_adj=0, oom_score_adj=0 Jul 10 > 12:33:03 BMS_CNA04 kernel: [18136514.138769] iostat cpuset=/ > mems_allowed=0 Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138773] Pid: 18117, comm: iostat Tainted: P        W  NX 3.0.58-0.6.6-xen #1 Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138775] Call Trace: > Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138800]  > [] dump_trace+0x6e/0x1a0 Jul 10 12:33:03 BMS_CNA04 > kernel: [18136514.138810]  [] dump_stack+0x69/0x6f > Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138821]  > [] dump_header+0x9d/0x120 Jul 10 12:33:03 BMS_CNA04 > kernel: [18136514.138826]  [] > oom_kill_process+0x95/0x1a0 Jul 10 12:33:03 BMS_CNA04 kernel: > [18136514.138830]  [] out_of_memory+0x136/0x220 Jul > 10 12:33:03 BMS_CNA04 kernel: [18136514.138834]  [] > __alloc_pages_slowpath+0x7ba/0x810 > Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138838]  > [] __alloc_pages_nodemask+0x1e9/0x200 > Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138845]  > [] cache_grow+0x348/0x450 Jul 10 12:33:03 BMS_CNA04 > kernel: [18136514.138850]  [] > cache_alloc_refill+0x303/0x4d0 Jul 10 12:33:03 BMS_CNA04 kernel: > [18136514.138854]  [] __kmalloc+0x1b0/0x290 Jul 10 > 12:33:03 BMS_CNA04 kernel: [18136514.138862]  [] > seq_read+0x13a/0x3b0 Jul 10 12:33:03 BMS_CNA04 kernel: > [18136514.138869]  [] proc_reg_read+0x92/0xe0 Jul 10 > 12:33:03 BMS_CNA04 kernel: [18136514.138877]  [] > vfs_read+0xc7/0x130 Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138881]  [] sys_read+0x53/0xa0 Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138887]  [] system_call_fastpath+0x16/0x1b Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138922]  [<00007f935f57f4c0>] 0x7f935f57f4bf Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138923] Mem-Info: > Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138925] DMA per-cpu: > Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138927] CPU    0: hi:    > 0, btch:   1 usd:   0 Jul 10 12:33:03 BMS_CNA04 kernel: > [18136514.138929] CPU    1: hi:    0, btch:   1 usd:   0 Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138930] DMA32 per-cpu: > Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138932] CPU    0: hi:  > 155, btch:  38 usd:  11 Jul 10 12:33:03 BMS_CNA04 kernel: > [18136514.138933] CPU    1: hi:  155, btch:  38 usd:   0 Jul 10 > 12:33:03 BMS_CNA04 kernel: [18136514.138936] active_anon:227111 > inactive_anon:10382 isolated_anon:0 Jul 10 12:33:03 BMS_CNA04 kernel: > [18136514.138937]  active_file:203 inactive_file:189 isolated_file:47 > Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138938]  unevictable:95395 > dirty:0 writeback:0 unstable:0 Jul 10 12:33:03 BMS_CNA04 kernel: > [18136514.138939]  free:247834 slab_reclaimable:18187 > slab_unreclaimable:53853 Jul 10 12:33:03 BMS_CNA04 kernel: > [18136514.138940]  mapped:11485 shmem:11167 pagetables:0 bounce:0 Jul > 10 12:33:03 BMS_CNA04 kernel: [18136514.138945] DMA free:984kB > min:36kB low:44kB high:52kB active_anon:0kB inactive_anon:0kB > active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB > isolated(file):0kB present:16160kB mlocked:0kB dirty:0kB writeback:0kB > mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB > kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB > writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes Jul 10 > 12:33:03 BMS_CNA04 kernel: [18136514.138949] lowmem_reserve[]: 0 3014 > 3014 3014 Jul 10 12:33:03 BMS_CNA04 kernel: [18136514.138955] DMA32 > free:990352kB min:7004kB low:8752kB high:10504kB active_anon:908444kB > inactive_anon:41528kB active_file:812kB inactive_file:756kB > unevictable:381580kB isolated(anon):0kB isolated(file):188kB > present:3025264kB mlocked:381580kB dirty:0kB writeback:0kB > mapped:45940kB shmem:44668kB slab_reclaimable:72748kB > slab_unreclaimable:215412kB kernel_stack:12456kB pagetables:0kB > unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:192 > all_unreclaimable? no Jul 10 12:33:03 BMS_CNA04 kernel: > [18136514.138960] lowmem_reserve[]: 0 0 0 0 Jul 10 12:33:03 BMS_CNA04 > kernel: [18136514.138962] DMA: 2*4kB 4*8kB 3*16kB 4*32kB 2*64kB > 1*128kB 2*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 984kB Jul 10 > 12:33:03 BMS_CNA04 kernel: [18136514.138968] DMA32: 188513*4kB > 29459*8kB 2*16kB 2*32kB 1*64kB 0*128kB 0*256kB 1*512kB 0*1024kB > 0*2048kB 0*4096kB = 990396kB The problem is most of your memory for ZONE_DMA32 is available only in sizes of order-0 and order-1 and the slab allocator is trying to allocate order-2 memory with no possibility of fallback to a smaller order. You're running on a 3.0.58 kernel, but the watermark calculation should be the same in recent kernels. If you follow the logic of __zone_watermark_ok(), which uses the same watermarks as printed above, the min watermark for this zone is 1751 pages and the total zone free pages is 247588. Discounting order-0 memory, there are only 59075 pages free with a min watermark of 875 pages. Discounting order-1 memory, there are 157 pages free with a min watermark of 437 pages. This is where your allocation fails. Even though the zone has 672KB of memory available, the per-order watermark fails. The only option you have to avoid this other than changing your workload is to alter lowmem_reserve_ratio, see Documentation/sysctl/vm.txt. You have 916KB of memory in ZONE_DMA that could be used for this allocation if it wasn't reserved for DMA allocations. {.n++%ݶw{.n+{G{ayʇڙ,jfhz_(階ݢj"mG?&~iOzv^m ?I