* mm, vmscan: commit makes PAE kernel crash nightly (bisected) @ 2017-01-11 10:32 Trevor Cordes 2017-01-11 12:11 ` Mel Gorman 0 siblings, 1 reply; 40+ messages in thread From: Trevor Cordes @ 2017-01-11 10:32 UTC (permalink / raw) To: linux-kernel Cc: Mel Gorman, Joonsoo Kim, Michal Hocko, Minchan Kim, Rik van Riel, Srikar Dronamraju Hi! I have biected a nightly oom-killer flood and crash/hang on one of the boxes I admin. It doesn't crash on Fedora 23/24 4.7.10 kernel but does on any 4.8 Fedora kernel. I did a vanilla bisect and the bug is here: commit b2e18757f2c9d1cdd746a882e9878852fdec9501 Author: Mel Gorman <mgorman@techsingularity.net> Date: Thu Jul 28 15:45:37 2016 -0700 mm, vmscan: begin reclaiming pages on a per-node basis I bisected between: # bad: [69973b830859bc6529a7a0468ba0d80ee5117826] Linux 4.9 # good: [523d939ef98fd712632d93a5a2b588e477a7565e] Linux 4.7 I have not tried newer than 4.8.13 Fedora kernel, but if someone thinks this bug is already fixed in HEAD I could try that next. It took 3 weeks to bisect because the crash only seems to happen in the middle of the night, and not every, but most, nights. It does not occur on most of my other boxes, just this one. The box is a bit unique in that it's running 32-bit PAE on a 64-bit capable CPU, and I have the memory tuned down to mem=6G in the kernel command line (I think it has 16GB actual). I tuned the RAM down because around 8GB the PAE kernel has massive IO speed issues. It is a relatively new Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz on an Intel S1200BTL board. I will eventually change it to 64-bit Fedora which I'm sure will solve this bug, but since there's no easy upgrade path, that's on the backburner on this production box. I'm sure this will be another "PAE sucks, don't use it" issue, but like I said, I'm currently stuck with it, and in theory the kernel shouldn't crash like this (I'm guessing/hoping). I think I pinned the trigger down to either (or both) big dir scans (like "find /bigdir-foo") running at around 3am. It's either a remote box doing indexing via smbd and/or rsync or rdiff-backup also doing big dir scans. But when I do "find /" manually I can't trigger the bug. Very weird. The commit notes make it sound like the author thought perhaps there could be a problem in some scenarios? I guess I found the scenario. The only discussion I found on the net regarding this commit is https://lkml.org/lkml/2016/8/29/154 And perhaps it's somewhat relevant, it's a bit over my head. I'm available for testing, etc, and can usually rule out a bad kernel within 24-hours by just waiting for 3am to roll around. I also have copious logs I can provide and screenshots of the crashes. The box is extremely lightly loaded, and RAM use is almost always under 1GB, and swap is 0-20k used most of the time with GB's free. Everything looks great until all of a sudden oom-killer starts running and goes through 10-260 iterations before the system just dies. I wrote a script to watch for oom-killer and issue "reboot" immediately, but 80% of the time the box will hang before the reboot actually manages to shutdown. Any information/help I can provide, please just holler. Thanks! ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-11 10:32 mm, vmscan: commit makes PAE kernel crash nightly (bisected) Trevor Cordes @ 2017-01-11 12:11 ` Mel Gorman 2017-01-11 12:14 ` Mel Gorman 0 siblings, 1 reply; 40+ messages in thread From: Mel Gorman @ 2017-01-11 12:11 UTC (permalink / raw) To: Trevor Cordes Cc: linux-kernel, Joonsoo Kim, Michal Hocko, Minchan Kim, Rik van Riel, Srikar Dronamraju On Wed, Jan 11, 2017 at 04:32:43AM -0600, Trevor Cordes wrote: > Hi! I have biected a nightly oom-killer flood and crash/hang on one of > the boxes I admin. It doesn't crash on Fedora 23/24 4.7.10 kernel but > does on any 4.8 Fedora kernel. I did a vanilla bisect and the bug is > here: > > commit b2e18757f2c9d1cdd746a882e9878852fdec9501 > Author: Mel Gorman <mgorman@techsingularity.net> > Date: Thu Jul 28 15:45:37 2016 -0700 > > mm, vmscan: begin reclaiming pages on a per-node basis > Michal Hocko recently worked on a bug similar to this. Can you test the following patch that is currently queued in Andrew Morton's tree? It applies cleanly to 4.9 Thanks. From: Michal Hocko <mhocko@suse.com> Subject: mm, memcg: fix the active list aging for lowmem requests when memcg is enabled Nils Holland and Klaus Ethgen have reported unexpected OOM killer invocations with 32b kernel starting with 4.8 kernels kworker/u4:5 invoked oom-killer: gfp_mask=0x2400840(GFP_NOFS|__GFP_NOFAIL), nodemask=0, order=0, oom_score_adj=0 kworker/u4:5 cpuset=/ mems_allowed=0 CPU: 1 PID: 2603 Comm: kworker/u4:5 Not tainted 4.9.0-gentoo #2 [...] Mem-Info: active_anon:58685 inactive_anon:90 isolated_anon:0 active_file:274324 inactive_file:281962 isolated_file:0 unevictable:0 dirty:649 writeback:0 unstable:0 slab_reclaimable:40662 slab_unreclaimable:17754 mapped:7382 shmem:202 pagetables:351 bounce:0 free:206736 free_pcp:332 free_cma:0 Node 0 active_anon:234740kB inactive_anon:360kB active_file:1097296kB inactive_file:1127848kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:29528kB dirty:2596kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 184320kB anon_thp: 808kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no DMA free:3952kB min:788kB low:984kB high:1180kB active_anon:0kB inactive_anon:0kB active_file:7316kB inactive_file:0kB unevictable:0kB writepending:96kB present:15992kB managed:15916kB mlocked:0kB slab_reclaimable:3200kB slab_unreclaimable:1408kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB lowmem_reserve[]: 0 813 3474 3474 Normal free:41332kB min:41368kB low:51708kB high:62048kB active_anon:0kB inactive_anon:0kB active_file:532748kB inactive_file:44kB unevictable:0kB writepending:24kB present:897016kB managed:836248kB mlocked:0kB slab_reclaimable:159448kB slab_unreclaimable:69608kB kernel_stack:1112kB pagetables:1404kB bounce:0kB free_pcp:528kB local_pcp:340kB free_cma:0kB lowmem_reserve[]: 0 0 21292 21292 HighMem free:781660kB min:512kB low:34356kB high:68200kB active_anon:234740kB inactive_anon:360kB active_file:557232kB inactive_file:1127804kB unevictable:0kB writepending:2592kB present:2725384kB managed:2725384kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:800kB local_pcp:608kB free_cma:0kB the oom killer is clearly pre-mature because there there is still a lot of page cache in the zone Normal which should satisfy this lowmem request. Further debugging has shown that the reclaim cannot make any forward progress because the page cache is hidden in the active list which doesn't get rotated because inactive_list_is_low is not memcg aware. It simply subtracts per-zone highmem counters from the respective memcg's lru sizes which doesn't make any sense. We can simply end up always seeing the resulting active and inactive counts 0 and return false. This issue is not limited to 32b kernels but in practice the effect on systems without CONFIG_HIGHMEM would be much harder to notice because we do not invoke the OOM killer for allocations requests targeting < ZONE_NORMAL. Fix the issue by tracking per zone lru page counts in mem_cgroup_per_node and subtract per-memcg highmem counts when memcg is enabled. Introduce helper lruvec_zone_lru_size which redirects to either zone counters or mem_cgroup_get_zone_lru_size when appropriate. We are losing empty LRU but non-zero lru size detection introduced by ca707239e8a7 ("mm: update_lru_size warn and reset bad lru_size") because of the inherent zone vs. node discrepancy. Fixes: f8d1a31163fc ("mm: consider whether to decivate based on eligible zones inactive ratio") Link: http://lkml.kernel.org/r/20170104100825.3729-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Nils Holland <nholland@tisys.org> Tested-by: Nils Holland <nholland@tisys.org> Reported-by: Klaus Ethgen <Klaus@Ethgen.de> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: <stable@vger.kernel.org> [4.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> --- include/linux/memcontrol.h | 26 +++++++++++++++++++++++--- include/linux/mm_inline.h | 2 +- mm/memcontrol.c | 18 ++++++++---------- mm/vmscan.c | 27 +++++++++++++++++---------- 4 files changed, 49 insertions(+), 24 deletions(-) diff -puN include/linux/memcontrol.h~mm-memcg-fix-the-active-list-aging-for-lowmem-requests-when-memcg-is-enabled include/linux/memcontrol.h --- a/include/linux/memcontrol.h~mm-memcg-fix-the-active-list-aging-for-lowmem-requests-when-memcg-is-enabled +++ a/include/linux/memcontrol.h @@ -120,7 +120,7 @@ struct mem_cgroup_reclaim_iter { */ struct mem_cgroup_per_node { struct lruvec lruvec; - unsigned long lru_size[NR_LRU_LISTS]; + unsigned long lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS]; struct mem_cgroup_reclaim_iter iter[DEF_PRIORITY + 1]; @@ -432,7 +432,7 @@ static inline bool mem_cgroup_online(str int mem_cgroup_select_victim_node(struct mem_cgroup *memcg); void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru, - int nr_pages); + int zid, int nr_pages); unsigned long mem_cgroup_node_nr_lru_pages(struct mem_cgroup *memcg, int nid, unsigned int lru_mask); @@ -441,9 +441,23 @@ static inline unsigned long mem_cgroup_get_lru_size(struct lruvec *lruvec, enum lru_list lru) { struct mem_cgroup_per_node *mz; + unsigned long nr_pages = 0; + int zid; mz = container_of(lruvec, struct mem_cgroup_per_node, lruvec); - return mz->lru_size[lru]; + for (zid = 0; zid < MAX_NR_ZONES; zid++) + nr_pages += mz->lru_zone_size[zid][lru]; + return nr_pages; +} + +static inline +unsigned long mem_cgroup_get_zone_lru_size(struct lruvec *lruvec, + enum lru_list lru, int zone_idx) +{ + struct mem_cgroup_per_node *mz; + + mz = container_of(lruvec, struct mem_cgroup_per_node, lruvec); + return mz->lru_zone_size[zone_idx][lru]; } void mem_cgroup_handle_over_high(void); @@ -671,6 +685,12 @@ mem_cgroup_get_lru_size(struct lruvec *l { return 0; } +static inline +unsigned long mem_cgroup_get_zone_lru_size(struct lruvec *lruvec, + enum lru_list lru, int zone_idx) +{ + return 0; +} static inline unsigned long mem_cgroup_node_nr_lru_pages(struct mem_cgroup *memcg, diff -puN include/linux/mm_inline.h~mm-memcg-fix-the-active-list-aging-for-lowmem-requests-when-memcg-is-enabled include/linux/mm_inline.h --- a/include/linux/mm_inline.h~mm-memcg-fix-the-active-list-aging-for-lowmem-requests-when-memcg-is-enabled +++ a/include/linux/mm_inline.h @@ -39,7 +39,7 @@ static __always_inline void update_lru_s { __update_lru_size(lruvec, lru, zid, nr_pages); #ifdef CONFIG_MEMCG - mem_cgroup_update_lru_size(lruvec, lru, nr_pages); + mem_cgroup_update_lru_size(lruvec, lru, zid, nr_pages); #endif } diff -puN mm/memcontrol.c~mm-memcg-fix-the-active-list-aging-for-lowmem-requests-when-memcg-is-enabled mm/memcontrol.c --- a/mm/memcontrol.c~mm-memcg-fix-the-active-list-aging-for-lowmem-requests-when-memcg-is-enabled +++ a/mm/memcontrol.c @@ -625,8 +625,8 @@ static void mem_cgroup_charge_statistics unsigned long mem_cgroup_node_nr_lru_pages(struct mem_cgroup *memcg, int nid, unsigned int lru_mask) { + struct lruvec *lruvec = mem_cgroup_lruvec(NODE_DATA(nid), memcg); unsigned long nr = 0; - struct mem_cgroup_per_node *mz; enum lru_list lru; VM_BUG_ON((unsigned)nid >= nr_node_ids); @@ -634,8 +634,7 @@ unsigned long mem_cgroup_node_nr_lru_pag for_each_lru(lru) { if (!(BIT(lru) & lru_mask)) continue; - mz = mem_cgroup_nodeinfo(memcg, nid); - nr += mz->lru_size[lru]; + nr += mem_cgroup_get_lru_size(lruvec, lru); } return nr; } @@ -1002,6 +1001,7 @@ out: * mem_cgroup_update_lru_size - account for adding or removing an lru page * @lruvec: mem_cgroup per zone lru vector * @lru: index of lru list the page is sitting on + * @zid: zone id of the accounted pages * @nr_pages: positive when adding or negative when removing * * This function must be called under lru_lock, just before a page is added @@ -1009,27 +1009,25 @@ out: * so as to allow it to check that lru_size 0 is consistent with list_empty). */ void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru, - int nr_pages) + int zid, int nr_pages) { struct mem_cgroup_per_node *mz; unsigned long *lru_size; long size; - bool empty; if (mem_cgroup_disabled()) return; mz = container_of(lruvec, struct mem_cgroup_per_node, lruvec); - lru_size = mz->lru_size + lru; - empty = list_empty(lruvec->lists + lru); + lru_size = &mz->lru_zone_size[zid][lru]; if (nr_pages < 0) *lru_size += nr_pages; size = *lru_size; - if (WARN_ONCE(size < 0 || empty != !size, - "%s(%p, %d, %d): lru_size %ld but %sempty\n", - __func__, lruvec, lru, nr_pages, size, empty ? "" : "not ")) { + if (WARN_ONCE(size < 0, + "%s(%p, %d, %d): lru_size %ld\n", + __func__, lruvec, lru, nr_pages, size)) { VM_BUG_ON(1); *lru_size = 0; } diff -puN mm/vmscan.c~mm-memcg-fix-the-active-list-aging-for-lowmem-requests-when-memcg-is-enabled mm/vmscan.c --- a/mm/vmscan.c~mm-memcg-fix-the-active-list-aging-for-lowmem-requests-when-memcg-is-enabled +++ a/mm/vmscan.c @@ -242,6 +242,16 @@ unsigned long lruvec_lru_size(struct lru return node_page_state(lruvec_pgdat(lruvec), NR_LRU_BASE + lru); } +unsigned long lruvec_zone_lru_size(struct lruvec *lruvec, enum lru_list lru, + int zone_idx) +{ + if (!mem_cgroup_disabled()) + return mem_cgroup_get_zone_lru_size(lruvec, lru, zone_idx); + + return zone_page_state(&lruvec_pgdat(lruvec)->node_zones[zone_idx], + NR_ZONE_LRU_BASE + lru); +} + /* * Add a shrinker callback to be called from the vm. */ @@ -1382,8 +1392,7 @@ int __isolate_lru_page(struct page *page * be complete before mem_cgroup_update_lru_size due to a santity check. */ static __always_inline void update_lru_sizes(struct lruvec *lruvec, - enum lru_list lru, unsigned long *nr_zone_taken, - unsigned long nr_taken) + enum lru_list lru, unsigned long *nr_zone_taken) { int zid; @@ -1392,11 +1401,11 @@ static __always_inline void update_lru_s continue; __update_lru_size(lruvec, lru, zid, -nr_zone_taken[zid]); - } - #ifdef CONFIG_MEMCG - mem_cgroup_update_lru_size(lruvec, lru, -nr_taken); + mem_cgroup_update_lru_size(lruvec, lru, zid, -nr_zone_taken[zid]); #endif + } + } /* @@ -1501,7 +1510,7 @@ static unsigned long isolate_lru_pages(u *nr_scanned = scan; trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, nr_to_scan, scan, nr_taken, mode, is_file_lru(lru)); - update_lru_sizes(lruvec, lru, nr_zone_taken, nr_taken); + update_lru_sizes(lruvec, lru, nr_zone_taken); return nr_taken; } @@ -2047,10 +2056,8 @@ static bool inactive_list_is_low(struct if (!managed_zone(zone)) continue; - inactive_zone = zone_page_state(zone, - NR_ZONE_LRU_BASE + (file * LRU_FILE)); - active_zone = zone_page_state(zone, - NR_ZONE_LRU_BASE + (file * LRU_FILE) + LRU_ACTIVE); + inactive_zone = lruvec_zone_lru_size(lruvec, file * LRU_FILE, zid); + active_zone = lruvec_zone_lru_size(lruvec, (file * LRU_FILE) + LRU_ACTIVE, zid); inactive -= min(inactive, inactive_zone); active -= min(active, active_zone); _ ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-11 12:11 ` Mel Gorman @ 2017-01-11 12:14 ` Mel Gorman 2017-01-11 22:52 ` Trevor Cordes 0 siblings, 1 reply; 40+ messages in thread From: Mel Gorman @ 2017-01-11 12:14 UTC (permalink / raw) To: Trevor Cordes Cc: linux-kernel, Joonsoo Kim, Michal Hocko, Minchan Kim, Rik van Riel, Srikar Dronamraju On Wed, Jan 11, 2017 at 12:11:46PM +0000, Mel Gorman wrote: > On Wed, Jan 11, 2017 at 04:32:43AM -0600, Trevor Cordes wrote: > > Hi! I have biected a nightly oom-killer flood and crash/hang on one of > > the boxes I admin. It doesn't crash on Fedora 23/24 4.7.10 kernel but > > does on any 4.8 Fedora kernel. I did a vanilla bisect and the bug is > > here: > > > > commit b2e18757f2c9d1cdd746a882e9878852fdec9501 > > Author: Mel Gorman <mgorman@techsingularity.net> > > Date: Thu Jul 28 15:45:37 2016 -0700 > > > > mm, vmscan: begin reclaiming pages on a per-node basis > > > > Michal Hocko recently worked on a bug similar to this. Can you test the > following patch that is currently queued in Andrew Morton's tree? It > applies cleanly to 4.9 > I should have pointed out that this patch primarily affects memcg but the bug report did not include an OOM report and did not describe whether memcgs could be involved or not. If memcgs are not involved then please post the first full OOM kill. > Thanks. > > From: Michal Hocko <mhocko@suse.com> > Subject: mm, memcg: fix the active list aging for lowmem requests when memcg is enabled > > Nils Holland and Klaus Ethgen have reported unexpected OOM killer > invocations with 32b kernel starting with 4.8 kernels > > kworker/u4:5 invoked oom-killer: gfp_mask=0x2400840(GFP_NOFS|__GFP_NOFAIL), nodemask=0, order=0, oom_score_adj=0 > kworker/u4:5 cpuset=/ mems_allowed=0 > CPU: 1 PID: 2603 Comm: kworker/u4:5 Not tainted 4.9.0-gentoo #2 > [...] > Mem-Info: > active_anon:58685 inactive_anon:90 isolated_anon:0 > active_file:274324 inactive_file:281962 isolated_file:0 > unevictable:0 dirty:649 writeback:0 unstable:0 > slab_reclaimable:40662 slab_unreclaimable:17754 > mapped:7382 shmem:202 pagetables:351 bounce:0 > free:206736 free_pcp:332 free_cma:0 > Node 0 active_anon:234740kB inactive_anon:360kB active_file:1097296kB inactive_file:1127848kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:29528kB dirty:2596kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 184320kB anon_thp: 808kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no > DMA free:3952kB min:788kB low:984kB high:1180kB active_anon:0kB inactive_anon:0kB active_file:7316kB inactive_file:0kB unevictable:0kB writepending:96kB present:15992kB managed:15916kB mlocked:0kB slab_reclaimable:3200kB slab_unreclaimable:1408kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB > lowmem_reserve[]: 0 813 3474 3474 > Normal free:41332kB min:41368kB low:51708kB high:62048kB active_anon:0kB inactive_anon:0kB active_file:532748kB inactive_file:44kB unevictable:0kB writepending:24kB present:897016kB managed:836248kB mlocked:0kB slab_reclaimable:159448kB slab_unreclaimable:69608kB kernel_stack:1112kB pagetables:1404kB bounce:0kB free_pcp:528kB local_pcp:340kB free_cma:0kB > lowmem_reserve[]: 0 0 21292 21292 > HighMem free:781660kB min:512kB low:34356kB high:68200kB active_anon:234740kB inactive_anon:360kB active_file:557232kB inactive_file:1127804kB unevictable:0kB writepending:2592kB present:2725384kB managed:2725384kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:800kB local_pcp:608kB free_cma:0kB > > the oom killer is clearly pre-mature because there there is still a lot of > page cache in the zone Normal which should satisfy this lowmem request. > Further debugging has shown that the reclaim cannot make any forward > progress because the page cache is hidden in the active list which doesn't > get rotated because inactive_list_is_low is not memcg aware. > > It simply subtracts per-zone highmem counters from the respective memcg's > lru sizes which doesn't make any sense. We can simply end up always > seeing the resulting active and inactive counts 0 and return false. This > issue is not limited to 32b kernels but in practice the effect on systems > without CONFIG_HIGHMEM would be much harder to notice because we do not > invoke the OOM killer for allocations requests targeting < ZONE_NORMAL. > > Fix the issue by tracking per zone lru page counts in mem_cgroup_per_node > and subtract per-memcg highmem counts when memcg is enabled. Introduce > helper lruvec_zone_lru_size which redirects to either zone counters or > mem_cgroup_get_zone_lru_size when appropriate. > > We are losing empty LRU but non-zero lru size detection introduced by > ca707239e8a7 ("mm: update_lru_size warn and reset bad lru_size") because > of the inherent zone vs. node discrepancy. > > Fixes: f8d1a31163fc ("mm: consider whether to decivate based on eligible zones inactive ratio") > Link: http://lkml.kernel.org/r/20170104100825.3729-1-mhocko@kernel.org > Signed-off-by: Michal Hocko <mhocko@suse.com> > Reported-by: Nils Holland <nholland@tisys.org> > Tested-by: Nils Holland <nholland@tisys.org> > Reported-by: Klaus Ethgen <Klaus@Ethgen.de> > Acked-by: Minchan Kim <minchan@kernel.org> > Acked-by: Mel Gorman <mgorman@suse.de> > Acked-by: Johannes Weiner <hannes@cmpxchg.org> > Reviewed-by: Vladimir Davydov <vdavydov.dev@gmail.com> > Cc: <stable@vger.kernel.org> [4.8+] > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > --- > > include/linux/memcontrol.h | 26 +++++++++++++++++++++++--- > include/linux/mm_inline.h | 2 +- > mm/memcontrol.c | 18 ++++++++---------- > mm/vmscan.c | 27 +++++++++++++++++---------- > 4 files changed, 49 insertions(+), 24 deletions(-) > > diff -puN include/linux/memcontrol.h~mm-memcg-fix-the-active-list-aging-for-lowmem-requests-when-memcg-is-enabled include/linux/memcontrol.h > --- a/include/linux/memcontrol.h~mm-memcg-fix-the-active-list-aging-for-lowmem-requests-when-memcg-is-enabled > +++ a/include/linux/memcontrol.h > @@ -120,7 +120,7 @@ struct mem_cgroup_reclaim_iter { > */ > struct mem_cgroup_per_node { > struct lruvec lruvec; > - unsigned long lru_size[NR_LRU_LISTS]; > + unsigned long lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS]; > > struct mem_cgroup_reclaim_iter iter[DEF_PRIORITY + 1]; > > @@ -432,7 +432,7 @@ static inline bool mem_cgroup_online(str > int mem_cgroup_select_victim_node(struct mem_cgroup *memcg); > > void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru, > - int nr_pages); > + int zid, int nr_pages); > > unsigned long mem_cgroup_node_nr_lru_pages(struct mem_cgroup *memcg, > int nid, unsigned int lru_mask); > @@ -441,9 +441,23 @@ static inline > unsigned long mem_cgroup_get_lru_size(struct lruvec *lruvec, enum lru_list lru) > { > struct mem_cgroup_per_node *mz; > + unsigned long nr_pages = 0; > + int zid; > > mz = container_of(lruvec, struct mem_cgroup_per_node, lruvec); > - return mz->lru_size[lru]; > + for (zid = 0; zid < MAX_NR_ZONES; zid++) > + nr_pages += mz->lru_zone_size[zid][lru]; > + return nr_pages; > +} > + > +static inline > +unsigned long mem_cgroup_get_zone_lru_size(struct lruvec *lruvec, > + enum lru_list lru, int zone_idx) > +{ > + struct mem_cgroup_per_node *mz; > + > + mz = container_of(lruvec, struct mem_cgroup_per_node, lruvec); > + return mz->lru_zone_size[zone_idx][lru]; > } > > void mem_cgroup_handle_over_high(void); > @@ -671,6 +685,12 @@ mem_cgroup_get_lru_size(struct lruvec *l > { > return 0; > } > +static inline > +unsigned long mem_cgroup_get_zone_lru_size(struct lruvec *lruvec, > + enum lru_list lru, int zone_idx) > +{ > + return 0; > +} > > static inline unsigned long > mem_cgroup_node_nr_lru_pages(struct mem_cgroup *memcg, > diff -puN include/linux/mm_inline.h~mm-memcg-fix-the-active-list-aging-for-lowmem-requests-when-memcg-is-enabled include/linux/mm_inline.h > --- a/include/linux/mm_inline.h~mm-memcg-fix-the-active-list-aging-for-lowmem-requests-when-memcg-is-enabled > +++ a/include/linux/mm_inline.h > @@ -39,7 +39,7 @@ static __always_inline void update_lru_s > { > __update_lru_size(lruvec, lru, zid, nr_pages); > #ifdef CONFIG_MEMCG > - mem_cgroup_update_lru_size(lruvec, lru, nr_pages); > + mem_cgroup_update_lru_size(lruvec, lru, zid, nr_pages); > #endif > } > > diff -puN mm/memcontrol.c~mm-memcg-fix-the-active-list-aging-for-lowmem-requests-when-memcg-is-enabled mm/memcontrol.c > --- a/mm/memcontrol.c~mm-memcg-fix-the-active-list-aging-for-lowmem-requests-when-memcg-is-enabled > +++ a/mm/memcontrol.c > @@ -625,8 +625,8 @@ static void mem_cgroup_charge_statistics > unsigned long mem_cgroup_node_nr_lru_pages(struct mem_cgroup *memcg, > int nid, unsigned int lru_mask) > { > + struct lruvec *lruvec = mem_cgroup_lruvec(NODE_DATA(nid), memcg); > unsigned long nr = 0; > - struct mem_cgroup_per_node *mz; > enum lru_list lru; > > VM_BUG_ON((unsigned)nid >= nr_node_ids); > @@ -634,8 +634,7 @@ unsigned long mem_cgroup_node_nr_lru_pag > for_each_lru(lru) { > if (!(BIT(lru) & lru_mask)) > continue; > - mz = mem_cgroup_nodeinfo(memcg, nid); > - nr += mz->lru_size[lru]; > + nr += mem_cgroup_get_lru_size(lruvec, lru); > } > return nr; > } > @@ -1002,6 +1001,7 @@ out: > * mem_cgroup_update_lru_size - account for adding or removing an lru page > * @lruvec: mem_cgroup per zone lru vector > * @lru: index of lru list the page is sitting on > + * @zid: zone id of the accounted pages > * @nr_pages: positive when adding or negative when removing > * > * This function must be called under lru_lock, just before a page is added > @@ -1009,27 +1009,25 @@ out: > * so as to allow it to check that lru_size 0 is consistent with list_empty). > */ > void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru, > - int nr_pages) > + int zid, int nr_pages) > { > struct mem_cgroup_per_node *mz; > unsigned long *lru_size; > long size; > - bool empty; > > if (mem_cgroup_disabled()) > return; > > mz = container_of(lruvec, struct mem_cgroup_per_node, lruvec); > - lru_size = mz->lru_size + lru; > - empty = list_empty(lruvec->lists + lru); > + lru_size = &mz->lru_zone_size[zid][lru]; > > if (nr_pages < 0) > *lru_size += nr_pages; > > size = *lru_size; > - if (WARN_ONCE(size < 0 || empty != !size, > - "%s(%p, %d, %d): lru_size %ld but %sempty\n", > - __func__, lruvec, lru, nr_pages, size, empty ? "" : "not ")) { > + if (WARN_ONCE(size < 0, > + "%s(%p, %d, %d): lru_size %ld\n", > + __func__, lruvec, lru, nr_pages, size)) { > VM_BUG_ON(1); > *lru_size = 0; > } > diff -puN mm/vmscan.c~mm-memcg-fix-the-active-list-aging-for-lowmem-requests-when-memcg-is-enabled mm/vmscan.c > --- a/mm/vmscan.c~mm-memcg-fix-the-active-list-aging-for-lowmem-requests-when-memcg-is-enabled > +++ a/mm/vmscan.c > @@ -242,6 +242,16 @@ unsigned long lruvec_lru_size(struct lru > return node_page_state(lruvec_pgdat(lruvec), NR_LRU_BASE + lru); > } > > +unsigned long lruvec_zone_lru_size(struct lruvec *lruvec, enum lru_list lru, > + int zone_idx) > +{ > + if (!mem_cgroup_disabled()) > + return mem_cgroup_get_zone_lru_size(lruvec, lru, zone_idx); > + > + return zone_page_state(&lruvec_pgdat(lruvec)->node_zones[zone_idx], > + NR_ZONE_LRU_BASE + lru); > +} > + > /* > * Add a shrinker callback to be called from the vm. > */ > @@ -1382,8 +1392,7 @@ int __isolate_lru_page(struct page *page > * be complete before mem_cgroup_update_lru_size due to a santity check. > */ > static __always_inline void update_lru_sizes(struct lruvec *lruvec, > - enum lru_list lru, unsigned long *nr_zone_taken, > - unsigned long nr_taken) > + enum lru_list lru, unsigned long *nr_zone_taken) > { > int zid; > > @@ -1392,11 +1401,11 @@ static __always_inline void update_lru_s > continue; > > __update_lru_size(lruvec, lru, zid, -nr_zone_taken[zid]); > - } > - > #ifdef CONFIG_MEMCG > - mem_cgroup_update_lru_size(lruvec, lru, -nr_taken); > + mem_cgroup_update_lru_size(lruvec, lru, zid, -nr_zone_taken[zid]); > #endif > + } > + > } > > /* > @@ -1501,7 +1510,7 @@ static unsigned long isolate_lru_pages(u > *nr_scanned = scan; > trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, nr_to_scan, scan, > nr_taken, mode, is_file_lru(lru)); > - update_lru_sizes(lruvec, lru, nr_zone_taken, nr_taken); > + update_lru_sizes(lruvec, lru, nr_zone_taken); > return nr_taken; > } > > @@ -2047,10 +2056,8 @@ static bool inactive_list_is_low(struct > if (!managed_zone(zone)) > continue; > > - inactive_zone = zone_page_state(zone, > - NR_ZONE_LRU_BASE + (file * LRU_FILE)); > - active_zone = zone_page_state(zone, > - NR_ZONE_LRU_BASE + (file * LRU_FILE) + LRU_ACTIVE); > + inactive_zone = lruvec_zone_lru_size(lruvec, file * LRU_FILE, zid); > + active_zone = lruvec_zone_lru_size(lruvec, (file * LRU_FILE) + LRU_ACTIVE, zid); > > inactive -= min(inactive, inactive_zone); > active -= min(active, active_zone); > _ -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-11 12:14 ` Mel Gorman @ 2017-01-11 22:52 ` Trevor Cordes 2017-01-12 9:36 ` Michal Hocko 0 siblings, 1 reply; 40+ messages in thread From: Trevor Cordes @ 2017-01-11 22:52 UTC (permalink / raw) To: Mel Gorman Cc: linux-kernel, Joonsoo Kim, Michal Hocko, Minchan Kim, Rik van Riel, Srikar Dronamraju [-- Attachment #1: Type: text/plain, Size: 1359 bytes --] On 2017-01-11 Mel Gorman wrote: > On Wed, Jan 11, 2017 at 12:11:46PM +0000, Mel Gorman wrote: > > On Wed, Jan 11, 2017 at 04:32:43AM -0600, Trevor Cordes wrote: > > > Hi! I have biected a nightly oom-killer flood and crash/hang on > > > one of the boxes I admin. It doesn't crash on Fedora 23/24 > > > 4.7.10 kernel but does on any 4.8 Fedora kernel. I did a vanilla > > > bisect and the bug is here: > > > > > > commit b2e18757f2c9d1cdd746a882e9878852fdec9501 > > > Author: Mel Gorman <mgorman@techsingularity.net> > > > Date: Thu Jul 28 15:45:37 2016 -0700 > > > > > > mm, vmscan: begin reclaiming pages on a per-node basis > > > > > > > Michal Hocko recently worked on a bug similar to this. Can you test > > the following patch that is currently queued in Andrew Morton's > > tree? It applies cleanly to 4.9 > > > > I should have pointed out that this patch primarily affects memcg but > the bug report did not include an OOM report and did not describe > whether memcgs could be involved or not. If memcgs are not involved > then please post the first full OOM kill. I will apply your patch tonight and it will take 48 hours to confirm that it is "good" (<24 hours if it's bad), and I will reply back. I'm not sure how I can tell if my bug is because of memcgs so here is a full first oom example (attached). Thanks for the help! [-- Attachment #2: oom-example --] [-- Type: application/octet-stream, Size: 20850 bytes --] Jan 9 03:20:46 firewallfsi kernel: [25593.036636] reboot-when-oom invoked oom-killer: gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=1, oom_score_adj=0 Jan 9 03:20:46 firewallfsi kernel: [25593.039254] reboot-when-oom cpuset=/ mems_allowed=0 Jan 9 03:20:46 firewallfsi kernel: [25593.040573] CPU: 3 PID: 29355 Comm: reboot-when-oom Not tainted 4.7.0+ #14 Jan 9 03:20:46 firewallfsi kernel: [25593.041839] Hardware name: Intel Corporation S1200BTL/S1200BTL, BIOS S1200BT.86B.02.00.0035.030220120927 03/02/2012 Jan 9 03:20:46 firewallfsi kernel: [25593.043074] db22c967 56e6d67f 00000286 ec8ddd30 dab59977 ec8dde74 f03a3540 ec8ddd78 Jan 9 03:20:46 firewallfsi kernel: [25593.044308] da9dc66c db147878 dbbfb9b4 027000c0 ec8dde80 00000001 00000000 ec8ddd58 Jan 9 03:20:46 firewallfsi kernel: [25593.045528] daf4e7fd ec8ddd78 dab5f71f 00000000 f61ed280 f07dd200 f03a3540 0000000a Jan 9 03:20:46 firewallfsi kernel: [25593.046762] Call Trace: Jan 9 03:20:46 firewallfsi kernel: [25593.047928] [<dab59977>] dump_stack+0x58/0x81 Jan 9 03:20:46 firewallfsi kernel: [25593.049063] [<da9dc66c>] dump_header+0x4a/0x18f Jan 9 03:20:46 firewallfsi kernel: [25593.050179] [<daf4e7fd>] ? _raw_spin_unlock_irqrestore+0xd/0x10 Jan 9 03:20:46 firewallfsi kernel: [25593.051321] [<dab5f71f>] ? ___ratelimit+0x9f/0xf0 Jan 9 03:20:46 firewallfsi kernel: [25593.052405] [<da9763ea>] oom_kill_process+0x1ea/0x3b0 Jan 9 03:20:46 firewallfsi kernel: [25593.053473] [<da87613a>] ? has_capability_noaudit+0x1a/0x30 Jan 9 03:20:46 firewallfsi kernel: [25593.054525] [<da975b8b>] ? oom_badness.part.12+0xcb/0x140 Jan 9 03:20:46 firewallfsi kernel: [25593.055560] [<da976809>] out_of_memory+0x1f9/0x230 Jan 9 03:20:46 firewallfsi kernel: [25593.056614] [<da97b3bd>] __alloc_pages_nodemask+0xd9d/0xdc0 Jan 9 03:20:46 firewallfsi kernel: [25593.057644] [<da86a078>] copy_process.part.41+0x108/0x14d0 Jan 9 03:20:46 firewallfsi kernel: [25593.058657] [<da9ca0e7>] ? kmem_cache_alloc+0xf7/0x1c0 Jan 9 03:20:46 firewallfsi kernel: [25593.059652] [<da9ca0e7>] ? kmem_cache_alloc+0xf7/0x1c0 Jan 9 03:20:46 firewallfsi kernel: [25593.060643] [<da90fb4f>] ? __audit_syscall_entry+0xaf/0x110 Jan 9 03:20:46 firewallfsi kernel: [25593.061626] [<da86b604>] _do_fork+0xd4/0x370 Jan 9 03:20:46 firewallfsi kernel: [25593.062563] [<da90fd7e>] ? __audit_syscall_exit+0x1ce/0x260 Jan 9 03:20:46 firewallfsi kernel: [25593.063477] [<da86b98c>] SyS_clone+0x2c/0x30 Jan 9 03:20:46 firewallfsi kernel: [25593.064373] [<da80386d>] do_fast_syscall_32+0x8d/0x140 Jan 9 03:20:46 firewallfsi kernel: [25593.065253] [<daf4ecf2>] sysenter_past_esp+0x47/0x75 Jan 9 03:20:46 firewallfsi kernel: [25593.066141] Mem-Info: Jan 9 03:20:46 firewallfsi kernel: [25593.067019] active_anon:78255 inactive_anon:79332 isolated_anon:0 Jan 9 03:20:46 firewallfsi kernel: [25593.067019] active_file:87903 inactive_file:52541 isolated_file:32 Jan 9 03:20:46 firewallfsi kernel: [25593.067019] unevictable:0 dirty:4 writeback:0 unstable:0 Jan 9 03:20:46 firewallfsi kernel: [25593.067019] slab_reclaimable:182463 slab_unreclaimable:11478 Jan 9 03:20:46 firewallfsi kernel: [25593.067019] mapped:24016 shmem:1043 pagetables:1466 bounce:17 Jan 9 03:20:46 firewallfsi kernel: [25593.067019] free:713297 free_pcp:139 free_cma:0 Jan 9 03:20:46 firewallfsi kernel: [25593.071846] Node 0 active_anon:313020kB inactive_anon:317328kB active_file:351612kB inactive_file:210164kB unevictable:0kB isolated(anon):0kB isolated(file):128kB all_unreclaimable? no Jan 9 03:20:46 firewallfsi kernel: [25593.073395] DMA free:3140kB min:68kB low:84kB high:100kB present:15976kB managed:15900kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:4968kB slab_unreclaimable:1176kB kernel_stack:64kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB node_pages_scanned:540732 Jan 9 03:20:46 firewallfsi kernel: [25593.075719] lowmem_reserve[]: 0 776 4733 4733 Jan 9 03:20:46 firewallfsi kernel: [25593.076490] Normal free:5088kB min:3528kB low:4408kB high:5288kB present:892920kB managed:814908kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:724884kB slab_unreclaimable:44736kB kernel_stack:2672kB pagetables:452kB unstable:0kB bounce:0kB free_pcp:452kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB node_pages_scanned:570144 Jan 9 03:20:46 firewallfsi kernel: [25593.078806] lowmem_reserve[]: 0 0 31652 31652 Jan 9 03:20:46 firewallfsi kernel: [25593.079564] HighMem free:2844960kB min:512kB low:5004kB high:9496kB present:4051548kB managed:4051548kB mlocked:0kB dirty:24kB writeback:0kB mapped:96060kB shmem:4172kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:5412kB unstable:0kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB node_pages_scanned:599556 Jan 9 03:20:46 firewallfsi kernel: [25593.081880] lowmem_reserve[]: 0 0 0 0 Jan 9 03:20:46 firewallfsi kernel: [25593.082646] DMA: 13*4kB (UME) 10*8kB (UME) 8*16kB (UME) 2*32kB (UE) 4*64kB (UME) 2*128kB (UM) 1*256kB (M) 2*512kB (UM) 1*1024kB (E) 0*2048kB 0*4096kB = 3140kB Jan 9 03:20:46 firewallfsi kernel: [25593.084186] Normal: 791*4kB (MH) 196*8kB (UMEH) 2*16kB (H) 2*32kB (H) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4828kB Jan 9 03:20:46 firewallfsi kernel: [25593.085770] HighMem: 259*4kB (UM) 173*8kB (UM) 69*16kB (UM) 137*32kB (UM) 135*64kB (UM) 66*128kB (UM) 37*256kB (UM) 11*512kB (U) 5*1024kB (UM) 9*2048kB (UM) 679*4096kB (UM) = 2844836kB Jan 9 03:20:46 firewallfsi kernel: [25593.087356] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Jan 9 03:20:46 firewallfsi kernel: [25593.088172] 142162 total pagecache pages Jan 9 03:20:46 firewallfsi kernel: [25593.088975] 569 pages in swap cache Jan 9 03:20:46 firewallfsi kernel: [25593.089777] Swap cache stats: add 24710, delete 24141, find 1073/1632 Jan 9 03:20:46 firewallfsi kernel: [25593.090588] Free swap = 33691592kB Jan 9 03:20:46 firewallfsi kernel: [25593.091378] Total swap = 33784572kB Jan 9 03:20:46 firewallfsi kernel: [25593.092182] 1240111 pages RAM Jan 9 03:20:46 firewallfsi kernel: [25593.093004] 1012887 pages HighMem/MovableOnly Jan 9 03:20:46 firewallfsi kernel: [25593.093786] 19522 pages reserved Jan 9 03:20:46 firewallfsi kernel: [25593.094579] 0 pages hwpoisoned Jan 9 03:20:46 firewallfsi kernel: [25593.095372] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name Jan 9 03:20:46 firewallfsi kernel: [25593.096185] [ 611] 0 611 5229 4071 14 3 1 0 systemd-journal Jan 9 03:20:46 firewallfsi kernel: [25593.097029] [ 648] 0 648 3585 1140 9 3 43 -1000 systemd-udevd Jan 9 03:20:46 firewallfsi kernel: [25593.097854] [ 773] 0 773 1704 644 6 3 0 0 sh Jan 9 03:20:46 firewallfsi kernel: [25593.098702] [ 778] 0 778 1704 623 7 3 0 0 sh Jan 9 03:20:46 firewallfsi kernel: [25593.099521] [ 788] 0 788 1704 627 6 3 0 0 sh Jan 9 03:20:46 firewallfsi kernel: [25593.100328] [ 796] 288 796 14423 1642 14 3 1 0 milter-greylist Jan 9 03:20:46 firewallfsi kernel: [25593.101157] [ 799] 0 799 1704 669 5 3 0 0 sh Jan 9 03:20:46 firewallfsi kernel: [25593.101954] [ 800] 0 800 988 682 5 3 6 0 systemd-logind Jan 9 03:20:46 firewallfsi kernel: [25593.102770] [ 804] 0 804 1704 703 7 3 0 0 sh Jan 9 03:20:46 firewallfsi kernel: [25593.103561] [ 805] 0 805 3234 2132 9 3 18 0 dynamic-ip-upda Jan 9 03:20:46 firewallfsi kernel: [25593.104356] [ 806] 0 806 1797 853 7 3 0 0 watch-services Jan 9 03:20:46 firewallfsi kernel: [25593.105166] [ 807] 0 807 3165 2028 9 3 50 0 mailwarnings Jan 9 03:20:46 firewallfsi kernel: [25593.105970] [ 809] 0 809 1736 740 7 3 0 0 tickle-pog Jan 9 03:20:46 firewallfsi kernel: [25593.106790] [ 810] 0 810 8633 1046 11 3 3 0 rsyslogd Jan 9 03:20:46 firewallfsi kernel: [25593.107566] [ 811] 0 811 1704 683 6 3 0 0 sh Jan 9 03:20:46 firewallfsi kernel: [25593.108359] [ 813] 0 813 3695 1781 11 3 40 0 fetchmail Jan 9 03:20:46 firewallfsi kernel: [25593.109171] [ 814] 0 814 2710 1617 10 3 50 0 udp-sgr Jan 9 03:20:46 firewallfsi kernel: [25593.109944] [ 821] 0 821 800 492 5 3 0 0 mdadm Jan 9 03:20:46 firewallfsi kernel: [25593.110697] [ 835] 0 835 1472 922 6 3 41 0 smartd Jan 9 03:20:46 firewallfsi kernel: [25593.111426] [ 836] 0 836 1704 648 6 3 0 0 sh Jan 9 03:20:46 firewallfsi kernel: [25593.112119] [ 837] 0 837 3266 2092 10 3 42 0 restarter Jan 9 03:20:46 firewallfsi kernel: [25593.112813] [ 838] 0 838 1033 607 6 3 0 0 irqbalance Jan 9 03:20:46 firewallfsi kernel: [25593.113476] [ 858] 0 858 1704 616 7 3 0 0 sh Jan 9 03:20:46 firewallfsi kernel: [25593.114140] [ 859] 0 859 3311 2200 10 3 48 0 watch-ip Jan 9 03:20:46 firewallfsi kernel: [25593.114787] [ 860] 81 860 1700 1052 7 3 0 -900 dbus-daemon Jan 9 03:20:46 firewallfsi kernel: [25593.115395] [ 987] 0 987 2499 365 9 3 16 0 saslauthd Jan 9 03:20:46 firewallfsi kernel: [25593.116015] [ 988] 0 988 2499 111 9 3 15 0 saslauthd Jan 9 03:20:46 firewallfsi kernel: [25593.116608] [ 989] 0 989 2499 113 9 3 13 0 saslauthd Jan 9 03:20:46 firewallfsi kernel: [25593.117163] [ 990] 0 990 2499 113 9 3 13 0 saslauthd Jan 9 03:20:46 firewallfsi kernel: [25593.117732] [ 991] 0 991 2499 112 9 3 14 0 saslauthd Jan 9 03:20:46 firewallfsi kernel: [25593.118260] [ 1042] 0 1042 583 439 5 3 0 0 acpid Jan 9 03:20:46 firewallfsi kernel: [25593.118755] [ 1066] 0 1066 7287 1088 11 3 14 0 apcupsd Jan 9 03:20:46 firewallfsi kernel: [25593.119228] [ 1068] 0 1068 2769 514 8 3 23 -1000 sshd Jan 9 03:20:46 firewallfsi kernel: [25593.119685] [ 1174] 0 1174 860 467 5 3 0 0 atd Jan 9 03:20:46 firewallfsi kernel: [25593.120139] [ 1214] 27 1214 1707 727 7 3 0 0 mysqld_safe Jan 9 03:20:46 firewallfsi kernel: [25593.120557] [ 1323] 25 1323 79966 41422 111 3 4283 0 named Jan 9 03:20:46 firewallfsi kernel: [25593.120961] [ 1384] 27 1384 124505 11410 60 3 1233 0 mysqld Jan 9 03:20:46 firewallfsi kernel: [25593.121350] [ 1388] 0 1388 1116 500 6 3 0 0 agetty Jan 9 03:20:46 firewallfsi kernel: [25593.121732] [ 1389] 0 1389 1116 489 6 3 0 0 agetty Jan 9 03:20:46 firewallfsi kernel: [25593.122125] [ 1390] 0 1390 1116 514 6 3 0 0 agetty Jan 9 03:20:46 firewallfsi kernel: [25593.122489] [ 1391] 0 1391 1116 518 6 3 0 0 agetty Jan 9 03:20:46 firewallfsi kernel: [25593.122867] [ 1392] 0 1392 1116 488 6 3 0 0 agetty Jan 9 03:20:46 firewallfsi kernel: [25593.123238] [ 1393] 0 1393 1116 528 6 3 0 0 agetty Jan 9 03:20:46 firewallfsi kernel: [25593.123586] [ 1395] 276 1395 113030 86070 218 3 15425 0 clamd Jan 9 03:20:46 firewallfsi kernel: [25593.123938] [ 1396] 290 1396 10838 1113 11 3 0 0 clamav-milter Jan 9 03:20:46 firewallfsi kernel: [25593.124309] [ 1406] 0 1406 12050 6009 26 3 136 0 /usr/sbin/httpd Jan 9 03:20:46 firewallfsi kernel: [25593.124669] [ 1442] 0 1442 7049 2384 16 3 2 0 nmbd Jan 9 03:20:46 firewallfsi kernel: [25593.125034] [ 1443] 0 1443 6840 2239 16 3 7 0 nmbd Jan 9 03:20:46 firewallfsi kernel: [25593.125366] [ 1489] 0 1489 3829 1506 11 3 106 0 sendmail Jan 9 03:20:46 firewallfsi kernel: [25593.125730] [ 1549] 23 1549 5440 700 13 3 21 0 squid Jan 9 03:20:46 firewallfsi kernel: [25593.126070] [ 1551] 23 1551 9371 6015 20 3 79 0 squid Jan 9 03:20:46 firewallfsi kernel: [25593.126401] [ 1552] 51 1552 3502 704 10 3 57 0 sendmail Jan 9 03:20:46 firewallfsi kernel: [25593.126759] [ 1622] 23 1622 1179 413 6 3 0 0 unlinkd Jan 9 03:20:46 firewallfsi kernel: [25593.127116] [ 1767] 48 1767 49254 4004 43 3 185 0 /usr/sbin/httpd Jan 9 03:20:46 firewallfsi kernel: [25593.127452] [ 1768] 48 1768 16217 4391 27 3 116 0 /usr/sbin/httpd Jan 9 03:20:46 firewallfsi kernel: [25593.127789] [ 1771] 48 1771 16384 4377 28 3 135 0 /usr/sbin/httpd Jan 9 03:20:46 firewallfsi kernel: [25593.128139] [ 1787] 48 1787 16218 4047 27 3 138 0 /usr/sbin/httpd Jan 9 03:20:46 firewallfsi kernel: [25593.128466] [ 1793] 48 1793 16384 4173 28 3 258 0 /usr/sbin/httpd Jan 9 03:20:46 firewallfsi kernel: [25593.128793] [ 1807] 0 1807 1704 632 6 3 0 0 sh Jan 9 03:20:46 firewallfsi kernel: [25593.129136] [ 1808] 0 1808 2741 1347 9 3 294 0 udp-sgs Jan 9 03:20:46 firewallfsi kernel: [25593.129462] [ 1863] 0 1863 9561 3486 22 3 7 0 smbd Jan 9 03:20:46 firewallfsi kernel: [25593.129808] [ 1871] 0 1871 9197 1083 22 3 9 0 smbd Jan 9 03:20:46 firewallfsi kernel: [25593.130148] [ 1874] 0 1874 9446 1178 22 3 7 0 smbd Jan 9 03:20:46 firewallfsi kernel: [25593.130466] [ 1953] 0 1953 5116 1937 13 3 185 0 dhclient Jan 9 03:20:46 firewallfsi kernel: [25593.130789] [ 2025] 0 2025 594 379 5 3 0 0 pptpd Jan 9 03:20:46 firewallfsi kernel: [25593.131128] [ 2032] 0 2032 954 662 6 3 9 0 dovecot Jan 9 03:20:46 firewallfsi kernel: [25593.131449] [ 2033] 97 2033 904 545 5 3 0 0 anvil Jan 9 03:20:46 firewallfsi kernel: [25593.131782] [ 2034] 0 2034 937 560 5 3 51 0 log Jan 9 03:20:46 firewallfsi kernel: [25593.132111] [ 2191] 0 2191 1899 716 7 3 45 0 crond Jan 9 03:20:46 firewallfsi kernel: [25593.132429] [ 2192] 38 2192 1536 995 6 3 0 0 ntpd Jan 9 03:20:46 firewallfsi kernel: [25593.132770] [ 2195] 48 2195 16218 4065 27 3 119 0 /usr/sbin/httpd Jan 9 03:20:46 firewallfsi kernel: [25593.133111] [ 2198] 48 2198 16218 4066 27 3 117 0 /usr/sbin/httpd Jan 9 03:20:46 firewallfsi kernel: [25593.133427] [ 2199] 48 2199 16218 4084 27 3 115 0 /usr/sbin/httpd Jan 9 03:20:46 firewallfsi kernel: [25593.133759] [ 2204] 301 2204 11214 4061 26 3 6 0 smbd Jan 9 03:20:46 firewallfsi kernel: [25593.134091] [ 2211] 177 2211 6833 4305 17 3 316 0 dhcpd Jan 9 03:20:46 firewallfsi kernel: [25593.134408] [ 2212] 0 2212 4181 1843 11 3 0 0 sshd Jan 9 03:20:46 firewallfsi kernel: [25593.134753] [ 2214] 0 2214 1584 1115 6 3 0 0 systemd Jan 9 03:20:46 firewallfsi kernel: [25593.135098] [ 2225] 0 2225 2455 326 8 3 62 0 (sd-pam) Jan 9 03:20:46 firewallfsi kernel: [25593.135423] [ 2240] 0 2240 4214 1021 11 3 1 0 sshd Jan 9 03:20:46 firewallfsi kernel: [25593.135771] [ 2247] 0 2247 1266 1006 6 3 0 0 tcsh Jan 9 03:20:46 firewallfsi kernel: [25593.136112] [ 2532] 301 2532 1590 1065 6 3 17 0 imap Jan 9 03:20:46 firewallfsi kernel: [25593.136433] [ 3299] 0 3299 1135 819 6 3 3 0 config Jan 9 03:20:46 firewallfsi kernel: [25593.136777] [ 3501] 304 3501 1399 988 6 3 1 0 imap Jan 9 03:20:46 firewallfsi kernel: [25593.137115] [ 3505] 273 3505 2208 1308 8 3 3 0 imap-login Jan 9 03:20:46 firewallfsi kernel: [25593.137433] [ 3506] 301 3506 1589 1021 7 3 83 0 imap Jan 9 03:20:46 firewallfsi kernel: [25593.137772] [ 4460] 302 4460 1431 1108 6 3 0 0 imap Jan 9 03:20:46 firewallfsi kernel: [25593.138109] [ 4672] 303 4672 1402 932 6 3 0 0 imap Jan 9 03:20:46 firewallfsi kernel: [25593.138429] [ 6971] 273 6971 2209 1269 8 3 0 0 imap-login Jan 9 03:20:46 firewallfsi kernel: [25593.138779] [ 6977] 302 6977 1427 1081 6 3 1 0 imap Jan 9 03:20:46 firewallfsi kernel: [25593.139127] [ 6984] 0 6984 9675 3727 22 3 7 0 smbd Jan 9 03:20:46 firewallfsi kernel: [25593.139460] [18001] 273 18001 2208 1380 8 3 0 0 imap-login Jan 9 03:20:46 firewallfsi kernel: [25593.139818] [18005] 301 18005 1396 958 6 3 0 0 imap Jan 9 03:20:46 firewallfsi kernel: [25593.140169] [29355] 0 29355 1307 915 6 3 0 0 reboot-when-oom Jan 9 03:20:46 firewallfsi kernel: [25593.140501] [29356] 0 29356 1307 919 6 3 0 0 watch-ps Jan 9 03:20:46 firewallfsi kernel: [25593.140852] [ 3232] 0 3232 9656 3456 22 3 36 0 smbd Jan 9 03:20:46 firewallfsi kernel: [25593.141204] [ 9393] 0 9393 1900 464 7 3 56 0 crond Jan 9 03:20:46 firewallfsi kernel: [25593.141537] [ 9394] 0 9394 3210 2041 10 3 20 0 freshclam-tec-w Jan 9 03:20:46 firewallfsi kernel: [25593.141904] [ 9801] 0 9801 1654 419 7 3 9 0 anacron Jan 9 03:20:46 firewallfsi kernel: [25593.142251] [13714] 0 13714 1396 162 6 3 0 0 sleep Jan 9 03:20:46 firewallfsi kernel: [25593.142588] [14533] 97 14533 1464 869 6 3 0 0 auth Jan 9 03:20:46 firewallfsi kernel: [25593.142927] [14534] 0 14534 903 251 5 3 0 0 ssl-params Jan 9 03:20:46 firewallfsi kernel: [25593.143280] [14944] 0 14944 1396 141 7 3 0 0 sleep Jan 9 03:20:46 firewallfsi kernel: [25593.143613] Out of memory: Kill process 1395 (clamd) score 10 or sacrifice child Jan 9 03:20:46 firewallfsi kernel: [25593.143975] Killed process 1395 (clamd) total-vm:452120kB, anon-rss:329076kB, file-rss:15204kB, shmem-rss:0kB Jan 9 03:20:46 firewallfsi kernel: [25593.158502] oom_reaper: reaped process 1395 (clamd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-11 22:52 ` Trevor Cordes @ 2017-01-12 9:36 ` Michal Hocko 2017-01-15 6:27 ` Trevor Cordes 0 siblings, 1 reply; 40+ messages in thread From: Michal Hocko @ 2017-01-12 9:36 UTC (permalink / raw) To: Trevor Cordes Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Wed 11-01-17 16:52:32, Trevor Cordes wrote: [...] > I'm not sure how I can tell if my bug is because of memcgs so here is > a full first oom example (attached). 4.7 kernel doesn't contain 71c799f4982d ("mm: add per-zone lru list stat") so the OOM report will not tell us whether the Normal zone doesn't age active lists, unfortunatelly. You can easily check whether this is memcg related by trying to run the same workload with cgroup_disable=memory kernel command line parameter. This will put all the memcg specifics out of the way. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-12 9:36 ` Michal Hocko @ 2017-01-15 6:27 ` Trevor Cordes 2017-01-16 11:09 ` Mel Gorman 2017-01-17 13:45 ` Michal Hocko 0 siblings, 2 replies; 40+ messages in thread From: Trevor Cordes @ 2017-01-15 6:27 UTC (permalink / raw) To: Michal Hocko Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju [-- Attachment #1: Type: text/plain, Size: 1437 bytes --] On 2017-01-12 Michal Hocko wrote: > On Wed 11-01-17 16:52:32, Trevor Cordes wrote: > [...] > > I'm not sure how I can tell if my bug is because of memcgs so here > > is a full first oom example (attached). > > 4.7 kernel doesn't contain 71c799f4982d ("mm: add per-zone lru list > stat") so the OOM report will not tell us whether the Normal zone > doesn't age active lists, unfortunatelly. I compiled the patch Mel provided into the stock F23 kernel 4.8.13-100.fc23.i686+PAE and it ran for 2 nights. It didn't oom the first night, but did the second night. So the bug persists even with that patch. However, it does *seem* a bit "better" since it took 2 nights (usually takes only one, but maybe 10% of the time it does take two) before oom'ing, *and* it allowed my reboot script to reboot it cleanly when it saw the oom (which happens only 25% of the time). I'm attaching the 4.8.13 oom message which should have the memcg info (71c799f4982d) you are asking for above? Hopefully? > You can easily check whether this is memcg related by trying to run > the same workload with cgroup_disable=memory kernel command line > parameter. This will put all the memcg specifics out of the way. I will try booting now into cgroup_disable=memory to see if that helps at all. I'll reply back in 48 hours, or when it oom's, whichever comes first. Also, should I bother trying the latest git HEAD to see if that solves anything? Thanks! [-- Attachment #2: oom2 --] [-- Type: application/octet-stream, Size: 23540 bytes --] Jan 14 03:14:40 firewallfsi kernel: [167409.074463] nmbd invoked oom-killer: gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=1, oom_score_adj=0 Jan 14 03:14:40 firewallfsi kernel: [167409.074467] nmbd cpuset=/ mems_allowed=0 Jan 14 03:14:40 firewallfsi kernel: [167409.074472] CPU: 4 PID: 1519 Comm: nmbd Not tainted 4.8.13-101.fc23.i686+PAE #1 Jan 14 03:14:40 firewallfsi kernel: [167409.074473] Hardware name: Intel Corporation S1200BTL/S1200BTL, BIOS S1200BT.86B.02.00.0035.030220120927 03/02/2012 Jan 14 03:14:40 firewallfsi kernel: [167409.074478] cd640967 aaaf870b 00000286 efe9fd48 ccf5e9c7 efe9fe74 ed07d8c0 efe9fd90 Jan 14 03:14:40 firewallfsi kernel: [167409.074481] ccde0274 cd559508 ed84cb74 027000c0 efe9fe80 00000001 00000000 efe9fd70 Jan 14 03:14:40 firewallfsi kernel: [167409.074483] cd35cdad efe9fd90 ccf6472f 00000000 f101f0c0 e1284400 ed07d8c0 0000000a Jan 14 03:14:40 firewallfsi kernel: [167409.074484] Call Trace: Jan 14 03:14:40 firewallfsi kernel: [167409.074492] [<ccf5e9c7>] dump_stack+0x58/0x81 Jan 14 03:14:40 firewallfsi kernel: [167409.074497] [<ccde0274>] dump_header+0x4a/0x18f Jan 14 03:14:40 firewallfsi kernel: [167409.074506] [<cd35cdad>] ? _raw_spin_unlock_irqrestore+0xd/0x10 Jan 14 03:14:40 firewallfsi kernel: [167409.074508] [<ccf6472f>] ? ___ratelimit+0x9f/0x100 Jan 14 03:14:40 firewallfsi kernel: [167409.074513] [<ccd7914a>] oom_kill_process+0x1ea/0x3b0 Jan 14 03:14:40 firewallfsi kernel: [167409.074515] [<ccc7687a>] ? has_capability_noaudit+0x1a/0x30 Jan 14 03:14:40 firewallfsi kernel: [167409.074517] [<ccd788eb>] ? oom_badness.part.12+0xcb/0x140 Jan 14 03:14:40 firewallfsi kernel: [167409.074522] [<ccd79569>] out_of_memory+0x1f9/0x230 Jan 14 03:14:40 firewallfsi kernel: [167409.074525] [<ccd7df15>] __alloc_pages_nodemask+0xba5/0xbc0 Jan 14 03:14:40 firewallfsi kernel: [167409.074530] [<ccc6a578>] copy_process.part.40+0x108/0x1490 Jan 14 03:14:40 firewallfsi kernel: [167409.074531] [<ccc6bac4>] _do_fork+0xd4/0x370 Jan 14 03:14:40 firewallfsi kernel: [167409.074539] [<ccd11fae>] ? __audit_syscall_exit+0x1ce/0x260 Jan 14 03:14:40 firewallfsi kernel: [167409.074541] [<ccc6be4c>] SyS_clone+0x2c/0x30 Jan 14 03:14:40 firewallfsi kernel: [167409.074548] [<ccc0378d>] do_fast_syscall_32+0x8d/0x140 Jan 14 03:14:40 firewallfsi kernel: [167409.074555] [<cd35d2b2>] sysenter_past_esp+0x47/0x75 Jan 14 03:14:40 firewallfsi kernel: [167409.074556] Mem-Info: Jan 14 03:14:40 firewallfsi kernel: [167409.074566] active_anon:162369 inactive_anon:31581 isolated_anon:0 Jan 14 03:14:40 firewallfsi kernel: [167409.074566] active_file:152473 inactive_file:600139 isolated_file:0 Jan 14 03:14:40 firewallfsi kernel: [167409.074566] unevictable:0 dirty:0 writeback:0 unstable:0 Jan 14 03:14:40 firewallfsi kernel: [167409.074566] slab_reclaimable:180758 slab_unreclaimable:12487 Jan 14 03:14:40 firewallfsi kernel: [167409.074566] mapped:25326 shmem:1271 pagetables:1698 bounce:0 Jan 14 03:14:40 firewallfsi kernel: [167409.074566] free:63080 free_pcp:60 free_cma:0 Jan 14 03:14:40 firewallfsi kernel: [167409.074569] Node 0 active_anon:649476kB inactive_anon:126324kB active_file:609892kB inactive_file:2400556kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:101304kB dirty:0kB writeback:0kB shmem:5084kB writeback_tmp:0kB unstable:0kB pages_scanned:24907078 all_unreclaimable? yes Jan 14 03:14:40 firewallfsi kernel: [167409.074572] DMA free:3200kB min:68kB low:84kB high:100kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15976kB managed:15900kB mlocked:0kB slab_reclaimable:11964kB slab_unreclaimable:452kB kernel_stack:56kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Jan 14 03:14:40 firewallfsi kernel: [167409.074573] lowmem_reserve[]: 0 783 4740 4740 Jan 14 03:14:40 firewallfsi kernel: [167409.074576] Normal free:3484kB min:3544kB low:4428kB high:5312kB active_anon:0kB inactive_anon:0kB active_file:3412kB inactive_file:1560kB unevictable:0kB writepending:0kB present:892920kB managed:815216kB mlocked:0kB slab_reclaimable:711068kB slab_unreclaimable:49496kB kernel_stack:2904kB pagetables:0kB bounce:0kB free_pcp:240kB local_pcp:120kB free_cma:0kB Jan 14 03:14:40 firewallfsi kernel: [167409.074577] lowmem_reserve[]: 0 0 31652 31652 Jan 14 03:14:40 firewallfsi kernel: [167409.074581] HighMem free:245636kB min:512kB low:4988kB high:9464kB active_anon:649476kB inactive_anon:126324kB active_file:606332kB inactive_file:2399136kB unevictable:0kB writepending:0kB present:4051548kB managed:4051548kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:6792kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Jan 14 03:14:40 firewallfsi kernel: [167409.074582] lowmem_reserve[]: 0 0 0 0 Jan 14 03:14:40 firewallfsi kernel: [167409.074590] DMA: 26*4kB (UME) 19*8kB (UME) 20*16kB (UME) 10*32kB (UE) 8*64kB (UM) 10*128kB (UME) 2*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3200kB Jan 14 03:14:40 firewallfsi kernel: [167409.074598] Normal: 508*4kB (UMEH) 7*8kB (H) 3*16kB (H) 1*32kB (H) 1*64kB (H) 0*128kB 1*256kB (H) 0*512kB 1*1024kB (H) 0*2048kB 0*4096kB = 3512kB Jan 14 03:14:40 firewallfsi kernel: [167409.074609] HighMem: 317*4kB (UM) 150*8kB (UM) 96*16kB (UM) 43*32kB (UM) 296*64kB (UM) 55*128kB (M) 131*256kB (UM) 77*512kB (UM) 30*1024kB (M) 16*2048kB (M) 19*4096kB (M) = 245636kB Jan 14 03:14:40 firewallfsi kernel: [167409.074614] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Jan 14 03:14:40 firewallfsi kernel: [167409.074614] 753924 total pagecache pages Jan 14 03:14:40 firewallfsi kernel: [167409.074615] 12 pages in swap cache Jan 14 03:14:40 firewallfsi kernel: [167409.074616] Swap cache stats: add 177, delete 165, find 0/0 Jan 14 03:14:40 firewallfsi kernel: [167409.074617] Free swap = 33783864kB Jan 14 03:14:40 firewallfsi kernel: [167409.074617] Total swap = 33784572kB Jan 14 03:14:40 firewallfsi kernel: [167409.074619] 1240111 pages RAM Jan 14 03:14:40 firewallfsi kernel: [167409.074620] 1012887 pages HighMem/MovableOnly Jan 14 03:14:40 firewallfsi kernel: [167409.074621] 19445 pages reserved Jan 14 03:14:40 firewallfsi kernel: [167409.074622] 0 pages hwpoisoned Jan 14 03:14:40 firewallfsi kernel: [167409.074625] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name Jan 14 03:14:40 firewallfsi kernel: [167409.074650] [ 598] 0 598 5325 3580 14 3 0 0 systemd-journal Jan 14 03:14:40 firewallfsi kernel: [167409.074652] [ 637] 0 637 3585 1121 8 3 0 -1000 systemd-udevd Jan 14 03:14:40 firewallfsi kernel: [167409.074655] [ 747] 0 747 808 481 6 3 0 0 mdadm Jan 14 03:14:40 firewallfsi kernel: [167409.074659] [ 748] 81 748 1700 1020 7 3 0 -900 dbus-daemon Jan 14 03:14:40 firewallfsi kernel: [167409.074662] [ 772] 0 772 1704 626 6 3 0 0 sh Jan 14 03:14:40 firewallfsi kernel: [167409.074664] [ 774] 0 774 1033 637 5 3 0 0 irqbalance Jan 14 03:14:40 firewallfsi kernel: [167409.074666] [ 776] 0 776 988 733 5 3 0 0 systemd-logind Jan 14 03:14:40 firewallfsi kernel: [167409.074672] [ 779] 0 779 1704 663 7 3 0 0 sh Jan 14 03:14:40 firewallfsi kernel: [167409.074673] [ 790] 0 790 1472 794 6 3 0 0 smartd Jan 14 03:14:40 firewallfsi kernel: [167409.074678] [ 792] 0 792 1704 594 7 3 0 0 sh Jan 14 03:14:40 firewallfsi kernel: [167409.074679] [ 800] 0 800 1704 623 6 3 0 0 sh Jan 14 03:14:40 firewallfsi kernel: [167409.074680] [ 805] 0 805 1704 607 7 3 0 0 sh Jan 14 03:14:40 firewallfsi kernel: [167409.074685] [ 806] 0 806 1704 602 7 3 0 0 sh Jan 14 03:14:40 firewallfsi kernel: [167409.074686] [ 813] 0 813 8633 1106 11 3 0 0 rsyslogd Jan 14 03:14:40 firewallfsi kernel: [167409.074688] [ 814] 288 814 19546 2254 18 3 0 0 milter-greylist Jan 14 03:14:40 firewallfsi kernel: [167409.074690] [ 816] 0 816 1736 694 6 3 0 0 tickle-pog Jan 14 03:14:40 firewallfsi kernel: [167409.074695] [ 817] 0 817 3165 2103 9 3 0 0 mailwarnings Jan 14 03:14:40 firewallfsi kernel: [167409.074698] [ 820] 0 820 1819 809 6 3 0 0 watch-services Jan 14 03:14:40 firewallfsi kernel: [167409.074704] [ 821] 0 821 3264 2173 10 3 0 0 restarter Jan 14 03:14:40 firewallfsi kernel: [167409.074706] [ 824] 0 824 2707 1661 9 3 0 0 udp-sgr Jan 14 03:14:40 firewallfsi kernel: [167409.074708] [ 825] 0 825 3313 2229 11 3 0 0 watch-ip Jan 14 03:14:40 firewallfsi kernel: [167409.074713] [ 852] 0 852 583 440 5 3 0 0 acpid Jan 14 03:14:40 firewallfsi kernel: [167409.074719] [ 853] 0 853 1704 604 6 3 0 0 sh Jan 14 03:14:40 firewallfsi kernel: [167409.074725] [ 854] 0 854 3234 2142 9 3 0 0 dynamic-ip-upda Jan 14 03:14:40 firewallfsi kernel: [167409.074731] [ 857] 0 857 1704 636 5 3 0 0 sh Jan 14 03:14:40 firewallfsi kernel: [167409.074735] [ 858] 0 858 3695 1973 11 3 0 0 fetchmail Jan 14 03:14:40 firewallfsi kernel: [167409.074737] [ 865] 0 865 7287 1135 11 3 0 0 apcupsd Jan 14 03:14:40 firewallfsi kernel: [167409.074738] [ 907] 0 907 860 519 5 3 0 0 atd Jan 14 03:14:40 firewallfsi kernel: [167409.074745] [ 980] 0 980 2665 1030 9 3 0 0 saslauthd Jan 14 03:14:40 firewallfsi kernel: [167409.074750] [ 981] 0 981 2665 1030 9 3 0 0 saslauthd Jan 14 03:14:40 firewallfsi kernel: [167409.074751] [ 982] 0 982 2499 125 8 3 0 0 saslauthd Jan 14 03:14:40 firewallfsi kernel: [167409.074753] [ 983] 0 983 2499 125 8 3 0 0 saslauthd Jan 14 03:14:40 firewallfsi kernel: [167409.074756] [ 984] 0 984 2499 125 8 3 0 0 saslauthd Jan 14 03:14:40 firewallfsi kernel: [167409.074758] [ 1066] 27 1066 1707 731 7 3 0 0 mysqld_safe Jan 14 03:14:40 firewallfsi kernel: [167409.074760] [ 1199] 0 1199 1116 532 6 3 0 0 agetty Jan 14 03:14:40 firewallfsi kernel: [167409.074764] [ 1200] 0 1200 1116 484 6 3 0 0 agetty Jan 14 03:14:40 firewallfsi kernel: [167409.074765] [ 1201] 0 1201 1116 480 6 3 0 0 agetty Jan 14 03:14:40 firewallfsi kernel: [167409.074767] [ 1202] 0 1202 1116 499 6 3 0 0 agetty Jan 14 03:14:40 firewallfsi kernel: [167409.074769] [ 1203] 0 1203 1116 495 6 3 0 0 agetty Jan 14 03:14:40 firewallfsi kernel: [167409.074777] [ 1204] 0 1204 1116 527 6 3 0 0 agetty Jan 14 03:14:40 firewallfsi kernel: [167409.074783] [ 1373] 27 1373 126042 14546 65 3 0 0 mysqld Jan 14 03:14:40 firewallfsi kernel: [167409.074787] [ 1374] 0 1374 2769 737 9 3 0 -1000 sshd Jan 14 03:14:40 firewallfsi kernel: [167409.074789] [ 1479] 25 1479 82956 48992 116 3 0 0 named Jan 14 03:14:40 firewallfsi kernel: [167409.074790] [ 1519] 0 1519 7049 2436 18 3 0 0 nmbd Jan 14 03:14:40 firewallfsi kernel: [167409.074793] [ 1520] 0 1520 6711 2140 17 3 0 0 nmbd Jan 14 03:14:40 firewallfsi kernel: [167409.074794] [ 1541] 0 1541 12052 6348 28 3 0 0 /usr/sbin/httpd Jan 14 03:14:40 firewallfsi kernel: [167409.074796] [ 1769] 48 1769 49262 4341 46 3 0 0 /usr/sbin/httpd Jan 14 03:14:40 firewallfsi kernel: [167409.074798] [ 1770] 48 1770 16216 4312 29 3 0 0 /usr/sbin/httpd Jan 14 03:14:40 firewallfsi kernel: [167409.074799] [ 1778] 48 1778 16214 4322 29 3 0 0 /usr/sbin/httpd Jan 14 03:14:40 firewallfsi kernel: [167409.074806] [ 1792] 48 1792 16220 4322 29 3 0 0 /usr/sbin/httpd Jan 14 03:14:40 firewallfsi kernel: [167409.074810] [ 1804] 0 1804 9561 3578 22 3 0 0 smbd Jan 14 03:14:40 firewallfsi kernel: [167409.074812] [ 1805] 0 1805 9197 1097 22 3 0 0 smbd Jan 14 03:14:40 firewallfsi kernel: [167409.074814] [ 1806] 0 1806 9446 1166 22 3 0 0 smbd Jan 14 03:14:40 firewallfsi kernel: [167409.074818] [ 1819] 0 1819 1704 639 6 3 0 0 sh Jan 14 03:14:40 firewallfsi kernel: [167409.074819] [ 1820] 0 1820 2740 1707 9 3 0 0 udp-sgs Jan 14 03:14:40 firewallfsi kernel: [167409.074821] [ 1889] 0 1889 5116 2456 14 3 0 0 dhclient Jan 14 03:14:40 firewallfsi kernel: [167409.074826] [ 1971] 0 1971 594 401 5 3 0 0 pptpd Jan 14 03:14:40 firewallfsi kernel: [167409.074828] [ 1978] 0 1978 954 664 5 3 0 0 dovecot Jan 14 03:14:40 firewallfsi kernel: [167409.074829] [ 1979] 97 1979 904 521 5 3 0 0 anvil Jan 14 03:14:40 firewallfsi kernel: [167409.074832] [ 1980] 0 1980 937 577 5 3 0 0 log Jan 14 03:14:40 firewallfsi kernel: [167409.074839] [ 1982] 0 1982 1131 812 5 3 0 0 config Jan 14 03:14:40 firewallfsi kernel: [167409.074841] [ 1989] 177 1989 6833 4754 16 3 0 0 dhcpd Jan 14 03:14:40 firewallfsi kernel: [167409.074842] [ 1990] 0 1990 1900 731 8 3 0 0 crond Jan 14 03:14:40 firewallfsi kernel: [167409.074843] [ 1992] 38 1992 1536 1057 6 3 0 0 ntpd Jan 14 03:14:40 firewallfsi kernel: [167409.074845] [ 2007] 48 2007 16220 4532 29 3 0 0 /usr/sbin/httpd Jan 14 03:14:40 firewallfsi kernel: [167409.074851] [ 2028] 303 2028 1402 954 6 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074852] [ 2029] 48 2029 16386 4598 30 3 0 0 /usr/sbin/httpd Jan 14 03:14:40 firewallfsi kernel: [167409.074854] [ 2156] 301 2156 1607 1122 6 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074855] [ 2157] 0 2157 4181 1974 11 3 0 0 sshd Jan 14 03:14:40 firewallfsi kernel: [167409.074859] [ 2159] 0 2159 1584 1090 7 3 0 0 systemd Jan 14 03:14:40 firewallfsi kernel: [167409.074861] [ 2165] 0 2165 6809 395 11 3 0 0 (sd-pam) Jan 14 03:14:40 firewallfsi kernel: [167409.074863] [ 2188] 0 2188 4215 973 11 3 0 0 sshd Jan 14 03:14:40 firewallfsi kernel: [167409.074867] [ 2195] 0 2195 1229 973 6 3 0 0 tcsh Jan 14 03:14:40 firewallfsi kernel: [167409.074868] [ 2350] 300 2350 11578 4448 26 3 0 0 smbd Jan 14 03:14:40 firewallfsi kernel: [167409.074870] [ 2555] 304 2555 1399 1004 6 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074875] [ 2966] 302 2966 1449 1120 6 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074877] [22531] 301 22531 2007 1275 7 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074878] [22745] 301 22745 1747 1185 8 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074880] [22750] 301 22750 1458 1052 7 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074884] [22753] 301 22753 4216 1688 9 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074885] [25574] 0 25574 4206 1966 11 3 0 0 sshd Jan 14 03:14:40 firewallfsi kernel: [167409.074887] [25577] 0 25577 4206 1255 11 3 0 0 sshd Jan 14 03:14:40 firewallfsi kernel: [167409.074893] [25588] 0 25588 1994 1021 8 3 0 0 tcsh Jan 14 03:14:40 firewallfsi kernel: [167409.074895] [25616] 0 25616 2664 1687 8 3 0 0 ssh Jan 14 03:14:40 firewallfsi kernel: [167409.074897] [29170] 273 29170 2208 1411 8 3 0 0 imap-login Jan 14 03:14:40 firewallfsi kernel: [167409.074898] [29174] 301 29174 1684 1186 6 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074900] [29582] 273 29582 2209 1394 8 3 0 0 imap-login Jan 14 03:14:40 firewallfsi kernel: [167409.074902] [29583] 301 29583 2006 1198 7 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074909] [ 6411] 273 6411 2208 1394 8 3 0 0 imap-login Jan 14 03:14:40 firewallfsi kernel: [167409.074910] [ 6417] 301 6417 1446 1063 6 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074912] [13238] 273 13238 2208 1447 8 3 0 0 imap-login Jan 14 03:14:40 firewallfsi kernel: [167409.074917] [13242] 301 13242 2516 1277 7 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074918] [ 5077] 48 5077 16220 4586 29 3 0 0 /usr/sbin/httpd Jan 14 03:14:40 firewallfsi kernel: [167409.074919] [ 5080] 48 5080 16216 4317 29 3 0 0 /usr/sbin/httpd Jan 14 03:14:40 firewallfsi kernel: [167409.074923] [ 5082] 48 5082 16220 4576 29 3 0 0 /usr/sbin/httpd Jan 14 03:14:40 firewallfsi kernel: [167409.074925] [31732] 0 31732 1307 924 6 3 0 0 reboot-when-oom Jan 14 03:14:40 firewallfsi kernel: [167409.074928] [31733] 0 31733 1307 909 6 3 0 0 watch-ps Jan 14 03:14:40 firewallfsi kernel: [167409.074930] [ 7082] 273 7082 2208 1411 8 3 0 0 imap-login Jan 14 03:14:40 firewallfsi kernel: [167409.074931] [ 7086] 301 7086 3832 1390 9 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074933] [29840] 276 29840 116776 103156 222 3 68 0 clamd Jan 14 03:14:40 firewallfsi kernel: [167409.074936] [29858] 290 29858 15448 1219 13 3 0 0 clamav-milter Jan 14 03:14:40 firewallfsi kernel: [167409.074937] [29881] 0 29881 3830 1710 11 3 0 0 sendmail Jan 14 03:14:40 firewallfsi kernel: [167409.074939] [29898] 51 29898 3502 769 10 3 0 0 sendmail Jan 14 03:14:40 firewallfsi kernel: [167409.074941] [30041] 23 30041 5440 724 14 3 0 0 squid Jan 14 03:14:40 firewallfsi kernel: [167409.074949] [30043] 23 30043 11724 8388 25 3 0 0 squid Jan 14 03:14:40 firewallfsi kernel: [167409.074950] [30044] 23 30044 1179 261 6 3 0 0 unlinkd Jan 14 03:14:40 firewallfsi kernel: [167409.074957] [21973] 273 21973 2209 1417 8 3 0 0 imap-login Jan 14 03:14:40 firewallfsi kernel: [167409.074959] [21977] 302 21977 1429 1059 6 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074960] [ 3848] 48 3848 16183 3641 29 3 0 0 /usr/sbin/httpd Jan 14 03:14:40 firewallfsi kernel: [167409.074962] [11059] 273 11059 2209 1442 9 3 0 0 imap-login Jan 14 03:14:40 firewallfsi kernel: [167409.074968] [11060] 273 11060 2209 1436 8 3 0 0 imap-login Jan 14 03:14:40 firewallfsi kernel: [167409.074970] [11061] 273 11061 2209 1454 8 3 0 0 imap-login Jan 14 03:14:40 firewallfsi kernel: [167409.074976] [11062] 273 11062 2209 1422 8 3 0 0 imap-login Jan 14 03:14:40 firewallfsi kernel: [167409.074978] [11063] 301 11063 1401 978 6 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074981] [11066] 301 11066 1399 948 6 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074985] [11067] 301 11067 1616 1096 6 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074987] [11068] 301 11068 1396 929 6 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.074989] [23879] 0 23879 3839 1924 11 3 0 0 sendmail Jan 14 03:14:40 firewallfsi kernel: [167409.074996] [25771] 0 25771 1396 158 6 3 0 0 sleep Jan 14 03:14:40 firewallfsi kernel: [167409.074998] [27161] 273 27161 2209 1418 8 3 0 0 imap-login Jan 14 03:14:40 firewallfsi kernel: [167409.075000] [27165] 301 27165 1436 1049 6 3 0 0 imap Jan 14 03:14:40 firewallfsi kernel: [167409.075001] [27432] 0 27432 1654 406 7 3 0 0 anacron Jan 14 03:14:40 firewallfsi kernel: [167409.075011] [29124] 0 29124 1396 160 7 3 0 0 sleep Jan 14 03:14:40 firewallfsi kernel: [167409.075014] Out of memory: Kill process 29840 (clamd) score 10 or sacrifice child Jan 14 03:14:40 firewallfsi kernel: [167409.075026] Killed process 29840 (clamd) total-vm:467104kB, anon-rss:396772kB, file-rss:15852kB, shmem-rss:0kB ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-15 6:27 ` Trevor Cordes @ 2017-01-16 11:09 ` Mel Gorman 2017-01-17 13:52 ` Michal Hocko 2017-01-18 6:52 ` Trevor Cordes 2017-01-17 13:45 ` Michal Hocko 1 sibling, 2 replies; 40+ messages in thread From: Mel Gorman @ 2017-01-16 11:09 UTC (permalink / raw) To: Trevor Cordes Cc: Michal Hocko, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Sun, Jan 15, 2017 at 12:27:52AM -0600, Trevor Cordes wrote: > On 2017-01-12 Michal Hocko wrote: > > On Wed 11-01-17 16:52:32, Trevor Cordes wrote: > > [...] > > > I'm not sure how I can tell if my bug is because of memcgs so here > > > is a full first oom example (attached). > > > > 4.7 kernel doesn't contain 71c799f4982d ("mm: add per-zone lru list > > stat") so the OOM report will not tell us whether the Normal zone > > doesn't age active lists, unfortunatelly. > > I compiled the patch Mel provided into the stock F23 kernel > 4.8.13-100.fc23.i686+PAE and it ran for 2 nights. It didn't oom the > first night, but did the second night. So the bug persists even with > that patch. However, it does *seem* a bit "better" since it took 2 > nights (usually takes only one, but maybe 10% of the time it does take > two) before oom'ing, *and* it allowed my reboot script to reboot it > cleanly when it saw the oom (which happens only 25% of the time). > > I'm attaching the 4.8.13 oom message which should have the memcg info > (71c799f4982d) you are asking for above? Hopefully? > It shows that there are an extremely large number of reclaimable slab pages in the lower zones. Other pages have been reclaimed as normal but the failure to reclaim slab pages causes a high-order allocation to fail. > > You can easily check whether this is memcg related by trying to run > > the same workload with cgroup_disable=memory kernel command line > > parameter. This will put all the memcg specifics out of the way. > > I will try booting now into cgroup_disable=memory to see if that helps > at all. I'll reply back in 48 hours, or when it oom's, whichever comes > first. > Thanks. > Also, should I bother trying the latest git HEAD to see if that solves > anything? Thanks! That's worth trying. If that also fails then could you try the following hack to encourage direct reclaim to reclaim slab when buffers are over the limit please? diff --git a/mm/vmscan.c b/mm/vmscan.c index 532a2a750952..46aac487b89a 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2684,6 +2684,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) continue; if (sc->priority != DEF_PRIORITY && + !buffer_heads_over_limit && !pgdat_reclaimable(zone->zone_pgdat)) continue; /* Let kswapd poll it */ -- Mel Gorman SUSE Labs ^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-16 11:09 ` Mel Gorman @ 2017-01-17 13:52 ` Michal Hocko 2017-01-17 14:21 ` Mel Gorman 2017-01-18 6:52 ` Trevor Cordes 1 sibling, 1 reply; 40+ messages in thread From: Michal Hocko @ 2017-01-17 13:52 UTC (permalink / raw) To: Mel Gorman Cc: Trevor Cordes, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Mon 16-01-17 11:09:34, Mel Gorman wrote: [...] > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 532a2a750952..46aac487b89a 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2684,6 +2684,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) > continue; > > if (sc->priority != DEF_PRIORITY && > + !buffer_heads_over_limit && > !pgdat_reclaimable(zone->zone_pgdat)) > continue; /* Let kswapd poll it */ I think we should rather remove pgdat_reclaimable here. This sounds like a wrong layer to decide whether we want to reclaim and how much. But even that won't help very much I am afraid. As I've noted in the other response as long as we will scale the slab shrinking based on nr_scanned we will have a problem with situations where slab outnumbers lru lists too much. I do not have a good idea how to fix that though... -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-17 13:52 ` Michal Hocko @ 2017-01-17 14:21 ` Mel Gorman 2017-01-17 14:54 ` Michal Hocko 0 siblings, 1 reply; 40+ messages in thread From: Mel Gorman @ 2017-01-17 14:21 UTC (permalink / raw) To: Michal Hocko Cc: Trevor Cordes, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Tue, Jan 17, 2017 at 02:52:28PM +0100, Michal Hocko wrote: > On Mon 16-01-17 11:09:34, Mel Gorman wrote: > [...] > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index 532a2a750952..46aac487b89a 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -2684,6 +2684,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) > > continue; > > > > if (sc->priority != DEF_PRIORITY && > > + !buffer_heads_over_limit && > > !pgdat_reclaimable(zone->zone_pgdat)) > > continue; /* Let kswapd poll it */ > > I think we should rather remove pgdat_reclaimable here. This sounds like > a wrong layer to decide whether we want to reclaim and how much. > I had considered that but it'd also be important to add the other 32-bit patches you have posted to see the impact. Because of the ratio of LRU pages to slab pages, it may not have an impact but it'd need to be eliminated. > But even that won't help very much I am afraid. As I've noted in the > other response as long as we will scale the slab shrinking based on > nr_scanned we will have a problem with situations where slab outnumbers > lru lists too much. I do not have a good idea how to fix that though... > Right now, I don't either other than a heavy-handed approach of checking if a) it's a pgdat with a highmem node b) if the ratio of LRU pages to slab pages on the lower zones is out of whack and if so, ignore nr_scanned for the slab shrinker. Before prototyping such a thing, I'd like to hear the outcome of this heavy hack and then add your 32-bit patches onto the list. If the problem is still there then I'd next look at taking slab pages into account in pgdat_reclaimable() instead of an outright removal that has a much wider impact. If that doesn't work then I'll prototype a heavy-handed forced slab reclaim when lower zones are almost all slab pages. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-17 14:21 ` Mel Gorman @ 2017-01-17 14:54 ` Michal Hocko 2017-01-18 7:25 ` Trevor Cordes ` (3 more replies) 0 siblings, 4 replies; 40+ messages in thread From: Michal Hocko @ 2017-01-17 14:54 UTC (permalink / raw) To: Trevor Cordes Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Tue 17-01-17 14:21:14, Mel Gorman wrote: > On Tue, Jan 17, 2017 at 02:52:28PM +0100, Michal Hocko wrote: > > On Mon 16-01-17 11:09:34, Mel Gorman wrote: > > [...] > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > index 532a2a750952..46aac487b89a 100644 > > > --- a/mm/vmscan.c > > > +++ b/mm/vmscan.c > > > @@ -2684,6 +2684,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) > > > continue; > > > > > > if (sc->priority != DEF_PRIORITY && > > > + !buffer_heads_over_limit && > > > !pgdat_reclaimable(zone->zone_pgdat)) > > > continue; /* Let kswapd poll it */ > > > > I think we should rather remove pgdat_reclaimable here. This sounds like > > a wrong layer to decide whether we want to reclaim and how much. > > > > I had considered that but it'd also be important to add the other 32-bit > patches you have posted to see the impact. Because of the ratio of LRU pages > to slab pages, it may not have an impact but it'd need to be eliminated. OK, Trevor you can pull from git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git tree fixes/highmem-node-fixes branch. This contains the current mmotm tree + the latest highmem fixes. I also do not expect this would help much in your case but as Mel've said we should rule that out at least. > Right now, I don't either other than a heavy-handed approach of checking if > a) it's a pgdat with a highmem node I do not think this is a right approach because we have a similar problem even without the highmem. I have already seen cases where the slab memory has eaten the whole DMA32 zone. > b) if the ratio of LRU pages to slab > pages on the lower zones is out of whack and if so, ignore nr_scanned for > the slab shrinker. this sounds much more promissing. > Before prototyping such a thing, I'd like to hear the outcome of this > heavy hack and then add your 32-bit patches onto the list. If the problem > is still there then I'd next look at taking slab pages into account in > pgdat_reclaimable() instead of an outright removal that has a much wider > impact. If that doesn't work then I'll prototype a heavy-handed forced > slab reclaim when lower zones are almost all slab pages. I would be really curious to hear whether pgdat_reclaimable removal makes any bad side effects. It just smells wrong from a highlevel point of view. Besides that I really _hate_ pgdat_reclaimable for any decision making. It just behaves very randomly... I do not expect it help much in this case, though, as the highmem can easily bias the decision. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-17 14:54 ` Michal Hocko @ 2017-01-18 7:25 ` Trevor Cordes 2017-01-18 17:48 ` Mel Gorman ` (2 subsequent siblings) 3 siblings, 0 replies; 40+ messages in thread From: Trevor Cordes @ 2017-01-18 7:25 UTC (permalink / raw) To: Michal Hocko Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On 2017-01-17 Michal Hocko wrote: > On Tue 17-01-17 14:21:14, Mel Gorman wrote: > > On Tue, Jan 17, 2017 at 02:52:28PM +0100, Michal Hocko wrote: > > > On Mon 16-01-17 11:09:34, Mel Gorman wrote: > > > [...] > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > > index 532a2a750952..46aac487b89a 100644 > > > > --- a/mm/vmscan.c > > > > +++ b/mm/vmscan.c > > > > @@ -2684,6 +2684,7 @@ static void shrink_zones(struct zonelist > > > > *zonelist, struct scan_control *sc) continue; > > > > > > > > if (sc->priority != DEF_PRIORITY && > > > > + !buffer_heads_over_limit && > > > > !pgdat_reclaimable(zone->zone_pgdat)) > > > > continue; /* Let kswapd > > > > poll it */ > > > > > > I think we should rather remove pgdat_reclaimable here. This > > > sounds like a wrong layer to decide whether we want to reclaim > > > and how much. > > > > I had considered that but it'd also be important to add the other > > 32-bit patches you have posted to see the impact. Because of the > > ratio of LRU pages to slab pages, it may not have an impact but > > it'd need to be eliminated. > > OK, Trevor you can pull from > git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git tree > fixes/highmem-node-fixes branch. This contains the current mmotm tree > + the latest highmem fixes. I also do not expect this would help much > in your case but as Mel've said we should rule that out at least. OK, ignore my last question re: what to do next. I am building this mhocko git tree now per your above instructions and will reboot into it in a few hours with*out* the cgroup_disable=memory option. Might take ~50 hours for a result. I should note that the workload on the box with the bug is mostly as a file server and iptables firewall/router. It routes around 8GB(ytes) a day, and periodic file server loads. That's about it. Everything else that is running is not doing much, and not using much RAM; except maybe clamav, by far the biggest RAM. I don't see this bug on other nearly identical boxes, including: F24 4.8.15 32-bit (no PAE) 1GB ram P4 F24 4.8.15 32-bit (no PAE) 2GB ram Core2 Quad However, just noticed for the first time today that one other box is also seeing this bug (gets an oom message), though with much less frequency: twice in 2 months since upgrading to 4.8. However, it recovers from the oom without a reboot and hasn't hanged (yet). That could be because this box does not do as much file serving or I/O as the one I've been building/testing on. Also, this box is a much older Pentium-D with 4GB (PAE on). If it would be helpful to see its oom log, let me know. (Scanning all my boxes now, I also found 1 single oom on yet another 1 computer with the same story; but this is a Xeon E3-1220 32-bit with PAE, 4GB.) So far the commonality seems to be >2GB RAM and PAE on. Might be interesting to boot my build/test box with mem=2G and isolate it to small RAM vs PAE. "mem=2G" would make a great, easy, immediate workaround for this problem for me (as cgroup_disable=memory also seems to do, so far). Thanks! ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-17 14:54 ` Michal Hocko 2017-01-18 7:25 ` Trevor Cordes @ 2017-01-18 17:48 ` Mel Gorman 2017-01-18 18:07 ` Mel Gorman 2017-01-19 9:48 ` Trevor Cordes 3 siblings, 0 replies; 40+ messages in thread From: Mel Gorman @ 2017-01-18 17:48 UTC (permalink / raw) To: Michal Hocko Cc: Trevor Cordes, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Tue, Jan 17, 2017 at 03:54:51PM +0100, Michal Hocko wrote: > On Tue 17-01-17 14:21:14, Mel Gorman wrote: > > On Tue, Jan 17, 2017 at 02:52:28PM +0100, Michal Hocko wrote: > > > On Mon 16-01-17 11:09:34, Mel Gorman wrote: > > > [...] > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > > index 532a2a750952..46aac487b89a 100644 > > > > --- a/mm/vmscan.c > > > > +++ b/mm/vmscan.c > > > > @@ -2684,6 +2684,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) > > > > continue; > > > > > > > > if (sc->priority != DEF_PRIORITY && > > > > + !buffer_heads_over_limit && > > > > !pgdat_reclaimable(zone->zone_pgdat)) > > > > continue; /* Let kswapd poll it */ > > > > > > I think we should rather remove pgdat_reclaimable here. This sounds like > > > a wrong layer to decide whether we want to reclaim and how much. > > > > > > > I had considered that but it'd also be important to add the other 32-bit > > patches you have posted to see the impact. Because of the ratio of LRU pages > > to slab pages, it may not have an impact but it'd need to be eliminated. > > OK, Trevor you can pull from > git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git tree > fixes/highmem-node-fixes branch. This contains the current mmotm tree + > the latest highmem fixes. I also do not expect this would help much in > your case but as Mel've said we should rule that out at least. > After considering slab shrinking of lower nodes, it occurs to me that your fixes also impacts slab shrinking. For lowmem-constrained allocations, we accounted for scans on the lower zones but shrunk slabs proportional to the total LRU size. If the lower zones had few LRU pages and were mostly slab pages then the proportional calculation would be way off. This may have a bigger impact on Trevor Cordes' situation that I had imagined at the start of today. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-17 14:54 ` Michal Hocko 2017-01-18 7:25 ` Trevor Cordes 2017-01-18 17:48 ` Mel Gorman @ 2017-01-18 18:07 ` Mel Gorman 2017-01-19 9:48 ` Trevor Cordes 3 siblings, 0 replies; 40+ messages in thread From: Mel Gorman @ 2017-01-18 18:07 UTC (permalink / raw) To: Michal Hocko Cc: Trevor Cordes, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Tue, Jan 17, 2017 at 03:54:51PM +0100, Michal Hocko wrote: > On Tue 17-01-17 14:21:14, Mel Gorman wrote: > > On Tue, Jan 17, 2017 at 02:52:28PM +0100, Michal Hocko wrote: > > > On Mon 16-01-17 11:09:34, Mel Gorman wrote: > > > [...] > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > > index 532a2a750952..46aac487b89a 100644 > > > > --- a/mm/vmscan.c > > > > +++ b/mm/vmscan.c > > > > @@ -2684,6 +2684,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) > > > > continue; > > > > > > > > if (sc->priority != DEF_PRIORITY && > > > > + !buffer_heads_over_limit && > > > > !pgdat_reclaimable(zone->zone_pgdat)) > > > > continue; /* Let kswapd poll it */ > > > > > > I think we should rather remove pgdat_reclaimable here. This sounds like > > > a wrong layer to decide whether we want to reclaim and how much. > > > > > > > I had considered that but it'd also be important to add the other 32-bit > > patches you have posted to see the impact. Because of the ratio of LRU pages > > to slab pages, it may not have an impact but it'd need to be eliminated. > > OK, Trevor you can pull from > git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git tree > fixes/highmem-node-fixes branch. This contains the current mmotm tree + > the latest highmem fixes. I also do not expect this would help much in > your case but as Mel've said we should rule that out at least. > After considering slab shrinking of lower nodes, it occured to me that your fixes may have a bigger impact than I believed this morning. For lowmem-constrained allocations, we account for scans on the lower zones but shrink proportionally to the LRU size for the entire node. If the lower zones had few LRU pages and were mostly slab pages then the proportional calculation would be way off so direct reclaim would barely touch slab caches. That is fixed up by "mm, vmscan: consider eligible zones in get_scan_count" so that the slab shrinking will be proportional to the LRU pages on the lower zones. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-17 14:54 ` Michal Hocko ` (2 preceding siblings ...) 2017-01-18 18:07 ` Mel Gorman @ 2017-01-19 9:48 ` Trevor Cordes 2017-01-19 11:37 ` Michal Hocko 3 siblings, 1 reply; 40+ messages in thread From: Trevor Cordes @ 2017-01-19 9:48 UTC (permalink / raw) To: Michal Hocko Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On 2017-01-17 Michal Hocko wrote: > On Tue 17-01-17 14:21:14, Mel Gorman wrote: > > On Tue, Jan 17, 2017 at 02:52:28PM +0100, Michal Hocko wrote: > > > On Mon 16-01-17 11:09:34, Mel Gorman wrote: > > > [...] > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > > index 532a2a750952..46aac487b89a 100644 > > > > --- a/mm/vmscan.c > > > > +++ b/mm/vmscan.c > > > > @@ -2684,6 +2684,7 @@ static void shrink_zones(struct zonelist > > > > *zonelist, struct scan_control *sc) continue; > > > > > > > > if (sc->priority != DEF_PRIORITY && > > > > + !buffer_heads_over_limit && > > > > !pgdat_reclaimable(zone->zone_pgdat)) > > > > continue; /* Let kswapd > > > > poll it */ > > > > > > I think we should rather remove pgdat_reclaimable here. This > > > sounds like a wrong layer to decide whether we want to reclaim > > > and how much. > > > > I had considered that but it'd also be important to add the other > > 32-bit patches you have posted to see the impact. Because of the > > ratio of LRU pages to slab pages, it may not have an impact but > > it'd need to be eliminated. > > OK, Trevor you can pull from > git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git tree > fixes/highmem-node-fixes branch. This contains the current mmotm tree > + the latest highmem fixes. I also do not expect this would help much > in your case but as Mel've said we should rule that out at least. Hi! The git tree above version oom'd after < 24 hours (3:02am) so it doesn't solve the bug. If you need a oom messages dump let me know. Let me know what to try next, guys, and I'll test it out. > > Before prototyping such a thing, I'd like to hear the outcome of > > this heavy hack and then add your 32-bit patches onto the list. If > > the problem is still there then I'd next look at taking slab pages > > into account in pgdat_reclaimable() instead of an outright removal > > that has a much wider impact. If that doesn't work then I'll > > prototype a heavy-handed forced slab reclaim when lower zones are > > almost all slab pages. I don't think I've tried the "heavy hack" patch yet? It's not in the mhocko tree I just tried? Should I try the heavy hack on top of mhocko git or on vanilla or what? I also want to mention that these PAE boxes suffer from another problem/bug that I've worked around for almost a year now. For some reason it keeps gnawing at me that it might be related. The disk I/O goes to pot on this/these PAE boxes after a certain amount of disk writes (like some unknown number of GB, around 10-ish maybe). Like writes go from 500MB/s to 10MB/s!! Reboot and it's magically 500MB/s again. I detail this here: https://muug.ca/pipermail/roundtable/2016-June/004669.html My fix was to mem=XG where X is <8 (like 4 or 6) to force the PAE kernel to be more sane about highmem choices. I never filed a bug because I read a ton of stuff saying Linus hates PAE, don't use over 4G, blah blah. But the other fix is to: set /proc/sys/vm/highmem_is_dirtyable to 1 I'm not bringing this up to get attention to a new bug, I bring this up because it smells like it might be related. If something slowly eats away at the box's vm to the point that I/O gets horribly slow, perhaps it's related to the slab and high/lomem issue we have here? And if related, it may help to solve the oom bug. If I'm way off base here, just ignore my tangent! The funny thing is I thought mem=XG where X<8 solved the problem, but it doesn't! It greatly mitigates it, but I still get subtle slowdown that gets worse over time (like weeks instead of days). I now use the highmem_is_dirtyable on most boxes and that seems to solve it for good in combo with mem=XG. Let me note, however, that I have NOT set highmem_is_dirtyable=1 on the test box I am using for all of this building/testing, as I wanted the config to stay static while I work through this oom bug. (I'm real curious to see if highmem_is_dirtyable=1 would have any impact on the oom though!) Thanks! ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-19 9:48 ` Trevor Cordes @ 2017-01-19 11:37 ` Michal Hocko 2017-01-20 6:35 ` Trevor Cordes 0 siblings, 1 reply; 40+ messages in thread From: Michal Hocko @ 2017-01-19 11:37 UTC (permalink / raw) To: Trevor Cordes Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Thu 19-01-17 03:48:50, Trevor Cordes wrote: > On 2017-01-17 Michal Hocko wrote: > > On Tue 17-01-17 14:21:14, Mel Gorman wrote: > > > On Tue, Jan 17, 2017 at 02:52:28PM +0100, Michal Hocko wrote: > > > > On Mon 16-01-17 11:09:34, Mel Gorman wrote: > > > > [...] > > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > > > index 532a2a750952..46aac487b89a 100644 > > > > > --- a/mm/vmscan.c > > > > > +++ b/mm/vmscan.c > > > > > @@ -2684,6 +2684,7 @@ static void shrink_zones(struct zonelist > > > > > *zonelist, struct scan_control *sc) continue; > > > > > > > > > > if (sc->priority != DEF_PRIORITY && > > > > > + !buffer_heads_over_limit && > > > > > !pgdat_reclaimable(zone->zone_pgdat)) > > > > > continue; /* Let kswapd > > > > > poll it */ > > > > > > > > I think we should rather remove pgdat_reclaimable here. This > > > > sounds like a wrong layer to decide whether we want to reclaim > > > > and how much. > > > > > > I had considered that but it'd also be important to add the other > > > 32-bit patches you have posted to see the impact. Because of the > > > ratio of LRU pages to slab pages, it may not have an impact but > > > it'd need to be eliminated. > > > > OK, Trevor you can pull from > > git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git tree > > fixes/highmem-node-fixes branch. This contains the current mmotm tree > > + the latest highmem fixes. I also do not expect this would help much > > in your case but as Mel've said we should rule that out at least. > > Hi! The git tree above version oom'd after < 24 hours (3:02am) so > it doesn't solve the bug. If you need a oom messages dump let me know. Yes please. > Let me know what to try next, guys, and I'll test it out. > > > > Before prototyping such a thing, I'd like to hear the outcome of > > > this heavy hack and then add your 32-bit patches onto the list. If > > > the problem is still there then I'd next look at taking slab pages > > > into account in pgdat_reclaimable() instead of an outright removal > > > that has a much wider impact. If that doesn't work then I'll > > > prototype a heavy-handed forced slab reclaim when lower zones are > > > almost all slab pages. > > I don't think I've tried the "heavy hack" patch yet? It's not in the > mhocko tree I just tried? Should I try the heavy hack on top of mhocko > git or on vanilla or what? > > I also want to mention that these PAE boxes suffer from another > problem/bug that I've worked around for almost a year now. For some > reason it keeps gnawing at me that it might be related. The disk I/O > goes to pot on this/these PAE boxes after a certain amount of disk > writes (like some unknown number of GB, around 10-ish maybe). Like > writes go from 500MB/s to 10MB/s!! Reboot and it's magically 500MB/s > again. I detail this here: > https://muug.ca/pipermail/roundtable/2016-June/004669.html > My fix was to mem=XG where X is <8 (like 4 or 6) to force the PAE > kernel to be more sane about highmem choices. I never filed a bug > because I read a ton of stuff saying Linus hates PAE, don't use over > 4G, blah blah. But the other fix is to: > set /proc/sys/vm/highmem_is_dirtyable to 1 Yes this sounds like a dirty memory throttling and there were some changes in that area. I do not remember when exactly. > I'm not bringing this up to get attention to a new bug, I bring this up > because it smells like it might be related. If something slowly eats > away at the box's vm to the point that I/O gets horribly slow, perhaps > it's related to the slab and high/lomem issue we have here? And if > related, it may help to solve the oom bug. If I'm way off base here, > just ignore my tangent! >From your OOM reports so far it doesn't really seem related because you never had large number of pages under the writeback when OOM. The situation with the PAE kernel is unfortunate but it is really hard to do anything about that considering that the kernel and most its allocations have to live in a small and scarce lowmem memory. Moreover the more memory you have to more you have to allocated from that memory. This is why not only Linus hates 32b systems on a large memory systems. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-19 11:37 ` Michal Hocko @ 2017-01-20 6:35 ` Trevor Cordes 2017-01-20 11:02 ` Mel Gorman 2017-01-24 12:51 ` Michal Hocko 0 siblings, 2 replies; 40+ messages in thread From: Trevor Cordes @ 2017-01-20 6:35 UTC (permalink / raw) To: Michal Hocko Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju [-- Attachment #1: Type: text/plain, Size: 5869 bytes --] On 2017-01-19 Michal Hocko wrote: > On Thu 19-01-17 03:48:50, Trevor Cordes wrote: > > On 2017-01-17 Michal Hocko wrote: > > > On Tue 17-01-17 14:21:14, Mel Gorman wrote: > > > > On Tue, Jan 17, 2017 at 02:52:28PM +0100, Michal Hocko > > > > wrote: > > > > > On Mon 16-01-17 11:09:34, Mel Gorman wrote: > > > > > [...] > > > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > > > > index 532a2a750952..46aac487b89a 100644 > > > > > > --- a/mm/vmscan.c > > > > > > +++ b/mm/vmscan.c > > > > > > @@ -2684,6 +2684,7 @@ static void shrink_zones(struct > > > > > > zonelist *zonelist, struct scan_control *sc) continue; > > > > > > > > > > > > if (sc->priority != DEF_PRIORITY && > > > > > > + !buffer_heads_over_limit && > > > > > > !pgdat_reclaimable(zone->zone_pgdat)) > > > > > > continue; /* Let > > > > > > kswapd poll it */ > > > > > > > > > > I think we should rather remove pgdat_reclaimable here. This > > > > > sounds like a wrong layer to decide whether we want to reclaim > > > > > and how much. > > > > > > > > I had considered that but it'd also be important to add the > > > > other 32-bit patches you have posted to see the impact. Because > > > > of the ratio of LRU pages to slab pages, it may not have an > > > > impact but it'd need to be eliminated. > > > > > > OK, Trevor you can pull from > > > git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git tree > > > fixes/highmem-node-fixes branch. This contains the current mmotm > > > tree > > > + the latest highmem fixes. I also do not expect this would help > > > much in your case but as Mel've said we should rule that out at > > > least. > > > > Hi! The git tree above version oom'd after < 24 hours (3:02am) so > > it doesn't solve the bug. If you need a oom messages dump let me > > know. > > Yes please. The first oom from that night attached. Note, the oom wasn't as dire with your mhocko/4.9.0+ as it usually is with stock 4.8.x: my oom detector and reboot script was able to do its thing cleanly before the system became unusable. I'll await further instructions and test right away. Maybe I'll try a few tuning ideas until then. Thanks! > > Let me know what to try next, guys, and I'll test it out. > > > > > > Before prototyping such a thing, I'd like to hear the outcome of > > > > this heavy hack and then add your 32-bit patches onto the list. > > > > If the problem is still there then I'd next look at taking slab > > > > pages into account in pgdat_reclaimable() instead of an > > > > outright removal that has a much wider impact. If that doesn't > > > > work then I'll prototype a heavy-handed forced slab reclaim > > > > when lower zones are almost all slab pages. > > > > I don't think I've tried the "heavy hack" patch yet? It's not in > > the mhocko tree I just tried? Should I try the heavy hack on top > > of mhocko git or on vanilla or what? > > > > I also want to mention that these PAE boxes suffer from another > > problem/bug that I've worked around for almost a year now. For some > > reason it keeps gnawing at me that it might be related. The disk > > I/O goes to pot on this/these PAE boxes after a certain amount of > > disk writes (like some unknown number of GB, around 10-ish maybe). > > Like writes go from 500MB/s to 10MB/s!! Reboot and it's magically > > 500MB/s again. I detail this here: > > https://muug.ca/pipermail/roundtable/2016-June/004669.html > > My fix was to mem=XG where X is <8 (like 4 or 6) to force the PAE > > kernel to be more sane about highmem choices. I never filed a bug > > because I read a ton of stuff saying Linus hates PAE, don't use over > > 4G, blah blah. But the other fix is to: > > set /proc/sys/vm/highmem_is_dirtyable to 1 > > Yes this sounds like a dirty memory throttling and there were some > changes in that area. I do not remember when exactly. I think my PAE-slow-IO bug started way back in Fedora 22 (4.0?), hard to know exactly when as I didn't discover the bug for maybe a year as I didn't realize IO was the problem right away. Too late to bisect that one, I guess. I guess it's not related so we can ignore my tangent! > > I'm not bringing this up to get attention to a new bug, I bring > > this up because it smells like it might be related. If something > > slowly eats away at the box's vm to the point that I/O gets > > horribly slow, perhaps it's related to the slab and high/lomem > > issue we have here? And if related, it may help to solve the oom > > bug. If I'm way off base here, just ignore my tangent! > > >From your OOM reports so far it doesn't really seem related because > >you > never had large number of pages under the writeback when OOM. > > The situation with the PAE kernel is unfortunate but it is really hard > to do anything about that considering that the kernel and most its > allocations have to live in a small and scarce lowmem memory. Moreover > the more memory you have to more you have to allocated from that > memory. You're for sure right that the IO-slow bug was definitely worse the more ram was in a system! (The mem=4G really helps alleviate this bug and is good enough for me.) > This is why not only Linus hates 32b systems on a large memory > systems. Completely off-topic: it would be great if rather than pretending PAE should work with large RAM (which seems more broken every day), the kernel guys put out an officially stated policy of a maximum RAM you can use, and try to have the kernel behave for <= that size, and then people could use more RAM but clearly "at your own risk, don't bug us about problems!". Other than a few posts about Linus hating it, there's nothing official I can find about it in documentation, etc. It gives the (mis)impression that it's perfectly fine to run PAE on a zillion GB modern system. Then we later learn the hard way :-) [-- Attachment #2: oom3 --] [-- Type: application/octet-stream, Size: 22921 bytes --] Jan 19 03:02:19 firewallfsi kernel: [85602.797346] smbd invoked oom-killer: gfp_mask=0x26000d0(GFP_TEMPORARY|__GFP_NOTRACK), nodemask=0, order=0, oom_score_adj=0 Jan 19 03:02:19 firewallfsi kernel: [85602.798595] smbd cpuset=/ mems_allowed=0 Jan 19 03:02:19 firewallfsi kernel: [85602.799868] CPU: 0 PID: 5892 Comm: smbd Not tainted 4.9.0+ #1 Jan 19 03:02:19 firewallfsi kernel: [85602.801084] Hardware name: Intel Corporation S1200BTL/S1200BTL, BIOS S1200BT.86B.02.00.0035.030220120927 03/02/2012 Jan 19 03:02:19 firewallfsi kernel: [85602.802306] df18da9c d67603e7 df18dbd4 ee3e2400 df18dacc d65e16a6 df18daac d6b63aad Jan 19 03:02:19 firewallfsi kernel: [85602.803529] df18dacc d676615f df18dad0 f6a6d600 e1172400 ee3e2400 d6d68bde df18dbd4 Jan 19 03:02:19 firewallfsi kernel: [85602.804736] df18db10 d657aff7 d6476d8a df18dafc d657ac6b 00000006 00000000 0000000b Jan 19 03:02:19 firewallfsi kernel: [85602.805933] Call Trace: Jan 19 03:02:19 firewallfsi kernel: [85602.807087] [<d67603e7>] dump_stack+0x58/0x81 Jan 19 03:02:19 firewallfsi kernel: [85602.808227] [<d65e16a6>] dump_header+0x64/0x1a6 Jan 19 03:02:19 firewallfsi kernel: [85602.809347] [<d6b63aad>] ? _raw_spin_unlock_irqrestore+0xd/0x10 Jan 19 03:02:19 firewallfsi kernel: [85602.810454] [<d676615f>] ? ___ratelimit+0x9f/0x100 Jan 19 03:02:19 firewallfsi kernel: [85602.811544] [<d657aff7>] oom_kill_process+0x207/0x3d0 Jan 19 03:02:19 firewallfsi kernel: [85602.812617] [<d6476d8a>] ? has_capability_noaudit+0x1a/0x30 Jan 19 03:02:19 firewallfsi kernel: [85602.813674] [<d657ac6b>] ? oom_badness.part.13+0xcb/0x140 Jan 19 03:02:19 firewallfsi kernel: [85602.814715] [<d657b4d8>] out_of_memory+0xf8/0x2a0 Jan 19 03:02:19 firewallfsi kernel: [85602.815737] [<d65800ca>] __alloc_pages_nodemask+0xcfa/0xd10 Jan 19 03:02:19 firewallfsi kernel: [85602.816746] [<d65cc928>] new_slab+0x3c8/0x4b0 Jan 19 03:02:19 firewallfsi kernel: [85602.817736] [<d65cddca>] ___slab_alloc.constprop.75+0x42a/0x670 Jan 19 03:02:19 firewallfsi kernel: [85602.818715] [<d65fd583>] ? __d_alloc+0x23/0x190 Jan 19 03:02:19 firewallfsi kernel: [85602.819675] [<d6668ce4>] ? __ext4_get_inode_loc+0x104/0x440 Jan 19 03:02:19 firewallfsi kernel: [85602.820620] [<d65ce03f>] __slab_alloc.constprop.74+0x2f/0x50 Jan 19 03:02:19 firewallfsi kernel: [85602.821550] [<d65cf0ca>] kmem_cache_alloc+0x17a/0x1c0 Jan 19 03:02:19 firewallfsi kernel: [85602.822462] [<d65fd583>] ? __d_alloc+0x23/0x190 Jan 19 03:02:19 firewallfsi kernel: [85602.823355] [<d65fd583>] __d_alloc+0x23/0x190 Jan 19 03:02:19 firewallfsi kernel: [85602.824228] [<d65fd704>] d_alloc+0x14/0x50 Jan 19 03:02:19 firewallfsi kernel: [85602.825082] [<d65fdbd7>] d_alloc_parallel+0x47/0x450 Jan 19 03:02:19 firewallfsi kernel: [85602.825919] [<d65fcecd>] ? d_splice_alias+0x1fd/0x370 Jan 19 03:02:19 firewallfsi kernel: [85602.826739] [<d6677971>] ? ext4_lookup+0x161/0x240 Jan 19 03:02:19 firewallfsi kernel: [85602.827540] [<d676e80b>] ? lockref_get_not_dead+0xb/0x30 Jan 19 03:02:19 firewallfsi kernel: [85602.828325] [<d65ef289>] ? unlazy_walk+0xf9/0x1a0 Jan 19 03:02:19 firewallfsi kernel: [85602.829093] [<d65f00ae>] lookup_slow+0x5e/0x130 Jan 19 03:02:19 firewallfsi kernel: [85602.829842] [<d65f0ca4>] walk_component+0x1e4/0x300 Jan 19 03:02:19 firewallfsi kernel: [85602.830574] [<d65efc8d>] ? path_init+0x14d/0x330 Jan 19 03:02:19 firewallfsi kernel: [85602.831288] [<d65f1e83>] path_lookupat+0x53/0xe0 Jan 19 03:02:19 firewallfsi kernel: [85602.831984] [<d65f4027>] filename_lookup+0x97/0x190 Jan 19 03:02:19 firewallfsi kernel: [85602.832661] [<d65cf047>] ? kmem_cache_alloc+0xf7/0x1c0 Jan 19 03:02:19 firewallfsi kernel: [85602.833322] [<d65f3c6a>] ? getname_flags+0x3a/0x1a0 Jan 19 03:02:19 firewallfsi kernel: [85602.833966] [<d65f3c81>] ? getname_flags+0x51/0x1a0 Jan 19 03:02:19 firewallfsi kernel: [85602.834587] [<d65f41f6>] user_path_at_empty+0x36/0x40 Jan 19 03:02:19 firewallfsi kernel: [85602.835190] [<d65e9910>] vfs_fstatat+0x60/0xb0 Jan 19 03:02:19 firewallfsi kernel: [85602.835774] [<d65ea3aa>] SyS_fstatat64+0x2a/0x50 Jan 19 03:02:19 firewallfsi kernel: [85602.836341] [<d642b168>] ? sched_clock+0x8/0x10 Jan 19 03:02:19 firewallfsi kernel: [85602.836888] [<d649a075>] ? sched_clock_cpu+0x125/0x140 Jan 19 03:02:19 firewallfsi kernel: [85602.837415] [<d64de305>] ? hrtimer_interrupt+0xa5/0x180 Jan 19 03:02:19 firewallfsi kernel: [85602.837926] [<d640377a>] do_fast_syscall_32+0x8a/0x150 Jan 19 03:02:19 firewallfsi kernel: [85602.838420] [<d6b63fca>] sysenter_past_esp+0x47/0x75 Jan 19 03:02:19 firewallfsi kernel: [85602.838902] Mem-Info: Jan 19 03:02:19 firewallfsi kernel: [85602.840191] active_anon:155033 inactive_anon:32960 isolated_anon:0 Jan 19 03:02:19 firewallfsi kernel: [85602.840191] active_file:171215 inactive_file:640942 isolated_file:0 Jan 19 03:02:19 firewallfsi kernel: [85602.840191] unevictable:0 dirty:1277 writeback:0 unstable:0 Jan 19 03:02:19 firewallfsi kernel: [85602.840191] slab_reclaimable:132504 slab_unreclaimable:11682 Jan 19 03:02:19 firewallfsi kernel: [85602.840191] mapped:22669 shmem:1221 pagetables:1537 bounce:0 Jan 19 03:02:19 firewallfsi kernel: [85602.840191] free:57986 free_pcp:1120 free_cma:0 Jan 19 03:02:19 firewallfsi kernel: [85602.848275] Node 0 active_anon:620132kB inactive_anon:131840kB active_file:684860kB inactive_file:2563768kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:90712kB dirty:5128kB writeback:0kB shmem:4884kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no Jan 19 03:02:19 firewallfsi kernel: [85602.852375] DMA free:3176kB min:68kB low:84kB high:100kB active_anon:0kB inactive_anon:0kB active_file:4996kB inactive_file:0kB unevictable:0kB writepending:8kB present:15976kB managed:15900kB mlocked:0kB slab_reclaimable:7724kB slab_unreclaimable:4kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Jan 19 03:02:19 firewallfsi kernel: lowmem_reserve[]: 0 777 4734 4734 Jan 19 03:02:19 firewallfsi kernel: [85602.858232] Normal free:3436kB min:3532kB low:4412kB high:5292kB active_anon:4kB inactive_anon:8kB active_file:193340kB inactive_file:120kB unevictable:0kB writepending:2516kB present:892920kB managed:816932kB mlocked:0kB slab_reclaimable:522292kB slab_unreclaimable:46724kB kernel_stack:2560kB pagetables:0kB bounce:0kB free_pcp:3468kB local_pcp:176kB free_cma:0kB Jan 19 03:02:19 firewallfsi kernel: lowmem_reserve[]: 0 0 31652 31652 Jan 19 03:02:19 firewallfsi kernel: [85602.864610] HighMem free:225332kB min:512kB low:5004kB high:9496kB active_anon:620128kB inactive_anon:131832kB active_file:486476kB inactive_file:2563648kB unevictable:0kB writepending:2616kB present:4051548kB managed:4051548kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:6148kB bounce:0kB free_pcp:1012kB local_pcp:180kB free_cma:0kB Jan 19 03:02:19 firewallfsi kernel: lowmem_reserve[]: 0 0 0 0 Jan 19 03:02:19 firewallfsi kernel: [85602.871108] DMA: 14*4kB (UME) 4*8kB (UE) 1*16kB (M) 2*32kB (UE) 1*64kB (U) 1*128kB (U) 1*256kB (U) 3*512kB (UME) 1*1024kB (U) 0*2048kB 0*4096kB = 3176kB Jan 19 03:02:19 firewallfsi kernel: Normal: 165*4kB (UME) 61*8kB (UME) 19*16kB (UM) 18*32kB (UME) 2*64kB (U) 4*128kB (UE) 3*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3436kB Jan 19 03:02:19 firewallfsi kernel: HighMem: 243*4kB (M) 167*8kB (UM) 55*16kB (UM) 28*32kB (UM) 15*64kB (UM) 13*128kB (UM) 6*256kB (UM) 0*512kB 2*1024kB (M) 39*2048kB (M) 33*4096kB (M) = 225332kB Jan 19 03:02:19 firewallfsi kernel: [85602.877129] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Jan 19 03:02:19 firewallfsi kernel: [85602.878150] 813455 total pagecache pages Jan 19 03:02:19 firewallfsi kernel: [85602.879168] 34 pages in swap cache Jan 19 03:02:19 firewallfsi kernel: [85602.880181] Swap cache stats: add 71, delete 37, find 191/200 Jan 19 03:02:19 firewallfsi kernel: [85602.881205] Free swap = 33784424kB Jan 19 03:02:19 firewallfsi kernel: [85602.882242] Total swap = 33784572kB Jan 19 03:02:19 firewallfsi kernel: [85602.883275] 1240111 pages RAM Jan 19 03:02:19 firewallfsi kernel: [85602.884302] 1012887 pages HighMem/MovableOnly Jan 19 03:02:19 firewallfsi kernel: [85602.885317] 19016 pages reserved Jan 19 03:02:19 firewallfsi kernel: [85602.886310] 0 pages hwpoisoned Jan 19 03:02:19 firewallfsi kernel: [85602.887299] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name Jan 19 03:02:19 firewallfsi kernel: [85602.888342] [ 596] 0 596 3278 1039 9 3 0 0 systemd-journal Jan 19 03:02:19 firewallfsi kernel: [85602.889382] [ 632] 0 632 3592 1182 9 3 0 -1000 systemd-udevd Jan 19 03:02:19 firewallfsi kernel: [85602.890420] [ 742] 81 742 1700 1069 7 3 0 -900 dbus-daemon Jan 19 03:02:19 firewallfsi kernel: [85602.891461] [ 743] 0 743 1704 663 6 3 0 0 sh Jan 19 03:02:19 firewallfsi kernel: [85602.892502] [ 745] 0 745 1704 646 6 3 0 0 sh Jan 19 03:02:19 firewallfsi kernel: [85602.893540] [ 747] 0 747 1033 641 6 3 0 0 irqbalance Jan 19 03:02:19 firewallfsi kernel: [85602.894574] [ 751] 0 751 808 487 5 3 0 0 mdadm Jan 19 03:02:19 firewallfsi kernel: [85602.895606] [ 752] 288 752 19290 1785 20 3 0 0 milter-greylist Jan 19 03:02:19 firewallfsi kernel: [85602.896595] [ 753] 0 753 988 716 5 3 0 0 systemd-logind Jan 19 03:02:19 firewallfsi kernel: [85602.897577] [ 755] 0 755 1472 953 6 3 0 0 smartd Jan 19 03:02:19 firewallfsi kernel: [85602.898550] [ 757] 0 757 1704 676 7 3 0 0 sh Jan 19 03:02:19 firewallfsi kernel: [85602.899514] [ 764] 0 764 8633 1110 12 3 0 0 rsyslogd Jan 19 03:02:19 firewallfsi kernel: [85602.900474] [ 765] 0 765 1704 673 7 3 0 0 sh Jan 19 03:02:19 firewallfsi kernel: [85602.901429] [ 768] 0 768 1704 629 6 3 0 0 sh Jan 19 03:02:19 firewallfsi kernel: [85602.902375] [ 770] 0 770 1704 681 7 3 0 0 sh Jan 19 03:02:19 firewallfsi kernel: [85602.903312] [ 771] 0 771 1704 644 6 3 0 0 sh Jan 19 03:02:19 firewallfsi kernel: [85602.904225] [ 774] 0 774 1704 680 6 3 0 0 sh Jan 19 03:02:19 firewallfsi kernel: [85602.905108] [ 799] 0 799 583 436 5 3 0 0 acpid Jan 19 03:02:19 firewallfsi kernel: [85602.905967] [ 803] 0 803 7287 1234 11 3 0 0 apcupsd Jan 19 03:02:19 firewallfsi kernel: [85602.906804] [ 810] 0 810 860 487 5 3 0 0 atd Jan 19 03:02:19 firewallfsi kernel: [85602.907616] [ 868] 0 868 1819 814 8 3 0 0 watch-services Jan 19 03:02:19 firewallfsi kernel: [85602.908388] [ 869] 0 869 3165 2070 9 3 0 0 mailwarnings Jan 19 03:02:19 firewallfsi kernel: [85602.909143] [ 870] 0 870 3238 2164 9 3 0 0 dynamic-ip-upda Jan 19 03:02:19 firewallfsi kernel: [85602.909879] [ 871] 0 871 1736 719 6 3 0 0 tickle-pog Jan 19 03:02:19 firewallfsi kernel: [85602.910589] [ 872] 0 872 3264 2201 10 3 0 0 restarter Jan 19 03:02:19 firewallfsi kernel: [85602.911282] [ 875] 0 875 3695 1986 11 3 0 0 fetchmail Jan 19 03:02:19 firewallfsi kernel: [85602.911955] [ 876] 0 876 2707 1622 9 3 0 0 udp-sgr Jan 19 03:02:19 firewallfsi kernel: [85602.912607] [ 877] 0 877 2499 414 9 3 0 0 saslauthd Jan 19 03:02:19 firewallfsi kernel: [85602.913175] [ 878] 0 878 2518 900 9 3 0 0 saslauthd Jan 19 03:02:19 firewallfsi kernel: [85602.913709] [ 879] 0 879 2499 126 9 3 0 0 saslauthd Jan 19 03:02:19 firewallfsi kernel: [85602.914216] [ 880] 0 880 2499 126 9 3 0 0 saslauthd Jan 19 03:02:19 firewallfsi kernel: [85602.914706] [ 881] 0 881 2499 126 9 3 0 0 saslauthd Jan 19 03:02:19 firewallfsi kernel: [85602.915166] [ 882] 0 882 3310 2223 10 3 0 0 watch-ip Jan 19 03:02:19 firewallfsi kernel: [85602.915609] [ 938] 0 938 2769 621 9 3 0 -1000 sshd Jan 19 03:02:19 firewallfsi kernel: [85602.916028] [ 1032] 27 1032 1707 733 7 3 0 0 mysqld_safe Jan 19 03:02:19 firewallfsi kernel: [85602.916432] [ 1130] 25 1130 82172 47875 114 3 0 0 named Jan 19 03:02:19 firewallfsi kernel: [85602.916822] [ 1208] 27 1208 126039 14693 65 3 0 0 mysqld Jan 19 03:02:19 firewallfsi kernel: [85602.917197] [ 1245] 0 1245 7049 2401 16 3 0 0 nmbd Jan 19 03:02:19 firewallfsi kernel: [85602.917559] [ 1246] 0 1246 6840 2267 16 3 0 0 nmbd Jan 19 03:02:19 firewallfsi kernel: [85602.917923] [ 1308] 0 1308 12052 6449 26 3 0 0 /usr/sbin/httpd Jan 19 03:02:19 firewallfsi kernel: [85602.918282] [ 1335] 0 1335 1116 497 6 3 0 0 agetty Jan 19 03:02:19 firewallfsi kernel: [85602.918640] [ 1336] 0 1336 1116 479 6 3 0 0 agetty Jan 19 03:02:19 firewallfsi kernel: [85602.918992] [ 1337] 0 1337 1116 521 6 3 0 0 agetty Jan 19 03:02:19 firewallfsi kernel: [85602.919340] [ 1338] 0 1338 1116 505 6 3 0 0 agetty Jan 19 03:02:19 firewallfsi kernel: [85602.919688] [ 1339] 0 1339 1116 462 6 3 0 0 agetty Jan 19 03:02:19 firewallfsi kernel: [85602.920027] [ 1340] 0 1340 1116 473 6 3 0 0 agetty Jan 19 03:02:19 firewallfsi kernel: [85602.920363] [ 1596] 48 1596 49256 4353 44 3 0 0 /usr/sbin/httpd Jan 19 03:02:19 firewallfsi kernel: [85602.920697] [ 1597] 48 1597 16386 4635 29 3 0 0 /usr/sbin/httpd Jan 19 03:02:19 firewallfsi kernel: [85602.921014] [ 1600] 48 1600 16295 5404 29 3 0 0 /usr/sbin/httpd Jan 19 03:02:19 firewallfsi kernel: [85602.921328] [ 1619] 48 1619 16298 5545 29 3 0 0 /usr/sbin/httpd Jan 19 03:02:19 firewallfsi kernel: [85602.921640] [ 1622] 48 1622 16220 4205 27 3 0 0 /usr/sbin/httpd Jan 19 03:02:19 firewallfsi kernel: [85602.921944] [ 1778] 0 1778 1704 682 6 3 0 0 sh Jan 19 03:02:19 firewallfsi kernel: [85602.922253] [ 1779] 0 1779 2743 1705 9 3 0 0 udp-sgs Jan 19 03:02:19 firewallfsi kernel: [85602.922562] [ 1781] 0 1781 9561 3567 23 3 0 0 smbd Jan 19 03:02:19 firewallfsi kernel: [85602.922875] [ 1782] 0 1782 9197 1092 21 3 0 0 smbd Jan 19 03:02:19 firewallfsi kernel: [85602.923198] [ 1783] 0 1783 9446 1156 21 3 0 0 smbd Jan 19 03:02:19 firewallfsi kernel: [85602.923498] [ 1870] 0 1870 5116 2471 14 3 0 0 dhclient Jan 19 03:02:19 firewallfsi kernel: [85602.923793] [ 1948] 0 1948 594 400 5 3 0 0 pptpd Jan 19 03:02:19 firewallfsi kernel: [85602.924104] [ 1950] 0 1950 954 622 6 3 0 0 dovecot Jan 19 03:02:19 firewallfsi kernel: [85602.924395] [ 1951] 97 1951 904 582 5 3 0 0 anvil Jan 19 03:02:19 firewallfsi kernel: [85602.924689] [ 1952] 0 1952 937 613 6 3 0 0 log Jan 19 03:02:19 firewallfsi kernel: [85602.925000] [ 1954] 0 1954 1133 784 6 3 0 0 config Jan 19 03:02:19 firewallfsi kernel: [85602.925294] [ 1956] 0 1956 4181 1973 12 3 0 0 sshd Jan 19 03:02:19 firewallfsi kernel: [85602.925590] [ 1965] 0 1965 1584 1147 6 3 0 0 systemd Jan 19 03:02:19 firewallfsi kernel: [85602.925904] [ 1971] 0 1971 2455 381 8 3 0 0 (sd-pam) Jan 19 03:02:19 firewallfsi kernel: [85602.926216] [ 2001] 0 2001 4214 1046 12 3 0 0 sshd Jan 19 03:02:19 firewallfsi kernel: [85602.926517] [ 2011] 0 2011 1187 957 5 3 0 0 tcsh Jan 19 03:02:19 firewallfsi kernel: [85602.926821] [ 2035] 48 2035 16220 4208 28 3 0 0 /usr/sbin/httpd Jan 19 03:02:19 firewallfsi kernel: [85602.927144] [ 2040] 48 2040 16298 5544 29 3 0 0 /usr/sbin/httpd Jan 19 03:02:19 firewallfsi kernel: [85602.927445] [ 2042] 48 2042 16216 4586 28 3 0 0 /usr/sbin/httpd Jan 19 03:02:19 firewallfsi kernel: [85602.927746] [ 2055] 177 2055 6833 4786 16 3 0 0 dhcpd Jan 19 03:02:19 firewallfsi kernel: [85602.928062] [ 2062] 0 2062 1899 698 7 3 29 0 crond Jan 19 03:02:19 firewallfsi kernel: [85602.928358] [ 2070] 38 2070 1536 1082 6 3 0 0 ntpd Jan 19 03:02:19 firewallfsi kernel: [85602.928659] [ 2297] 273 2297 2208 1390 8 3 0 0 imap-login Jan 19 03:02:19 firewallfsi kernel: [85602.928977] [ 2299] 301 2299 1762 1248 7 3 0 0 imap Jan 19 03:02:19 firewallfsi kernel: [85602.929276] [ 5376] 0 5376 1307 927 6 3 0 0 reboot-when-oom Jan 19 03:02:19 firewallfsi kernel: [85602.929583] [10505] 276 10505 115120 103978 222 3 0 0 clamd Jan 19 03:02:19 firewallfsi kernel: [85602.929909] [10519] 290 10519 15704 1191 14 3 0 0 clamav-milter Jan 19 03:02:19 firewallfsi kernel: [85602.930239] [10539] 0 10539 3829 1778 10 3 0 0 sendmail Jan 19 03:02:19 firewallfsi kernel: [85602.930565] [10552] 51 10552 3500 765 11 3 0 0 sendmail Jan 19 03:02:19 firewallfsi kernel: [85602.930902] [10658] 23 10658 5440 724 14 3 0 0 squid Jan 19 03:02:19 firewallfsi kernel: [85602.931234] [10660] 23 10660 9797 6723 22 3 0 0 squid Jan 19 03:02:19 firewallfsi kernel: [85602.931551] [10661] 23 10661 1179 419 6 3 0 0 unlinkd Jan 19 03:02:19 firewallfsi kernel: [85602.931879] [13921] 273 13921 2208 1470 8 3 0 0 imap-login Jan 19 03:02:19 firewallfsi kernel: [85602.932206] [13924] 301 13924 1400 973 6 3 0 0 imap Jan 19 03:02:19 firewallfsi kernel: [85602.932519] [13925] 273 13925 2209 1364 9 3 0 0 imap-login Jan 19 03:02:19 firewallfsi kernel: [85602.932840] [13926] 301 13926 2559 1273 7 3 0 0 imap Jan 19 03:02:19 firewallfsi kernel: [85602.933173] [15323] 48 15323 16220 4208 27 3 0 0 /usr/sbin/httpd Jan 19 03:02:19 firewallfsi kernel: [85602.933491] [14826] 273 14826 2208 1448 7 3 0 0 imap-login Jan 19 03:02:19 firewallfsi kernel: [85602.933836] [14828] 301 14828 3995 1513 8 3 0 0 imap Jan 19 03:02:19 firewallfsi kernel: [85602.934181] [18753] 48 18753 16220 4338 28 3 0 0 /usr/sbin/httpd Jan 19 03:02:19 firewallfsi kernel: [85602.934516] [ 5254] 0 5254 9685 3801 23 3 0 0 smbd Jan 19 03:02:19 firewallfsi kernel: [85602.934879] [ 5892] 300 5892 11287 5069 26 3 0 0 smbd Jan 19 03:02:19 firewallfsi kernel: [85602.935241] [31892] 273 31892 2209 1455 8 3 0 0 imap-login Jan 19 03:02:19 firewallfsi kernel: [85602.935583] [31896] 302 31896 1415 1067 7 3 0 0 imap Jan 19 03:02:19 firewallfsi kernel: [85602.935941] [28129] 0 28129 1396 138 6 3 0 0 sleep Jan 19 03:02:19 firewallfsi kernel: [85602.936298] [30312] 273 30312 2209 1436 8 3 0 0 imap-login Jan 19 03:02:19 firewallfsi kernel: [85602.936656] [30315] 301 30315 1432 989 6 3 0 0 imap Jan 19 03:02:19 firewallfsi kernel: [85602.937030] [32695] 0 32695 1900 494 7 3 17 0 crond Jan 19 03:02:19 firewallfsi kernel: [85602.937383] [32696] 0 32696 3184 2117 10 3 0 0 freshclam-tec-w Jan 19 03:02:19 firewallfsi kernel: [85602.937754] [32726] 0 32726 1654 435 7 3 0 0 anacron Jan 19 03:02:19 firewallfsi kernel: [85602.938122] [ 548] 0 548 1396 139 6 3 0 0 sleep Jan 19 03:02:19 firewallfsi kernel: [85602.938481] Out of memory: Kill process 10505 (clamd) score 10 or sacrifice child Jan 19 03:02:19 firewallfsi kernel: [85602.938867] Killed process 10505 (clamd) total-vm:460480kB, anon-rss:399264kB, file-rss:16648kB, shmem-rss:0kB Jan 19 03:02:19 firewallfsi kernel: [85602.970282] audit: type=1131 audit(1484816539.419:2108): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=clamd@scan comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed' ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-20 6:35 ` Trevor Cordes @ 2017-01-20 11:02 ` Mel Gorman 2017-01-20 15:55 ` Mel Gorman 2017-01-24 12:51 ` Michal Hocko 1 sibling, 1 reply; 40+ messages in thread From: Mel Gorman @ 2017-01-20 11:02 UTC (permalink / raw) To: Trevor Cordes Cc: Michal Hocko, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Fri, Jan 20, 2017 at 12:35:44AM -0600, Trevor Cordes wrote: > > > Hi! The git tree above version oom'd after < 24 hours (3:02am) so > > > it doesn't solve the bug. If you need a oom messages dump let me > > > know. > > > > Yes please. > > The first oom from that night attached. Note, the oom wasn't as dire > with your mhocko/4.9.0+ as it usually is with stock 4.8.x: my oom > detector and reboot script was able to do its thing cleanly before the > system became unusable. > > I'll await further instructions and test right away. Maybe I'll try a > few tuning ideas until then. Thanks! > Thanks for the OOM report. I was expecting it to be a particular shape and my expectations were not matched so it took time to consider it further. Can you try the cumulative patch below? It combines three patches that 1. Allow slab shrinking even if the LRU patches are unreclaimable in direct reclaim 2. Shrinks slab based once based on the contents of all memcgs instead of shrinking one at a time 3. Tries to shrink slabs if the lowmem usage is too high Unfortunately it's only boot tested on x86-64 as I didn't get the chance to setup an i386 test bed. > > This is why not only Linus hates 32b systems on a large memory > > systems. > > Completely off-topic: it would be great if rather than pretending PAE > should work with large RAM (which seems more broken every day), the > kernel guys put out an officially stated policy of a maximum RAM you > can use, and try to have the kernel behave for <= that size, and then > people could use more RAM but clearly "at your own risk, don't bug us > about problems!". Other than a few posts about Linus hating it, > there's nothing official I can find about it in documentation, etc. It > gives the (mis)impression that it's perfectly fine to run PAE on a > zillion GB modern system. Then we later learn the hard way :-) The unfortunate reality is that the behaviour is workload dependant so it's impossible to make a general statement other than "your mileage may vary considerably". diff --git a/mm/vmscan.c b/mm/vmscan.c index 2281ad310d06..76d68a8872c7 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2318,6 +2318,52 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg, } } +#ifdef CONFIG_HIGHMEM +static void balance_slab_lowmem(struct pglist_data *pgdat, + struct scan_control *sc) +{ + unsigned long lru_pages = 0; + unsigned long slab_pages = 0; + int zid; + + for (zid = 0; zid < MAX_NR_ZONES; zid++) { + struct zone *zone = &pgdat->node_zones[zid]; + + if (!populated_zone(zone) || !is_highmem_idx(zid)) + continue; + + lru_pages += zone_page_state(zone, NR_ZONE_INACTIVE_FILE); + lru_pages += zone_page_state(zone, NR_ZONE_ACTIVE_FILE); + slab_pages += zone_page_state(zone, NR_SLAB_RECLAIMABLE); + slab_pages += zone_page_state(zone, NR_SLAB_UNRECLAIMABLE); + } + + /* + * Shrink reclaimable slabs if the number of lowmem slab pages is + * over twice the size of LRU pages. Apply pressure relative to + * the imbalance between LRU and slab pages. + */ + if (slab_pages > lru_pages << 1) { + struct reclaim_state *reclaim_state = current->reclaim_state; + unsigned long exceed = (lru_pages << 1) - slab_pages; + int nid = pgdat->node_id; + + exceed = min(exceed, slab_pages); + shrink_slab(sc->gfp_mask, nid, NULL, exceed, slab_pages); + if (reclaim_state) { + sc->nr_reclaimed += reclaim_state->reclaimed_slab; + reclaim_state->reclaimed_slab = 0; + } + } +} +#else +static void balance_slab_lowmem(struct pglist_data *pgdat, + struct scan_control *sc) +{ + return false; +} +#endif + /* * This is a basic per-node page freer. Used by both kswapd and direct reclaim. */ @@ -2336,6 +2382,27 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc get_scan_count(lruvec, memcg, sc, nr, lru_pages); + /* + * If direct reclaiming at elevated priority and the node is + * unreclaimable then skip LRU reclaim and let kswapd poll it. + */ + if (!current_is_kswapd() && + sc->priority != DEF_PRIORITY && + !pgdat_reclaimable(pgdat)) { + unsigned long nr_scanned; + + /* + * Fake scanning so that slab shrinking will continue. For + * lowmem restricted allocations, shrink aggressively. + */ + nr_scanned = SWAP_CLUSTER_MAX << (DEF_PRIORITY - sc->priority); + if (!(sc->gfp_mask & __GFP_HIGHMEM)) + nr_scanned = max(nr_scanned, *lru_pages); + sc->nr_scanned += nr_scanned; + + return; + } + /* Record the original scan target for proportional adjustments later */ memcpy(targets, nr, sizeof(nr)); @@ -2369,6 +2436,7 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc } } + balance_slab_lowmem(pgdat, sc); cond_resched(); if (nr_reclaimed < nr_to_reclaim || scan_adjusted) @@ -2533,7 +2601,8 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) .pgdat = pgdat, .priority = sc->priority, }; - unsigned long node_lru_pages = 0; + unsigned long slab_pressure = 0; + unsigned long slab_eligible = 0; struct mem_cgroup *memcg; nr_reclaimed = sc->nr_reclaimed; @@ -2555,12 +2624,8 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) scanned = sc->nr_scanned; shrink_node_memcg(pgdat, memcg, sc, &lru_pages); - node_lru_pages += lru_pages; - - if (memcg) - shrink_slab(sc->gfp_mask, pgdat->node_id, - memcg, sc->nr_scanned - scanned, - lru_pages); + slab_eligible += lru_pages; + slab_pressure += sc->nr_reclaimed - reclaimed; /* Record the group's reclaim efficiency */ vmpressure(sc->gfp_mask, memcg, false, @@ -2586,12 +2651,12 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) /* * Shrink the slab caches in the same proportion that - * the eligible LRU pages were scanned. + * the eligible LRU pages were scanned. For memcg, this + * will apply the cumulative scanning pressure over all + * memcgs. */ - if (global_reclaim(sc)) - shrink_slab(sc->gfp_mask, pgdat->node_id, NULL, - sc->nr_scanned - nr_scanned, - node_lru_pages); + shrink_slab(sc->gfp_mask, pgdat->node_id, NULL, slab_pressure, + slab_eligible); if (reclaim_state) { sc->nr_reclaimed += reclaim_state->reclaimed_slab; @@ -2683,10 +2748,6 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) GFP_KERNEL | __GFP_HARDWALL)) continue; - if (sc->priority != DEF_PRIORITY && - !pgdat_reclaimable(zone->zone_pgdat)) - continue; /* Let kswapd poll it */ - /* * If we already have plenty of memory free for * compaction in this zone, don't free any more. ^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-20 11:02 ` Mel Gorman @ 2017-01-20 15:55 ` Mel Gorman 2017-01-23 0:45 ` Trevor Cordes 0 siblings, 1 reply; 40+ messages in thread From: Mel Gorman @ 2017-01-20 15:55 UTC (permalink / raw) To: Trevor Cordes Cc: Michal Hocko, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Fri, Jan 20, 2017 at 11:02:32AM +0000, Mel Gorman wrote: > On Fri, Jan 20, 2017 at 12:35:44AM -0600, Trevor Cordes wrote: > > > > Hi! The git tree above version oom'd after < 24 hours (3:02am) so > > > > it doesn't solve the bug. If you need a oom messages dump let me > > > > know. > > > > > > Yes please. > > > > The first oom from that night attached. Note, the oom wasn't as dire > > with your mhocko/4.9.0+ as it usually is with stock 4.8.x: my oom > > detector and reboot script was able to do its thing cleanly before the > > system became unusable. > > > > I'll await further instructions and test right away. Maybe I'll try a > > few tuning ideas until then. Thanks! > > > > Thanks for the OOM report. I was expecting it to be a particular shape and > my expectations were not matched so it took time to consider it further. Can > you try the cumulative patch below? It combines three patches that > > 1. Allow slab shrinking even if the LRU patches are unreclaimable in > direct reclaim > 2. Shrinks slab based once based on the contents of all memcgs instead > of shrinking one at a time > 3. Tries to shrink slabs if the lowmem usage is too high > > Unfortunately it's only boot tested on x86-64 as I didn't get the chance > to setup an i386 test bed. > There was one major flaw in that patch. This version fixes it and addresses other minor issues. It may still be too agressive shrinking slab but worth trying out. Thanks. diff --git a/mm/vmscan.c b/mm/vmscan.c index 2281ad310d06..2c735ea24a85 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2318,6 +2318,59 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg, } } +#ifdef CONFIG_HIGHMEM +static void balance_slab_lowmem(struct pglist_data *pgdat, + struct scan_control *sc) +{ + unsigned long lru_pages = 0; + unsigned long slab_pages = 0; + unsigned long managed_pages = 0; + int zid; + + for (zid = 0; zid < MAX_NR_ZONES; zid++) { + struct zone *zone = &pgdat->node_zones[zid]; + + if (!populated_zone(zone) || is_highmem_idx(zid)) + continue; + + lru_pages += zone_page_state(zone, NR_ZONE_INACTIVE_FILE); + lru_pages += zone_page_state(zone, NR_ZONE_ACTIVE_FILE); + lru_pages += zone_page_state(zone, NR_ZONE_INACTIVE_ANON); + lru_pages += zone_page_state(zone, NR_ZONE_ACTIVE_ANON); + slab_pages += zone_page_state(zone, NR_SLAB_RECLAIMABLE); + slab_pages += zone_page_state(zone, NR_SLAB_UNRECLAIMABLE); + } + + /* Do not balance until LRU and slab exceeds 50% of lowmem */ + if (lru_pages + slab_pages < (managed_pages >> 1)) + return; + + /* + * Shrink reclaimable slabs if the number of lowmem slab pages is + * over twice the size of LRU pages. Apply pressure relative to + * the imbalance between LRU and slab pages. + */ + if (slab_pages > lru_pages << 1) { + struct reclaim_state *reclaim_state = current->reclaim_state; + unsigned long exceed = slab_pages - (lru_pages << 1); + int nid = pgdat->node_id; + + exceed = min(exceed, slab_pages); + shrink_slab(sc->gfp_mask, nid, NULL, exceed >> 3, slab_pages); + if (reclaim_state) { + sc->nr_reclaimed += reclaim_state->reclaimed_slab; + reclaim_state->reclaimed_slab = 0; + } + } +} +#else +static void balance_slab_lowmem(struct pglist_data *pgdat, + struct scan_control *sc) +{ + return; +} +#endif + /* * This is a basic per-node page freer. Used by both kswapd and direct reclaim. */ @@ -2336,6 +2389,27 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc get_scan_count(lruvec, memcg, sc, nr, lru_pages); + /* + * If direct reclaiming at elevated priority and the node is + * unreclaimable then skip LRU reclaim and let kswapd poll it. + */ + if (!current_is_kswapd() && + sc->priority != DEF_PRIORITY && + !pgdat_reclaimable(pgdat)) { + unsigned long nr_scanned; + + /* + * Fake scanning so that slab shrinking will continue. For + * lowmem restricted allocations, shrink aggressively. + */ + nr_scanned = SWAP_CLUSTER_MAX << (DEF_PRIORITY - sc->priority); + if (!(sc->gfp_mask & __GFP_HIGHMEM)) + nr_scanned = max(nr_scanned, *lru_pages); + sc->nr_scanned += nr_scanned; + + return; + } + /* Record the original scan target for proportional adjustments later */ memcpy(targets, nr, sizeof(nr)); @@ -2435,6 +2509,8 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc if (inactive_list_is_low(lruvec, false, sc, true)) shrink_active_list(SWAP_CLUSTER_MAX, lruvec, sc, LRU_ACTIVE_ANON); + + balance_slab_lowmem(pgdat, sc); } /* Use reclaim/compaction for costly allocs or under memory pressure */ @@ -2533,7 +2609,8 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) .pgdat = pgdat, .priority = sc->priority, }; - unsigned long node_lru_pages = 0; + unsigned long slab_pressure = 0; + unsigned long slab_eligible = 0; struct mem_cgroup *memcg; nr_reclaimed = sc->nr_reclaimed; @@ -2555,12 +2632,8 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) scanned = sc->nr_scanned; shrink_node_memcg(pgdat, memcg, sc, &lru_pages); - node_lru_pages += lru_pages; - - if (memcg) - shrink_slab(sc->gfp_mask, pgdat->node_id, - memcg, sc->nr_scanned - scanned, - lru_pages); + slab_eligible += lru_pages; + slab_pressure += sc->nr_reclaimed - reclaimed; /* Record the group's reclaim efficiency */ vmpressure(sc->gfp_mask, memcg, false, @@ -2586,12 +2659,12 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc) /* * Shrink the slab caches in the same proportion that - * the eligible LRU pages were scanned. + * the eligible LRU pages were scanned. For memcg, this + * will apply the cumulative scanning pressure over all + * memcgs. */ - if (global_reclaim(sc)) - shrink_slab(sc->gfp_mask, pgdat->node_id, NULL, - sc->nr_scanned - nr_scanned, - node_lru_pages); + shrink_slab(sc->gfp_mask, pgdat->node_id, NULL, slab_pressure, + slab_eligible); if (reclaim_state) { sc->nr_reclaimed += reclaim_state->reclaimed_slab; @@ -2683,10 +2756,6 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) GFP_KERNEL | __GFP_HARDWALL)) continue; - if (sc->priority != DEF_PRIORITY && - !pgdat_reclaimable(zone->zone_pgdat)) - continue; /* Let kswapd poll it */ - /* * If we already have plenty of memory free for * compaction in this zone, don't free any more. -- Mel Gorman SUSE Labs ^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-20 15:55 ` Mel Gorman @ 2017-01-23 0:45 ` Trevor Cordes 2017-01-23 10:48 ` Mel Gorman 2017-01-24 12:54 ` Michal Hocko 0 siblings, 2 replies; 40+ messages in thread From: Trevor Cordes @ 2017-01-23 0:45 UTC (permalink / raw) To: Mel Gorman Cc: Michal Hocko, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju [-- Attachment #1: Type: text/plain, Size: 7523 bytes --] On 2017-01-20 Mel Gorman wrote: > > > > Thanks for the OOM report. I was expecting it to be a particular > > shape and my expectations were not matched so it took time to > > consider it further. Can you try the cumulative patch below? It > > combines three patches that > > > > 1. Allow slab shrinking even if the LRU patches are unreclaimable in > > direct reclaim > > 2. Shrinks slab based once based on the contents of all memcgs > > instead of shrinking one at a time > > 3. Tries to shrink slabs if the lowmem usage is too high > > > > Unfortunately it's only boot tested on x86-64 as I didn't get the > > chance to setup an i386 test bed. > > > > There was one major flaw in that patch. This version fixes it and > addresses other minor issues. It may still be too agressive shrinking > slab but worth trying out. Thanks. I ran with your patch below and it oom'd on the first night. It was weird, it didn't hang the system, and my rebooter script started a reboot but the system never got more than half down before it just sat there in a weird state where a local console user could still login but not much was working. So the patches don't seem to solve the problem. For the above compile I applied your patches to 4.10.0-rc4+, I hope that's ok. Attached is the first oom from that night. I include some stuff below the oom where the kernel is obviously having issues and dumping more strange output. I don't think I've seen that before. That probably explains the strange state it was left in. Also, completely separate from your patch I ran mhocko's 4.9 tree with mem=2G to see if lower ram amount would help, but it didn't. Even with 2G the system oom and hung same as usual. So far the only thing that helps at all was the cgroup_disable=memory option, which makes the problem disappear completely for me. I added that option to 3 other boxes I admin with PAE and that plus limiting ram to <4GB gets rid of the bug. However, on the RHBZ on this bug I am commenting on, someone there reports that cgroup_disable=memory doesn't help him at all. Hopefully the oom attached can help you figure out a next step. Thanks! > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 2281ad310d06..2c735ea24a85 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2318,6 +2318,59 @@ static void get_scan_count(struct lruvec > *lruvec, struct mem_cgroup *memcg, } > } > > +#ifdef CONFIG_HIGHMEM > +static void balance_slab_lowmem(struct pglist_data *pgdat, > + struct scan_control *sc) > +{ > + unsigned long lru_pages = 0; > + unsigned long slab_pages = 0; > + unsigned long managed_pages = 0; > + int zid; > + > + for (zid = 0; zid < MAX_NR_ZONES; zid++) { > + struct zone *zone = &pgdat->node_zones[zid]; > + > + if (!populated_zone(zone) || is_highmem_idx(zid)) > + continue; > + > + lru_pages += zone_page_state(zone, > NR_ZONE_INACTIVE_FILE); > + lru_pages += zone_page_state(zone, > NR_ZONE_ACTIVE_FILE); > + lru_pages += zone_page_state(zone, > NR_ZONE_INACTIVE_ANON); > + lru_pages += zone_page_state(zone, > NR_ZONE_ACTIVE_ANON); > + slab_pages += zone_page_state(zone, > NR_SLAB_RECLAIMABLE); > + slab_pages += zone_page_state(zone, > NR_SLAB_UNRECLAIMABLE); > + } > + > + /* Do not balance until LRU and slab exceeds 50% of lowmem */ > + if (lru_pages + slab_pages < (managed_pages >> 1)) > + return; > + > + /* > + * Shrink reclaimable slabs if the number of lowmem slab > pages is > + * over twice the size of LRU pages. Apply pressure relative > to > + * the imbalance between LRU and slab pages. > + */ > + if (slab_pages > lru_pages << 1) { > + struct reclaim_state *reclaim_state = > current->reclaim_state; > + unsigned long exceed = slab_pages - (lru_pages << 1); > + int nid = pgdat->node_id; > + > + exceed = min(exceed, slab_pages); > + shrink_slab(sc->gfp_mask, nid, NULL, exceed >> 3, > slab_pages); > + if (reclaim_state) { > + sc->nr_reclaimed += > reclaim_state->reclaimed_slab; > + reclaim_state->reclaimed_slab = 0; > + } > + } > +} > +#else > +static void balance_slab_lowmem(struct pglist_data *pgdat, > + struct scan_control *sc) > +{ > + return; > +} > +#endif > + > /* > * This is a basic per-node page freer. Used by both kswapd and > direct reclaim. */ > @@ -2336,6 +2389,27 @@ static void shrink_node_memcg(struct > pglist_data *pgdat, struct mem_cgroup *memc > get_scan_count(lruvec, memcg, sc, nr, lru_pages); > > + /* > + * If direct reclaiming at elevated priority and the node is > + * unreclaimable then skip LRU reclaim and let kswapd poll > it. > + */ > + if (!current_is_kswapd() && > + sc->priority != DEF_PRIORITY && > + !pgdat_reclaimable(pgdat)) { > + unsigned long nr_scanned; > + > + /* > + * Fake scanning so that slab shrinking will > continue. For > + * lowmem restricted allocations, shrink > aggressively. > + */ > + nr_scanned = SWAP_CLUSTER_MAX << (DEF_PRIORITY - > sc->priority); > + if (!(sc->gfp_mask & __GFP_HIGHMEM)) > + nr_scanned = max(nr_scanned, *lru_pages); > + sc->nr_scanned += nr_scanned; > + > + return; > + } > + > /* Record the original scan target for proportional > adjustments later */ memcpy(targets, nr, sizeof(nr)); > > @@ -2435,6 +2509,8 @@ static void shrink_node_memcg(struct > pglist_data *pgdat, struct mem_cgroup *memc if > (inactive_list_is_low(lruvec, false, sc, true)) > shrink_active_list(SWAP_CLUSTER_MAX, lruvec, sc, LRU_ACTIVE_ANON); > + > + balance_slab_lowmem(pgdat, sc); > } > > /* Use reclaim/compaction for costly allocs or under memory pressure > */ @@ -2533,7 +2609,8 @@ static bool shrink_node(pg_data_t *pgdat, > struct scan_control *sc) .pgdat = pgdat, > .priority = sc->priority, > }; > - unsigned long node_lru_pages = 0; > + unsigned long slab_pressure = 0; > + unsigned long slab_eligible = 0; > struct mem_cgroup *memcg; > > nr_reclaimed = sc->nr_reclaimed; > @@ -2555,12 +2632,8 @@ static bool shrink_node(pg_data_t *pgdat, > struct scan_control *sc) scanned = sc->nr_scanned; > > shrink_node_memcg(pgdat, memcg, sc, > &lru_pages); > - node_lru_pages += lru_pages; > - > - if (memcg) > - shrink_slab(sc->gfp_mask, > pgdat->node_id, > - memcg, sc->nr_scanned - > scanned, > - lru_pages); > + slab_eligible += lru_pages; > + slab_pressure += sc->nr_reclaimed - > reclaimed; > /* Record the group's reclaim efficiency */ > vmpressure(sc->gfp_mask, memcg, false, > @@ -2586,12 +2659,12 @@ static bool shrink_node(pg_data_t *pgdat, > struct scan_control *sc) > /* > * Shrink the slab caches in the same proportion that > - * the eligible LRU pages were scanned. > + * the eligible LRU pages were scanned. For memcg, > this > + * will apply the cumulative scanning pressure over > all > + * memcgs. > */ > - if (global_reclaim(sc)) > - shrink_slab(sc->gfp_mask, pgdat->node_id, > NULL, > - sc->nr_scanned - nr_scanned, > - node_lru_pages); > + shrink_slab(sc->gfp_mask, pgdat->node_id, NULL, > slab_pressure, > + > slab_eligible); > if (reclaim_state) { > sc->nr_reclaimed += > reclaim_state->reclaimed_slab; @@ -2683,10 +2756,6 @@ static void > shrink_zones(struct zonelist *zonelist, struct scan_control *sc) > GFP_KERNEL | __GFP_HARDWALL)) continue; > > - if (sc->priority != DEF_PRIORITY && > - !pgdat_reclaimable(zone->zone_pgdat)) > - continue; /* Let kswapd poll > it */ - > /* > * If we already have plenty of memory free > for > * compaction in this zone, don't free any > more. [-- Attachment #2: oom4 --] [-- Type: application/octet-stream, Size: 35300 bytes --] Jan 22 04:04:28 firewallfsi kernel: [38016.566153] sb_mboxtrain.py invoked oom-killer: gfp_mask=0x1420848(GFP_NOFS|__GFP_NOFAIL|__GFP_HARDWALL|__GFP_MOVABLE), nodemask=0, order=0, oom_score_adj=0 Jan 22 04:04:28 firewallfsi kernel: [38016.568190] sb_mboxtrain.py cpuset=/ mems_allowed=0 Jan 22 04:04:28 firewallfsi kernel: [38016.569185] CPU: 7 PID: 6601 Comm: sb_mboxtrain.py Not tainted 4.10.0-rc4+ #17 Jan 22 04:04:28 firewallfsi kernel: [38016.570161] Hardware name: Intel Corporation S1200BTL/S1200BTL, BIOS S1200BT.86B.02.00.0035.030220120927 03/02/2012 Jan 22 04:04:28 firewallfsi kernel: [38016.571145] Call Trace: Jan 22 04:04:28 firewallfsi kernel: [38016.572124] dump_stack+0x58/0x81 Jan 22 04:04:28 firewallfsi kernel: [38016.573086] dump_header+0x64/0x1a6 Jan 22 04:04:28 firewallfsi kernel: [38016.574178] ? _raw_spin_unlock_irqrestore+0xd/0x10 Jan 22 04:04:28 firewallfsi kernel: [38016.575233] ? ___ratelimit+0x9f/0x100 Jan 22 04:04:28 firewallfsi kernel: [38016.576197] oom_kill_process+0x207/0x3d0 Jan 22 04:04:28 firewallfsi kernel: [38016.577150] ? has_capability_noaudit+0x1a/0x30 Jan 22 04:04:28 firewallfsi kernel: [38016.578108] ? oom_badness.part.13+0xcb/0x140 Jan 22 04:04:28 firewallfsi kernel: [38016.579068] out_of_memory+0xf8/0x2a0 Jan 22 04:04:28 firewallfsi kernel: [38016.580025] __alloc_pages_nodemask+0xb22/0xc30 Jan 22 04:04:28 firewallfsi kernel: [38016.580990] pagecache_get_page+0xbe/0x2d0 Jan 22 04:04:28 firewallfsi kernel: [38016.581948] __getblk_gfp+0x104/0x360 Jan 22 04:04:28 firewallfsi kernel: [38016.582899] __ext4_get_inode_loc+0x104/0x440 Jan 22 04:04:28 firewallfsi kernel: [38016.583840] ext4_reserve_inode_write+0x2a/0x80 Jan 22 04:04:28 firewallfsi kernel: [38016.584766] ? jbd2__journal_start+0xbf/0x1b0 Jan 22 04:04:28 firewallfsi kernel: [38016.585667] ext4_mark_inode_dirty+0x49/0x200 Jan 22 04:04:28 firewallfsi kernel: [38016.586552] ? ext4_dirty_inode+0x48/0x60 Jan 22 04:04:28 firewallfsi kernel: [38016.587417] ext4_dirty_inode+0x48/0x60 Jan 22 04:04:28 firewallfsi kernel: [38016.588270] __mark_inode_dirty+0x150/0x340 Jan 22 04:04:28 firewallfsi kernel: [38016.589114] generic_update_time+0x66/0xb0 Jan 22 04:04:28 firewallfsi kernel: [38016.589939] ? __atime_needs_update+0x6e/0x160 Jan 22 04:04:28 firewallfsi kernel: [38016.590744] ? find_inode_nowait+0xb0/0xb0 Jan 22 04:04:28 firewallfsi kernel: [38016.591530] touch_atime+0x8f/0xb0 Jan 22 04:04:28 firewallfsi kernel: [38016.592295] generic_file_read_iter+0x75a/0x8a0 Jan 22 04:04:28 firewallfsi kernel: [38016.593052] ? unlock_page+0x60/0x60 Jan 22 04:04:28 firewallfsi kernel: [38016.593798] ext4_file_read_iter+0x2b/0xa0 Jan 22 04:04:28 firewallfsi kernel: [38016.594528] __vfs_read+0xe0/0x150 Jan 22 04:04:28 firewallfsi kernel: [38016.595237] vfs_read+0x7b/0x140 Jan 22 04:04:28 firewallfsi kernel: [38016.595923] SyS_read+0x49/0xb0 Jan 22 04:04:28 firewallfsi kernel: [38016.596586] do_fast_syscall_32+0x8a/0x150 Jan 22 04:04:28 firewallfsi kernel: [38016.597239] entry_SYSENTER_32+0x4e/0x7c Jan 22 04:04:28 firewallfsi kernel: [38016.597880] EIP: 0xb77cbd25 Jan 22 04:04:28 firewallfsi kernel: [38016.598499] EFLAGS: 00200246 CPU: 7 Jan 22 04:04:28 firewallfsi kernel: [38016.599099] EAX: ffffffda EBX: 00000004 ECX: 80f04384 EDX: 00002000 Jan 22 04:04:28 firewallfsi kernel: [38016.599694] ESI: 80ac1c30 EDI: 00002ff2 EBP: 80f04384 ESP: bf829838 Jan 22 04:04:28 firewallfsi kernel: [38016.600281] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b Jan 22 04:04:28 firewallfsi kernel: [38016.600874] Mem-Info: Jan 22 04:04:28 firewallfsi kernel: [38016.601433] active_anon:195761 inactive_anon:1207 isolated_anon:0 Jan 22 04:04:28 firewallfsi kernel: [38016.601433] active_file:105882 inactive_file:482932 isolated_file:32 Jan 22 04:04:28 firewallfsi kernel: [38016.601433] unevictable:0 dirty:4 writeback:8 unstable:0 Jan 22 04:04:28 firewallfsi kernel: [38016.601433] slab_reclaimable:183989 slab_unreclaimable:11756 Jan 22 04:04:28 firewallfsi kernel: [38016.601433] mapped:27071 shmem:1396 pagetables:1593 bounce:8 Jan 22 04:04:28 firewallfsi kernel: [38016.601433] free:222933 free_pcp:538 free_cma:0 Jan 22 04:04:28 firewallfsi kernel: [38016.604519] Node 0 active_anon:783044kB inactive_anon:4828kB active_file:423528kB inactive_file:1931728kB unevictable:0kB isolated(anon):0kB isolated(file):128kB mapped:108284kB dirty:16kB writeback:32kB shmem:5584kB writeback_tmp:0kB unstable:0kB pages_scanned:59206274 all_unreclaimable? yes Jan 22 04:04:28 firewallfsi kernel: [38016.606602] DMA free:3144kB min:68kB low:84kB high:100kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15976kB managed:15900kB mlocked:0kB slab_reclaimable:10812kB slab_unreclaimable:1268kB kernel_stack:0kB pagetables:0kB bounce:4kB free_pcp:0kB local_pcp:0kB free_cma:0kB Jan 22 04:04:28 firewallfsi kernel: [38016.609095] lowmem_reserve[]: 0 777 4733 4733 Jan 22 04:04:28 firewallfsi kernel: [38016.609894] Normal free:3528kB min:3532kB low:4412kB high:5292kB active_anon:0kB inactive_anon:0kB active_file:2732kB inactive_file:152kB unevictable:0kB writepending:0kB present:892920kB managed:816828kB mlocked:0kB slab_reclaimable:725144kB slab_unreclaimable:45756kB kernel_stack:2736kB pagetables:0kB bounce:28kB free_pcp:1480kB local_pcp:280kB free_cma:0kB Jan 22 04:04:28 firewallfsi kernel: [38016.612467] lowmem_reserve[]: 0 0 31652 31652 Jan 22 04:04:28 firewallfsi kernel: [38016.613354] HighMem free:885060kB min:512kB low:5008kB high:9504kB active_anon:783044kB inactive_anon:4828kB active_file:420880kB inactive_file:1931404kB unevictable:0kB writepending:48kB present:4051548kB managed:4051548kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:6372kB bounce:0kB free_pcp:672kB local_pcp:124kB free_cma:0kB Jan 22 04:04:28 firewallfsi kernel: [38016.616233] lowmem_reserve[]: 0 0 0 0 Jan 22 04:04:28 firewallfsi kernel: [38016.617230] DMA: 12*4kB (UE) 13*8kB (E) 63*16kB (UE) 33*32kB (UE) 13*64kB (UE) 1*128kB (U) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3176kB Jan 22 04:04:28 firewallfsi kernel: [38016.619309] Normal: 412*4kB (MH) 235*8kB (UMH) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3528kB Jan 22 04:04:28 firewallfsi kernel: [38016.621498] HighMem: 1*4kB (U) 54*8kB (UM) 25*16kB (U) 4*32kB (UM) 72*64kB (U) 49*128kB (UM) 37*256kB (UM) 39*512kB (UM) 8*1024kB (UM) 0*2048kB 204*4096kB (M) = 885060kB Jan 22 04:04:28 firewallfsi kernel: [38016.624026] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Jan 22 04:04:28 firewallfsi kernel: [38016.625361] 590204 total pagecache pages Jan 22 04:04:28 firewallfsi kernel: [38016.626827] 0 pages in swap cache Jan 22 04:04:28 firewallfsi kernel: [38016.628047] Swap cache stats: add 0, delete 0, find 0/0 Jan 22 04:04:28 firewallfsi kernel: [38016.629385] Free swap = 33784572kB Jan 22 04:04:28 firewallfsi kernel: [38016.630847] Total swap = 33784572kB Jan 22 04:04:28 firewallfsi kernel: [38016.632337] 1240111 pages RAM Jan 22 04:04:28 firewallfsi kernel: [38016.633788] 1012887 pages HighMem/MovableOnly Jan 22 04:04:28 firewallfsi kernel: [38016.635170] 19042 pages reserved Jan 22 04:04:28 firewallfsi kernel: [38016.636634] 0 pages hwpoisoned Jan 22 04:04:28 firewallfsi kernel: [38016.638082] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name Jan 22 04:04:28 firewallfsi kernel: [38016.639581] [ 593] 0 593 5324 4066 14 3 0 0 systemd-journal Jan 22 04:04:28 firewallfsi kernel: [38016.641082] [ 629] 0 629 3548 1176 9 3 0 -1000 systemd-udevd Jan 22 04:04:28 firewallfsi kernel: [38016.642490] [ 733] 0 733 1472 998 6 3 0 0 smartd Jan 22 04:04:28 firewallfsi kernel: [38016.643721] [ 748] 0 748 8633 1062 11 3 0 0 rsyslogd Jan 22 04:04:28 firewallfsi kernel: [38016.645057] [ 759] 0 759 1704 655 7 3 0 0 sh Jan 22 04:04:28 firewallfsi kernel: [38016.646322] [ 760] 288 760 14457 1693 15 3 0 0 milter-greylist Jan 22 04:04:28 firewallfsi kernel: [38016.647631] [ 785] 0 785 808 487 5 3 0 0 mdadm Jan 22 04:04:28 firewallfsi kernel: [38016.648860] [ 809] 0 809 583 423 5 3 0 0 acpid Jan 22 04:04:28 firewallfsi kernel: [38016.650077] [ 810] 81 810 1700 1038 7 3 0 -900 dbus-daemon Jan 22 04:04:28 firewallfsi kernel: [38016.651299] [ 857] 0 857 3062 1648 10 3 0 0 fetchmail Jan 22 04:04:28 firewallfsi kernel: [38016.652513] [ 894] 0 894 2499 381 8 3 0 0 saslauthd Jan 22 04:04:28 firewallfsi kernel: [38016.653717] [ 895] 0 895 2499 125 8 3 0 0 saslauthd Jan 22 04:04:28 firewallfsi kernel: [38016.654906] [ 896] 0 896 2499 125 8 3 0 0 saslauthd Jan 22 04:04:28 firewallfsi kernel: [38016.656096] [ 897] 0 897 2499 125 8 3 0 0 saslauthd Jan 22 04:04:28 firewallfsi kernel: [38016.657443] [ 898] 0 898 2499 125 8 3 0 0 saslauthd Jan 22 04:04:28 firewallfsi kernel: [38016.658844] [ 979] 0 979 2769 736 9 3 0 -1000 sshd Jan 22 04:04:28 firewallfsi kernel: [38016.660331] [ 1049] 0 1049 988 733 5 3 0 0 systemd-logind Jan 22 04:04:28 firewallfsi kernel: [38016.661467] [ 1053] 0 1053 7287 1267 11 3 0 0 apcupsd Jan 22 04:04:28 firewallfsi kernel: [38016.662809] [ 1093] 0 1093 860 492 4 3 0 0 atd Jan 22 04:04:28 firewallfsi kernel: [38016.664110] [ 1142] 27 1142 1707 723 6 3 0 0 mysqld_safe Jan 22 04:04:28 firewallfsi kernel: [38016.665411] [ 1191] 0 1191 1116 482 6 3 0 0 agetty Jan 22 04:04:28 firewallfsi kernel: [38016.666540] [ 1192] 0 1192 1116 535 6 3 0 0 agetty Jan 22 04:04:28 firewallfsi kernel: [38016.667773] [ 1193] 0 1193 1116 528 5 3 0 0 agetty Jan 22 04:04:28 firewallfsi kernel: [38016.668969] [ 1194] 0 1194 1116 509 6 3 0 0 agetty Jan 22 04:04:28 firewallfsi kernel: [38016.669919] [ 1195] 0 1195 1116 490 6 3 0 0 agetty Jan 22 04:04:28 firewallfsi kernel: [38016.670785] [ 1196] 0 1196 1116 491 6 3 0 0 agetty Jan 22 04:04:28 firewallfsi kernel: [38016.671745] [ 1313] 27 1313 126039 14603 64 3 0 0 mysqld Jan 22 04:04:28 firewallfsi kernel: [38016.672737] [ 1343] 0 1343 7049 2368 17 3 0 0 nmbd Jan 22 04:04:28 firewallfsi kernel: [38016.673702] [ 1345] 0 1345 6840 2264 17 3 0 0 nmbd Jan 22 04:04:28 firewallfsi kernel: [38016.674614] [ 1353] 25 1353 81136 46879 113 3 0 0 named Jan 22 04:04:28 firewallfsi kernel: [38016.675502] [ 1415] 276 1415 115809 104485 223 3 0 0 clamd Jan 22 04:04:28 firewallfsi kernel: [38016.676354] [ 1417] 0 1417 12052 6485 26 3 0 0 /usr/sbin/httpd Jan 22 04:04:28 firewallfsi kernel: [38016.677167] [ 1418] 290 1418 10838 1168 11 3 0 0 clamav-milter Jan 22 04:04:28 firewallfsi kernel: [38016.677928] [ 1542] 23 1542 5440 1187 15 3 0 0 squid Jan 22 04:04:28 firewallfsi kernel: [38016.678684] [ 1544] 23 1544 9541 6462 20 3 0 0 squid Jan 22 04:04:28 firewallfsi kernel: [38016.679406] [ 1547] 23 1547 1179 420 6 3 0 0 unlinkd Jan 22 04:04:28 firewallfsi kernel: [38016.680105] [ 1585] 48 1585 49256 4559 45 3 0 0 /usr/sbin/httpd Jan 22 04:04:28 firewallfsi kernel: [38016.680813] [ 1586] 48 1586 16220 4558 28 3 0 0 /usr/sbin/httpd Jan 22 04:04:28 firewallfsi kernel: [38016.681460] [ 1592] 48 1592 16216 4557 28 3 0 0 /usr/sbin/httpd Jan 22 04:04:28 firewallfsi kernel: [38016.682102] [ 1608] 48 1608 16216 4556 28 3 0 0 /usr/sbin/httpd Jan 22 04:04:28 firewallfsi kernel: [38016.682730] [ 1611] 48 1611 16218 4294 28 3 0 0 /usr/sbin/httpd Jan 22 04:04:28 firewallfsi kernel: [38016.683322] [ 1768] 0 1768 9561 3584 23 3 0 0 smbd Jan 22 04:04:28 firewallfsi kernel: [38016.683843] [ 1769] 0 1769 9197 1114 22 3 0 0 smbd Jan 22 04:04:28 firewallfsi kernel: [38016.684345] [ 1770] 0 1770 9446 1219 22 3 0 0 smbd Jan 22 04:04:28 firewallfsi kernel: [38016.684844] [ 1842] 0 1842 5116 2167 13 3 0 0 dhclient Jan 22 04:04:28 firewallfsi kernel: [38016.685346] [ 1931] 0 1931 594 434 5 3 0 0 pptpd Jan 22 04:04:28 firewallfsi kernel: [38016.685929] [ 1938] 0 1938 954 638 4 3 0 0 dovecot Jan 22 04:04:28 firewallfsi kernel: [38016.686499] [ 1939] 97 1939 904 565 5 3 0 0 anvil Jan 22 04:04:28 firewallfsi kernel: [38016.687060] [ 1940] 0 1940 937 614 5 3 0 0 log Jan 22 04:04:28 firewallfsi kernel: [38016.687621] [ 1942] 0 1942 1142 783 5 3 0 0 config Jan 22 04:04:28 firewallfsi kernel: [38016.688179] [ 1943] 48 1943 16220 4283 28 3 0 0 /usr/sbin/httpd Jan 22 04:04:28 firewallfsi kernel: [38016.688734] [ 1949] 177 1949 6833 4729 17 3 0 0 dhcpd Jan 22 04:04:28 firewallfsi kernel: [38016.689289] [ 1950] 0 1950 1907 769 7 3 0 0 crond Jan 22 04:04:28 firewallfsi kernel: [38016.689890] [ 1951] 38 1951 1536 1093 6 3 0 0 ntpd Jan 22 04:04:28 firewallfsi kernel: [38016.690456] [ 1962] 0 1962 11222 4005 26 3 0 0 smbd Jan 22 04:04:28 firewallfsi kernel: [38016.690991] [ 1971] 302 1971 1447 1089 6 3 0 0 imap Jan 22 04:04:28 firewallfsi kernel: [38016.691572] [ 1972] 304 1972 1399 988 6 3 0 0 imap Jan 22 04:04:28 firewallfsi kernel: [38016.692078] [ 1977] 303 1977 1394 941 6 3 0 0 imap Jan 22 04:04:28 firewallfsi kernel: [38016.692626] [ 1979] 301 1979 1638 1161 6 3 0 0 imap Jan 22 04:04:28 firewallfsi kernel: [38016.693151] [ 2763] 273 2763 2208 1442 8 3 0 0 imap-login Jan 22 04:04:28 firewallfsi kernel: [38016.693714] [ 2767] 301 2767 1659 1130 7 3 0 0 imap Jan 22 04:04:28 firewallfsi kernel: [38016.694267] [ 5041] 0 5041 4181 1974 11 3 0 0 sshd Jan 22 04:04:28 firewallfsi kernel: [38016.694832] [ 5043] 0 5043 1584 1144 6 3 0 0 systemd Jan 22 04:04:28 firewallfsi kernel: [38016.695379] [ 5051] 0 5051 2456 390 8 3 0 0 (sd-pam) Jan 22 04:04:28 firewallfsi kernel: [38016.695935] [ 5070] 0 5070 4214 935 11 3 0 0 sshd Jan 22 04:04:28 firewallfsi kernel: [38016.696486] [ 5078] 0 5078 1188 931 5 3 0 0 tcsh Jan 22 04:04:28 firewallfsi kernel: [38016.697024] [11258] 273 11258 2209 1413 8 3 0 0 imap-login Jan 22 04:04:28 firewallfsi kernel: [38016.697577] [11261] 301 11261 1401 928 6 3 0 0 imap Jan 22 04:04:28 firewallfsi kernel: [38016.698073] [18933] 48 18933 16220 4295 28 3 0 0 /usr/sbin/httpd Jan 22 04:04:28 firewallfsi kernel: [38016.698540] [27720] 273 27720 2209 1453 8 3 0 0 imap-login Jan 22 04:04:28 firewallfsi kernel: [38016.699010] [27724] 302 27724 1423 1047 6 3 0 0 imap Jan 22 04:04:28 firewallfsi kernel: [38016.699551] [31627] 48 31627 16216 3530 28 3 0 0 /usr/sbin/httpd Jan 22 04:04:28 firewallfsi kernel: [38016.700148] [ 9644] 0 9644 1307 922 6 3 0 0 reboot-when-oom Jan 22 04:04:28 firewallfsi kernel: [38016.700758] [13818] 301 13818 1462 1086 6 3 0 0 imap Jan 22 04:04:28 firewallfsi kernel: [38016.701368] [14022] 301 14022 3879 1423 8 3 0 0 imap Jan 22 04:04:28 firewallfsi kernel: [38016.701982] [15362] 0 15362 1900 565 7 3 0 0 crond Jan 22 04:04:28 firewallfsi kernel: [38016.702595] [15363] 0 15363 1705 681 6 3 0 0 raid-check Jan 22 04:04:28 firewallfsi kernel: [38016.703203] [28256] 167 28256 19314 2952 20 3 0 0 polkitd Jan 22 04:04:28 firewallfsi kernel: [38016.703824] [32762] 0 32762 1704 641 7 3 0 0 sh Jan 22 04:04:28 firewallfsi kernel: [38016.704430] [32764] 0 32764 1704 640 6 3 0 0 sh Jan 22 04:04:28 firewallfsi kernel: [38016.705047] [32766] 0 32766 1736 743 6 3 0 0 tickle-pog Jan 22 04:04:28 firewallfsi kernel: [38016.705579] [32767] 0 32767 3165 2066 10 3 0 0 mailwarnings Jan 22 04:04:28 firewallfsi kernel: [38016.706072] [ 762] 0 762 1908 577 7 3 0 0 crond Jan 22 04:04:28 firewallfsi kernel: [38016.706656] [ 764] 0 764 3209 2085 10 3 0 0 freshclam-tec-w Jan 22 04:04:28 firewallfsi kernel: [38016.707239] [ 986] 0 986 1673 595 7 3 0 0 anacron Jan 22 04:04:28 firewallfsi kernel: [38016.707853] [ 1565] 0 1565 1704 658 7 3 0 0 sh Jan 22 04:04:28 firewallfsi kernel: [38016.708473] [ 1566] 0 1566 1704 641 6 3 0 0 sh Jan 22 04:04:28 firewallfsi kernel: [38016.709095] [ 1567] 0 1567 1704 640 6 3 0 0 sh Jan 22 04:04:28 firewallfsi kernel: [38016.709707] [ 1568] 0 1568 1704 688 6 3 0 0 sh Jan 22 04:04:28 firewallfsi kernel: [38016.710306] [ 1569] 0 1569 2707 1633 9 3 0 0 udp-sgr Jan 22 04:04:28 firewallfsi kernel: [38016.710920] [ 1570] 0 1570 3309 2229 11 3 0 0 watch-ip Jan 22 04:04:28 firewallfsi kernel: [38016.711531] [ 1571] 0 1571 2741 1666 9 3 0 0 udp-sgs Jan 22 04:04:28 firewallfsi kernel: [38016.712121] [ 1572] 0 1572 3238 2121 10 3 0 0 dynamic-ip-upda Jan 22 04:04:28 firewallfsi kernel: [38016.712726] [ 1638] 0 1638 3829 1711 11 3 0 0 sendmail Jan 22 04:04:28 firewallfsi kernel: [38016.713239] [ 1661] 51 1661 3501 736 12 3 0 0 sendmail Jan 22 04:04:28 firewallfsi kernel: [38016.713747] [ 2317] 0 2317 1033 652 6 3 0 0 irqbalance Jan 22 04:04:28 firewallfsi kernel: [38016.714254] [ 2897] 0 2897 1704 633 7 3 0 0 sh Jan 22 04:04:28 firewallfsi kernel: [38016.714765] [ 2898] 0 2898 3264 2209 10 3 0 0 restarter Jan 22 04:04:28 firewallfsi kernel: [38016.715342] [ 2914] 0 2914 1704 660 7 3 0 0 sh Jan 22 04:04:28 firewallfsi kernel: [38016.715900] [ 2915] 0 2915 1769 783 8 3 0 0 watch-services Jan 22 04:04:28 firewallfsi kernel: [38016.716409] [ 4260] 0 4260 1704 741 6 3 0 0 run-parts Jan 22 04:04:28 firewallfsi kernel: [38016.717003] [ 5595] 0 5595 1704 642 6 3 0 0 train-spam Jan 22 04:04:28 firewallfsi kernel: [38016.717516] [ 5596] 0 5596 1638 265 6 3 0 0 awk Jan 22 04:04:28 firewallfsi kernel: [38016.718022] [ 5601] 0 5601 3199 2118 9 3 0 0 train-spam Jan 22 04:04:28 firewallfsi kernel: [38016.718539] [ 6573] 0 6573 3375 1484 10 3 0 0 sudo Jan 22 04:04:28 firewallfsi kernel: [38016.719134] [ 6577] 301 6577 1603 1184 7 3 0 0 systemd Jan 22 04:04:28 firewallfsi kernel: [38016.719711] [ 6585] 301 6585 4817 703 10 3 0 0 (sd-pam) Jan 22 04:04:28 firewallfsi kernel: [38016.720256] [ 6601] 301 6601 11251 9610 25 3 0 0 sb_mboxtrain.py Jan 22 04:04:28 firewallfsi kernel: [38016.720864] [12163] 273 12163 2209 1416 8 3 0 0 imap-login Jan 22 04:04:28 firewallfsi kernel: [38016.721396] [12167] 301 12167 1433 1065 6 3 0 0 imap Jan 22 04:04:28 firewallfsi kernel: [38016.721930] [12599] 0 12599 1396 169 6 3 0 0 sleep Jan 22 04:04:28 firewallfsi kernel: [38016.722466] [12687] 0 12687 1704 699 6 3 0 0 sh Jan 22 04:04:28 firewallfsi kernel: [38016.722995] [12688] 0 12688 1769 763 6 3 0 0 iptables-regen Jan 22 04:04:28 firewallfsi kernel: [38016.723533] [12974] 0 12974 1396 140 6 3 0 0 sleep Jan 22 04:04:28 firewallfsi kernel: [38016.724061] [13085] 0 13085 3838 1975 11 3 0 0 sendmail Jan 22 04:04:28 firewallfsi kernel: [38016.724674] [13112] 0 13112 1396 142 6 3 0 0 sleep Jan 22 04:04:28 firewallfsi kernel: [38016.725235] Out of memory: Kill process 1415 (clamd) score 10 or sacrifice child Jan 22 04:04:28 firewallfsi kernel: [38016.725776] Killed process 1415 (clamd) total-vm:463236kB, anon-rss:401612kB, file-rss:16328kB, shmem-rss:0kB Jan 22 04:04:28 firewallfsi kernel: [38017.327183] iptables-regen: page allocation stalls for 10990ms, order:1, mode:0x17000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK) Jan 22 04:04:28 firewallfsi kernel: [38017.327616] CPU: 1 PID: 12688 Comm: iptables-regen Not tainted 4.10.0-rc4+ #17 Jan 22 04:04:28 firewallfsi kernel: [38017.328032] Hardware name: Intel Corporation S1200BTL/S1200BTL, BIOS S1200BT.86B.02.00.0035.030220120927 03/02/2012 Jan 22 04:04:28 firewallfsi kernel: [38017.328475] Call Trace: Jan 22 04:04:28 firewallfsi kernel: [38017.328907] dump_stack+0x58/0x81 Jan 22 04:04:28 firewallfsi kernel: [38017.329337] warn_alloc+0xf6/0x110 Jan 22 04:04:28 firewallfsi kernel: [38017.329765] __alloc_pages_nodemask+0x9fe/0xc30 Jan 22 04:04:28 firewallfsi kernel: [38017.330196] ? kmem_cache_alloc+0xf7/0x1c0 Jan 22 04:04:28 firewallfsi kernel: [38017.330627] ? __kunmap_atomic+0xa3/0x120 Jan 22 04:04:28 firewallfsi kernel: [38017.331069] ? copy_process.part.44+0x531/0x1590 Jan 22 04:04:28 firewallfsi kernel: [38017.331506] copy_process.part.44+0x108/0x1590 Jan 22 04:04:28 firewallfsi kernel: [38017.331950] ? ext4_llseek+0xad/0x520 Jan 22 04:04:28 firewallfsi kernel: [38017.332384] _do_fork+0xd4/0x370 Jan 22 04:04:28 firewallfsi kernel: [38017.332833] ? __audit_syscall_exit+0x1e6/0x270 Jan 22 04:04:28 firewallfsi kernel: [38017.333269] ? _copy_to_user+0x26/0x30 Jan 22 04:04:28 firewallfsi kernel: [38017.333710] SyS_clone+0x2c/0x30 Jan 22 04:04:28 firewallfsi kernel: [38017.334148] do_fast_syscall_32+0x8a/0x150 Jan 22 04:04:28 firewallfsi kernel: [38017.334595] entry_SYSENTER_32+0x4e/0x7c Jan 22 04:04:28 firewallfsi kernel: [38017.335053] EIP: 0xb77a4d25 Jan 22 04:04:28 firewallfsi kernel: [38017.335505] EFLAGS: 00000246 CPU: 1 Jan 22 04:04:28 firewallfsi kernel: [38017.335953] EAX: ffffffda EBX: 01200011 ECX: 00000000 EDX: 00000000 Jan 22 04:04:28 firewallfsi kernel: [38017.336420] ESI: 00000000 EDI: b759e768 EBP: bfd70dd8 ESP: bfd70d90 Jan 22 04:04:28 firewallfsi kernel: [38017.336900] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b Jan 22 04:04:28 firewallfsi kernel: [38017.337376] Mem-Info: Jan 22 04:04:28 firewallfsi kernel: [38017.337848] active_anon:185471 inactive_anon:1207 isolated_anon:0 Jan 22 04:04:28 firewallfsi kernel: [38017.337848] active_file:105818 inactive_file:482899 isolated_file:0 Jan 22 04:04:28 firewallfsi kernel: [38017.337848] unevictable:0 dirty:4 writeback:8 unstable:0 Jan 22 04:04:28 firewallfsi kernel: [38017.337848] slab_reclaimable:183989 slab_unreclaimable:11748 Jan 22 04:04:28 firewallfsi kernel: [38017.337848] mapped:27071 shmem:1396 pagetables:1593 bounce:8 Jan 22 04:04:28 firewallfsi kernel: [38017.337848] free:232808 free_pcp:836 free_cma:0 Jan 22 04:04:28 firewallfsi kernel: [38017.340805] Node 0 active_anon:741100kB inactive_anon:4828kB active_file:423272kB inactive_file:1931596kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:108284kB dirty:16kB writeback:32kB shmem:5584kB writeback_tmp:0kB unstable:0kB pages_scanned:92715 all_unreclaimable? no Jan 22 04:04:28 firewallfsi kernel: [38017.342451] DMA free:3176kB min:68kB low:84kB high:100kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15976kB managed:15900kB mlocked:0kB slab_reclaimable:10812kB slab_unreclaimable:1236kB kernel_stack:0kB pagetables:0kB bounce:4kB free_pcp:0kB local_pcp:0kB free_cma:0kB Jan 22 04:04:28 firewallfsi kernel: [38017.344284] lowmem_reserve[]: 0 777 4733 4733 Jan 22 04:04:28 firewallfsi kernel: [38017.344925] Normal free:3404kB min:3532kB low:4412kB high:5292kB active_anon:0kB inactive_anon:0kB active_file:2384kB inactive_file:228kB unevictable:0kB writepending:0kB present:892920kB managed:816828kB mlocked:0kB slab_reclaimable:725144kB slab_unreclaimable:45756kB kernel_stack:2736kB pagetables:0kB bounce:28kB free_pcp:1612kB local_pcp:0kB free_cma:0kB Jan 22 04:04:28 firewallfsi kernel: [38017.346966] lowmem_reserve[]: 0 0 31652 31652 Jan 22 04:04:28 firewallfsi kernel: [38017.347667] HighMem free:924652kB min:512kB low:5008kB high:9504kB active_anon:738748kB inactive_anon:4828kB active_file:420880kB inactive_file:1931404kB unevictable:0kB writepending:48kB present:4051548kB managed:4051548kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:6372kB bounce:0kB free_pcp:1732kB local_pcp:0kB free_cma:0kB Jan 22 04:04:28 firewallfsi kernel: [38017.349941] lowmem_reserve[]: 0 0 0 0 Jan 22 04:04:28 firewallfsi kernel: [38017.350712] DMA: 12*4kB (UE) 13*8kB (E) 63*16kB (UE) 33*32kB (UE) 13*64kB (UE) 1*128kB (U) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3176kB Jan 22 04:04:28 firewallfsi kernel: [38017.352292] Normal: 405*4kB (MH) 224*8kB (UMH) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3412kB Jan 22 04:04:28 firewallfsi kernel: [38017.353888] HighMem: 589*4kB (UM) 334*8kB (UM) 132*16kB (UM) 41*32kB (UM) 101*64kB (UM) 65*128kB (UM) 47*256kB (UM) 47*512kB (UM) 13*1024kB (UM) 10*2048kB (M) 204*4096kB (M) = 928708kB Jan 22 04:04:28 firewallfsi kernel: [38017.355508] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Jan 22 04:04:29 firewallfsi kernel: [38017.356333] 590204 total pagecache pages Jan 22 04:04:29 firewallfsi kernel: [38017.357170] 0 pages in swap cache Jan 22 04:04:29 firewallfsi kernel: [38017.357997] Swap cache stats: add 0, delete 0, find 0/0 Jan 22 04:04:29 firewallfsi kernel: [38017.358828] Free swap = 33784572kB Jan 22 04:04:29 firewallfsi kernel: [38017.359653] Total swap = 33784572kB Jan 22 04:04:29 firewallfsi kernel: [38017.360490] 1240111 pages RAM Jan 22 04:04:29 firewallfsi kernel: [38017.361306] 1012887 pages HighMem/MovableOnly Jan 22 04:04:29 firewallfsi kernel: [38017.362131] 19042 pages reserved Jan 22 04:04:29 firewallfsi kernel: [38017.362955] 0 pages hwpoisoned Jan 22 04:04:30 firewallfsi kernel: [38018.444679] systemd: page allocation stalls for 11297ms, order:1, mode:0x17000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK) Jan 22 04:04:30 firewallfsi kernel: [38018.446183] CPU: 0 PID: 1 Comm: systemd Not tainted 4.10.0-rc4+ #17 Jan 22 04:04:30 firewallfsi kernel: [38018.447717] Hardware name: Intel Corporation S1200BTL/S1200BTL, BIOS S1200BT.86B.02.00.0035.030220120927 03/02/2012 Jan 22 04:04:30 firewallfsi kernel: [38018.449292] Call Trace: Jan 22 04:04:30 firewallfsi kernel: [38018.450837] dump_stack+0x58/0x81 Jan 22 04:04:30 firewallfsi kernel: [38018.452102] warn_alloc+0xf6/0x110 Jan 22 04:04:30 firewallfsi kernel: [38018.453412] __alloc_pages_nodemask+0x9fe/0xc30 Jan 22 04:04:30 firewallfsi kernel: [38018.454677] ? fd_install+0x10/0x30 Jan 22 04:04:30 firewallfsi kernel: [38018.456159] ? kmem_cache_alloc+0xf7/0x1c0 Jan 22 04:04:30 firewallfsi kernel: [38018.457725] ? autofs_dev_ioctl+0x134/0x360 Jan 22 04:04:30 firewallfsi kernel: [38018.459271] ? copy_process.part.44+0x531/0x1590 Jan 22 04:04:30 firewallfsi kernel: [38018.460795] copy_process.part.44+0x108/0x1590 Jan 22 04:04:30 firewallfsi kernel: [38018.462066] ? autofs_dev_ioctl_openmount+0xf0/0xf0 Jan 22 04:04:30 firewallfsi kernel: [38018.463330] ? do_vfs_ioctl+0x8c/0x690 Jan 22 04:04:30 firewallfsi kernel: [38018.464565] _do_fork+0xd4/0x370 Jan 22 04:04:30 firewallfsi kernel: [38018.465788] SyS_clone+0x2c/0x30 Jan 22 04:04:30 firewallfsi kernel: [38018.467118] do_int80_syscall_32+0x5c/0xc0 Jan 22 04:04:30 firewallfsi kernel: [38018.468586] entry_INT80_32+0x31/0x31 Jan 22 04:04:30 firewallfsi kernel: [38018.470031] EIP: 0xb756ff18 Jan 22 04:04:30 firewallfsi kernel: [38018.471428] EFLAGS: 00000296 CPU: 0 Jan 22 04:04:30 firewallfsi kernel: [38018.472744] EAX: ffffffda EBX: 003d0f00 ECX: b735c2e4 EDX: b735cba8 Jan 22 04:04:30 firewallfsi kernel: [38018.473853] ESI: bfad32b0 EDI: b735cba8 EBP: bfad3398 ESP: bfad3280 Jan 22 04:04:30 firewallfsi kernel: [38018.474921] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b Jan 22 04:04:30 firewallfsi kernel: [38018.475996] Mem-Info: Jan 22 04:04:30 firewallfsi kernel: [38018.477082] active_anon:175742 inactive_anon:1207 isolated_anon:0 Jan 22 04:04:30 firewallfsi kernel: [38018.477082] active_file:105848 inactive_file:482918 isolated_file:96 Jan 22 04:04:30 firewallfsi kernel: [38018.477082] unevictable:0 dirty:39 writeback:1 unstable:0 Jan 22 04:04:30 firewallfsi kernel: [38018.477082] slab_reclaimable:183989 slab_unreclaimable:11755 Jan 22 04:04:30 firewallfsi kernel: [38018.477082] mapped:27032 shmem:1396 pagetables:1593 bounce:0 Jan 22 04:04:30 firewallfsi kernel: [38018.477082] free:242713 free_pcp:30 free_cma:0 Jan 22 04:04:30 firewallfsi kernel: [38018.483943] Node 0 active_anon:702968kB inactive_anon:4828kB active_file:423392kB inactive_file:1931672kB unevictable:0kB isolated(anon):0kB isolated(file):384kB mapped:108128kB dirty:156kB writeback:4kB shmem:5584kB writeback_tmp:0kB unstable:0kB pages_scanned:2255055 all_unreclaimable? no Jan 22 04:04:30 firewallfsi kernel: [38018.486726] DMA free:3144kB min:68kB low:84kB high:100kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15976kB managed:15900kB mlocked:0kB slab_reclaimable:10812kB slab_unreclaimable:1268kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Jan 22 04:04:30 firewallfsi kernel: [38018.489865] lowmem_reserve[]: 0 777 4733 4733 Jan 22 04:04:30 firewallfsi kernel: [38018.490992] Normal free:4868kB min:3532kB low:4412kB high:5292kB active_anon:0kB inactive_anon:0kB active_file:2652kB inactive_file:212kB unevictable:0kB writepending:0kB present:892920kB managed:816828kB mlocked:0kB slab_reclaimable:725144kB slab_unreclaimable:45752kB kernel_stack:2728kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Jan 22 04:04:30 firewallfsi kernel: [38018.494375] lowmem_reserve[]: 0 0 31652 31652 Jan 22 04:04:30 firewallfsi kernel: [38018.495509] HighMem free:962840kB min:512kB low:5008kB high:9504kB active_anon:702968kB inactive_anon:4828kB active_file:420880kB inactive_file:1931404kB unevictable:0kB writepending:160kB present:4051548kB managed:4051548kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:6372kB bounce:0kB free_pcp:116kB local_pcp:0kB free_cma:0kB Jan 22 04:04:30 firewallfsi kernel: [38018.498746] lowmem_reserve[]: 0 0 0 0 Jan 22 04:04:30 firewallfsi kernel: [38018.500062] DMA: 12*4kB (UE) 13*8kB (E) 63*16kB (UE) 32*32kB (UE) 13*64kB (UE) 1*128kB (U) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3144kB Jan 22 04:04:30 firewallfsi kernel: [38018.502846] Normal: 660*4kB (UMEH) 299*8kB (UMEH) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 5032kB Jan 22 04:04:30 firewallfsi kernel: [38018.505364] HighMem: 1246*4kB (UM) 841*8kB (UM) 504*16kB (UM) 266*32kB (UM) 191*64kB (UM) 90*128kB (UM) 55*256kB (UM) 50*512kB (UM) 11*1024kB (UM) 12*2048kB (M) 204*4096kB (M) = 963136kB Jan 22 04:04:30 firewallfsi kernel: [38018.507741] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Jan 22 04:04:30 firewallfsi kernel: [38018.509066] 590175 total pagecache pages Jan 22 04:04:30 firewallfsi kernel: [38018.510535] 0 pages in swap cache Jan 22 04:04:30 firewallfsi kernel: [38018.512006] Swap cache stats: add 0, delete 0, find 0/0 Jan 22 04:04:30 firewallfsi kernel: [38018.513480] Free swap = 33784572kB Jan 22 04:04:30 firewallfsi kernel: [38018.514871] Total swap = 33784572kB Jan 22 04:04:30 firewallfsi kernel: [38018.516058] 1240111 pages RAM Jan 22 04:04:30 firewallfsi kernel: [38018.517242] 1012887 pages HighMem/MovableOnly Jan 22 04:04:30 firewallfsi kernel: [38018.518427] 19042 pages reserved Jan 22 04:04:30 firewallfsi kernel: [38018.519833] 0 pages hwpoisoned ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-23 0:45 ` Trevor Cordes @ 2017-01-23 10:48 ` Mel Gorman 2017-01-23 11:04 ` Mel Gorman ` (2 more replies) 2017-01-24 12:54 ` Michal Hocko 1 sibling, 3 replies; 40+ messages in thread From: Mel Gorman @ 2017-01-23 10:48 UTC (permalink / raw) To: Trevor Cordes Cc: Michal Hocko, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Sun, Jan 22, 2017 at 06:45:59PM -0600, Trevor Cordes wrote: > On 2017-01-20 Mel Gorman wrote: > > > > > > Thanks for the OOM report. I was expecting it to be a particular > > > shape and my expectations were not matched so it took time to > > > consider it further. Can you try the cumulative patch below? It > > > combines three patches that > > > > > > 1. Allow slab shrinking even if the LRU patches are unreclaimable in > > > direct reclaim > > > 2. Shrinks slab based once based on the contents of all memcgs > > > instead of shrinking one at a time > > > 3. Tries to shrink slabs if the lowmem usage is too high > > > > > > Unfortunately it's only boot tested on x86-64 as I didn't get the > > > chance to setup an i386 test bed. > > > > > > > There was one major flaw in that patch. This version fixes it and > > addresses other minor issues. It may still be too agressive shrinking > > slab but worth trying out. Thanks. > > I ran with your patch below and it oom'd on the first night. It was > weird, it didn't hang the system, and my rebooter script started a > reboot but the system never got more than half down before it just sat > there in a weird state where a local console user could still login but > not much was working. So the patches don't seem to solve the problem. > > For the above compile I applied your patches to 4.10.0-rc4+, I hope > that's ok. > It would be strongly preferred to run them on top of Michal's other fixes. The main reason it's preferred is because this OOM differs from earlier ones in that it OOM killed from GFP_NOFS|__GFP_NOFAIL context. That meant that the slab shrinking could not happen from direct reclaim so the balancing from my patches would not occur. As Michal's other patches affect how kswapd behaves, it's important. Unfortunately, even that will be race prone for GFP_NOFS callers as they'll effectively be racing to see if kswapd or another direct reclaimer can reclaim before the OOM conditions are hit. It is by design, but it's apparent that a __GFP_NOFAIL request can trigger OOM relatively easily as it's not necessarily throttled or waiting on kswapd to complete any work. I'll keep thinking about it. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-23 10:48 ` Mel Gorman @ 2017-01-23 11:04 ` Mel Gorman 2017-01-25 9:46 ` Michal Hocko 2017-01-24 12:59 ` Michal Hocko 2017-01-25 10:02 ` Trevor Cordes 2 siblings, 1 reply; 40+ messages in thread From: Mel Gorman @ 2017-01-23 11:04 UTC (permalink / raw) To: Trevor Cordes Cc: Michal Hocko, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Mon, Jan 23, 2017 at 10:48:58AM +0000, Mel Gorman wrote: > On Sun, Jan 22, 2017 at 06:45:59PM -0600, Trevor Cordes wrote: > > On 2017-01-20 Mel Gorman wrote: > > > > > > > > Thanks for the OOM report. I was expecting it to be a particular > > > > shape and my expectations were not matched so it took time to > > > > consider it further. Can you try the cumulative patch below? It > > > > combines three patches that > > > > > > > > 1. Allow slab shrinking even if the LRU patches are unreclaimable in > > > > direct reclaim > > > > 2. Shrinks slab based once based on the contents of all memcgs > > > > instead of shrinking one at a time > > > > 3. Tries to shrink slabs if the lowmem usage is too high > > > > > > > > Unfortunately it's only boot tested on x86-64 as I didn't get the > > > > chance to setup an i386 test bed. > > > > > > > > > > There was one major flaw in that patch. This version fixes it and > > > addresses other minor issues. It may still be too agressive shrinking > > > slab but worth trying out. Thanks. > > > > I ran with your patch below and it oom'd on the first night. It was > > weird, it didn't hang the system, and my rebooter script started a > > reboot but the system never got more than half down before it just sat > > there in a weird state where a local console user could still login but > > not much was working. So the patches don't seem to solve the problem. > > > > For the above compile I applied your patches to 4.10.0-rc4+, I hope > > that's ok. > > > > It would be strongly preferred to run them on top of Michal's other > fixes. The main reason it's preferred is because this OOM differs from > earlier ones in that it OOM killed from GFP_NOFS|__GFP_NOFAIL context. > That meant that the slab shrinking could not happen from direct reclaim so > the balancing from my patches would not occur. As Michal's other patches > affect how kswapd behaves, it's important. > > Unfortunately, even that will be race prone for GFP_NOFS callers as > they'll effectively be racing to see if kswapd or another direct > reclaimer can reclaim before the OOM conditions are hit. It is by > design, but it's apparent that a __GFP_NOFAIL request can trigger OOM > relatively easily as it's not necessarily throttled or waiting on kswapd > to complete any work. I'll keep thinking about it. > As a slight follow-up albeit without patches, further options are to; 1. In should_reclaim_retry, account for SLAB_RECLAIMABLE as available pages when deciding to retry reclaim 2. Stall in should_reclaim_retry for __GFP_NOFAIL|__GFP_NOFS with a comment stating that the intent is to allow kswapd make progress with the shrinker 3. Stall __GFP_NOFS in direct reclaimer on a workqueue when it's failing to make progress to allow kswapd to do some work. This may be impaired if kswapd is locked up waiting for a lock held by the direct reclaimer 4. Schedule the system workqueue to drain slab for __GFP_NOFS|__GFP_NOFAIL. 3 and 4 are extremely heavy handed so we should try them one at a time. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-23 11:04 ` Mel Gorman @ 2017-01-25 9:46 ` Michal Hocko 0 siblings, 0 replies; 40+ messages in thread From: Michal Hocko @ 2017-01-25 9:46 UTC (permalink / raw) To: Mel Gorman Cc: Trevor Cordes, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Mon 23-01-17 11:04:12, Mel Gorman wrote: [...] > 1. In should_reclaim_retry, account for SLAB_RECLAIMABLE as available > pages when deciding to retry reclaim I am pretty sure I have considered this but then decided to not go that way. I do not remember details so I will think about this some more. It might have been just "let's wait for the real issue here". Anyway we can give it a try and it would be as simple as diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 94ebd30d0f09..87221491be84 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3566,7 +3566,7 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order, unsigned long min_wmark = min_wmark_pages(zone); bool wmark; - available = reclaimable = zone_reclaimable_pages(zone); + available = reclaimable = zone_reclaimable_pages(zone) + zone_page_state_snapshot(zone, NR_SLAB_RECLAIMABLE); available -= DIV_ROUND_UP((*no_progress_loops) * available, MAX_RECLAIM_RETRIES); available += zone_page_state_snapshot(zone, NR_FREE_PAGES); I am not sure it would really help much on its own without further changes to how we scale LRU->slab scanning. Could you give this a try on top of the mmotm or linux-next tree? > 2. Stall in should_reclaim_retry for __GFP_NOFAIL|__GFP_NOFS with a > comment stating that the intent is to allow kswapd make progress > with the shrinker The current mmotm tree doesnt need this because we no longer trigger the oom killer for this combinations of flags. > 3. Stall __GFP_NOFS in direct reclaimer on a workqueue when it's > failing to make progress to allow kswapd to do some work. This > may be impaired if kswapd is locked up waiting for a lock held > by the direct reclaimer > 4. Schedule the system workqueue to drain slab for > __GFP_NOFS|__GFP_NOFAIL. > > 3 and 4 are extremely heavy handed so we should try them one at a time. I am not even sure they are really necessary. -- Michal Hocko SUSE Labs ^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-23 10:48 ` Mel Gorman 2017-01-23 11:04 ` Mel Gorman @ 2017-01-24 12:59 ` Michal Hocko 2017-01-25 10:02 ` Trevor Cordes 2 siblings, 0 replies; 40+ messages in thread From: Michal Hocko @ 2017-01-24 12:59 UTC (permalink / raw) To: Mel Gorman Cc: Trevor Cordes, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Mon 23-01-17 10:48:58, Mel Gorman wrote: [...] > Unfortunately, even that will be race prone for GFP_NOFS callers as > they'll effectively be racing to see if kswapd or another direct > reclaimer can reclaim before the OOM conditions are hit. It is by > design, but it's apparent that a __GFP_NOFAIL request can trigger OOM > relatively easily as it's not necessarily throttled or waiting on kswapd > to complete any work. I'll keep thinking about it. Yes, we shouldn't trigger the OOM for GFP_NOFS as the memory reclaim is really weaker. And that might really matter here. So the mmomt tree will behave differently in this regards as we have [1] [1] http://lkml.kernel.org/r/20161220134904.21023-3-mhocko@kernel.org -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-23 10:48 ` Mel Gorman 2017-01-23 11:04 ` Mel Gorman 2017-01-24 12:59 ` Michal Hocko @ 2017-01-25 10:02 ` Trevor Cordes 2017-01-25 12:04 ` Michal Hocko 2 siblings, 1 reply; 40+ messages in thread From: Trevor Cordes @ 2017-01-25 10:02 UTC (permalink / raw) To: Mel Gorman Cc: Michal Hocko, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju [-- Attachment #1: Type: text/plain, Size: 4154 bytes --] On 2017-01-23 Mel Gorman wrote: > On Sun, Jan 22, 2017 at 06:45:59PM -0600, Trevor Cordes wrote: > > On 2017-01-20 Mel Gorman wrote: > > > > > > > > Thanks for the OOM report. I was expecting it to be a particular > > > > shape and my expectations were not matched so it took time to > > > > consider it further. Can you try the cumulative patch below? It > > > > combines three patches that > > > > > > > > 1. Allow slab shrinking even if the LRU patches are > > > > unreclaimable in direct reclaim > > > > 2. Shrinks slab based once based on the contents of all memcgs > > > > instead of shrinking one at a time > > > > 3. Tries to shrink slabs if the lowmem usage is too high > > > > > > > > Unfortunately it's only boot tested on x86-64 as I didn't get > > > > the chance to setup an i386 test bed. > > > > > > > > > > There was one major flaw in that patch. This version fixes it and > > > addresses other minor issues. It may still be too agressive > > > shrinking slab but worth trying out. Thanks. > > > > I ran with your patch below and it oom'd on the first night. It was > > weird, it didn't hang the system, and my rebooter script started a > > reboot but the system never got more than half down before it just > > sat there in a weird state where a local console user could still > > login but not much was working. So the patches don't seem to solve > > the problem. > > > > For the above compile I applied your patches to 4.10.0-rc4+, I hope > > that's ok. > > > > It would be strongly preferred to run them on top of Michal's other > fixes. The main reason it's preferred is because this OOM differs from > earlier ones in that it OOM killed from GFP_NOFS|__GFP_NOFAIL context. > That meant that the slab shrinking could not happen from direct > reclaim so the balancing from my patches would not occur. As > Michal's other patches affect how kswapd behaves, it's important. OK, I patched & compiled mhocko's git tree from the other day 4.9.0+. (To confirm, weird, but mhocko's git tree I'm using from a couple of weeks ago shows the newest commit (git log) is 69973b830859bc6529a7a0468ba0d80ee5117826 "Linux 4.9"? Let me know if I'm doing something wrong, see below.) Anyhow, it oom'd as usual at ~3am, system froze after 20 ooms hit in 7 secs. So no help there. Attached is the oom log from the first oom hit. On 2017-01-24 Michal Hocko wrote: > On Sun 22-01-17 18:45:59, Trevor Cordes wrote: > [...] > > Also, completely separate from your patch I ran mhocko's 4.9 tree > > with mem=2G to see if lower ram amount would help, but it didn't. > > Even with 2G the system oom and hung same as usual. So far the > > only thing that helps at all was the cgroup_disable=memory option, > > which makes the problem disappear completely for me. > > OK, can we reduce the problem space slightly more and could you boot > with kmem accounting enabled? cgroup.memory=nokmem,nosocket I will try that right now, I'll use the mhocko git tree without Mel's emailed patch, and I'll refresh the git tree from origin first (let me know that's a bad move). As usual, I'll report back within 24-48 hours. Actually, on my tests with mhocko git tree, I'm a bit confused and want to make sure I'm compiling the right thing. His tree doesn't seem to have recent commits? I did "git fetch origin" and "git reset --hard origin/master" to refresh the tree just now and the latest commit is still the one shown above "Linux 4.9"? Is Michal making changes but not comitting? How do I ensure I'm compiling the version you guys want me to test? ("git log mm/vmscan.c" shows newest commit is Dec 2??) Am I supposed to be testing a specific branch? If I've been testing the wrong branch, this *only* affects my mhocko tree tests (not the vanilla or fedora-patched tests). Thankfully I think I've only done 1 or 2 mhocko tree tests, and I can easily redo them. If this turns out to be the case, I'm so sorry for the confusion, the non-vanilla git tree thing is all new to me. In any event, I'm still trying the above, and will adjust if necessary if it's confirmed I'm doing something wrong with the mhocko git tree. Thanks! [-- Attachment #2: oom5 --] [-- Type: application/octet-stream, Size: 22459 bytes --] Jan 25 03:01:39 firewallfsi kernel: [91632.892378] smbd invoked oom-killer: gfp_mask=0x2420848(GFP_NOFS|__GFP_NOFAIL|__GFP_HARDWALL|__GFP_MOVABLE), nodemask=0, order=0, oom_score_adj=0 Jan 25 03:01:39 firewallfsi kernel: [91632.894810] smbd cpuset=/ mems_allowed=0 Jan 25 03:01:39 firewallfsi kernel: [91632.896010] CPU: 0 PID: 1973 Comm: smbd Not tainted 4.9.0+ #2 Jan 25 03:01:39 firewallfsi kernel: [91632.897220] Hardware name: Intel Corporation S1200BTL/S1200BTL, BIOS S1200BT.86B.02.00.0035.030220120927 03/02/2012 Jan 25 03:01:39 firewallfsi kernel: [91632.898418] ed683b5c c9f604e7 ed683c94 f69fc800 ed683b8c c9de17a6 ed683b6c ca363bad Jan 25 03:01:39 firewallfsi kernel: [91632.899615] ed683b8c c9f6625f ed683b90 f6bcac00 ecb84000 f69fc800 ca568bce ed683c94 Jan 25 03:01:39 firewallfsi kernel: [91632.900797] ed683bd0 c9d7aff7 c9c76d8a ed683bbc c9d7ac6b 00000007 00000000 0000002d Jan 25 03:01:39 firewallfsi kernel: [91632.901968] Call Trace: Jan 25 03:01:39 firewallfsi kernel: [91632.903098] [<c9f604e7>] dump_stack+0x58/0x81 Jan 25 03:01:39 firewallfsi kernel: [91632.904214] [<c9de17a6>] dump_header+0x64/0x1a6 Jan 25 03:01:39 firewallfsi kernel: [91632.905311] [<ca363bad>] ? _raw_spin_unlock_irqrestore+0xd/0x10 Jan 25 03:01:39 firewallfsi kernel: [91632.906398] [<c9f6625f>] ? ___ratelimit+0x9f/0x100 Jan 25 03:01:39 firewallfsi kernel: [91632.907466] [<c9d7aff7>] oom_kill_process+0x207/0x3d0 Jan 25 03:01:39 firewallfsi kernel: [91632.908517] [<c9c76d8a>] ? has_capability_noaudit+0x1a/0x30 Jan 25 03:01:39 firewallfsi kernel: [91632.909554] [<c9d7ac6b>] ? oom_badness.part.13+0xcb/0x140 Jan 25 03:01:39 firewallfsi kernel: [91632.910574] [<c9d7b4d8>] out_of_memory+0xf8/0x2a0 Jan 25 03:01:39 firewallfsi kernel: [91632.911576] [<c9d80016>] __alloc_pages_nodemask+0xc46/0xd10 Jan 25 03:01:39 firewallfsi kernel: [91632.912564] [<c9d768c2>] ? find_get_entry+0x22/0x160 Jan 25 03:01:39 firewallfsi kernel: [91632.913534] [<c9d7727e>] pagecache_get_page+0xbe/0x2d0 Jan 25 03:01:39 firewallfsi kernel: [91632.914488] [<c9e19044>] __getblk_gfp+0x104/0x360 Jan 25 03:01:39 firewallfsi kernel: [91632.915422] [<c9e6b973>] ext4_getblk+0xa3/0x1b0 Jan 25 03:01:39 firewallfsi kernel: [91632.916337] [<c9e6baa3>] ext4_bread+0x23/0xb0 Jan 25 03:01:39 firewallfsi kernel: [91632.917240] [<c9e75617>] __ext4_read_dirblock+0x27/0x3f0 Jan 25 03:01:39 firewallfsi kernel: [91632.918130] [<c9def476>] ? lookup_fast+0x46/0x2c0 Jan 25 03:01:39 firewallfsi kernel: [91632.919003] [<c9e76005>] htree_dirblock_to_tree+0x45/0x1a0 Jan 25 03:01:39 firewallfsi kernel: [91632.919860] [<c9f6e90b>] ? lockref_get_not_dead+0xb/0x30 Jan 25 03:01:39 firewallfsi kernel: [91632.920701] [<c9def389>] ? unlazy_walk+0xf9/0x1a0 Jan 25 03:01:39 firewallfsi kernel: [91632.921523] [<c9e77000>] ext4_htree_fill_tree+0x90/0x2e0 Jan 25 03:01:39 firewallfsi kernel: [91632.922331] [<c9e63ef0>] ? ext4_readdir+0x8c0/0x960 Jan 25 03:01:39 firewallfsi kernel: [91632.923119] [<c9e63ef0>] ? ext4_readdir+0x8c0/0x960 Jan 25 03:01:39 firewallfsi kernel: [91632.923883] [<c9e631b6>] ? free_rb_tree_fname+0x16/0x70 Jan 25 03:01:39 firewallfsi kernel: [91632.924630] [<c9e63cff>] ext4_readdir+0x6cf/0x960 Jan 25 03:01:39 firewallfsi kernel: [91632.925358] [<c9f6e7f6>] ? _copy_to_user+0x26/0x30 Jan 25 03:01:39 firewallfsi kernel: [91632.926071] [<c9ee3b6f>] ? security_file_permission+0x9f/0xc0 Jan 25 03:01:39 firewallfsi kernel: [91632.926769] [<c9df84b9>] iterate_dir+0x179/0x1a0 Jan 25 03:01:39 firewallfsi kernel: [91632.927448] [<c9df8a6b>] SyS_getdents64+0x7b/0x110 Jan 25 03:01:39 firewallfsi kernel: [91632.928109] [<c9df84e0>] ? iterate_dir+0x1a0/0x1a0 Jan 25 03:01:39 firewallfsi kernel: [91632.928751] [<c9c0377a>] do_fast_syscall_32+0x8a/0x150 Jan 25 03:01:39 firewallfsi kernel: [91632.929378] [<ca3640ca>] sysenter_past_esp+0x47/0x75 Jan 25 03:01:39 firewallfsi kernel: [91632.929991] Mem-Info: Jan 25 03:01:39 firewallfsi kernel: [91632.930764] active_anon:185890 inactive_anon:1821 isolated_anon:0 Jan 25 03:01:39 firewallfsi kernel: [91632.930764] active_file:123389 inactive_file:655369 isolated_file:0 Jan 25 03:01:39 firewallfsi kernel: [91632.930764] unevictable:0 dirty:1332 writeback:0 unstable:0 Jan 25 03:01:39 firewallfsi kernel: [91632.930764] slab_reclaimable:166383 slab_unreclaimable:11556 Jan 25 03:01:39 firewallfsi kernel: [91632.930764] mapped:24783 shmem:1249 pagetables:1579 bounce:0 Jan 25 03:01:39 firewallfsi kernel: [91632.930764] free:61842 free_pcp:286 free_cma:0 Jan 25 03:01:39 firewallfsi kernel: [91632.934074] Node 0 active_anon:743560kB inactive_anon:7284kB active_file:493556kB inactive_file:2621476kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:99132kB dirty:5328kB writeback:0kB shmem:4996kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no Jan 25 03:01:39 firewallfsi kernel: [91632.935671] DMA free:3164kB min:68kB low:84kB high:100kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15976kB managed:15900kB mlocked:0kB slab_reclaimable:12684kB slab_unreclaimable:52kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Jan 25 03:01:39 firewallfsi kernel: lowmem_reserve[]: 0 778 4734 4734 Jan 25 03:01:39 firewallfsi kernel: [91632.937897] Normal free:3452kB min:3532kB low:4412kB high:5292kB active_anon:0kB inactive_anon:0kB active_file:78388kB inactive_file:120kB unevictable:0kB writepending:2164kB present:892920kB managed:817332kB mlocked:0kB slab_reclaimable:652848kB slab_unreclaimable:46172kB kernel_stack:2592kB pagetables:0kB bounce:0kB free_pcp:1020kB local_pcp:228kB free_cma:0kB Jan 25 03:01:39 firewallfsi kernel: lowmem_reserve[]: 0 0 31652 31652 Jan 25 03:01:39 firewallfsi kernel: [91632.940306] HighMem free:240628kB min:512kB low:5000kB high:9488kB active_anon:743560kB inactive_anon:7284kB active_file:415168kB inactive_file:2621356kB unevictable:0kB writepending:3168kB present:4051548kB managed:4051548kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:6316kB bounce:0kB free_pcp:244kB local_pcp:0kB free_cma:0kB Jan 25 03:01:39 firewallfsi kernel: lowmem_reserve[]: 0 0 0 0 Jan 25 03:01:39 firewallfsi kernel: [91632.942957] DMA: 41*4kB (E) 23*8kB (UE) 2*16kB (UE) 3*32kB (UE) 6*64kB (UE) 6*128kB (UME) 2*256kB (U) 2*512kB (U) 0*1024kB 0*2048kB 0*4096kB = 3164kB Jan 25 03:01:39 firewallfsi kernel: Normal: 27*4kB (MH) 152*8kB (UMH) 81*16kB (MH) 14*32kB (H) 4*64kB (H) 1*128kB (H) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3452kB Jan 25 03:01:39 firewallfsi kernel: HighMem: 621*4kB (UM) 168*8kB (UM) 36*16kB (UM) 12*32kB (UM) 11*64kB (UM) 25*128kB (UM) 6*256kB (UM) 60*512kB (M) 13*1024kB (M) 15*2048kB (M) 38*4096kB (M) = 240628kB Jan 25 03:01:39 firewallfsi kernel: [91632.947548] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Jan 25 03:01:39 firewallfsi kernel: [91632.948339] 780026 total pagecache pages Jan 25 03:01:39 firewallfsi kernel: [91632.949153] 0 pages in swap cache Jan 25 03:01:39 firewallfsi kernel: [91632.949929] Swap cache stats: add 0, delete 0, find 0/0 Jan 25 03:01:39 firewallfsi kernel: [91632.950772] Free swap = 33784572kB Jan 25 03:01:39 firewallfsi kernel: [91632.951590] Total swap = 33784572kB Jan 25 03:01:39 firewallfsi kernel: [91632.952402] 1240111 pages RAM Jan 25 03:01:39 firewallfsi kernel: [91632.953205] 1012887 pages HighMem/MovableOnly Jan 25 03:01:39 firewallfsi kernel: [91632.953994] 18916 pages reserved Jan 25 03:01:39 firewallfsi kernel: [91632.954816] 0 pages hwpoisoned Jan 25 03:01:39 firewallfsi kernel: [91632.955624] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name Jan 25 03:01:39 firewallfsi kernel: [91632.956471] [ 596] 0 596 5325 3531 14 3 0 0 systemd-journal Jan 25 03:01:39 firewallfsi kernel: [91632.957312] [ 631] 0 631 3592 1206 9 3 0 -1000 systemd-udevd Jan 25 03:01:39 firewallfsi kernel: [91632.958137] [ 735] 0 735 8633 1077 11 3 0 0 rsyslogd Jan 25 03:01:39 firewallfsi kernel: [91632.958979] [ 736] 0 736 1005 740 5 3 0 0 systemd-logind Jan 25 03:01:39 firewallfsi kernel: [91632.959824] [ 737] 0 737 1704 653 7 3 0 0 sh Jan 25 03:01:39 firewallfsi kernel: [91632.960680] [ 738] 0 738 1704 652 8 3 0 0 sh Jan 25 03:01:39 firewallfsi kernel: [91632.961513] [ 742] 0 742 1472 971 7 3 0 0 smartd Jan 25 03:01:39 firewallfsi kernel: [91632.962346] [ 743] 81 743 1700 1041 7 3 0 -900 dbus-daemon Jan 25 03:01:39 firewallfsi kernel: [91632.963173] [ 752] 0 752 2708 1666 9 3 0 0 udp-sgr Jan 25 03:01:39 firewallfsi kernel: [91632.963984] [ 753] 0 753 3695 1992 11 3 0 0 fetchmail Jan 25 03:01:39 firewallfsi kernel: [91632.964807] [ 757] 0 757 1704 625 6 3 0 0 sh Jan 25 03:01:39 firewallfsi kernel: [91632.965622] [ 760] 0 760 1704 704 8 3 0 0 sh Jan 25 03:01:39 firewallfsi kernel: [91632.966434] [ 761] 0 761 800 496 5 3 0 0 mdadm Jan 25 03:01:39 firewallfsi kernel: [91632.967234] [ 773] 0 773 1033 646 5 3 0 0 irqbalance Jan 25 03:01:39 firewallfsi kernel: [91632.968016] [ 775] 0 775 1736 722 6 3 0 0 tickle-pog Jan 25 03:01:39 firewallfsi kernel: [91632.968813] [ 777] 0 777 3309 2235 10 3 0 0 watch-ip Jan 25 03:01:39 firewallfsi kernel: [91632.969591] [ 788] 0 788 1704 692 6 3 0 0 sh Jan 25 03:01:39 firewallfsi kernel: [91632.970349] [ 789] 0 789 3239 2159 10 3 0 0 dynamic-ip-upda Jan 25 03:01:39 firewallfsi kernel: [91632.971086] [ 790] 0 790 1704 624 7 3 0 0 sh Jan 25 03:01:39 firewallfsi kernel: [91632.971788] [ 791] 0 791 3165 2109 10 3 0 0 mailwarnings Jan 25 03:01:39 firewallfsi kernel: [91632.972492] [ 792] 0 792 1704 692 6 3 0 0 sh Jan 25 03:01:39 firewallfsi kernel: [91632.973171] [ 793] 0 793 3264 2170 10 3 0 0 restarter Jan 25 03:01:39 firewallfsi kernel: [91632.973833] [ 796] 0 796 1704 649 7 3 0 0 sh Jan 25 03:01:39 firewallfsi kernel: [91632.974526] [ 797] 0 797 1819 793 7 3 0 0 watch-services Jan 25 03:01:39 firewallfsi kernel: [91632.975162] [ 799] 288 799 14423 1672 18 3 0 0 milter-greylist Jan 25 03:01:39 firewallfsi kernel: [91632.975765] [ 802] 0 802 583 403 5 3 0 0 acpid Jan 25 03:01:39 firewallfsi kernel: [91632.976372] [ 805] 0 805 7287 1286 11 3 0 0 apcupsd Jan 25 03:01:39 firewallfsi kernel: [91632.976957] [ 822] 0 822 860 539 5 3 0 0 atd Jan 25 03:01:39 firewallfsi kernel: [91632.977505] [ 928] 0 928 2499 430 8 3 0 0 saslauthd Jan 25 03:01:39 firewallfsi kernel: [91632.978051] [ 929] 0 929 2499 126 8 3 0 0 saslauthd Jan 25 03:01:39 firewallfsi kernel: [91632.978562] [ 930] 0 930 2499 126 8 3 0 0 saslauthd Jan 25 03:01:39 firewallfsi kernel: [91632.979065] [ 931] 0 931 2499 126 8 3 0 0 saslauthd Jan 25 03:01:39 firewallfsi kernel: [91632.979529] [ 932] 0 932 2499 126 8 3 0 0 saslauthd Jan 25 03:01:39 firewallfsi kernel: [91632.979985] [ 1034] 0 1034 2769 605 9 3 0 -1000 sshd Jan 25 03:01:39 firewallfsi kernel: [91632.980408] [ 1063] 27 1063 1707 736 6 3 0 0 mysqld_safe Jan 25 03:01:39 firewallfsi kernel: [91632.980829] [ 1101] 25 1101 81656 47557 113 3 0 0 named Jan 25 03:01:39 firewallfsi kernel: [91632.981212] [ 1241] 27 1241 126039 14616 64 3 0 0 mysqld Jan 25 03:01:39 firewallfsi kernel: [91632.981586] [ 1270] 0 1270 12052 6520 28 3 0 0 /usr/sbin/httpd Jan 25 03:01:39 firewallfsi kernel: [91632.981968] [ 1362] 0 1362 7049 2424 16 3 0 0 nmbd Jan 25 03:01:39 firewallfsi kernel: [91632.982334] [ 1363] 0 1363 6711 2157 16 3 0 0 nmbd Jan 25 03:01:39 firewallfsi kernel: [91632.982710] [ 1386] 0 1386 1116 517 6 3 0 0 agetty Jan 25 03:01:39 firewallfsi kernel: [91632.983062] [ 1387] 0 1387 1116 546 6 3 0 0 agetty Jan 25 03:01:39 firewallfsi kernel: [91632.983417] [ 1388] 0 1388 1116 531 5 3 0 0 agetty Jan 25 03:01:39 firewallfsi kernel: [91632.983784] [ 1389] 0 1389 1116 536 6 3 0 0 agetty Jan 25 03:01:39 firewallfsi kernel: [91632.984128] [ 1390] 0 1390 1116 512 6 3 0 0 agetty Jan 25 03:01:39 firewallfsi kernel: [91632.984469] [ 1391] 0 1391 1116 530 6 3 0 0 agetty Jan 25 03:01:39 firewallfsi kernel: [91632.984824] [ 1599] 48 1599 49254 4236 46 3 0 0 /usr/sbin/httpd Jan 25 03:01:39 firewallfsi kernel: [91632.985154] [ 1601] 48 1601 16220 4548 29 3 0 0 /usr/sbin/httpd Jan 25 03:01:39 firewallfsi kernel: [91632.985474] [ 1613] 48 1613 16220 4234 29 3 0 0 /usr/sbin/httpd Jan 25 03:01:39 firewallfsi kernel: [91632.985805] [ 1622] 48 1622 16220 4381 29 3 0 0 /usr/sbin/httpd Jan 25 03:01:39 firewallfsi kernel: [91632.986110] [ 1625] 48 1625 16216 4375 29 3 0 0 /usr/sbin/httpd Jan 25 03:01:39 firewallfsi kernel: [91632.986416] [ 1773] 0 1773 1704 645 7 3 0 0 sh Jan 25 03:01:39 firewallfsi kernel: [91632.986736] [ 1774] 0 1774 2744 1706 9 3 0 0 udp-sgs Jan 25 03:01:39 firewallfsi kernel: [91632.987041] [ 1779] 0 1779 9561 3569 22 3 0 0 smbd Jan 25 03:01:39 firewallfsi kernel: [91632.987349] [ 1780] 0 1780 9197 1105 22 3 0 0 smbd Jan 25 03:01:39 firewallfsi kernel: [91632.987669] [ 1781] 0 1781 9446 1215 22 3 0 0 smbd Jan 25 03:01:39 firewallfsi kernel: [91632.987965] [ 1854] 0 1854 5116 2463 13 3 0 0 dhclient Jan 25 03:01:39 firewallfsi kernel: [91632.988262] [ 1947] 0 1947 594 434 5 3 0 0 pptpd Jan 25 03:01:39 firewallfsi kernel: [91632.988563] [ 1949] 0 1949 954 608 5 3 0 0 dovecot Jan 25 03:01:39 firewallfsi kernel: [91632.988881] [ 1950] 97 1950 904 574 6 3 0 0 anvil Jan 25 03:01:39 firewallfsi kernel: [91632.989180] [ 1951] 0 1951 937 597 5 3 0 0 log Jan 25 03:01:39 firewallfsi kernel: [91632.989481] [ 1953] 0 1953 1136 787 5 3 0 0 config Jan 25 03:01:39 firewallfsi kernel: [91632.989797] [ 1960] 0 1960 1899 726 8 3 0 0 crond Jan 25 03:01:39 firewallfsi kernel: [91632.990092] [ 1961] 177 1961 6832 4736 17 3 0 0 dhcpd Jan 25 03:01:39 firewallfsi kernel: [91632.990390] [ 1962] 38 1962 1536 1092 6 3 0 0 ntpd Jan 25 03:01:39 firewallfsi kernel: [91632.990702] [ 1965] 48 1965 16217 4397 29 3 0 0 /usr/sbin/httpd Jan 25 03:01:39 firewallfsi kernel: [91632.990997] [ 1973] 300 1973 11618 4438 27 3 0 0 smbd Jan 25 03:01:39 firewallfsi kernel: [91632.991291] [ 2562] 273 2562 2208 1393 8 3 0 0 imap-login Jan 25 03:01:39 firewallfsi kernel: [91632.991591] [ 2564] 301 2564 1676 1089 7 3 0 0 imap Jan 25 03:01:39 firewallfsi kernel: [91632.991905] [ 2764] 302 2764 1437 1077 7 3 0 0 imap Jan 25 03:01:39 firewallfsi kernel: [91632.992197] [ 2962] 304 2962 1399 925 6 3 0 0 imap Jan 25 03:01:39 firewallfsi kernel: [91632.992491] [ 2965] 303 2965 1444 1080 6 3 0 0 imap Jan 25 03:01:39 firewallfsi kernel: [91632.992796] [ 2967] 301 2967 1647 1141 7 3 0 0 imap Jan 25 03:01:39 firewallfsi kernel: [91632.993078] [10890] 48 10890 16386 4665 30 3 0 0 /usr/sbin/httpd Jan 25 03:01:39 firewallfsi kernel: [91632.993369] [14351] 0 14351 1584 1132 6 3 0 0 systemd Jan 25 03:01:39 firewallfsi kernel: [91632.993678] [14352] 0 14352 2455 384 8 3 0 0 (sd-pam) Jan 25 03:01:39 firewallfsi kernel: [91632.993967] [14403] 0 14403 1307 906 6 3 0 0 reboot-when-oom Jan 25 03:01:39 firewallfsi kernel: [91632.994263] [18123] 276 18123 118311 105011 225 3 0 0 clamd Jan 25 03:01:39 firewallfsi kernel: [91632.994569] [18137] 290 18137 10838 1193 11 3 0 0 clamav-milter Jan 25 03:01:39 firewallfsi kernel: [91632.994898] [18160] 0 18160 3829 1728 12 3 0 0 sendmail Jan 25 03:01:39 firewallfsi kernel: [91632.995213] [18175] 51 18175 3501 767 11 3 0 0 sendmail Jan 25 03:01:39 firewallfsi kernel: [91632.995530] [18306] 23 18306 5440 725 15 3 0 0 squid Jan 25 03:01:39 firewallfsi kernel: [91632.995859] [18308] 23 18308 9593 6545 21 3 0 0 squid Jan 25 03:01:39 firewallfsi kernel: [91632.996164] [18309] 23 18309 1179 404 6 3 0 0 unlinkd Jan 25 03:01:39 firewallfsi kernel: [91632.996472] [22303] 0 22303 11505 5578 27 3 0 0 smbd Jan 25 03:01:39 firewallfsi kernel: [91632.996794] [25074] 273 25074 2209 1416 8 3 0 0 imap-login Jan 25 03:01:39 firewallfsi kernel: [91632.997098] [25075] 301 25075 1401 905 6 3 0 0 imap Jan 25 03:01:39 firewallfsi kernel: [91632.997406] [15747] 48 15747 16220 4394 29 3 0 0 /usr/sbin/httpd Jan 25 03:01:39 firewallfsi kernel: [91632.997737] [15750] 48 15750 16216 4253 29 3 0 0 /usr/sbin/httpd Jan 25 03:01:39 firewallfsi kernel: [91632.998049] [15752] 48 15752 16220 4238 29 3 0 0 /usr/sbin/httpd Jan 25 03:01:39 firewallfsi kernel: [91632.998366] [27260] 273 27260 2209 1372 9 3 0 0 imap-login Jan 25 03:01:39 firewallfsi kernel: [91632.998704] [27264] 302 27264 1421 1109 6 3 0 0 imap Jan 25 03:01:39 firewallfsi kernel: [91632.999026] [ 4826] 273 4826 2208 1449 8 3 0 0 imap-login Jan 25 03:01:39 firewallfsi kernel: [91632.999357] [ 4830] 301 4830 1396 980 7 3 0 0 imap Jan 25 03:01:39 firewallfsi kernel: [91632.999702] [25329] 0 25329 4181 1979 11 3 0 0 sshd Jan 25 03:01:39 firewallfsi kernel: [91633.000029] [25331] 0 25331 4214 1018 11 3 0 0 sshd Jan 25 03:01:39 firewallfsi kernel: [91633.000362] [25343] 0 25343 1236 1008 6 3 0 0 tcsh Jan 25 03:01:39 firewallfsi kernel: [91633.000713] [ 3159] 0 3159 1396 169 7 3 0 0 sleep Jan 25 03:01:39 firewallfsi kernel: [91633.001049] [ 4685] 0 4685 589 165 5 3 0 0 tail Jan 25 03:01:39 firewallfsi kernel: [91633.001384] [ 5660] 0 5660 1900 510 8 3 0 0 crond Jan 25 03:01:39 firewallfsi kernel: [91633.001736] [ 5661] 0 5661 3185 2084 10 3 0 0 freshclam-tec-w Jan 25 03:01:39 firewallfsi kernel: [91633.002074] [ 5770] 0 5770 1396 141 6 3 0 0 sleep Jan 25 03:01:39 firewallfsi kernel: [91633.002416] [ 5771] 0 5771 9575 2310 22 3 0 0 smbd Jan 25 03:01:39 firewallfsi kernel: [91633.002776] [ 5782] 0 5782 1654 402 7 3 0 0 anacron Jan 25 03:01:39 firewallfsi kernel: [91633.003117] Out of memory: Kill process 18123 (clamd) score 10 or sacrifice child Jan 25 03:01:39 firewallfsi kernel: [91633.003473] Killed process 18123 (clamd) total-vm:473244kB, anon-rss:403624kB, file-rss:16420kB, shmem-rss:0kB Jan 25 03:01:39 firewallfsi kernel: [91633.100395] audit: type=1131 audit(1485334899.957:2534): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=clamd@scan comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed' ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-25 10:02 ` Trevor Cordes @ 2017-01-25 12:04 ` Michal Hocko 2017-01-29 22:50 ` Trevor Cordes 0 siblings, 1 reply; 40+ messages in thread From: Michal Hocko @ 2017-01-25 12:04 UTC (permalink / raw) To: Trevor Cordes Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Wed 25-01-17 04:02:46, Trevor Cordes wrote: > On 2017-01-23 Mel Gorman wrote: > > On Sun, Jan 22, 2017 at 06:45:59PM -0600, Trevor Cordes wrote: > > > On 2017-01-20 Mel Gorman wrote: > > > > > > > > > > Thanks for the OOM report. I was expecting it to be a particular > > > > > shape and my expectations were not matched so it took time to > > > > > consider it further. Can you try the cumulative patch below? It > > > > > combines three patches that > > > > > > > > > > 1. Allow slab shrinking even if the LRU patches are > > > > > unreclaimable in direct reclaim > > > > > 2. Shrinks slab based once based on the contents of all memcgs > > > > > instead of shrinking one at a time > > > > > 3. Tries to shrink slabs if the lowmem usage is too high > > > > > > > > > > Unfortunately it's only boot tested on x86-64 as I didn't get > > > > > the chance to setup an i386 test bed. > > > > > > > > > > > > > There was one major flaw in that patch. This version fixes it and > > > > addresses other minor issues. It may still be too agressive > > > > shrinking slab but worth trying out. Thanks. > > > > > > I ran with your patch below and it oom'd on the first night. It was > > > weird, it didn't hang the system, and my rebooter script started a > > > reboot but the system never got more than half down before it just > > > sat there in a weird state where a local console user could still > > > login but not much was working. So the patches don't seem to solve > > > the problem. > > > > > > For the above compile I applied your patches to 4.10.0-rc4+, I hope > > > that's ok. > > > > > > > It would be strongly preferred to run them on top of Michal's other > > fixes. The main reason it's preferred is because this OOM differs from > > earlier ones in that it OOM killed from GFP_NOFS|__GFP_NOFAIL context. > > That meant that the slab shrinking could not happen from direct > > reclaim so the balancing from my patches would not occur. As > > Michal's other patches affect how kswapd behaves, it's important. > > OK, I patched & compiled mhocko's git tree from the other day 4.9.0+. > (To confirm, weird, but mhocko's git tree I'm using from a couple of > weeks ago shows the newest commit (git log) is > 69973b830859bc6529a7a0468ba0d80ee5117826 "Linux 4.9"? Let me know if > I'm doing something wrong, see below.) My fault. I should have noted that you should use since-4.9 branch. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-25 12:04 ` Michal Hocko @ 2017-01-29 22:50 ` Trevor Cordes 2017-01-30 7:51 ` Michal Hocko 2017-01-30 9:10 ` Mel Gorman 0 siblings, 2 replies; 40+ messages in thread From: Trevor Cordes @ 2017-01-29 22:50 UTC (permalink / raw) To: Michal Hocko Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On 2017-01-25 Michal Hocko wrote: > On Wed 25-01-17 04:02:46, Trevor Cordes wrote: > > OK, I patched & compiled mhocko's git tree from the other day > > 4.9.0+. (To confirm, weird, but mhocko's git tree I'm using from a > > couple of weeks ago shows the newest commit (git log) is > > 69973b830859bc6529a7a0468ba0d80ee5117826 "Linux 4.9"? Let me know > > if I'm doing something wrong, see below.) > > My fault. I should have noted that you should use since-4.9 branch. OK, I have good news. I compiled your mhocko git tree (properly this tim!) using since-4.9 branch (last commit ca63ff9b11f958efafd8c8fa60fda14baec6149c Jan 25) and the box survived 3 3am's, over 60 hours, and I made sure all the usual oom culprits ran, and I ran extras (finds on the whole tree, extra rdiff-backups) to try to tax it. Based on my previous criteria I would say your since-4.9 as of the above commit solves my bug, at least over a 3 day test span (which it never survives when the bug is present)! I tested WITHOUT any cgroup/mem boot options. I do still have my mem=6G limiter on, though (I've never tested with it off, until I solve the bug with it on, since I've had it on for many months for other reasons). On 2017-01-27 Michal Hocko wrote: > OK, that matches the theory that these OOMs are caused by the > incorrect active list aging fixed by b4536f0c829c ("mm, memcg: fix > the active list aging for lowmem requests when memcg is enabled") b4536f0c829c isn't in the since-4.9 I tested above though? So something else you did must have fixed it (also)? I don't think I've run any tests yet with b4536f0c829c in them? I think the vanillas I was doing a couple of weeks ago were before b4536f0c829c, but I can't be sure. What do I test next? Does the since-4.9 stuff get pushed into vanilla (4.9 hopefully?) so it can find its way into Fedora's stuck F24 kernel? I want to also note that the RHBZ https://bugzilla.redhat.com/show_bug.cgi?id=1401012 is garnering more interest as more people start me-too'ing. The situation is almost always the same: large rsync's or similar tree-scan accesses cause oom on PAE boxes. However, I wanted to note that many people there reported that cgroup_disable=memory doesn't fix anything for them, whereas that always makes the problem go away on my boxes. Strange. Thanks Michal and Mel, I really appreciate it! ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-29 22:50 ` Trevor Cordes @ 2017-01-30 7:51 ` Michal Hocko 2017-02-01 9:29 ` Trevor Cordes 2017-01-30 9:10 ` Mel Gorman 1 sibling, 1 reply; 40+ messages in thread From: Michal Hocko @ 2017-01-30 7:51 UTC (permalink / raw) To: Trevor Cordes Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Sun 29-01-17 16:50:03, Trevor Cordes wrote: > On 2017-01-25 Michal Hocko wrote: > > On Wed 25-01-17 04:02:46, Trevor Cordes wrote: > > > OK, I patched & compiled mhocko's git tree from the other day > > > 4.9.0+. (To confirm, weird, but mhocko's git tree I'm using from a > > > couple of weeks ago shows the newest commit (git log) is > > > 69973b830859bc6529a7a0468ba0d80ee5117826 "Linux 4.9"? Let me know > > > if I'm doing something wrong, see below.) > > > > My fault. I should have noted that you should use since-4.9 branch. > > OK, I have good news. I compiled your mhocko git tree (properly this > tim!) using since-4.9 branch (last commit > ca63ff9b11f958efafd8c8fa60fda14baec6149c Jan 25) and the box survived 3 > 3am's, over 60 hours, and I made sure all the usual oom culprits ran, > and I ran extras (finds on the whole tree, extra rdiff-backups) to try > to tax it. Based on my previous criteria I would say your since-4.9 as > of the above commit solves my bug, at least over a 3 day test span > (which it never survives when the bug is present)! > > I tested WITHOUT any cgroup/mem boot options. I do still have my > mem=6G limiter on, though (I've never tested with it off, until I solve > the bug with it on, since I've had it on for many months for other > reasons). Good news indeed. > > On 2017-01-27 Michal Hocko wrote: > > OK, that matches the theory that these OOMs are caused by the > > incorrect active list aging fixed by b4536f0c829c ("mm, memcg: fix > > the active list aging for lowmem requests when memcg is enabled") > > b4536f0c829c isn't in the since-4.9 I tested above though? Yes this is a sha1 from Linus tree. The same commit is in the since-4.9 branch under 0759e73ee689f2066a4d64dd90ec5cc3fed28f86. There are some more fixes on top of course. > So > something else you did must have fixed it (also)? I don't think I've > run any tests yet with b4536f0c829c in them? I think the vanillas I > was doing a couple of weeks ago were before b4536f0c829c, but I can't > be sure. > > What do I test next? Does the since-4.9 stuff get pushed into vanilla > (4.9 hopefully?) so it can find its way into Fedora's stuck F24 > kernel? Testing with Valinall rc6 released just yesterday would be a good fit. There are some more fixes sitting on mmotm on top and maybe we want some of them in finall 4.10. Anyway all those pending changes should be merged in the next merge window - aka 4.11 > I want to also note that the RHBZ > https://bugzilla.redhat.com/show_bug.cgi?id=1401012 is garnering more > interest as more people start me-too'ing. The situation is almost > always the same: large rsync's or similar tree-scan accesses cause oom > on PAE boxes. I believe your instructions in comment 20 covers it nicely. If the problem still persists with the current mmotm tree I would suggest writing to the mailing list (feel free to CC me) and we will have a look. Thanks! > However, I wanted to note that many people there reported > that cgroup_disable=memory doesn't fix anything for them, whereas that > always makes the problem go away on my boxes. Strange. > > Thanks Michal and Mel, I really appreciate it! -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-30 7:51 ` Michal Hocko @ 2017-02-01 9:29 ` Trevor Cordes 2017-02-01 10:14 ` Michal Hocko 0 siblings, 1 reply; 40+ messages in thread From: Trevor Cordes @ 2017-02-01 9:29 UTC (permalink / raw) To: Michal Hocko Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju [-- Attachment #1: Type: text/plain, Size: 3033 bytes --] On 2017-01-30 Michal Hocko wrote: > On Sun 29-01-17 16:50:03, Trevor Cordes wrote: > > On 2017-01-25 Michal Hocko wrote: > > > On Wed 25-01-17 04:02:46, Trevor Cordes wrote: > > > > OK, I patched & compiled mhocko's git tree from the other day > > > > 4.9.0+. (To confirm, weird, but mhocko's git tree I'm using > > > > from a couple of weeks ago shows the newest commit (git log) is > > > > 69973b830859bc6529a7a0468ba0d80ee5117826 "Linux 4.9"? Let me > > > > know if I'm doing something wrong, see below.) > > > > > > My fault. I should have noted that you should use since-4.9 > > > branch. > > > > OK, I have good news. I compiled your mhocko git tree (properly > > this tim!) using since-4.9 branch (last commit > > ca63ff9b11f958efafd8c8fa60fda14baec6149c Jan 25) and the box > > survived 3 3am's, over 60 hours, and I made sure all the usual oom > > culprits ran, and I ran extras (finds on the whole tree, extra > > rdiff-backups) to try to tax it. Based on my previous criteria I > > would say your since-4.9 as of the above commit solves my bug, at > > least over a 3 day test span (which it never survives when the bug > > is present)! > > > > I tested WITHOUT any cgroup/mem boot options. I do still have my > > mem=6G limiter on, though (I've never tested with it off, until I > > solve the bug with it on, since I've had it on for many months for > > other reasons). > > Good news indeed. Even better, another guy on the rhbz reported the mhocko git tree since-4.9 solves the bug for him too! And it ran another night (4+ total) without problems on my box. Whatever is in since-4.9 fixes it, as I reported before. But... > Testing with Valinall rc6 released just yesterday would be a good fit. > There are some more fixes sitting on mmotm on top and maybe we want > some of them in finall 4.10. Anyway all those pending changes should > be merged in the next merge window - aka 4.11 After 30 hours of running vanilla 4.10.0-rc6, the box started to go bonkers at 3am, so vanilla does not fix the bug :-( But, the bug hit differently this time, the box just bogged down like crazy and gave really weird top output. Starting nano would take 10s, then would run full speed, then when saving a file would take 5s. Starting any prog not in cache took equally as long. However, no oom hit. I waited about 15 minutes and things seemed to bog more, so I rebooted into since-4.9. Maybe if I had kept waiting the box would have oom'd, but I didn't want to take the chance (it's remote, and I can't reset it). I did capture a lot of the weird top, meminfo and slabinfo data before rebooting. I'll attached the output to this email. Messages show a lot of "page allocation stalls" during the bogged-down time. So my hunch at this moment is 4.10.0-rc6 might help alleviate the problem somewhat, but it's other things you have in since-4.9 that solve it completely. Let me know if you need any more testing or some bisecting or something. I'll keep on running since-4.9 in the meantime. Thanks! [-- Attachment #2: 4.10.rc6-bogged --] [-- Type: application/octet-stream, Size: 24771 bytes --] Feb 1 03:02:53 firewallfsi kernel: [170853.129545] nmbd: page allocation stalls for 10404ms, order:1, mode:0x17000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK) Feb 1 03:02:53 firewallfsi kernel: [170853.133448] CPU: 2 PID: 1545 Comm: nmbd Not tainted 4.10.0-rc6 #19 Feb 1 03:02:53 firewallfsi kernel: [170853.136949] Hardware name: Intel Corporation S1200BTL/S1200BTL, BIOS S1200BT.86B.02.00.0035.030220120927 03/02/2012 Feb 1 03:02:53 firewallfsi kernel: [170853.140563] Call Trace: Feb 1 03:02:53 firewallfsi kernel: [170853.144083] dump_stack+0x58/0x81 Feb 1 03:02:53 firewallfsi kernel: [170853.147591] warn_alloc+0xf6/0x110 Feb 1 03:02:53 firewallfsi kernel: [170853.150810] __alloc_pages_nodemask+0x97c/0xc70 Feb 1 03:02:53 firewallfsi kernel: [170853.153463] ? kmem_cache_alloc+0xf7/0x1c0 Feb 1 03:02:53 firewallfsi kernel: [170853.156094] ? copy_process.part.44+0x531/0x1590 Feb 1 03:02:53 firewallfsi kernel: [170853.158670] copy_process.part.44+0x108/0x1590 Feb 1 03:02:53 firewallfsi kernel: [170853.161214] _do_fork+0xd4/0x370 Feb 1 03:02:53 firewallfsi kernel: [170853.163707] ? __audit_syscall_exit+0x1e6/0x270 Feb 1 03:02:53 firewallfsi kernel: [170853.166189] SyS_clone+0x2c/0x30 Feb 1 03:02:53 firewallfsi kernel: [170853.168626] do_fast_syscall_32+0x8a/0x150 Feb 1 03:02:53 firewallfsi kernel: [170853.171328] entry_SYSENTER_32+0x4e/0x7c Feb 1 03:02:53 firewallfsi kernel: [170853.173988] EIP: 0xb77cdd25 Feb 1 03:02:53 firewallfsi kernel: [170853.176635] EFLAGS: 00200246 CPU: 2 Feb 1 03:02:53 firewallfsi kernel: [170853.179222] EAX: ffffffda EBX: 01200011 ECX: 00000000 EDX: 00000000 Feb 1 03:02:53 firewallfsi kernel: [170853.181765] ESI: 00000000 EDI: b619f828 EBP: bfd858d8 ESP: bfd85830 Feb 1 03:02:53 firewallfsi kernel: [170853.184325] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b Feb 1 03:02:53 firewallfsi kernel: [170853.186914] Mem-Info: Feb 1 03:02:53 firewallfsi kernel: [170853.188979] active_anon:140761 inactive_anon:50039 isolated_anon:0 Feb 1 03:02:53 firewallfsi kernel: [170853.188979] active_file:158493 inactive_file:643928 isolated_file:3 Feb 1 03:02:53 firewallfsi kernel: [170853.188979] unevictable:0 dirty:192 writeback:0 unstable:0 Feb 1 03:02:53 firewallfsi kernel: [170853.188979] slab_reclaimable:135593 slab_unreclaimable:10598 Feb 1 03:02:53 firewallfsi kernel: [170853.188979] mapped:23318 shmem:450 pagetables:1591 bounce:0 Feb 1 03:02:53 firewallfsi kernel: [170853.188979] free:66507 free_pcp:0 free_cma:0 Feb 1 03:02:53 firewallfsi kernel: [170853.200037] Node 0 active_anon:563044kB inactive_anon:200156kB active_file:633972kB inactive_file:2575712kB unevictable:0kB isolated(anon):0kB isolated(file):12kB mapped:93272kB dirty:768kB writeback:0kB shmem:1800kB writeback_tmp:0kB unstable:0kB pages_scanned:1682156 all_unreclaimable? no Feb 1 03:02:53 firewallfsi kernel: [170853.205225] DMA free:3152kB min:68kB low:84kB high:100kB active_anon:0kB inactive_anon:0kB active_file:20kB inactive_file:0kB unevictable:0kB writepending:0kB present:15976kB managed:15900kB mlocked:0kB slab_reclaimable:11856kB slab_unreclaimable:524kB kernel_stack:8kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Feb 1 03:02:53 firewallfsi kernel: [170853.210620] lowmem_reserve[]: 0 777 4733 4733 Feb 1 03:02:53 firewallfsi kernel: [170853.212306] Normal free:3976kB min:3532kB low:4412kB high:5292kB active_anon:20kB inactive_anon:124kB active_file:124808kB inactive_file:80960kB unevictable:0kB writepending:4kB present:892920kB managed:816852kB mlocked:0kB slab_reclaimable:530516kB slab_unreclaimable:41868kB kernel_stack:2648kB pagetables:4kB bounce:0kB free_pcp:124kB local_pcp:0kB free_cma:0kB Feb 1 03:02:53 firewallfsi kernel: [170853.217476] lowmem_reserve[]: 0 0 31652 31652 Feb 1 03:02:53 firewallfsi kernel: [170853.218869] HighMem free:258900kB min:512kB low:5008kB high:9504kB active_anon:563024kB inactive_anon:200032kB active_file:509144kB inactive_file:2494752kB unevictable:0kB writepending:764kB present:4051548kB managed:4051548kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:6360kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB Feb 1 03:02:53 firewallfsi kernel: [170853.223109] lowmem_reserve[]: 0 0 0 0 Feb 1 03:02:53 firewallfsi kernel: [170853.224489] DMA: 52*4kB (UME) 28*8kB (UME) 26*16kB (UM) 12*32kB (M) 10*64kB (UME) 6*128kB (UME) 2*256kB (ME) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3152kB Feb 1 03:02:53 firewallfsi kernel: [170853.227261] Normal: 453*4kB (UMEH) 219*8kB (UMEH) 14*16kB (MEH) 10*32kB (H) 2*64kB (H) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4236kB Feb 1 03:02:53 firewallfsi kernel: [170853.230025] HighMem: 62*4kB (U) 104*8kB (UM) 48*16kB (UM) 27*32kB (UM) 921*64kB (UM) 586*128kB (M) 183*256kB (M) 125*512kB (M) 7*1024kB (M) 2*2048kB (M) 0*4096kB = 258776kB Feb 1 03:02:53 firewallfsi kernel: [170853.232796] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Feb 1 03:02:53 firewallfsi kernel: [170853.234106] 802968 total pagecache pages Feb 1 03:02:53 firewallfsi kernel: [170853.235472] 91 pages in swap cache Feb 1 03:02:53 firewallfsi kernel: [170853.236732] Swap cache stats: add 2347, delete 2256, find 242/258 Feb 1 03:02:53 firewallfsi kernel: [170853.238070] Free swap = 33775564kB Feb 1 03:02:53 firewallfsi kernel: [170853.239383] Total swap = 33784572kB Feb 1 03:02:53 firewallfsi kernel: [170853.240707] 1240111 pages RAM Feb 1 03:02:53 firewallfsi kernel: [170853.242032] 1012887 pages HighMem/MovableOnly Feb 1 03:02:53 firewallfsi kernel: [170853.243334] 19036 pages reserved Feb 1 03:02:53 firewallfsi kernel: [170853.244685] 0 pages hwpoisoned ################### SAMPLE TOPS (normally none of the user ps's would be more than 1-5% CPU) note the 100%si stat sometimes!!! ################### %Cpu(s): 0.1 us, 3.2 sy, 0.0 ni, 95.8 id, 0.6 wa, 0.1 hi, 0.3 si, 0.0 st KiB Mem : 4884300 total, 273220 free, 815488 used, 3795592 buff/cache KiB Swap: 33784572 total, 33775564 free, 9008 used. 3848136 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 77 root 20 0 0 0 0 R 100.0 0.0 11:06.21 kswapd0 1773 samba 20 0 46880 17716 14776 R 99.3 0.4 5:10.76 smbd 854 root 20 0 7276 3180 2524 S 72.8 0.1 1:33.27 watch-services 25135 root 20 0 9476 3520 3224 R 27.6 0.1 0:00.83 ps 28939 root 20 0 0 0 0 S 1.0 0.0 2:44.76 kworker/0:0 %Cpu(s): 0.1 us, 4.8 sy, 0.0 ni, 93.0 id, 1.5 wa, 0.1 hi, 0.3 si, 0.0 st KiB Mem : 4884300 total, 272800 free, 815976 used, 3795524 buff/cache KiB Swap: 33784572 total, 33775564 free, 9008 used. 3847396 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1773 samba 20 0 46880 17716 14776 R 100.0 0.4 5:13.80 smbd 77 root 20 0 0 0 0 R 99.0 0.0 11:09.20 kswapd0 854 root 20 0 7276 3180 2524 R 90.4 0.1 1:36.00 watch-services 25136 root 20 0 0 0 0 Z 55.0 0.0 0:01.66 ps 1545 root 20 0 28196 9644 8112 R 6.3 0.2 0:45.65 nmbd 939 root 20 0 12936 8544 4996 R 5.0 0.2 1:28.30 dynamic-ip-upda %Cpu(s): 0.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi,100.0 si, 0.0 st KiB Mem : 4884300 total, 273456 free, 820436 used, 3790408 buff/cache KiB Swap: 33784572 total, 33775564 free, 9008 used. 3843044 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 77 root 20 0 0 0 0 R 100.0 0.0 11:12.23 kswapd0 25138 root 20 0 4188 2396 2144 R 64.2 0.0 0:01.94 systemd-coredum 939 root 20 0 12936 8544 4996 S 58.3 0.2 1:30.06 dynamic-ip-upda 1773 samba 20 0 46880 17716 14776 D 41.4 0.4 5:15.05 smbd 854 root 20 0 7276 3180 2524 S 39.1 0.1 1:37.18 watch-services 25148 root 20 0 9984 3732 3388 R 24.2 0.1 0:00.73 ps 22995 root 20 0 0 0 0 S 21.2 0.0 0:40.36 kworker/u16:0 %Cpu(s): 0.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi,100.0 si, 0.0 st KiB Mem : 4884300 total, 268536 free, 818808 used, 3796956 buff/cache KiB Swap: 33784572 total, 33775564 free, 9008 used. 3844856 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 77 root 20 0 0 0 0 R 100.0 0.0 11:18.25 kswapd0 25138 root 20 0 4188 2396 2144 R 99.7 0.0 0:07.95 systemd-coredum 25160 root 20 0 7276 1932 1272 S 86.1 0.0 0:02.60 watch-services 22995 root 20 0 0 0 0 D 77.5 0.0 0:42.70 kworker/u16:0 1773 samba 20 0 46880 17716 14776 S 25.2 0.4 5:16.93 smbd 25161 root 20 0 8076 2020 1852 R 11.3 0.0 0:00.34 systemctl 25162 root 20 0 8500 3208 2940 R 10.9 0.1 0:00.33 perl #cat /proc/meminfo MemTotal: 4884300 kB MemFree: 218488 kB MemAvailable: 3798704 kB Buffers: 166592 kB Cached: 3019200 kB SwapCached: 580 kB Active: 1207728 kB Inactive: 2780088 kB Active(anon): 618944 kB Inactive(anon): 184880 kB Active(file): 588784 kB Inactive(file): 2595208 kB Unevictable: 0 kB Mlocked: 0 kB HighTotal: 4051548 kB HighFree: 203096 kB LowTotal: 832752 kB LowFree: 15392 kB SwapTotal: 33784572 kB SwapFree: 33775628 kB Dirty: 1448 kB Writeback: 0 kB AnonPages: 801756 kB Mapped: 93060 kB Shmem: 1528 kB Slab: 614016 kB SReclaimable: 572792 kB SUnreclaim: 41224 kB KernelStack: 2608 kB PageTables: 6292 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 36226720 kB Committed_AS: 1940852 kB VmallocTotal: 122880 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HardwareCorrupted: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 10232 kB DirectMap2M: 899072 kB 3:06am ~#cat /proc/slabinfo slabinfo - version: 2.1 # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail> nf_conntrack_expect 0 0 184 22 1 : tunables 0 0 0 : slabdata 0 0 0 nf_conntrack 654 800 256 32 2 : tunables 0 0 0 : slabdata 25 25 0 ext4_groupinfo_4k 21816 21816 112 36 1 : tunables 0 0 0 : slabdata 606 606 0 raid6-md126 405 972 1384 23 8 : tunables 0 0 0 : slabdata 45 45 0 ip6-frags 0 0 160 25 1 : tunables 0 0 0 : slabdata 0 0 0 UDPv6 351 351 832 39 8 : tunables 0 0 0 : slabdata 9 9 0 tw_sock_TCPv6 273 273 208 39 2 : tunables 0 0 0 : slabdata 7 7 0 request_sock_TCPv6 32 32 256 32 2 : tunables 0 0 0 : slabdata 1 1 0 TCPv6 144 144 1728 18 8 : tunables 0 0 0 : slabdata 8 8 0 kcopyd_job 0 0 2344 13 8 : tunables 0 0 0 : slabdata 0 0 0 dm_uevent 0 0 2472 13 8 : tunables 0 0 0 : slabdata 0 0 0 cfq_io_cq 459 459 80 51 1 : tunables 0 0 0 : slabdata 9 9 0 cfq_queue 414 414 176 23 1 : tunables 0 0 0 : slabdata 18 18 0 bsg_cmd 0 0 288 28 2 : tunables 0 0 0 : slabdata 0 0 0 mqueue_inode_cache 28 28 576 28 4 : tunables 0 0 0 : slabdata 1 1 0 jbd2_journal_head 7808 7808 64 64 1 : tunables 0 0 0 : slabdata 122 122 0 jbd2_revoke_table_s 512 512 16 256 1 : tunables 0 0 0 : slabdata 2 2 0 ext4_inode_cache 686230 688859 696 23 4 : tunables 0 0 0 : slabdata 49255 49255 0 ext4_allocation_context 351 351 104 39 1 : tunables 0 0 0 : slabdata 9 9 0 ext4_prealloc_space 1176 1176 72 56 1 : tunables 0 0 0 : slabdata 21 21 0 ext4_extent_status 9600 9600 32 128 1 : tunables 0 0 0 : slabdata 75 75 0 mbcache 1938 1938 40 102 1 : tunables 0 0 0 : slabdata 19 19 0 userfaultfd_ctx_cache 0 0 64 64 1 : tunables 0 0 0 : slabdata 0 0 0 pid_namespace 0 0 112 36 1 : tunables 0 0 0 : slabdata 0 0 0 posix_timers_cache 168 168 168 24 1 : tunables 0 0 0 : slabdata 7 7 0 xfrm_dst_cache 275 275 320 25 2 : tunables 0 0 0 : slabdata 11 11 0 RAW 184 184 704 23 4 : tunables 0 0 0 : slabdata 8 8 0 tw_sock_TCP 312 312 208 39 2 : tunables 0 0 0 : slabdata 8 8 0 request_sock_TCP 160 160 256 32 2 : tunables 0 0 0 : slabdata 5 5 0 TCP 180 180 1600 20 8 : tunables 0 0 0 : slabdata 9 9 0 hugetlbfs_inode_cache 46 46 352 23 2 : tunables 0 0 0 : slabdata 2 2 0 dquot 252 252 192 21 1 : tunables 0 0 0 : slabdata 12 12 0 eventpoll_pwq 918 918 40 102 1 : tunables 0 0 0 : slabdata 9 9 0 request_queue 147 147 1496 21 8 : tunables 0 0 0 : slabdata 7 7 0 blkdev_requests 884 884 240 34 2 : tunables 0 0 0 : slabdata 33 33 0 biovec-256 163 183 3072 10 8 : tunables 0 0 0 : slabdata 21 21 0 biovec-128 275 294 1536 21 8 : tunables 0 0 0 : slabdata 14 14 0 biovec-64 557 651 768 21 4 : tunables 0 0 0 : slabdata 31 31 0 user_namespace 0 0 360 22 2 : tunables 0 0 0 : slabdata 0 0 0 sock_inode_cache 1155 1155 384 21 2 : tunables 0 0 0 : slabdata 55 55 0 skbuff_fclone_cache 531 531 448 36 4 : tunables 0 0 0 : slabdata 20 20 0 file_lock_cache 272 272 120 34 1 : tunables 0 0 0 : slabdata 8 8 0 net_namespace 0 0 3840 8 8 : tunables 0 0 0 : slabdata 0 0 0 shmem_inode_cache 2740 2740 400 20 2 : tunables 0 0 0 : slabdata 137 137 0 taskstats 192 192 328 24 2 : tunables 0 0 0 : slabdata 8 8 0 proc_inode_cache 2827 3118 384 21 2 : tunables 0 0 0 : slabdata 160 160 0 sigqueue 224 224 144 28 1 : tunables 0 0 0 : slabdata 8 8 0 bdev_cache 256 256 512 32 4 : tunables 0 0 0 : slabdata 8 8 0 kernfs_node_cache 43399 44688 72 56 1 : tunables 0 0 0 : slabdata 798 798 0 inode_cache 12256 13547 352 23 2 : tunables 0 0 0 : slabdata 589 589 0 dentry 421047 427360 128 32 1 : tunables 0 0 0 : slabdata 13355 13355 0 avc_node 584 584 56 73 1 : tunables 0 0 0 : slabdata 8 8 0 buffer_head 101998 105923 56 73 1 : tunables 0 0 0 : slabdata 1451 1451 0 vm_area_struct 25524 26250 96 42 1 : tunables 0 0 0 : slabdata 625 625 0 mm_struct 1088 1088 512 32 4 : tunables 0 0 0 : slabdata 34 34 0 files_cache 992 992 256 32 2 : tunables 0 0 0 : slabdata 31 31 0 signal_cache 1150 1150 704 23 4 : tunables 0 0 0 : slabdata 50 50 0 sighand_cache 1056 1056 1344 24 8 : tunables 0 0 0 : slabdata 44 44 0 task_struct 578 616 4544 7 8 : tunables 0 0 0 : slabdata 88 88 0 cred_jar 6504 7328 128 32 1 : tunables 0 0 0 : slabdata 229 229 0 Acpi-Namespace 5780 5780 24 170 1 : tunables 0 0 0 : slabdata 34 34 0 anon_vma_chain 42793 44544 32 128 1 : tunables 0 0 0 : slabdata 348 348 0 anon_vma 25451 26775 48 85 1 : tunables 0 0 0 : slabdata 315 315 0 pid 3712 3712 64 64 1 : tunables 0 0 0 : slabdata 58 58 0 radix_tree_node 33608 36712 304 26 2 : tunables 0 0 0 : slabdata 1412 1412 0 trace_event_file 5525 5525 48 85 1 : tunables 0 0 0 : slabdata 65 65 0 idr_layer_cache 899 899 280 29 2 : tunables 0 0 0 : slabdata 31 31 0 task_group 294 294 384 21 2 : tunables 0 0 0 : slabdata 14 14 0 dma-kmalloc-8192 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-4096 0 0 4096 8 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-2048 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-1024 0 0 1024 32 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-512 0 0 512 32 4 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-256 0 0 256 32 2 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-128 0 0 128 32 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-64 0 0 64 64 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-32 0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-16 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-8 0 0 8 512 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-192 0 0 192 21 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-96 0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-8192 68 68 8192 4 8 : tunables 0 0 0 : slabdata 17 17 0 kmalloc-4096 249 296 4096 8 8 : tunables 0 0 0 : slabdata 37 37 0 kmalloc-2048 585 596 2048 16 8 : tunables 0 0 0 : slabdata 39 39 0 kmalloc-1024 2071 2112 1024 32 8 : tunables 0 0 0 : slabdata 66 66 0 kmalloc-512 4033 4096 512 32 4 : tunables 0 0 0 : slabdata 128 128 0 kmalloc-256 1662 1824 256 32 2 : tunables 0 0 0 : slabdata 57 57 0 kmalloc-192 5282 6195 192 21 1 : tunables 0 0 0 : slabdata 295 295 0 kmalloc-128 2699 3008 128 32 1 : tunables 0 0 0 : slabdata 94 94 0 kmalloc-96 26722 28014 96 42 1 : tunables 0 0 0 : slabdata 667 667 0 kmalloc-64 53750 57984 64 64 1 : tunables 0 0 0 : slabdata 906 906 0 kmalloc-32 93952 93952 32 128 1 : tunables 0 0 0 : slabdata 734 734 0 kmalloc-16 12544 12544 16 256 1 : tunables 0 0 0 : slabdata 49 49 0 kmalloc-8 7168 7168 8 512 1 : tunables 0 0 0 : slabdata 14 14 0 kmem_cache_node 1024 1024 32 128 1 : tunables 0 0 0 : slabdata 8 8 0 kmem_cache 777 777 192 21 1 : tunables 0 0 0 : slabdata 37 37 0 %Cpu(s): 0.0 us, 19.5 sy, 0.0 ni, 78.9 id, 0.8 wa, 0.4 hi, 0.4 si, 0.0 st KiB Mem : 4884300 total, 285300 free, 814736 used, 3784264 buff/cache KiB Swap: 33784572 total, 33775636 free, 8936 used. 3850592 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 939 root 20 0 12936 8544 4996 R 100.0 0.2 1:49.84 dynamic-ip-upda 1942 root 20 0 3816 2692 2456 R 100.0 0.1 1:10.07 dovecot 1545 root 20 0 28196 9644 8112 R 100.0 0.2 2:12.90 nmbd 854 root 20 0 7276 3180 2524 R 100.0 0.1 2:24.54 watch-services 1773 samba 20 0 46880 17716 14776 R 100.0 0.4 7:22.75 smbd 2 root 20 0 0 0 0 R 99.0 0.0 1:26.01 kthreadd 77 root 20 0 0 0 0 R 92.3 0.0 14:52.22 kswapd0 %Cpu(s): 0.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi,100.0 si, 0.0 st KiB Mem : 4884300 total, 283808 free, 815836 used, 3784656 buff/cache KiB Swap: 33784572 total, 33775636 free, 8936 used. 3849492 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 77 root 20 0 0 0 0 R 100.0 0.0 14:55.29 kswapd0 1942 root 20 0 3816 2692 2456 R 100.0 0.1 1:13.13 dovecot 854 root 20 0 7276 3180 2524 R 99.7 0.1 2:27.58 watch-services 1545 root 20 0 28196 9644 8112 R 99.3 0.2 2:15.93 nmbd 2 root 20 0 0 0 0 R 98.7 0.0 1:29.02 kthreadd 1773 samba 20 0 46880 17716 14776 S 64.9 0.4 7:24.73 smbd 939 root 20 0 12936 8544 4996 S 39.0 0.2 1:51.03 dynamic-ip-upda ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-02-01 9:29 ` Trevor Cordes @ 2017-02-01 10:14 ` Michal Hocko 2017-02-04 0:36 ` Trevor Cordes 0 siblings, 1 reply; 40+ messages in thread From: Michal Hocko @ 2017-02-01 10:14 UTC (permalink / raw) To: Trevor Cordes Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Wed 01-02-17 03:29:28, Trevor Cordes wrote: > On 2017-01-30 Michal Hocko wrote: [...] > > Testing with Valinall rc6 released just yesterday would be a good fit. > > There are some more fixes sitting on mmotm on top and maybe we want > > some of them in finall 4.10. Anyway all those pending changes should > > be merged in the next merge window - aka 4.11 > > After 30 hours of running vanilla 4.10.0-rc6, the box started to go > bonkers at 3am, so vanilla does not fix the bug :-( But, the bug hit > differently this time, the box just bogged down like crazy and gave > really weird top output. Starting nano would take 10s, then would run > full speed, then when saving a file would take 5s. Starting any prog > not in cache took equally as long. Could you try with to_test/linus-tree/oom_hickups branch on the same git tree? I have cherry-picked "mm, vmscan: consider eligible zones in get_scan_count" which might be the missing part. Thanks! -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-02-01 10:14 ` Michal Hocko @ 2017-02-04 0:36 ` Trevor Cordes 2017-02-04 20:05 ` Rik van Riel 2017-02-05 10:03 ` Michal Hocko 0 siblings, 2 replies; 40+ messages in thread From: Trevor Cordes @ 2017-02-04 0:36 UTC (permalink / raw) To: Michal Hocko Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On 2017-02-01 Michal Hocko wrote: > On Wed 01-02-17 03:29:28, Trevor Cordes wrote: > > On 2017-01-30 Michal Hocko wrote: > [...] > > > Testing with Valinall rc6 released just yesterday would be a good > > > fit. There are some more fixes sitting on mmotm on top and maybe > > > we want some of them in finall 4.10. Anyway all those pending > > > changes should be merged in the next merge window - aka 4.11 > > > > After 30 hours of running vanilla 4.10.0-rc6, the box started to go > > bonkers at 3am, so vanilla does not fix the bug :-( But, the bug > > hit differently this time, the box just bogged down like crazy and > > gave really weird top output. Starting nano would take 10s, then > > would run full speed, then when saving a file would take 5s. > > Starting any prog not in cache took equally as long. > > Could you try with to_test/linus-tree/oom_hickups branch on the same > git tree? I have cherry-picked "mm, vmscan: consider eligible zones in > get_scan_count" which might be the missing part. I ran to_test/linus-tree/oom_hickups branch (4.10.0-rc6+) for 50 hours and it does NOT have the bug! No problems at all so far. So I think whatever to_test/linus-tree/oom_hickups has that since-4.9 has that vanilla 4.10-rc6 does *not* have is indeed the fix. For my reference, and I know you guys aren't distro-specific, what is the best way to get this fix into Fedora 24 (currently 4.9)? Can it be backported or made as a patch they can apply to 4.9? Or 4.10? If this fix only goes into 4.11 then I fear we'll never see it in Fedora and us rhbz guys will not have a stock-Fedora fix for this until F25 or F26. Again, I'm not trying to force this out of scope, I'm just wondering about the logistics in these situations. Once again, thanks to all for your great work and help! P.S. I'll try a couple of the other ideas Mel had about ramping the RAM back up, etc. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-02-04 0:36 ` Trevor Cordes @ 2017-02-04 20:05 ` Rik van Riel 2017-02-05 10:03 ` Michal Hocko 1 sibling, 0 replies; 40+ messages in thread From: Rik van Riel @ 2017-02-04 20:05 UTC (permalink / raw) To: Trevor Cordes, Michal Hocko Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Srikar Dronamraju [-- Attachment #1: Type: text/plain, Size: 1796 bytes --] On Fri, 2017-02-03 at 18:36 -0600, Trevor Cordes wrote: > On 2017-02-01 Michal Hocko wrote: > > On Wed 01-02-17 03:29:28, Trevor Cordes wrote: > > > On 2017-01-30 Michal Hocko wrote: > > > > [...] > > > > Testing with Valinall rc6 released just yesterday would be a > > > > good > > > > fit. There are some more fixes sitting on mmotm on top and > > > > maybe > > > > we want some of them in finall 4.10. Anyway all those pending > > > > changes should be merged in the next merge window - aka 4.11 > > > > > > After 30 hours of running vanilla 4.10.0-rc6, the box started to > > > go > > > bonkers at 3am, so vanilla does not fix the bug :-( But, the bug > > > hit differently this time, the box just bogged down like crazy > > > and > > > gave really weird top output. Starting nano would take 10s, then > > > would run full speed, then when saving a file would take 5s. > > > Starting any prog not in cache took equally as long. > > > > Could you try with to_test/linus-tree/oom_hickups branch on the > > same > > git tree? I have cherry-picked "mm, vmscan: consider eligible zones > > in > > get_scan_count" which might be the missing part. > > I ran to_test/linus-tree/oom_hickups branch (4.10.0-rc6+) for 50 > hours > and it does NOT have the bug! No problems at all so far. > > So I think whatever to_test/linus-tree/oom_hickups has that since-4.9 > has that vanilla 4.10-rc6 does *not* have is indeed the fix. > > For my reference, and I know you guys aren't distro-specific, what is > the best way to get this fix into Fedora 24 (currently 4.9)? Can it > be > backported or made as a patch they can apply to 4.9? Or 4.10? The best way would be to open a Fedora bug, and CC me on it :) -- All Rights Reversed. [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-02-04 0:36 ` Trevor Cordes 2017-02-04 20:05 ` Rik van Riel @ 2017-02-05 10:03 ` Michal Hocko 2017-02-05 22:53 ` Trevor Cordes 1 sibling, 1 reply; 40+ messages in thread From: Michal Hocko @ 2017-02-05 10:03 UTC (permalink / raw) To: Trevor Cordes Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Fri 03-02-17 18:36:54, Trevor Cordes wrote: > On 2017-02-01 Michal Hocko wrote: > > On Wed 01-02-17 03:29:28, Trevor Cordes wrote: > > > On 2017-01-30 Michal Hocko wrote: > > [...] > > > > Testing with Valinall rc6 released just yesterday would be a good > > > > fit. There are some more fixes sitting on mmotm on top and maybe > > > > we want some of them in finall 4.10. Anyway all those pending > > > > changes should be merged in the next merge window - aka 4.11 > > > > > > After 30 hours of running vanilla 4.10.0-rc6, the box started to go > > > bonkers at 3am, so vanilla does not fix the bug :-( But, the bug > > > hit differently this time, the box just bogged down like crazy and > > > gave really weird top output. Starting nano would take 10s, then > > > would run full speed, then when saving a file would take 5s. > > > Starting any prog not in cache took equally as long. > > > > Could you try with to_test/linus-tree/oom_hickups branch on the same > > git tree? I have cherry-picked "mm, vmscan: consider eligible zones in > > get_scan_count" which might be the missing part. > > I ran to_test/linus-tree/oom_hickups branch (4.10.0-rc6+) for 50 hours > and it does NOT have the bug! No problems at all so far. OK, that is definitely good to know. My other fix ("mm, vmscan: consider eligible zones in get_scan_count") was more theoretical than bug driven. I would add your Tested-by: Trevor Cordes <trevor@tecnopolis.ca> unless you have anything against that. > So I think whatever to_test/linus-tree/oom_hickups has that since-4.9 > has that vanilla 4.10-rc6 does *not* have is indeed the fix. > > For my reference, and I know you guys aren't distro-specific, what is > the best way to get this fix into Fedora 24 (currently 4.9)? I will send this patch to 4.9+ stable as soon as it hits Linus tree. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-02-05 10:03 ` Michal Hocko @ 2017-02-05 22:53 ` Trevor Cordes 0 siblings, 0 replies; 40+ messages in thread From: Trevor Cordes @ 2017-02-05 22:53 UTC (permalink / raw) To: Michal Hocko Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On 2017-02-05 Michal Hocko wrote: > On Fri 03-02-17 18:36:54, Trevor Cordes wrote: > > I ran to_test/linus-tree/oom_hickups branch (4.10.0-rc6+) for 50 > > hours and it does NOT have the bug! No problems at all so far. > > OK, that is definitely good to know. My other fix ("mm, vmscan: > consider eligible zones in get_scan_count") was more theoretical than > bug driven. I would add your > Tested-by: Trevor Cordes <trevor@tecnopolis.ca> > > unless you have anything against that. I am happy to be in the tested-by; go ahead. > > So I think whatever to_test/linus-tree/oom_hickups has that > > since-4.9 has that vanilla 4.10-rc6 does *not* have is indeed the > > fix. > > > > For my reference, and I know you guys aren't distro-specific, what > > is the best way to get this fix into Fedora 24 (currently 4.9)? > > I will send this patch to 4.9+ stable as soon as it hits Linus tree. That's great news! It will make everyone on the rhbz happy. Thank you! ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-29 22:50 ` Trevor Cordes 2017-01-30 7:51 ` Michal Hocko @ 2017-01-30 9:10 ` Mel Gorman 1 sibling, 0 replies; 40+ messages in thread From: Mel Gorman @ 2017-01-30 9:10 UTC (permalink / raw) To: Trevor Cordes Cc: Michal Hocko, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Sun, Jan 29, 2017 at 04:50:03PM -0600, Trevor Cordes wrote: > On 2017-01-25 Michal Hocko wrote: > > On Wed 25-01-17 04:02:46, Trevor Cordes wrote: > > > OK, I patched & compiled mhocko's git tree from the other day > > > 4.9.0+. (To confirm, weird, but mhocko's git tree I'm using from a > > > couple of weeks ago shows the newest commit (git log) is > > > 69973b830859bc6529a7a0468ba0d80ee5117826 "Linux 4.9"? Let me know > > > if I'm doing something wrong, see below.) > > > > My fault. I should have noted that you should use since-4.9 branch. > > OK, I have good news. I compiled your mhocko git tree (properly this > tim!) using since-4.9 branch (last commit > ca63ff9b11f958efafd8c8fa60fda14baec6149c Jan 25) and the box survived 3 > 3am's, over 60 hours, and I made sure all the usual oom culprits ran, > and I ran extras (finds on the whole tree, extra rdiff-backups) to try > to tax it. Based on my previous criteria I would say your since-4.9 as > of the above commit solves my bug, at least over a 3 day test span > (which it never survives when the bug is present)! > That's good news. It means the more extreme options may not be necessary. > I tested WITHOUT any cgroup/mem boot options. I do still have my > mem=6G limiter on, though (I've never tested with it off, until I solve > the bug with it on, since I've had it on for many months for other > reasons). > It may be an option to try relaxing that and see at what point it fails. You may find at some point that memory is not utilised as there is not enough lowmem for metadata to track data in highmem. That's not unexpected. > What do I test next? Does the since-4.9 stuff get pushed into vanilla > (4.9 hopefully?) so it can find its way into Fedora's stuck F24 > kernel? > Michal has already made suggestions here and I've nothing to add. > I want to also note that the RHBZ > https://bugzilla.redhat.com/show_bug.cgi?id=1401012 is garnering more > interest as more people start me-too'ing. The situation is almost > always the same: large rsync's or similar tree-scan accesses cause oom > on PAE boxes. However, I wanted to note that many people there reported > that cgroup_disable=memory doesn't fix anything for them, whereas that > always makes the problem go away on my boxes. Strange. > It could simply be down to whether memcgs were actually in use or not. > Thanks Michal and Mel, I really appreciate it! I appreciate the detailed testing and reporting! -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-23 0:45 ` Trevor Cordes 2017-01-23 10:48 ` Mel Gorman @ 2017-01-24 12:54 ` Michal Hocko 2017-01-26 23:18 ` Trevor Cordes 1 sibling, 1 reply; 40+ messages in thread From: Michal Hocko @ 2017-01-24 12:54 UTC (permalink / raw) To: Trevor Cordes Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Sun 22-01-17 18:45:59, Trevor Cordes wrote: [...] > Also, completely separate from your patch I ran mhocko's 4.9 tree with > mem=2G to see if lower ram amount would help, but it didn't. Even with > 2G the system oom and hung same as usual. So far the only thing that > helps at all was the cgroup_disable=memory option, which makes the > problem disappear completely for me. OK, can we reduce the problem space slightly more and could you boot with kmem accounting enabled? cgroup.memory=nokmem,nosocket -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-24 12:54 ` Michal Hocko @ 2017-01-26 23:18 ` Trevor Cordes 2017-01-27 7:36 ` Michal Hocko 0 siblings, 1 reply; 40+ messages in thread From: Trevor Cordes @ 2017-01-26 23:18 UTC (permalink / raw) To: Michal Hocko Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju [-- Attachment #1: Type: text/plain, Size: 1717 bytes --] On 2017-01-24 Michal Hocko wrote: > On Sun 22-01-17 18:45:59, Trevor Cordes wrote: > [...] > > Also, completely separate from your patch I ran mhocko's 4.9 tree > > with mem=2G to see if lower ram amount would help, but it didn't. > > Even with 2G the system oom and hung same as usual. So far the > > only thing that helps at all was the cgroup_disable=memory option, > > which makes the problem disappear completely for me. > > OK, can we reduce the problem space slightly more and could you boot > with kmem accounting enabled? cgroup.memory=nokmem,nosocket I ran for 30 hours with cgroup.memory=nokmem,nosocket using vanilla 4.9.0+ and it oom'd during a big rdiff-backup at 9am. My script was able to reboot it before it hung. Only one oom occurred before the reboot, which is a bit odd, usually there is 5-50. See attached messages log (oom6). So, still, only cgroup_disable=memory mitigates this bug (so far). If you need me to test cgroup.memory=nokmem,nosocket with your since-4.9 branch specifically, let me know and I'll add it to the to-test list. On 2017-01-25 Michal Hocko wrote: > On Wed 25-01-17 04:02:46, Trevor Cordes wrote: > > OK, I patched & compiled mhocko's git tree from the other day > > 4.9.0+. (To confirm, weird, but mhocko's git tree I'm using from a > > couple of weeks ago shows the newest commit (git log) is > > 69973b830859bc6529a7a0468ba0d80ee5117826 "Linux 4.9"? Let me know > > if I'm doing something wrong, see below.) > > My fault. I should have noted that you should use since-4.9 branch. OK, I got it now, I'm retesting the runs I did (with/without the various patches) on your git tree and will re-report the (correct) results. Will take a few days. Thanks! [-- Attachment #2: oom6 --] [-- Type: application/octet-stream, Size: 23915 bytes --] Jan 26 08:59:46 firewallfsi kernel: [104057.183579] rdiff-backup invoked oom-killer: gfp_mask=0x2420848(GFP_NOFS|__GFP_NOFAIL|__GFP_HARDWALL|__GFP_MOVABLE), nodemask=0, order=0, oom_score_adj=0 Jan 26 08:59:46 firewallfsi kernel: [104057.186272] rdiff-backup cpuset=/ mems_allowed=0 Jan 26 08:59:46 firewallfsi kernel: [104057.187565] CPU: 3 PID: 18243 Comm: rdiff-backup Not tainted 4.9.0+ #2 Jan 26 08:59:46 firewallfsi kernel: [104057.188821] Hardware name: Intel Corporation S1200BTL/S1200BTL, BIOS S1200BT.86B.02.00.0035.030220120927 03/02/2012 Jan 26 08:59:46 firewallfsi kernel: [104057.190083] de14fab0 ceb604e7 de14fbe8 eeb95a00 de14fae0 ce9e17a6 de14fac0 cef63bad Jan 26 08:59:46 firewallfsi kernel: [104057.191343] de14fae0 ceb6625f de14fae4 ea682c00 f6ba2600 eeb95a00 cf168bce de14fbe8 Jan 26 08:59:46 firewallfsi kernel: [104057.192588] de14fb24 ce97aff7 ce876d8a de14fb10 ce97ac6b 00000006 00000000 0000000c Jan 26 08:59:46 firewallfsi kernel: [104057.193821] Call Trace: Jan 26 08:59:46 firewallfsi kernel: [104057.195025] [<ceb604e7>] dump_stack+0x58/0x81 Jan 26 08:59:46 firewallfsi kernel: [104057.196213] [<ce9e17a6>] dump_header+0x64/0x1a6 Jan 26 08:59:46 firewallfsi kernel: [104057.197381] [<cef63bad>] ? _raw_spin_unlock_irqrestore+0xd/0x10 Jan 26 08:59:46 firewallfsi kernel: [104057.198536] [<ceb6625f>] ? ___ratelimit+0x9f/0x100 Jan 26 08:59:46 firewallfsi kernel: [104057.199672] [<ce97aff7>] oom_kill_process+0x207/0x3d0 Jan 26 08:59:46 firewallfsi kernel: [104057.200790] [<ce876d8a>] ? has_capability_noaudit+0x1a/0x30 Jan 26 08:59:46 firewallfsi kernel: [104057.201893] [<ce97ac6b>] ? oom_badness.part.13+0xcb/0x140 Jan 26 08:59:46 firewallfsi kernel: [104057.202979] [<ce97b4d8>] out_of_memory+0xf8/0x2a0 Jan 26 08:59:46 firewallfsi kernel: [104057.204045] [<ce980016>] __alloc_pages_nodemask+0xc46/0xd10 Jan 26 08:59:46 firewallfsi kernel: [104057.205096] [<ce9768c2>] ? find_get_entry+0x22/0x160 Jan 26 08:59:46 firewallfsi kernel: [104057.206131] [<ceb32010>] ? generic_make_request+0xd0/0x1b0 Jan 26 08:59:46 firewallfsi kernel: [104057.207148] [<ce97727e>] pagecache_get_page+0xbe/0x2d0 Jan 26 08:59:46 firewallfsi kernel: [104057.208148] [<cea19044>] __getblk_gfp+0x104/0x360 Jan 26 08:59:46 firewallfsi kernel: [104057.209127] [<cea19a3a>] ? submit_bh_wbc+0x14a/0x1f0 Jan 26 08:59:46 firewallfsi kernel: [104057.210097] [<cea1a6bb>] __breadahead+0x2b/0x70 Jan 26 08:59:46 firewallfsi kernel: [104057.211039] [<cea690ee>] __ext4_get_inode_loc+0x40e/0x440 Jan 26 08:59:46 firewallfsi kernel: [104057.211966] [<ce9feaf9>] ? inode_init_always+0x119/0x1c0 Jan 26 08:59:46 firewallfsi kernel: [104057.212874] [<cea6c07b>] ext4_iget+0x7b/0xaf0 Jan 26 08:59:46 firewallfsi kernel: [104057.213762] [<ce9fd683>] ? __d_alloc+0x23/0x190 Jan 26 08:59:46 firewallfsi kernel: [104057.214632] [<cea6cb1f>] ext4_iget_normal+0x2f/0x40 Jan 26 08:59:46 firewallfsi kernel: [104057.215483] [<cea779c5>] ext4_lookup+0xb5/0x240 Jan 26 08:59:46 firewallfsi kernel: [104057.216315] [<ce9ef257>] ? legitimize_path.isra.33+0x27/0x60 Jan 26 08:59:46 firewallfsi kernel: [104057.217132] [<ce9ef3fe>] ? unlazy_walk+0x16e/0x1a0 Jan 26 08:59:46 firewallfsi kernel: [104057.217929] [<ce9f01cc>] lookup_slow+0x7c/0x130 Jan 26 08:59:46 firewallfsi kernel: [104057.218710] [<ce9f0da4>] walk_component+0x1e4/0x300 Jan 26 08:59:46 firewallfsi kernel: [104057.219475] [<ce9efdde>] ? path_init+0x19e/0x330 Jan 26 08:59:46 firewallfsi kernel: [104057.220221] [<ce9ee69f>] ? terminate_walk+0x8f/0x100 Jan 26 08:59:46 firewallfsi kernel: [104057.220949] [<ce9f1f83>] path_lookupat+0x53/0xe0 Jan 26 08:59:46 firewallfsi kernel: [104057.221657] [<ce9f4127>] filename_lookup+0x97/0x190 Jan 26 08:59:46 firewallfsi kernel: [104057.222348] [<ce9cf1ad>] ? kmem_cache_alloc+0x15d/0x1c0 Jan 26 08:59:46 firewallfsi kernel: [104057.223021] [<ce9f3d6a>] ? getname_flags+0x3a/0x1a0 Jan 26 08:59:46 firewallfsi kernel: [104057.223676] [<ce9f42f6>] user_path_at_empty+0x36/0x40 Jan 26 08:59:46 firewallfsi kernel: [104057.224311] [<ce9e9a10>] vfs_fstatat+0x60/0xb0 Jan 26 08:59:46 firewallfsi kernel: [104057.224928] [<ce9ea40d>] SyS_lstat64+0x2d/0x50 Jan 26 08:59:46 firewallfsi kernel: [104057.225524] [<ce912e8e>] ? __audit_syscall_exit+0x1ce/0x260 Jan 26 08:59:46 firewallfsi kernel: [104057.226105] [<ce8035a6>] ? syscall_slow_exit_work+0xd6/0xe0 Jan 26 08:59:46 firewallfsi kernel: [104057.226667] [<ce80377a>] do_fast_syscall_32+0x8a/0x150 Jan 26 08:59:46 firewallfsi kernel: [104057.227212] [<cef640ca>] sysenter_past_esp+0x47/0x75 Jan 26 08:59:46 firewallfsi kernel: [104057.227745] Mem-Info: Jan 26 08:59:46 firewallfsi kernel: [104057.228569] active_anon:126164 inactive_anon:65161 isolated_anon:0 Jan 26 08:59:46 firewallfsi kernel: [104057.228569] active_file:193794 inactive_file:602556 isolated_file:0 Jan 26 08:59:46 firewallfsi kernel: [104057.228569] unevictable:0 dirty:198 writeback:0 unstable:0 Jan 26 08:59:46 firewallfsi kernel: [104057.228569] slab_reclaimable:97990 slab_unreclaimable:9475 Jan 26 08:59:46 firewallfsi kernel: [104057.228569] mapped:24098 shmem:324 pagetables:1512 bounce:0 Jan 26 08:59:46 firewallfsi kernel: [104057.228569] free:109080 free_pcp:1076 free_cma:0 Jan 26 08:59:46 firewallfsi kernel: [104057.233530] Node 0 active_anon:504656kB inactive_anon:260644kB active_file:775176kB inactive_file:2410224kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:96412kB dirty:852kB writeback:0kB shmem:1296kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no Jan 26 08:59:46 firewallfsi kernel: [104057.236152] DMA free:3180kB min:68kB low:84kB high:100kB active_anon:0kB inactive_anon:0kB active_file:5536kB inactive_file:0kB unevictable:0kB writepending:0kB present:15976kB managed:15900kB mlocked:0kB slab_reclaimable:7016kB slab_unreclaimable:24kB kernel_stack:8kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Jan 26 08:59:46 firewallfsi kernel: lowmem_reserve[]: 0 778 4734 4734 Jan 26 08:59:46 firewallfsi kernel: [104057.240078] Normal free:3504kB min:3532kB low:4412kB high:5292kB active_anon:260kB inactive_anon:0kB active_file:348276kB inactive_file:100kB unevictable:0kB writepending:380kB present:892920kB managed:817180kB mlocked:0kB slab_reclaimable:384944kB slab_unreclaimable:37876kB kernel_stack:2552kB pagetables:0kB bounce:0kB free_pcp:2000kB local_pcp:276kB free_cma:0kB Jan 26 08:59:46 firewallfsi kernel: lowmem_reserve[]: 0 0 31652 31652 Jan 26 08:59:46 firewallfsi kernel: [104057.244247] HighMem free:429636kB min:512kB low:5000kB high:9488kB active_anon:504396kB inactive_anon:260644kB active_file:421364kB inactive_file:2410124kB unevictable:0kB writepending:500kB present:4051548kB managed:4051548kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:6048kB bounce:0kB free_pcp:2300kB local_pcp:308kB free_cma:0kB Jan 26 08:59:46 firewallfsi kernel: lowmem_reserve[]: 0 0 0 0 Jan 26 08:59:46 firewallfsi kernel: [104057.248837] DMA: 5*4kB (UE) 21*8kB (UME) 17*16kB (UME) 17*32kB (UE) 8*64kB (UME) 3*128kB (UE) 3*256kB (ME) 1*512kB (U) 0*1024kB 0*2048kB 0*4096kB = 3180kB Jan 26 08:59:46 firewallfsi kernel: Normal: 432*4kB (UMEH) 24*8kB (UH) 13*16kB (H) 11*32kB (H) 8*64kB (H) 4*128kB (H) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3504kB Jan 26 08:59:46 firewallfsi kernel: HighMem: 7*4kB (U) 205*8kB (UM) 58*16kB (UM) 21*32kB (UM) 12*64kB (UM) 11*128kB (UM) 3*256kB (UM) 31*512kB (M) 4*1024kB (M) 1*2048kB (M) 98*4096kB (M) = 429636kB Jan 26 08:59:46 firewallfsi kernel: [104057.256399] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Jan 26 08:59:46 firewallfsi kernel: [104057.257716] 796796 total pagecache pages Jan 26 08:59:46 firewallfsi kernel: [104057.259050] 99 pages in swap cache Jan 26 08:59:46 firewallfsi kernel: [104057.260364] Swap cache stats: add 1286, delete 1187, find 150/171 Jan 26 08:59:46 firewallfsi kernel: [104057.261639] Free swap = 33779604kB Jan 26 08:59:46 firewallfsi kernel: [104057.262912] Total swap = 33784572kB Jan 26 08:59:46 firewallfsi kernel: [104057.264179] 1240111 pages RAM Jan 26 08:59:46 firewallfsi kernel: [104057.265440] 1012887 pages HighMem/MovableOnly Jan 26 08:59:46 firewallfsi kernel: [104057.266706] 18954 pages reserved Jan 26 08:59:46 firewallfsi kernel: [104057.267969] 0 pages hwpoisoned Jan 26 08:59:46 firewallfsi kernel: [104057.269225] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name Jan 26 08:59:46 firewallfsi kernel: [104057.270519] [ 588] 0 588 3279 1647 10 3 29 0 systemd-journal Jan 26 08:59:46 firewallfsi kernel: [104057.271779] [ 624] 0 624 3592 832 8 3 344 -1000 systemd-udevd Jan 26 08:59:46 firewallfsi kernel: [104057.273037] [ 728] 0 728 8633 1065 10 3 0 0 rsyslogd Jan 26 08:59:46 firewallfsi kernel: [104057.274298] [ 733] 0 733 1704 625 6 3 0 0 sh Jan 26 08:59:46 firewallfsi kernel: [104057.275559] [ 736] 0 736 1033 628 5 3 0 0 irqbalance Jan 26 08:59:46 firewallfsi kernel: [104057.276816] [ 737] 288 737 16729 1733 19 3 0 0 milter-greylist Jan 26 08:59:46 firewallfsi kernel: [104057.278074] [ 739] 0 739 1704 616 6 3 0 0 sh Jan 26 08:59:46 firewallfsi kernel: [104057.279326] [ 750] 0 750 1704 644 6 3 0 0 sh Jan 26 08:59:46 firewallfsi kernel: [104057.280573] [ 751] 0 751 1704 676 6 3 0 0 sh Jan 26 08:59:46 firewallfsi kernel: [104057.281764] [ 767] 0 767 1005 714 6 3 0 0 systemd-logind Jan 26 08:59:46 firewallfsi kernel: [104057.282949] [ 768] 0 768 1472 995 6 3 0 0 smartd Jan 26 08:59:46 firewallfsi kernel: [104057.284124] [ 771] 0 771 1704 675 7 3 0 0 sh Jan 26 08:59:46 firewallfsi kernel: [104057.285288] [ 772] 0 772 1704 657 6 3 0 0 sh Jan 26 08:59:46 firewallfsi kernel: [104057.286442] [ 777] 0 777 800 486 6 3 0 0 mdadm Jan 26 08:59:46 firewallfsi kernel: [104057.287590] [ 779] 0 779 1704 627 7 3 0 0 sh Jan 26 08:59:46 firewallfsi kernel: [104057.288729] [ 790] 81 790 1700 1036 7 3 0 -900 dbus-daemon Jan 26 08:59:46 firewallfsi kernel: [104057.289848] [ 810] 0 810 3266 2194 10 3 0 0 restarter Jan 26 08:59:46 firewallfsi kernel: [104057.290942] [ 811] 0 811 3234 2159 9 3 0 0 dynamic-ip-upda Jan 26 08:59:46 firewallfsi kernel: [104057.291965] [ 812] 0 812 3165 2073 10 3 0 0 mailwarnings Jan 26 08:59:46 firewallfsi kernel: [104057.292949] [ 814] 0 814 1736 717 6 3 0 0 tickle-pog Jan 26 08:59:46 firewallfsi kernel: [104057.293907] [ 815] 0 815 1819 804 6 3 0 0 watch-services Jan 26 08:59:46 firewallfsi kernel: [104057.294839] [ 816] 0 816 2708 1666 9 3 0 0 udp-sgr Jan 26 08:59:46 firewallfsi kernel: [104057.295754] [ 817] 0 817 3695 1987 10 3 0 0 fetchmail Jan 26 08:59:46 firewallfsi kernel: [104057.296646] [ 845] 0 845 2499 370 9 3 0 0 saslauthd Jan 26 08:59:46 firewallfsi kernel: [104057.297512] [ 846] 0 846 2518 912 9 3 0 0 saslauthd Jan 26 08:59:46 firewallfsi kernel: [104057.298343] [ 847] 0 847 2499 126 9 3 0 0 saslauthd Jan 26 08:59:46 firewallfsi kernel: [104057.299150] [ 848] 0 848 2518 912 9 3 0 0 saslauthd Jan 26 08:59:46 firewallfsi kernel: [104057.299942] [ 849] 0 849 2499 126 9 3 0 0 saslauthd Jan 26 08:59:46 firewallfsi kernel: [104057.300691] [ 888] 0 888 2769 731 9 3 0 -1000 sshd Jan 26 08:59:46 firewallfsi kernel: [104057.301407] [ 930] 0 930 1704 634 6 3 0 0 sh Jan 26 08:59:46 firewallfsi kernel: [104057.302105] [ 931] 0 931 3309 2240 10 3 0 0 watch-ip Jan 26 08:59:46 firewallfsi kernel: [104057.302765] [ 932] 0 932 583 403 5 3 0 0 acpid Jan 26 08:59:46 firewallfsi kernel: [104057.303380] [ 937] 0 937 7287 1292 11 3 0 0 apcupsd Jan 26 08:59:46 firewallfsi kernel: [104057.303975] [ 973] 0 973 860 491 5 3 0 0 atd Jan 26 08:59:46 firewallfsi kernel: [104057.304544] [ 1036] 0 1036 7049 2372 16 3 0 0 nmbd Jan 26 08:59:46 firewallfsi kernel: [104057.305081] [ 1037] 0 1037 6840 2278 16 3 0 0 nmbd Jan 26 08:59:46 firewallfsi kernel: [104057.305593] [ 1052] 25 1052 82757 48632 115 3 0 0 named Jan 26 08:59:46 firewallfsi kernel: [104057.306084] [ 1065] 27 1065 1707 767 6 3 0 0 mysqld_safe Jan 26 08:59:46 firewallfsi kernel: [104057.306568] [ 1235] 27 1235 126038 14727 64 3 0 0 mysqld Jan 26 08:59:46 firewallfsi kernel: [104057.307046] [ 1239] 0 1239 12052 6480 27 3 0 0 /usr/sbin/httpd Jan 26 08:59:46 firewallfsi kernel: [104057.307528] [ 1357] 0 1357 1116 484 6 3 0 0 agetty Jan 26 08:59:46 firewallfsi kernel: [104057.307996] [ 1358] 0 1358 1116 545 6 3 0 0 agetty Jan 26 08:59:46 firewallfsi kernel: [104057.308466] [ 1359] 0 1359 1116 546 6 3 0 0 agetty Jan 26 08:59:46 firewallfsi kernel: [104057.308926] [ 1360] 0 1360 1116 487 5 3 0 0 agetty Jan 26 08:59:46 firewallfsi kernel: [104057.309382] [ 1361] 0 1361 1116 519 6 3 0 0 agetty Jan 26 08:59:46 firewallfsi kernel: [104057.309836] [ 1362] 0 1362 1116 477 6 3 0 0 agetty Jan 26 08:59:46 firewallfsi kernel: [104057.310281] [ 1578] 48 1578 49258 4298 45 3 0 0 /usr/sbin/httpd Jan 26 08:59:46 firewallfsi kernel: [104057.310724] [ 1579] 48 1579 16216 4543 28 3 0 0 /usr/sbin/httpd Jan 26 08:59:46 firewallfsi kernel: [104057.311144] [ 1582] 48 1582 16220 4299 28 3 0 0 /usr/sbin/httpd Jan 26 08:59:46 firewallfsi kernel: [104057.311561] [ 1599] 48 1599 16386 4796 29 3 0 0 /usr/sbin/httpd Jan 26 08:59:46 firewallfsi kernel: [104057.311966] [ 1604] 48 1604 16386 4558 29 3 0 0 /usr/sbin/httpd Jan 26 08:59:46 firewallfsi kernel: [104057.312366] [ 1754] 0 1754 1704 632 6 3 0 0 sh Jan 26 08:59:46 firewallfsi kernel: [104057.312773] [ 1755] 0 1755 2741 1664 9 3 0 0 udp-sgs Jan 26 08:59:46 firewallfsi kernel: [104057.313167] [ 1759] 0 1759 9561 3551 22 3 0 0 smbd Jan 26 08:59:46 firewallfsi kernel: [104057.313551] [ 1760] 0 1760 9197 1115 22 3 0 0 smbd Jan 26 08:59:46 firewallfsi kernel: [104057.313931] [ 1761] 0 1761 9446 1192 22 3 0 0 smbd Jan 26 08:59:46 firewallfsi kernel: [104057.314311] [ 1850] 0 1850 5116 2511 13 3 0 0 dhclient Jan 26 08:59:46 firewallfsi kernel: [104057.314698] [ 1927] 0 1927 594 420 5 3 0 0 pptpd Jan 26 08:59:46 firewallfsi kernel: [104057.315080] [ 1929] 0 1929 954 672 6 3 0 0 dovecot Jan 26 08:59:46 firewallfsi kernel: [104057.315467] [ 1930] 97 1930 904 548 5 3 0 0 anvil Jan 26 08:59:46 firewallfsi kernel: [104057.315853] [ 1931] 0 1931 937 581 5 3 0 0 log Jan 26 08:59:46 firewallfsi kernel: [104057.316238] [ 1938] 0 1938 1899 739 7 3 0 0 crond Jan 26 08:59:46 firewallfsi kernel: [104057.316629] [ 1939] 177 1939 6832 4716 17 3 0 0 dhcpd Jan 26 08:59:46 firewallfsi kernel: [104057.317013] [ 1941] 38 1941 1536 1054 6 3 0 0 ntpd Jan 26 08:59:46 firewallfsi kernel: [104057.317393] [ 1944] 48 1944 16216 4289 28 3 0 0 /usr/sbin/httpd Jan 26 08:59:46 firewallfsi kernel: [104057.317783] [ 1947] 48 1947 16220 4571 28 3 0 0 /usr/sbin/httpd Jan 26 08:59:46 firewallfsi kernel: [104057.318161] [ 1948] 48 1948 16220 4500 28 3 0 0 /usr/sbin/httpd Jan 26 08:59:46 firewallfsi kernel: [104057.318539] [ 1960] 302 1960 1446 1049 7 3 0 0 imap Jan 26 08:59:46 firewallfsi kernel: [104057.318913] [ 1962] 301 1962 1647 1133 8 3 0 0 imap Jan 26 08:59:46 firewallfsi kernel: [104057.319282] [ 1965] 303 1965 1394 921 6 3 0 0 imap Jan 26 08:59:46 firewallfsi kernel: [104057.319652] [ 2090] 0 2090 4181 1978 11 3 5 0 sshd Jan 26 08:59:46 firewallfsi kernel: [104057.320019] [ 2092] 0 2092 1584 1140 6 3 0 0 systemd Jan 26 08:59:46 firewallfsi kernel: [104057.320390] [ 2093] 0 2093 6809 333 11 3 62 0 (sd-pam) Jan 26 08:59:46 firewallfsi kernel: [104057.320771] [ 2096] 0 2096 4214 1103 11 3 1 0 sshd Jan 26 08:59:46 firewallfsi kernel: [104057.321149] [ 2097] 0 2097 1239 989 6 3 0 0 tcsh Jan 26 08:59:46 firewallfsi kernel: [104057.321536] [ 2410] 0 2410 1142 800 5 3 0 0 config Jan 26 08:59:46 firewallfsi kernel: [104057.321916] [ 2619] 304 2619 1399 906 6 3 0 0 imap Jan 26 08:59:46 firewallfsi kernel: [104057.322295] [ 6076] 0 6076 1307 923 6 3 0 0 reboot-when-oom Jan 26 08:59:46 firewallfsi kernel: [104057.322685] [10382] 48 10382 16216 4285 28 3 0 0 /usr/sbin/httpd Jan 26 08:59:46 firewallfsi kernel: [104057.323076] [12318] 273 12318 2208 1470 8 3 0 0 imap-login Jan 26 08:59:46 firewallfsi kernel: [104057.323475] [12319] 301 12319 1672 1079 7 3 0 0 imap Jan 26 08:59:46 firewallfsi kernel: [104057.323863] [10772] 273 10772 2209 1439 8 3 0 0 imap-login Jan 26 08:59:46 firewallfsi kernel: [104057.324243] [10776] 301 10776 1396 1008 6 3 0 0 imap Jan 26 08:59:46 firewallfsi kernel: [104057.324630] [11099] 0 11099 11587 4409 26 3 0 0 smbd Jan 26 08:59:46 firewallfsi kernel: [104057.325012] [12143] 0 12143 589 167 5 3 0 0 tail Jan 26 08:59:46 firewallfsi kernel: [104057.325392] [18786] 276 18786 118337 104981 225 3 0 0 clamd Jan 26 08:59:46 firewallfsi kernel: [104057.325779] [18801] 290 18801 13143 1160 15 3 0 0 clamav-milter Jan 26 08:59:46 firewallfsi kernel: [104057.326164] [18822] 0 18822 3830 1760 11 3 0 0 sendmail Jan 26 08:59:46 firewallfsi kernel: [104057.326550] [18838] 51 18838 3501 781 10 3 0 0 sendmail Jan 26 08:59:46 firewallfsi kernel: [104057.326935] [18953] 23 18953 5440 723 13 3 0 0 squid Jan 26 08:59:46 firewallfsi kernel: [104057.327324] [18955] 23 18955 9136 5994 20 3 0 0 squid Jan 26 08:59:46 firewallfsi kernel: [104057.327720] [18956] 23 18956 1179 411 6 3 0 0 unlinkd Jan 26 08:59:46 firewallfsi kernel: [104057.328116] [13386] 0 13386 1396 169 6 3 0 0 sleep Jan 26 08:59:46 firewallfsi kernel: [104057.328520] [13604] 273 13604 2209 1435 8 3 0 0 imap-login Jan 26 08:59:46 firewallfsi kernel: [104057.328918] [13608] 302 13608 1419 1042 6 3 0 0 imap Jan 26 08:59:46 firewallfsi kernel: [104057.329316] [18231] 0 18231 4181 1967 11 3 0 0 sshd Jan 26 08:59:46 firewallfsi kernel: [104057.329725] [18233] 299 18233 1603 1165 7 3 0 0 systemd Jan 26 08:59:46 firewallfsi kernel: [104057.330148] [18234] 299 18234 6809 364 11 3 32 0 (sd-pam) Jan 26 08:59:46 firewallfsi kernel: [104057.330564] [18237] 299 18237 4214 1127 11 3 0 0 sshd Jan 26 08:59:46 firewallfsi kernel: [104057.330988] [18238] 299 18238 879 564 5 3 0 0 bash Jan 26 08:59:46 firewallfsi kernel: [104057.331394] [18243] 299 18243 7292 6386 18 3 0 0 rdiff-backup Jan 26 08:59:46 firewallfsi kernel: [104057.331810] [18244] 299 18244 619 220 4 3 0 0 pv Jan 26 08:59:46 firewallfsi kernel: [104057.332238] [18905] 0 18905 1396 164 6 3 0 0 sleep Jan 26 08:59:46 firewallfsi kernel: [104057.332661] Out of memory: Kill process 18786 (clamd) score 10 or sacrifice child Jan 26 08:59:46 firewallfsi kernel: [104057.333107] Killed process 18786 (clamd) total-vm:473348kB, anon-rss:403644kB, file-rss:16280kB, shmem-rss:0kB Jan 26 08:59:46 firewallfsi kernel: [104057.366106] audit: type=1131 audit(1485442786.940:3251): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=clamd@scan comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed' Jan 26 08:59:47 firewallfsi kernel: [104057.493007] audit: type=1130 audit(1485442787.067:3252): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=clamd@scan comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Jan 26 08:59:47 firewallfsi kernel: [104057.495502] audit: type=1131 audit(1485442787.070:3253): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=clamd@scan comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Jan 26 08:59:47 firewallfsi kernel: [104057.497128] audit: type=1130 audit(1485442787.071:3254): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=clamd@scan comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-26 23:18 ` Trevor Cordes @ 2017-01-27 7:36 ` Michal Hocko 0 siblings, 0 replies; 40+ messages in thread From: Michal Hocko @ 2017-01-27 7:36 UTC (permalink / raw) To: Trevor Cordes Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Thu 26-01-17 17:18:58, Trevor Cordes wrote: > On 2017-01-24 Michal Hocko wrote: > > On Sun 22-01-17 18:45:59, Trevor Cordes wrote: > > [...] > > > Also, completely separate from your patch I ran mhocko's 4.9 tree > > > with mem=2G to see if lower ram amount would help, but it didn't. > > > Even with 2G the system oom and hung same as usual. So far the > > > only thing that helps at all was the cgroup_disable=memory option, > > > which makes the problem disappear completely for me. > > > > OK, can we reduce the problem space slightly more and could you boot > > with kmem accounting enabled? cgroup.memory=nokmem,nosocket > > I ran for 30 hours with cgroup.memory=nokmem,nosocket using vanilla > 4.9.0+ and it oom'd during a big rdiff-backup at 9am. My script was > able to reboot it before it hung. Only one oom occurred before the > reboot, which is a bit odd, usually there is 5-50. See attached > messages log (oom6). > > So, still, only cgroup_disable=memory mitigates this bug (so far). If > you need me to test cgroup.memory=nokmem,nosocket with your since-4.9 > branch specifically, let me know and I'll add it to the to-test list. OK, that matches the theory that these OOMs are caused by the incorrect active list aging fixed by b4536f0c829c ("mm, memcg: fix the active list aging for lowmem requests when memcg is enabled") -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-20 6:35 ` Trevor Cordes 2017-01-20 11:02 ` Mel Gorman @ 2017-01-24 12:51 ` Michal Hocko 1 sibling, 0 replies; 40+ messages in thread From: Michal Hocko @ 2017-01-24 12:51 UTC (permalink / raw) To: Trevor Cordes Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Fri 20-01-17 00:35:44, Trevor Cordes wrote: > On 2017-01-19 Michal Hocko wrote: > > On Thu 19-01-17 03:48:50, Trevor Cordes wrote: > > > On 2017-01-17 Michal Hocko wrote: > > > > On Tue 17-01-17 14:21:14, Mel Gorman wrote: > > > > > On Tue, Jan 17, 2017 at 02:52:28PM +0100, Michal Hocko > > > > > wrote: > > > > > > On Mon 16-01-17 11:09:34, Mel Gorman wrote: > > > > > > [...] > > > > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > > > > > index 532a2a750952..46aac487b89a 100644 > > > > > > > --- a/mm/vmscan.c > > > > > > > +++ b/mm/vmscan.c > > > > > > > @@ -2684,6 +2684,7 @@ static void shrink_zones(struct > > > > > > > zonelist *zonelist, struct scan_control *sc) continue; > > > > > > > > > > > > > > if (sc->priority != DEF_PRIORITY && > > > > > > > + !buffer_heads_over_limit && > > > > > > > !pgdat_reclaimable(zone->zone_pgdat)) > > > > > > > continue; /* Let > > > > > > > kswapd poll it */ > > > > > > > > > > > > I think we should rather remove pgdat_reclaimable here. This > > > > > > sounds like a wrong layer to decide whether we want to reclaim > > > > > > and how much. > > > > > > > > > > I had considered that but it'd also be important to add the > > > > > other 32-bit patches you have posted to see the impact. Because > > > > > of the ratio of LRU pages to slab pages, it may not have an > > > > > impact but it'd need to be eliminated. > > > > > > > > OK, Trevor you can pull from > > > > git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git tree > > > > fixes/highmem-node-fixes branch. This contains the current mmotm > > > > tree > > > > + the latest highmem fixes. I also do not expect this would help > > > > much in your case but as Mel've said we should rule that out at > > > > least. > > > > > > Hi! The git tree above version oom'd after < 24 hours (3:02am) so > > > it doesn't solve the bug. If you need a oom messages dump let me > > > know. > > > > Yes please. > > The first oom from that night attached. Note, the oom wasn't as dire > with your mhocko/4.9.0+ as it usually is with stock 4.8.x: my oom > detector and reboot script was able to do its thing cleanly before the > system became unusable. Just for reference. This oom was due to bug with the active LRU aging fixed in the Linus tree (b4536f0c829c ("mm, memcg: fix the active list aging for lowmem requests when memcg is enabled") 4.10-rc4) Jan 19 03:02:19 firewallfsi kernel: [85602.858232] Normal free:3436kB min:3532kB low:4412kB high:5292kB active_anon:4kB inactive_anon:8kB active_file:193340kB inactive_file:120kB unevictable :0kB writepending:2516kB present:892920kB managed:816932kB mlocked:0kB slab_reclaimable:522292kB slab_unreclaimable:46724kB kernel_stack:2560kB pagetables:0kB bounce:0kB free_pcp:3468kB loca l_pcp:176kB free_cma:0kB Look at how all the reclaimable memory is on the inactive_file... -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-16 11:09 ` Mel Gorman 2017-01-17 13:52 ` Michal Hocko @ 2017-01-18 6:52 ` Trevor Cordes 1 sibling, 0 replies; 40+ messages in thread From: Trevor Cordes @ 2017-01-18 6:52 UTC (permalink / raw) To: Mel Gorman Cc: Michal Hocko, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On 2017-01-16 Mel Gorman wrote: > > > You can easily check whether this is memcg related by trying to > > > run the same workload with cgroup_disable=memory kernel command > > > line parameter. This will put all the memcg specifics out of the > > > way. > > > > I will try booting now into cgroup_disable=memory to see if that > > helps at all. I'll reply back in 48 hours, or when it oom's, > > whichever comes first. > > > > Thanks. It has successfully survived 70 hours and 2 3am cycles (when it normally oom's) with your first patch *and* cgroup_disable=memory grafted on Fedora's 4.8.13. Since it has never survived 2 3am cycles, I strongly suspect the cgroup_disable=memory mitigates my bug. > > Also, should I bother trying the latest git HEAD to see if that > > solves anything? Thanks! > > That's worth trying. If that also fails then could you try the > following hack to encourage direct reclaim to reclaim slab when > buffers are over the limit please? > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 532a2a750952..46aac487b89a 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2684,6 +2684,7 @@ static void shrink_zones(struct zonelist > *zonelist, struct scan_control *sc) continue; > > if (sc->priority != DEF_PRIORITY && > + !buffer_heads_over_limit && > !pgdat_reclaimable(zone->zone_pgdat)) > continue; /* Let kswapd poll > it */ What's the next best step? HEAD? HEAD + the above patch? A new patch? I'll start a HEAD compile until I hear more. I assume I should test without cgroup_disable=memory as that's just a kludge/workaround, right? Also, is there a way to spot the slab pressure you are talking about before oom's occur? slabinfo? I suppose I'd be able to see some counter slowly getting too high or low? Thanks! ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected) 2017-01-15 6:27 ` Trevor Cordes 2017-01-16 11:09 ` Mel Gorman @ 2017-01-17 13:45 ` Michal Hocko 1 sibling, 0 replies; 40+ messages in thread From: Michal Hocko @ 2017-01-17 13:45 UTC (permalink / raw) To: Trevor Cordes Cc: Mel Gorman, linux-kernel, Joonsoo Kim, Minchan Kim, Rik van Riel, Srikar Dronamraju On Sun 15-01-17 00:27:52, Trevor Cordes wrote: > On 2017-01-12 Michal Hocko wrote: > > On Wed 11-01-17 16:52:32, Trevor Cordes wrote: > > [...] > > > I'm not sure how I can tell if my bug is because of memcgs so here > > > is a full first oom example (attached). > > > > 4.7 kernel doesn't contain 71c799f4982d ("mm: add per-zone lru list > > stat") so the OOM report will not tell us whether the Normal zone > > doesn't age active lists, unfortunatelly. > > I compiled the patch Mel provided into the stock F23 kernel > 4.8.13-100.fc23.i686+PAE and it ran for 2 nights. It didn't oom the > first night, but did the second night. So the bug persists even with > that patch. However, it does *seem* a bit "better" since it took 2 > nights (usually takes only one, but maybe 10% of the time it does take > two) before oom'ing, *and* it allowed my reboot script to reboot it > cleanly when it saw the oom (which happens only 25% of the time). > > I'm attaching the 4.8.13 oom message which should have the memcg info > (71c799f4982d) you are asking for above? It doesn't have the memcg info which is neither a part of the current vanilla kernel output. But we have per zone LRU counters which is what I was after. So you have a correct patch. Sorry if I confused you. > Hopefully? [167409.074463] nmbd invoked oom-killer: gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=1, oom_score_adj=0 again lowmem request [...] [167409.074576] Normal free:3484kB min:3544kB low:4428kB high:5312kB active_anon:0kB inactive_anon:0kB active_file:3412kB inactive_file:1560kB unevictabl:0kB writepending:0kB present:892920kB managed:815216kB mlocked:0kB slab_reclaimable:711068kB slab_unreclaimable:49496kB kernel_stack:2904kB pagetables:0kB bounce:0kB free_pcp:240kB local_pcp:120kB free_cma:0kB but have a look here. There are basically no pages on the Normal zone LRU list. There is a huge amount of slab allocated here but we are not able to reclaim it because we scale slab reclaimers based on the LRU reclaim. This is an inherent problem of the current design and we should address it. It is nothing really new. We just didn't have many users affected because having a majority of memory consumed by SLAB is not a usual situation. It seems you just hit a more aggressive slab user with newer kernels. Using the 32b kernel really makes all this worse because all those allocations go to the Normal and DMA zones which will push LRU pages out of that zone. > > You can easily check whether this is memcg related by trying to run > > the same workload with cgroup_disable=memory kernel command line > > parameter. This will put all the memcg specifics out of the way. > > I will try booting now into cgroup_disable=memory to see if that helps > at all. I'll reply back in 48 hours, or when it oom's, whichever comes > first. This will not help most probably. > Also, should I bother trying the latest git HEAD to see if that solves > anything? Thanks! It might help wrt. the slab consumers but there is nothing that I would consider a fix for the general problem of the slab shrinking I am afraid. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2017-02-05 22:54 UTC | newest] Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-01-11 10:32 mm, vmscan: commit makes PAE kernel crash nightly (bisected) Trevor Cordes 2017-01-11 12:11 ` Mel Gorman 2017-01-11 12:14 ` Mel Gorman 2017-01-11 22:52 ` Trevor Cordes 2017-01-12 9:36 ` Michal Hocko 2017-01-15 6:27 ` Trevor Cordes 2017-01-16 11:09 ` Mel Gorman 2017-01-17 13:52 ` Michal Hocko 2017-01-17 14:21 ` Mel Gorman 2017-01-17 14:54 ` Michal Hocko 2017-01-18 7:25 ` Trevor Cordes 2017-01-18 17:48 ` Mel Gorman 2017-01-18 18:07 ` Mel Gorman 2017-01-19 9:48 ` Trevor Cordes 2017-01-19 11:37 ` Michal Hocko 2017-01-20 6:35 ` Trevor Cordes 2017-01-20 11:02 ` Mel Gorman 2017-01-20 15:55 ` Mel Gorman 2017-01-23 0:45 ` Trevor Cordes 2017-01-23 10:48 ` Mel Gorman 2017-01-23 11:04 ` Mel Gorman 2017-01-25 9:46 ` Michal Hocko 2017-01-24 12:59 ` Michal Hocko 2017-01-25 10:02 ` Trevor Cordes 2017-01-25 12:04 ` Michal Hocko 2017-01-29 22:50 ` Trevor Cordes 2017-01-30 7:51 ` Michal Hocko 2017-02-01 9:29 ` Trevor Cordes 2017-02-01 10:14 ` Michal Hocko 2017-02-04 0:36 ` Trevor Cordes 2017-02-04 20:05 ` Rik van Riel 2017-02-05 10:03 ` Michal Hocko 2017-02-05 22:53 ` Trevor Cordes 2017-01-30 9:10 ` Mel Gorman 2017-01-24 12:54 ` Michal Hocko 2017-01-26 23:18 ` Trevor Cordes 2017-01-27 7:36 ` Michal Hocko 2017-01-24 12:51 ` Michal Hocko 2017-01-18 6:52 ` Trevor Cordes 2017-01-17 13:45 ` Michal Hocko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).