* [PATCH 0/2 -mmotm] oom: show single slab cache in oom whose size > 10% of total system memory @ 2017-10-25 22:48 Yang Shi 2017-10-25 22:48 ` [PATCH 1/2] mm: extract common code for calculating total memory size Yang Shi 2017-10-25 22:49 ` [PATCH 2/2] mm: oom: dump single excessive slab cache when oom Yang Shi 0 siblings, 2 replies; 11+ messages in thread From: Yang Shi @ 2017-10-25 22:48 UTC (permalink / raw) To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko Cc: Yang Shi, linux-mm, linux-kernel Per the suggestion from David [1], this implementation dumps single slab cache if its size is over 10% of total system memory. In current implementation, the ration is fixed as 10%. To get the size of total system memory patch #1 extract the common code from show_mem() so that the code can be used by checking the ratio. The patchset is based on the mmotm tree. [1] https://marc.info/?l=linux-mm&m=150819933626604&w=2 Yang Shi (2): mm: extract common code for calculating total memory size mm: oom: dump single excessive slab cache when oom include/linux/mm.h | 25 +++++++++++++++++++++++++ lib/show_mem.c | 20 +------------------- mm/oom_kill.c | 22 +--------------------- mm/slab.h | 4 ++-- mm/slab_common.c | 21 ++++++++++++++++----- 5 files changed, 45 insertions(+), 47 deletions(-) ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] mm: extract common code for calculating total memory size 2017-10-25 22:48 [PATCH 0/2 -mmotm] oom: show single slab cache in oom whose size > 10% of total system memory Yang Shi @ 2017-10-25 22:48 ` Yang Shi 2017-10-27 10:00 ` Christopher Lameter 2017-10-25 22:49 ` [PATCH 2/2] mm: oom: dump single excessive slab cache when oom Yang Shi 1 sibling, 1 reply; 11+ messages in thread From: Yang Shi @ 2017-10-25 22:48 UTC (permalink / raw) To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko Cc: Yang Shi, linux-mm, linux-kernel Total memory size is needed by unreclaimable slub oom to check if significant memory is used by a single slab. But, the caculation work is done in show_mem(), so extracting the common code in order to share with others. Signed-off-by: Yang Shi <yang.s@alibaba-inc.com> --- include/linux/mm.h | 25 +++++++++++++++++++++++++ lib/show_mem.c | 20 +------------------- 2 files changed, 26 insertions(+), 19 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 935c4d4..e21b81e 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2050,6 +2050,31 @@ extern int __meminit __early_pfn_to_nid(unsigned long pfn, static inline void zero_resv_unavail(void) {} #endif +static inline void calc_mem_size(unsigned long *total, unsigned long *reserved, + unsigned long *highmem) +{ + pg_data_t *pgdat; + + for_each_online_pgdat(pgdat) { + unsigned long flags; + int zoneid; + + pgdat_resize_lock(pgdat, &flags); + for (zoneid = 0; zoneid < MAX_NR_ZONES; zoneid++) { + struct zone *zone = &pgdat->node_zones[zoneid]; + if (!populated_zone(zone)) + continue; + + *total += zone->present_pages; + *reserved += zone->present_pages - zone->managed_pages; + + if (is_highmem_idx(zoneid)) + *highmem += zone->present_pages; + } + pgdat_resize_unlock(pgdat, &flags); + } +} + extern void set_dma_reserve(unsigned long new_dma_reserve); extern void memmap_init_zone(unsigned long, int, unsigned long, unsigned long, enum memmap_context); diff --git a/lib/show_mem.c b/lib/show_mem.c index 0beaa1d..115475e 100644 --- a/lib/show_mem.c +++ b/lib/show_mem.c @@ -11,30 +11,12 @@ void show_mem(unsigned int filter, nodemask_t *nodemask) { - pg_data_t *pgdat; unsigned long total = 0, reserved = 0, highmem = 0; printk("Mem-Info:\n"); show_free_areas(filter, nodemask); - for_each_online_pgdat(pgdat) { - unsigned long flags; - int zoneid; - - pgdat_resize_lock(pgdat, &flags); - for (zoneid = 0; zoneid < MAX_NR_ZONES; zoneid++) { - struct zone *zone = &pgdat->node_zones[zoneid]; - if (!populated_zone(zone)) - continue; - - total += zone->present_pages; - reserved += zone->present_pages - zone->managed_pages; - - if (is_highmem_idx(zoneid)) - highmem += zone->present_pages; - } - pgdat_resize_unlock(pgdat, &flags); - } + calc_mem_size(&total, &reserved, &highmem); printk("%lu pages RAM\n", total); printk("%lu pages HighMem/MovableOnly\n", highmem); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] mm: extract common code for calculating total memory size 2017-10-25 22:48 ` [PATCH 1/2] mm: extract common code for calculating total memory size Yang Shi @ 2017-10-27 10:00 ` Christopher Lameter 2017-10-27 16:51 ` Yang Shi 0 siblings, 1 reply; 11+ messages in thread From: Christopher Lameter @ 2017-10-27 10:00 UTC (permalink / raw) To: Yang Shi Cc: penberg, rientjes, iamjoonsoo.kim, akpm, mhocko, linux-mm, linux-kernel On Thu, 26 Oct 2017, Yang Shi wrote: > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 935c4d4..e21b81e 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2050,6 +2050,31 @@ extern int __meminit __early_pfn_to_nid(unsigned long pfn, > static inline void zero_resv_unavail(void) {} > #endif > > +static inline void calc_mem_size(unsigned long *total, unsigned long *reserved, > + unsigned long *highmem) > +{ Huge incline function. This needs to go into mm/page_alloc.c or mm/slab_common.c ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] mm: extract common code for calculating total memory size 2017-10-27 10:00 ` Christopher Lameter @ 2017-10-27 16:51 ` Yang Shi 2017-10-31 16:45 ` Yang Shi 0 siblings, 1 reply; 11+ messages in thread From: Yang Shi @ 2017-10-27 16:51 UTC (permalink / raw) To: Christopher Lameter Cc: penberg, rientjes, iamjoonsoo.kim, akpm, mhocko, linux-mm, linux-kernel On 10/27/17 3:00 AM, Christopher Lameter wrote: > On Thu, 26 Oct 2017, Yang Shi wrote: > >> diff --git a/include/linux/mm.h b/include/linux/mm.h >> index 935c4d4..e21b81e 100644 >> --- a/include/linux/mm.h >> +++ b/include/linux/mm.h >> @@ -2050,6 +2050,31 @@ extern int __meminit __early_pfn_to_nid(unsigned long pfn, >> static inline void zero_resv_unavail(void) {} >> #endif >> >> +static inline void calc_mem_size(unsigned long *total, unsigned long *reserved, >> + unsigned long *highmem) >> +{ > > Huge incline function. This needs to go into mm/page_alloc.c or > mm/slab_common.c It is used by lib/show_mem.c too. But since it is definitely on a hot patch, I think I can change it to non inline. Thanks, Yang > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] mm: extract common code for calculating total memory size 2017-10-27 16:51 ` Yang Shi @ 2017-10-31 16:45 ` Yang Shi 0 siblings, 0 replies; 11+ messages in thread From: Yang Shi @ 2017-10-31 16:45 UTC (permalink / raw) To: Christopher Lameter Cc: penberg, rientjes, iamjoonsoo.kim, akpm, mhocko, linux-mm, linux-kernel On 10/27/17 9:51 AM, Yang Shi wrote: > > > On 10/27/17 3:00 AM, Christopher Lameter wrote: >> On Thu, 26 Oct 2017, Yang Shi wrote: >> >>> diff --git a/include/linux/mm.h b/include/linux/mm.h >>> index 935c4d4..e21b81e 100644 >>> --- a/include/linux/mm.h >>> +++ b/include/linux/mm.h >>> @@ -2050,6 +2050,31 @@ extern int __meminit >>> __early_pfn_to_nid(unsigned long pfn, >>> static inline void zero_resv_unavail(void) {} >>> #endif >>> >>> +static inline void calc_mem_size(unsigned long *total, unsigned long >>> *reserved, >>> + unsigned long *highmem) >>> +{ >> >> Huge incline function. This needs to go into mm/page_alloc.c or >> mm/slab_common.c > > It is used by lib/show_mem.c too. But since it is definitely on a hot > patch, I think I can change it to non inline. I mean it is *not* on the hot path. Sorry for the typo and inconvenience. Yang > > Thanks, > Yang > >> ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/2] mm: oom: dump single excessive slab cache when oom 2017-10-25 22:48 [PATCH 0/2 -mmotm] oom: show single slab cache in oom whose size > 10% of total system memory Yang Shi 2017-10-25 22:48 ` [PATCH 1/2] mm: extract common code for calculating total memory size Yang Shi @ 2017-10-25 22:49 ` Yang Shi 2017-10-26 14:53 ` Michal Hocko 1 sibling, 1 reply; 11+ messages in thread From: Yang Shi @ 2017-10-25 22:49 UTC (permalink / raw) To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko Cc: Yang Shi, linux-mm, linux-kernel Per the discussion with David [1], it looks more reasonable to just dump the single excessive slab cache instead of dumping all slab caches when oom. Dump single excessive slab cache if its size is > 10% of total system memory size when oom regardless it is unreclaimable. [1] https://marc.info/?l=linux-mm&m=150819933626604&w=2 Suggested-by: David Rientjes <rientjes@google.com> Signed-off-by: Yang Shi <yang.s@alibaba-inc.com> --- mm/oom_kill.c | 22 +--------------------- mm/slab.h | 4 ++-- mm/slab_common.c | 21 ++++++++++++++++----- 3 files changed, 19 insertions(+), 28 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 26add8a..f996f29 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -162,25 +162,6 @@ static bool oom_unkillable_task(struct task_struct *p, return false; } -/* - * Print out unreclaimble slabs info when unreclaimable slabs amount is greater - * than all user memory (LRU pages) - */ -static bool is_dump_unreclaim_slabs(void) -{ - unsigned long nr_lru; - - nr_lru = global_node_page_state(NR_ACTIVE_ANON) + - global_node_page_state(NR_INACTIVE_ANON) + - global_node_page_state(NR_ACTIVE_FILE) + - global_node_page_state(NR_INACTIVE_FILE) + - global_node_page_state(NR_ISOLATED_ANON) + - global_node_page_state(NR_ISOLATED_FILE) + - global_node_page_state(NR_UNEVICTABLE); - - return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru); -} - /** * oom_badness - heuristic function to determine which candidate task to kill * @p: task struct of which task we should calculate @@ -443,8 +424,7 @@ static void dump_header(struct oom_control *oc, struct task_struct *p) mem_cgroup_print_oom_info(oc->memcg, p); else { show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask); - if (is_dump_unreclaim_slabs()) - dump_unreclaimable_slab(); + dump_slab_cache(); } if (sysctl_oom_dump_tasks) dump_tasks(oc->memcg, oc->nodemask); diff --git a/mm/slab.h b/mm/slab.h index 6a86025..818b569 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -507,9 +507,9 @@ static inline struct kmem_cache_node *get_node(struct kmem_cache *s, int node) int memcg_slab_show(struct seq_file *m, void *p); #if defined(CONFIG_SLAB) || defined(CONFIG_SLUB_DEBUG) -void dump_unreclaimable_slab(void); +void dump_slab_cache(void); #else -static inline void dump_unreclaimable_slab(void) +static inline void dump_slab_cache(void) { } #endif diff --git a/mm/slab_common.c b/mm/slab_common.c index 1b14fe0..e5bfa07 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -1311,7 +1311,18 @@ static int slab_show(struct seq_file *m, void *p) return 0; } -void dump_unreclaimable_slab(void) +static bool inline is_dump_slabs(struct kmem_cache *s, struct slabinfo *sinfo) +{ + unsigned long total = 0, reserved = 0, highmem = 0; + unsigned long slab_size = sinfo->num_objs * s->size; + + calc_mem_size(&total, &reserved, &highmem); + + /* Check if single slab > 10% of total memory size */ + return (slab_size > (total * PAGE_SIZE / 10)); +} + +void dump_slab_cache(void) { struct kmem_cache *s, *s2; struct slabinfo sinfo; @@ -1324,20 +1335,20 @@ void dump_unreclaimable_slab(void) * without acquiring the mutex. */ if (!mutex_trylock(&slab_mutex)) { - pr_warn("excessive unreclaimable slab but cannot dump stats\n"); + pr_warn("excessive slab cache but cannot dump stats\n"); return; } - pr_info("Unreclaimable slab info:\n"); + pr_info("The list of excessive single slab cache:\n"); pr_info("Name Used Total\n"); list_for_each_entry_safe(s, s2, &slab_caches, list) { - if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT)) + if (!is_root_cache(s)) continue; get_slabinfo(s, &sinfo); - if (sinfo.num_objs > 0) + if (is_dump_slabs(s, &sinfo)) pr_info("%-17s %10luKB %10luKB\n", cache_name(s), (sinfo.active_objs * s->size) / 1024, (sinfo.num_objs * s->size) / 1024); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] mm: oom: dump single excessive slab cache when oom 2017-10-25 22:49 ` [PATCH 2/2] mm: oom: dump single excessive slab cache when oom Yang Shi @ 2017-10-26 14:53 ` Michal Hocko 2017-10-26 16:15 ` Yang Shi 0 siblings, 1 reply; 11+ messages in thread From: Michal Hocko @ 2017-10-26 14:53 UTC (permalink / raw) To: Yang Shi Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel On Thu 26-10-17 06:49:00, Yang Shi wrote: > Per the discussion with David [1], it looks more reasonable to just dump Please try to avoid external references in the changelog as much as possible. > the single excessive slab cache instead of dumping all slab caches when > oom. You meant to say "to just dump all slab caches which excess 10% of the total memory." While we are at it. Abusing calc_mem_size seems to be rather clumsy and tt is not nodemask aware so you the whole thing is dubious for NUMA constrained OOMs. The more I think about this the more I am convinced that this is just fiddling with the code without a good reason and without much better outcome. > Dump single excessive slab cache if its size is > 10% of total system s@single@all@ > memory size when oom regardless it is unreclaimable. > > [1] https://marc.info/?l=linux-mm&m=150819933626604&w=2 > > Suggested-by: David Rientjes <rientjes@google.com> > Signed-off-by: Yang Shi <yang.s@alibaba-inc.com> > --- > mm/oom_kill.c | 22 +--------------------- > mm/slab.h | 4 ++-- > mm/slab_common.c | 21 ++++++++++++++++----- > 3 files changed, 19 insertions(+), 28 deletions(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 26add8a..f996f29 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -162,25 +162,6 @@ static bool oom_unkillable_task(struct task_struct *p, > return false; > } > > -/* > - * Print out unreclaimble slabs info when unreclaimable slabs amount is greater > - * than all user memory (LRU pages) > - */ > -static bool is_dump_unreclaim_slabs(void) > -{ > - unsigned long nr_lru; > - > - nr_lru = global_node_page_state(NR_ACTIVE_ANON) + > - global_node_page_state(NR_INACTIVE_ANON) + > - global_node_page_state(NR_ACTIVE_FILE) + > - global_node_page_state(NR_INACTIVE_FILE) + > - global_node_page_state(NR_ISOLATED_ANON) + > - global_node_page_state(NR_ISOLATED_FILE) + > - global_node_page_state(NR_UNEVICTABLE); > - > - return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru); > -} > - > /** > * oom_badness - heuristic function to determine which candidate task to kill > * @p: task struct of which task we should calculate > @@ -443,8 +424,7 @@ static void dump_header(struct oom_control *oc, struct task_struct *p) > mem_cgroup_print_oom_info(oc->memcg, p); > else { > show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask); > - if (is_dump_unreclaim_slabs()) > - dump_unreclaimable_slab(); > + dump_slab_cache(); > } > if (sysctl_oom_dump_tasks) > dump_tasks(oc->memcg, oc->nodemask); > diff --git a/mm/slab.h b/mm/slab.h > index 6a86025..818b569 100644 > --- a/mm/slab.h > +++ b/mm/slab.h > @@ -507,9 +507,9 @@ static inline struct kmem_cache_node *get_node(struct kmem_cache *s, int node) > int memcg_slab_show(struct seq_file *m, void *p); > > #if defined(CONFIG_SLAB) || defined(CONFIG_SLUB_DEBUG) > -void dump_unreclaimable_slab(void); > +void dump_slab_cache(void); > #else > -static inline void dump_unreclaimable_slab(void) > +static inline void dump_slab_cache(void) > { > } > #endif > diff --git a/mm/slab_common.c b/mm/slab_common.c > index 1b14fe0..e5bfa07 100644 > --- a/mm/slab_common.c > +++ b/mm/slab_common.c > @@ -1311,7 +1311,18 @@ static int slab_show(struct seq_file *m, void *p) > return 0; > } > > -void dump_unreclaimable_slab(void) > +static bool inline is_dump_slabs(struct kmem_cache *s, struct slabinfo *sinfo) > +{ > + unsigned long total = 0, reserved = 0, highmem = 0; > + unsigned long slab_size = sinfo->num_objs * s->size; > + > + calc_mem_size(&total, &reserved, &highmem); > + > + /* Check if single slab > 10% of total memory size */ > + return (slab_size > (total * PAGE_SIZE / 10)); > +} > + > +void dump_slab_cache(void) > { > struct kmem_cache *s, *s2; > struct slabinfo sinfo; > @@ -1324,20 +1335,20 @@ void dump_unreclaimable_slab(void) > * without acquiring the mutex. > */ > if (!mutex_trylock(&slab_mutex)) { > - pr_warn("excessive unreclaimable slab but cannot dump stats\n"); > + pr_warn("excessive slab cache but cannot dump stats\n"); > return; > } > > - pr_info("Unreclaimable slab info:\n"); > + pr_info("The list of excessive single slab cache:\n"); > pr_info("Name Used Total\n"); > > list_for_each_entry_safe(s, s2, &slab_caches, list) { > - if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT)) > + if (!is_root_cache(s)) > continue; > > get_slabinfo(s, &sinfo); > > - if (sinfo.num_objs > 0) > + if (is_dump_slabs(s, &sinfo)) > pr_info("%-17s %10luKB %10luKB\n", cache_name(s), > (sinfo.active_objs * s->size) / 1024, > (sinfo.num_objs * s->size) / 1024); > -- > 1.8.3.1 -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] mm: oom: dump single excessive slab cache when oom 2017-10-26 14:53 ` Michal Hocko @ 2017-10-26 16:15 ` Yang Shi 2017-10-26 16:27 ` Michal Hocko 0 siblings, 1 reply; 11+ messages in thread From: Yang Shi @ 2017-10-26 16:15 UTC (permalink / raw) To: Michal Hocko Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel On 10/26/17 7:53 AM, Michal Hocko wrote: > On Thu 26-10-17 06:49:00, Yang Shi wrote: >> Per the discussion with David [1], it looks more reasonable to just dump > > Please try to avoid external references in the changelog as much as > possible. OK. > >> the single excessive slab cache instead of dumping all slab caches when >> oom. > > You meant to say > "to just dump all slab caches which excess 10% of the total memory." > > While we are at it. Abusing calc_mem_size seems to be rather clumsy and > tt is not nodemask aware so you the whole thing is dubious for NUMA > constrained OOMs. Since we just need the total memory size of the node for NUMA constrained OOM, we should be able to use show_mem_node_skip() to bring in nodemask. > > The more I think about this the more I am convinced that this is just > fiddling with the code without a good reason and without much better > outcome. I don't get you. Do you mean the benefit is not that much with just dumping excessive slab caches? Thanks, Yang > >> Dump single excessive slab cache if its size is > 10% of total system > > s@single@all@ > >> memory size when oom regardless it is unreclaimable. >> >> [1] https://marc.info/?l=linux-mm&m=150819933626604&w=2 >> >> Suggested-by: David Rientjes <rientjes@google.com> >> Signed-off-by: Yang Shi <yang.s@alibaba-inc.com> >> --- >> mm/oom_kill.c | 22 +--------------------- >> mm/slab.h | 4 ++-- >> mm/slab_common.c | 21 ++++++++++++++++----- >> 3 files changed, 19 insertions(+), 28 deletions(-) >> >> diff --git a/mm/oom_kill.c b/mm/oom_kill.c >> index 26add8a..f996f29 100644 >> --- a/mm/oom_kill.c >> +++ b/mm/oom_kill.c >> @@ -162,25 +162,6 @@ static bool oom_unkillable_task(struct task_struct *p, >> return false; >> } >> >> -/* >> - * Print out unreclaimble slabs info when unreclaimable slabs amount is greater >> - * than all user memory (LRU pages) >> - */ >> -static bool is_dump_unreclaim_slabs(void) >> -{ >> - unsigned long nr_lru; >> - >> - nr_lru = global_node_page_state(NR_ACTIVE_ANON) + >> - global_node_page_state(NR_INACTIVE_ANON) + >> - global_node_page_state(NR_ACTIVE_FILE) + >> - global_node_page_state(NR_INACTIVE_FILE) + >> - global_node_page_state(NR_ISOLATED_ANON) + >> - global_node_page_state(NR_ISOLATED_FILE) + >> - global_node_page_state(NR_UNEVICTABLE); >> - >> - return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru); >> -} >> - >> /** >> * oom_badness - heuristic function to determine which candidate task to kill >> * @p: task struct of which task we should calculate >> @@ -443,8 +424,7 @@ static void dump_header(struct oom_control *oc, struct task_struct *p) >> mem_cgroup_print_oom_info(oc->memcg, p); >> else { >> show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask); >> - if (is_dump_unreclaim_slabs()) >> - dump_unreclaimable_slab(); >> + dump_slab_cache(); >> } >> if (sysctl_oom_dump_tasks) >> dump_tasks(oc->memcg, oc->nodemask); >> diff --git a/mm/slab.h b/mm/slab.h >> index 6a86025..818b569 100644 >> --- a/mm/slab.h >> +++ b/mm/slab.h >> @@ -507,9 +507,9 @@ static inline struct kmem_cache_node *get_node(struct kmem_cache *s, int node) >> int memcg_slab_show(struct seq_file *m, void *p); >> >> #if defined(CONFIG_SLAB) || defined(CONFIG_SLUB_DEBUG) >> -void dump_unreclaimable_slab(void); >> +void dump_slab_cache(void); >> #else >> -static inline void dump_unreclaimable_slab(void) >> +static inline void dump_slab_cache(void) >> { >> } >> #endif >> diff --git a/mm/slab_common.c b/mm/slab_common.c >> index 1b14fe0..e5bfa07 100644 >> --- a/mm/slab_common.c >> +++ b/mm/slab_common.c >> @@ -1311,7 +1311,18 @@ static int slab_show(struct seq_file *m, void *p) >> return 0; >> } >> >> -void dump_unreclaimable_slab(void) >> +static bool inline is_dump_slabs(struct kmem_cache *s, struct slabinfo *sinfo) >> +{ >> + unsigned long total = 0, reserved = 0, highmem = 0; >> + unsigned long slab_size = sinfo->num_objs * s->size; >> + >> + calc_mem_size(&total, &reserved, &highmem); >> + >> + /* Check if single slab > 10% of total memory size */ >> + return (slab_size > (total * PAGE_SIZE / 10)); >> +} >> + >> +void dump_slab_cache(void) >> { >> struct kmem_cache *s, *s2; >> struct slabinfo sinfo; >> @@ -1324,20 +1335,20 @@ void dump_unreclaimable_slab(void) >> * without acquiring the mutex. >> */ >> if (!mutex_trylock(&slab_mutex)) { >> - pr_warn("excessive unreclaimable slab but cannot dump stats\n"); >> + pr_warn("excessive slab cache but cannot dump stats\n"); >> return; >> } >> >> - pr_info("Unreclaimable slab info:\n"); >> + pr_info("The list of excessive single slab cache:\n"); >> pr_info("Name Used Total\n"); >> >> list_for_each_entry_safe(s, s2, &slab_caches, list) { >> - if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT)) >> + if (!is_root_cache(s)) >> continue; >> >> get_slabinfo(s, &sinfo); >> >> - if (sinfo.num_objs > 0) >> + if (is_dump_slabs(s, &sinfo)) >> pr_info("%-17s %10luKB %10luKB\n", cache_name(s), >> (sinfo.active_objs * s->size) / 1024, >> (sinfo.num_objs * s->size) / 1024); >> -- >> 1.8.3.1 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] mm: oom: dump single excessive slab cache when oom 2017-10-26 16:15 ` Yang Shi @ 2017-10-26 16:27 ` Michal Hocko 2017-10-26 17:14 ` Michal Hocko 0 siblings, 1 reply; 11+ messages in thread From: Michal Hocko @ 2017-10-26 16:27 UTC (permalink / raw) To: Yang Shi Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel On Fri 27-10-17 00:15:17, Yang Shi wrote: > > > On 10/26/17 7:53 AM, Michal Hocko wrote: > > On Thu 26-10-17 06:49:00, Yang Shi wrote: > > > Per the discussion with David [1], it looks more reasonable to just dump > > > > Please try to avoid external references in the changelog as much as > > possible. > > OK. > > > > > > the single excessive slab cache instead of dumping all slab caches when > > > oom. > > > > You meant to say > > "to just dump all slab caches which excess 10% of the total memory." > > > > While we are at it. Abusing calc_mem_size seems to be rather clumsy and > > tt is not nodemask aware so you the whole thing is dubious for NUMA > > constrained OOMs. > > Since we just need the total memory size of the node for NUMA constrained > OOM, we should be able to use show_mem_node_skip() to bring in nodemask. yes > > The more I think about this the more I am convinced that this is just > > fiddling with the code without a good reason and without much better > > outcome. > > I don't get you. Do you mean the benefit is not that much with just dumping > excessive slab caches? Yes, I am not sure it makes sense to touch it without further experiences. I am not saying this is a wrong approach I would just give it some more time to see how it behaves in the wild and then make changes based on that experience. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] mm: oom: dump single excessive slab cache when oom 2017-10-26 16:27 ` Michal Hocko @ 2017-10-26 17:14 ` Michal Hocko 2017-10-31 16:47 ` Yang Shi 0 siblings, 1 reply; 11+ messages in thread From: Michal Hocko @ 2017-10-26 17:14 UTC (permalink / raw) To: Yang Shi Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel On Thu 26-10-17 18:27:01, Michal Hocko wrote: > On Fri 27-10-17 00:15:17, Yang Shi wrote: > > > > > > On 10/26/17 7:53 AM, Michal Hocko wrote: > > > On Thu 26-10-17 06:49:00, Yang Shi wrote: > > > > Per the discussion with David [1], it looks more reasonable to just dump > > > > > > Please try to avoid external references in the changelog as much as > > > possible. > > > > OK. > > > > > > > > > the single excessive slab cache instead of dumping all slab caches when > > > > oom. > > > > > > You meant to say > > > "to just dump all slab caches which excess 10% of the total memory." > > > > > > While we are at it. Abusing calc_mem_size seems to be rather clumsy and > > > tt is not nodemask aware so you the whole thing is dubious for NUMA > > > constrained OOMs. > > > > Since we just need the total memory size of the node for NUMA constrained > > OOM, we should be able to use show_mem_node_skip() to bring in nodemask. > > yes to be more specific. This would work for the total number of pages calculation. This is still not enough, though. You would also have to filter slabs per numa node and this is getting more and more complicated for a marginal improvement. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] mm: oom: dump single excessive slab cache when oom 2017-10-26 17:14 ` Michal Hocko @ 2017-10-31 16:47 ` Yang Shi 0 siblings, 0 replies; 11+ messages in thread From: Yang Shi @ 2017-10-31 16:47 UTC (permalink / raw) To: Michal Hocko Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel On 10/26/17 10:14 AM, Michal Hocko wrote: > On Thu 26-10-17 18:27:01, Michal Hocko wrote: >> On Fri 27-10-17 00:15:17, Yang Shi wrote: >>> >>> >>> On 10/26/17 7:53 AM, Michal Hocko wrote: >>>> On Thu 26-10-17 06:49:00, Yang Shi wrote: >>>>> Per the discussion with David [1], it looks more reasonable to just dump >>>> >>>> Please try to avoid external references in the changelog as much as >>>> possible. >>> >>> OK. >>> >>>> >>>>> the single excessive slab cache instead of dumping all slab caches when >>>>> oom. >>>> >>>> You meant to say >>>> "to just dump all slab caches which excess 10% of the total memory." >>>> >>>> While we are at it. Abusing calc_mem_size seems to be rather clumsy and >>>> tt is not nodemask aware so you the whole thing is dubious for NUMA >>>> constrained OOMs. >>> >>> Since we just need the total memory size of the node for NUMA constrained >>> OOM, we should be able to use show_mem_node_skip() to bring in nodemask. >> >> yes > > to be more specific. This would work for the total number of pages > calculation. This is still not enough, though. You would also have to > filter slabs per numa node and this is getting more and more complicated > for a marginal improvement. Yes, it sounds so. Basically, I agree with you to wait for a while to see how the current implementation is doing. Thanks, Yang > ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2017-10-31 16:48 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-10-25 22:48 [PATCH 0/2 -mmotm] oom: show single slab cache in oom whose size > 10% of total system memory Yang Shi 2017-10-25 22:48 ` [PATCH 1/2] mm: extract common code for calculating total memory size Yang Shi 2017-10-27 10:00 ` Christopher Lameter 2017-10-27 16:51 ` Yang Shi 2017-10-31 16:45 ` Yang Shi 2017-10-25 22:49 ` [PATCH 2/2] mm: oom: dump single excessive slab cache when oom Yang Shi 2017-10-26 14:53 ` Michal Hocko 2017-10-26 16:15 ` Yang Shi 2017-10-26 16:27 ` Michal Hocko 2017-10-26 17:14 ` Michal Hocko 2017-10-31 16:47 ` Yang Shi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).