* [PATCH 1/2] mm/vmstat: expose min_slab_pages in /proc/zoneinfo
@ 2019-05-09 8:07 Yafang Shao
2019-05-09 8:07 ` [PATCH 2/2] mm/vmscan: shrink slab in node reclaim Yafang Shao
0 siblings, 1 reply; 4+ messages in thread
From: Yafang Shao @ 2019-05-09 8:07 UTC (permalink / raw)
To: mhocko, akpm; +Cc: linux-mm, shaoyafang, Yafang Shao
On one of our servers, we find the dentry is continuously growing
without shrinking. We're not sure whether that is because reclaimable
slab is still less than min_slab_pages.
So if we expose min_slab_pages, it would be easy to compare.
As we can set min_slab_ratio with sysctl, we should expose the effective
min_slab_pages to user as well.
That is same with min_unmapped_pages.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
mm/vmstat.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/mm/vmstat.c b/mm/vmstat.c
index a7d4933..bb76cfe 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1549,7 +1549,15 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
NR_VM_NUMA_STAT_ITEMS],
node_page_state(pgdat, i));
}
+
+#ifdef CONFIG_NUMA
+ seq_printf(m, "\n %-12s %lu", "min_slab",
+ pgdat->min_slab_pages);
+ seq_printf(m, "\n %-12s %lu", "min_unmapped",
+ pgdat->min_unmapped_pages);
+#endif
}
+
seq_printf(m,
"\n pages free %lu"
"\n min %lu"
--
1.8.3.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/2] mm/vmscan: shrink slab in node reclaim
2019-05-09 8:07 [PATCH 1/2] mm/vmstat: expose min_slab_pages in /proc/zoneinfo Yafang Shao
@ 2019-05-09 8:07 ` Yafang Shao
2019-05-22 21:40 ` Andrew Morton
0 siblings, 1 reply; 4+ messages in thread
From: Yafang Shao @ 2019-05-09 8:07 UTC (permalink / raw)
To: mhocko, akpm; +Cc: linux-mm, shaoyafang, Yafang Shao
In the node reclaim, may_shrinkslab is 0 by default,
hence shrink_slab will never be performed in it.
While shrik_slab should be performed if the relcaimable slab is over
min slab limit.
This issue is very easy to produce, first you continuously cat a random
non-exist file to produce more and more dentry, then you read big file
to produce page cache. And finally you will find that the denty will
never be shrunk.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
mm/vmscan.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index d9c3e87..2c73223 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4141,6 +4141,8 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
.may_unmap = !!(node_reclaim_mode & RECLAIM_UNMAP),
.may_swap = 1,
.reclaim_idx = gfp_zone(gfp_mask),
+ .may_shrinkslab = node_page_state(pgdat, NR_SLAB_RECLAIMABLE) >
+ pgdat->min_slab_pages,
};
trace_mm_vmscan_node_reclaim_begin(pgdat->node_id, order,
@@ -4158,15 +4160,13 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
reclaim_state.reclaimed_slab = 0;
p->reclaim_state = &reclaim_state;
- if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages) {
/*
* Free memory by calling shrink node with increasing
* priorities until we have enough memory freed.
*/
- do {
- shrink_node(pgdat, &sc);
- } while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
- }
+ do {
+ shrink_node(pgdat, &sc);
+ } while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
p->reclaim_state = NULL;
current->flags &= ~PF_SWAPWRITE;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 2/2] mm/vmscan: shrink slab in node reclaim
2019-05-09 8:07 ` [PATCH 2/2] mm/vmscan: shrink slab in node reclaim Yafang Shao
@ 2019-05-22 21:40 ` Andrew Morton
2019-05-23 4:56 ` Yafang Shao
0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2019-05-22 21:40 UTC (permalink / raw)
To: Yafang Shao; +Cc: mhocko, linux-mm, shaoyafang
On Thu, 9 May 2019 16:07:49 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
> In the node reclaim, may_shrinkslab is 0 by default,
> hence shrink_slab will never be performed in it.
> While shrik_slab should be performed if the relcaimable slab is over
> min slab limit.
>
> This issue is very easy to produce, first you continuously cat a random
> non-exist file to produce more and more dentry, then you read big file
> to produce page cache. And finally you will find that the denty will
> never be shrunk.
It does sound like an oversight.
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -4141,6 +4141,8 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
> .may_unmap = !!(node_reclaim_mode & RECLAIM_UNMAP),
> .may_swap = 1,
> .reclaim_idx = gfp_zone(gfp_mask),
> + .may_shrinkslab = node_page_state(pgdat, NR_SLAB_RECLAIMABLE) >
> + pgdat->min_slab_pages,
> };
>
> trace_mm_vmscan_node_reclaim_begin(pgdat->node_id, order,
> @@ -4158,15 +4160,13 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
> reclaim_state.reclaimed_slab = 0;
> p->reclaim_state = &reclaim_state;
>
> - if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages) {
Would it be better to do
if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages ||
sc.may_shrinkslab) {
> /*
> * Free memory by calling shrink node with increasing
> * priorities until we have enough memory freed.
> */
The above will want re-indenting and re-right-justifying.
> - do {
> - shrink_node(pgdat, &sc);
> - } while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
> - }
> + do {
> + shrink_node(pgdat, &sc);
> + } while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
Won't this cause pagecache reclaim and compaction which previously did
not occur? If yes, what are the effects of this and are they
desirable? If no, perhaps call shrink_slab() directly in this case.
Or something like that.
It's unclear why min_unmapped_pages (min_unmapped_ratio) exists. Is it
a batch-things-up efficiency thing?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 2/2] mm/vmscan: shrink slab in node reclaim
2019-05-22 21:40 ` Andrew Morton
@ 2019-05-23 4:56 ` Yafang Shao
0 siblings, 0 replies; 4+ messages in thread
From: Yafang Shao @ 2019-05-23 4:56 UTC (permalink / raw)
To: Andrew Morton; +Cc: Michal Hocko, Linux MM, shaoyafang
On Thu, May 23, 2019 at 5:40 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Thu, 9 May 2019 16:07:49 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
>
> > In the node reclaim, may_shrinkslab is 0 by default,
> > hence shrink_slab will never be performed in it.
> > While shrik_slab should be performed if the relcaimable slab is over
> > min slab limit.
> >
> > This issue is very easy to produce, first you continuously cat a random
> > non-exist file to produce more and more dentry, then you read big file
> > to produce page cache. And finally you will find that the denty will
> > never be shrunk.
>
> It does sound like an oversight.
>
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -4141,6 +4141,8 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
> > .may_unmap = !!(node_reclaim_mode & RECLAIM_UNMAP),
> > .may_swap = 1,
> > .reclaim_idx = gfp_zone(gfp_mask),
> > + .may_shrinkslab = node_page_state(pgdat, NR_SLAB_RECLAIMABLE) >
> > + pgdat->min_slab_pages,
> > };
> >
> > trace_mm_vmscan_node_reclaim_begin(pgdat->node_id, order,
> > @@ -4158,15 +4160,13 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
> > reclaim_state.reclaimed_slab = 0;
> > p->reclaim_state = &reclaim_state;
> >
> > - if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages) {
>
> Would it be better to do
>
> if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages ||
> sc.may_shrinkslab) {
>
This if condition is always true here, because we already check them
in node_reclaim(),
see bellow,
if (node_pagecache_reclaimable(pgdat) <= pgdat->min_unmapped_pages &&
node_page_state(pgdat, NR_SLAB_RECLAIMABLE) <= pgdat->min_slab_pages)
return NODE_RECLAIM_FULL;
> > /*
> > * Free memory by calling shrink node with increasing
> > * priorities until we have enough memory freed.
> > */
>
> The above will want re-indenting and re-right-justifying.
>
Sorry about the carelessness.
> > - do {
> > - shrink_node(pgdat, &sc);
> > - } while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
> > - }
> > + do {
> > + shrink_node(pgdat, &sc);
> > + } while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
>
> Won't this cause pagecache reclaim and compaction which previously did
> not occur? If yes, what are the effects of this and are they
> desirable? If no, perhaps call shrink_slab() directly in this case.
> Or something like that.
>
It may cause pagecache reclaim and compaction even if
node_pagecache_reclaimable() is still less than
pgdat->min_unmapped_pages.
The active file will be deactivated and the inactive file will be recaimed.
(I traced these behavior with mm_vmscan_lru_shrink_active and
mm_vmscan_lru_shrink_inactive tracepoint)
If we don't like these behavior, what about bellow change ?
@@ -4166,6 +4166,17 @@ static int __node_reclaim(struct pglist_data
*pgdat, gfp_t gfp_mask, unsigned in
do {
shrink_node(pgdat, &sc);
} while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
+ } else {
+ struct mem_cgroup *memcg;
+ struct mem_cgroup_reclaim_cookie reclaim = {
+ .pgdat = pgdat,
+ .priority = sc.priority,
+ };
+
+ memcg = mem_cgroup_iter(false, NULL, &reclaim);
+ do {
+ shrink_slab(sc.gfp_mask, pgdat->node_id,
memcg, sc.priority);
+ } while ((memcg = mem_cgroup_iter(false, memcg, &reclaim)));
}
> It's unclear why min_unmapped_pages (min_unmapped_ratio) exists. Is it
I have tried to understand it, but still don't have a clear idea yet.
So I just let it as-is.
> a batch-things-up efficiency thing?
I guess so.
Thanks
Yafang
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-05-23 4:57 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-09 8:07 [PATCH 1/2] mm/vmstat: expose min_slab_pages in /proc/zoneinfo Yafang Shao
2019-05-09 8:07 ` [PATCH 2/2] mm/vmscan: shrink slab in node reclaim Yafang Shao
2019-05-22 21:40 ` Andrew Morton
2019-05-23 4:56 ` Yafang Shao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).