All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] mm: improvement in shrink slab
@ 2019-06-02  9:22 Yafang Shao
  2019-06-02  9:22 ` [PATCH v3 1/3] mm/vmstat: expose min_slab_pages in /proc/zoneinfo Yafang Shao
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Yafang Shao @ 2019-06-02  9:22 UTC (permalink / raw)
  To: mhocko, akpm; +Cc: linux-mm, shaoyafang, Yafang Shao

In the past few days, I found an issue in shrink slab.
We I was trying to fix it, I find there are something in shrink slab need
to be improved.

- #1 is to expose the min_slab_pages to help us analyze shrink slab.

- #2 is an code improvement.

- #3 is a fix to a issue. This issue is very easy to produce.
In the zone reclaim mode.
First you continuously cat a random non-exist file to produce
more and more dentry, then you read big file to produce page cache.
Finally you will find that the denty will never be shrunk.


Yafang Shao (3):
  mm/vmstat: expose min_slab_pages in /proc/zoneinfo
  mm/vmscan: change return type of shrink_node() to void
  mm/vmscan: shrink slab in node reclaim

 mm/vmscan.c | 33 +++++++++++++++++++++++++++++----
 mm/vmstat.c |  8 ++++++++
 2 files changed, 37 insertions(+), 4 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 1/3] mm/vmstat: expose min_slab_pages in /proc/zoneinfo
  2019-06-02  9:22 [PATCH v3 0/3] mm: improvement in shrink slab Yafang Shao
@ 2019-06-02  9:22 ` Yafang Shao
  2019-06-02  9:22 ` [PATCH v3 2/3] mm/vmscan: change return type of shrink_node() to void Yafang Shao
  2019-06-02  9:23 ` [PATCH v3 3/3] mm/vmscan: shrink slab in node reclaim Yafang Shao
  2 siblings, 0 replies; 6+ messages in thread
From: Yafang Shao @ 2019-06-02  9:22 UTC (permalink / raw)
  To: mhocko, akpm; +Cc: linux-mm, shaoyafang, Yafang Shao

On one of our servers, we find the dentry is continuously growing
without shrinking. We're not sure whether that is because reclaimable
slab is still less than min_slab_pages.
So if we expose min_slab_pages, it would be easy to compare.

As we can set min_slab_ratio with sysctl, we should expose the effective
min_slab_pages to user as well.

That is same with min_unmapped_pages.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 mm/vmstat.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/mm/vmstat.c b/mm/vmstat.c
index a7d4933..bb76cfe 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1549,7 +1549,15 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
 				NR_VM_NUMA_STAT_ITEMS],
 				node_page_state(pgdat, i));
 		}
+
+#ifdef CONFIG_NUMA
+		seq_printf(m, "\n      %-12s %lu", "min_slab",
+			   pgdat->min_slab_pages);
+		seq_printf(m, "\n      %-12s %lu", "min_unmapped",
+			   pgdat->min_unmapped_pages);
+#endif
 	}
+
 	seq_printf(m,
 		   "\n  pages free     %lu"
 		   "\n        min      %lu"
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v3 2/3] mm/vmscan: change return type of shrink_node() to void
  2019-06-02  9:22 [PATCH v3 0/3] mm: improvement in shrink slab Yafang Shao
  2019-06-02  9:22 ` [PATCH v3 1/3] mm/vmstat: expose min_slab_pages in /proc/zoneinfo Yafang Shao
@ 2019-06-02  9:22 ` Yafang Shao
  2019-06-02  9:23 ` [PATCH v3 3/3] mm/vmscan: shrink slab in node reclaim Yafang Shao
  2 siblings, 0 replies; 6+ messages in thread
From: Yafang Shao @ 2019-06-02  9:22 UTC (permalink / raw)
  To: mhocko, akpm; +Cc: linux-mm, shaoyafang, Yafang Shao

As the return value of shrink_node() isn't used by any callsites,
we'd better change the return type of shrink_node() from static inline
to void.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 mm/vmscan.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index d9c3e87..e0c5669 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2657,7 +2657,7 @@ static bool pgdat_memcg_congested(pg_data_t *pgdat, struct mem_cgroup *memcg)
 		(memcg && memcg_congested(pgdat, memcg));
 }
 
-static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
+static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 {
 	struct reclaim_state *reclaim_state = current->reclaim_state;
 	unsigned long nr_reclaimed, nr_scanned;
@@ -2827,8 +2827,6 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 	 */
 	if (reclaimable)
 		pgdat->kswapd_failures = 0;
-
-	return reclaimable;
 }
 
 /*
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v3 3/3] mm/vmscan: shrink slab in node reclaim
  2019-06-02  9:22 [PATCH v3 0/3] mm: improvement in shrink slab Yafang Shao
  2019-06-02  9:22 ` [PATCH v3 1/3] mm/vmstat: expose min_slab_pages in /proc/zoneinfo Yafang Shao
  2019-06-02  9:22 ` [PATCH v3 2/3] mm/vmscan: change return type of shrink_node() to void Yafang Shao
@ 2019-06-02  9:23 ` Yafang Shao
  2019-06-02 13:58   ` Bharath Vedartham
  2 siblings, 1 reply; 6+ messages in thread
From: Yafang Shao @ 2019-06-02  9:23 UTC (permalink / raw)
  To: mhocko, akpm; +Cc: linux-mm, shaoyafang, Yafang Shao

In the node reclaim, may_shrinkslab is 0 by default,
hence shrink_slab will never be performed in it.
While shrik_slab should be performed if the relcaimable slab is over
min slab limit.

If reclaimable pagecache is less than min_unmapped_pages while
reclaimable slab is greater than min_slab_pages, we only shrink slab.
Otherwise the min_unmapped_pages will be useless under this condition.

reclaim_state.reclaimed_slab is to tell us how many pages are
reclaimed in shrink slab.

This issue is very easy to produce, first you continuously cat a random
non-exist file to produce more and more dentry, then you read big file
to produce page cache. And finally you will find that the denty will
never be shrunk.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 mm/vmscan.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index e0c5669..d52014f 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4157,6 +4157,8 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
 	p->reclaim_state = &reclaim_state;
 
 	if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages) {
+		sc.may_shrinkslab = (pgdat->min_slab_pages <
+				node_page_state(pgdat, NR_SLAB_RECLAIMABLE));
 		/*
 		 * Free memory by calling shrink node with increasing
 		 * priorities until we have enough memory freed.
@@ -4164,6 +4166,28 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
 		do {
 			shrink_node(pgdat, &sc);
 		} while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
+	} else {
+		/*
+		 * If the reclaimable pagecache is not greater than
+		 * min_unmapped_pages, only reclaim the slab.
+		 */
+		struct mem_cgroup *memcg;
+		struct mem_cgroup_reclaim_cookie reclaim = {
+			.pgdat = pgdat,
+		};
+
+		do {
+			reclaim.priority = sc.priority;
+			memcg = mem_cgroup_iter(NULL, NULL, &reclaim);
+			do {
+				shrink_slab(sc.gfp_mask, pgdat->node_id,
+					    memcg, sc.priority);
+			} while ((memcg = mem_cgroup_iter(NULL, memcg,
+							  &reclaim)));
+
+			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
+			reclaim_state.reclaimed_slab = 0;
+		} while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
 	}
 
 	p->reclaim_state = NULL;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 3/3] mm/vmscan: shrink slab in node reclaim
  2019-06-02  9:23 ` [PATCH v3 3/3] mm/vmscan: shrink slab in node reclaim Yafang Shao
@ 2019-06-02 13:58   ` Bharath Vedartham
  2019-06-02 14:25     ` Yafang Shao
  0 siblings, 1 reply; 6+ messages in thread
From: Bharath Vedartham @ 2019-06-02 13:58 UTC (permalink / raw)
  To: Yafang Shao; +Cc: mhocko, akpm, linux-mm, shaoyafang

On Sun, Jun 02, 2019 at 05:23:00PM +0800, Yafang Shao wrote:
> In the node reclaim, may_shrinkslab is 0 by default,
> hence shrink_slab will never be performed in it.
> While shrik_slab should be performed if the relcaimable slab is over
> min slab limit.
> 
> If reclaimable pagecache is less than min_unmapped_pages while
> reclaimable slab is greater than min_slab_pages, we only shrink slab.
> Otherwise the min_unmapped_pages will be useless under this condition.
> 
> reclaim_state.reclaimed_slab is to tell us how many pages are
> reclaimed in shrink slab.
> 
> This issue is very easy to produce, first you continuously cat a random
> non-exist file to produce more and more dentry, then you read big file
> to produce page cache. And finally you will find that the denty will
> never be shrunk.
> 
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> ---
>  mm/vmscan.c | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index e0c5669..d52014f 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -4157,6 +4157,8 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
>  	p->reclaim_state = &reclaim_state;
>  
>  	if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages) {
> +		sc.may_shrinkslab = (pgdat->min_slab_pages <
> +				node_page_state(pgdat, NR_SLAB_RECLAIMABLE));
>  		/*
>  		 * Free memory by calling shrink node with increasing
>  		 * priorities until we have enough memory freed.
> @@ -4164,6 +4166,28 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
>  		do {
>  			shrink_node(pgdat, &sc);
>  		} while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
> +	} else {
> +		/*
> +		 * If the reclaimable pagecache is not greater than
> +		 * min_unmapped_pages, only reclaim the slab.
> +		 */
> +		struct mem_cgroup *memcg;
> +		struct mem_cgroup_reclaim_cookie reclaim = {
> +			.pgdat = pgdat,
> +		};
> +
> +		do {
> +			reclaim.priority = sc.priority;
> +			memcg = mem_cgroup_iter(NULL, NULL, &reclaim);
> +			do {
> +				shrink_slab(sc.gfp_mask, pgdat->node_id,
> +					    memcg, sc.priority);
> +			} while ((memcg = mem_cgroup_iter(NULL, memcg,
> +							  &reclaim)));
> +
> +			sc.nr_reclaimed += reclaim_state.reclaimed_slab;
> +			reclaim_state.reclaimed_slab = 0;
> +		} while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
>  	}
>  
>  	p->reclaim_state = NULL;
> -- 
> 1.8.3.1
>

Hi Yafang,

Just a few questions regarding this patch.

Don't you want to check if the number of slab reclaimable pages is
greater than pgdat->min_slab_pages before reclaiming from slab in your
else statement? Where is the check to see whether number of
reclaimable slab pages is greater than pgdat->min_slab_pages? It looks like your
shrinking slab on the condition if (node_pagecache_reclaimable(pgdata) >
min_unmapped_pages) is false, Not if (pgdat->min_slab_pages <
node_page_state(pgdat, NR_SLAB_RECLAIMABLE))? What do you think?

Also would it be better if we update sc.may_shrinkslab outside the if
statement of checking min_unmapped_pages? I think it may look better?

Would it be better if we move updating sc.may_shrinkslab outside the
if statement where we check min_unmapped_pages and add a else if
(sc.may_shrinkslab) rather than an else and then start shrinking the slab?

Thank you 
Bharath


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 3/3] mm/vmscan: shrink slab in node reclaim
  2019-06-02 13:58   ` Bharath Vedartham
@ 2019-06-02 14:25     ` Yafang Shao
  0 siblings, 0 replies; 6+ messages in thread
From: Yafang Shao @ 2019-06-02 14:25 UTC (permalink / raw)
  To: Bharath Vedartham; +Cc: Michal Hocko, Andrew Morton, Linux MM, shaoyafang

On Sun, Jun 2, 2019 at 9:58 PM Bharath Vedartham <linux.bhar@gmail.com> wrote:
>
> On Sun, Jun 02, 2019 at 05:23:00PM +0800, Yafang Shao wrote:
> > In the node reclaim, may_shrinkslab is 0 by default,
> > hence shrink_slab will never be performed in it.
> > While shrik_slab should be performed if the relcaimable slab is over
> > min slab limit.
> >
> > If reclaimable pagecache is less than min_unmapped_pages while
> > reclaimable slab is greater than min_slab_pages, we only shrink slab.
> > Otherwise the min_unmapped_pages will be useless under this condition.
> >
> > reclaim_state.reclaimed_slab is to tell us how many pages are
> > reclaimed in shrink slab.
> >
> > This issue is very easy to produce, first you continuously cat a random
> > non-exist file to produce more and more dentry, then you read big file
> > to produce page cache. And finally you will find that the denty will
> > never be shrunk.
> >
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > ---
> >  mm/vmscan.c | 24 ++++++++++++++++++++++++
> >  1 file changed, 24 insertions(+)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index e0c5669..d52014f 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -4157,6 +4157,8 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
> >       p->reclaim_state = &reclaim_state;
> >
> >       if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages) {
> > +             sc.may_shrinkslab = (pgdat->min_slab_pages <
> > +                             node_page_state(pgdat, NR_SLAB_RECLAIMABLE));
> >               /*
> >                * Free memory by calling shrink node with increasing
> >                * priorities until we have enough memory freed.
> > @@ -4164,6 +4166,28 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
> >               do {
> >                       shrink_node(pgdat, &sc);
> >               } while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
> > +     } else {
> > +             /*
> > +              * If the reclaimable pagecache is not greater than
> > +              * min_unmapped_pages, only reclaim the slab.
> > +              */
> > +             struct mem_cgroup *memcg;
> > +             struct mem_cgroup_reclaim_cookie reclaim = {
> > +                     .pgdat = pgdat,
> > +             };
> > +
> > +             do {
> > +                     reclaim.priority = sc.priority;
> > +                     memcg = mem_cgroup_iter(NULL, NULL, &reclaim);
> > +                     do {
> > +                             shrink_slab(sc.gfp_mask, pgdat->node_id,
> > +                                         memcg, sc.priority);
> > +                     } while ((memcg = mem_cgroup_iter(NULL, memcg,
> > +                                                       &reclaim)));
> > +
> > +                     sc.nr_reclaimed += reclaim_state.reclaimed_slab;
> > +                     reclaim_state.reclaimed_slab = 0;
> > +             } while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
> >       }
> >
> >       p->reclaim_state = NULL;
> > --
> > 1.8.3.1
> >
>
> Hi Yafang,
>
> Just a few questions regarding this patch.
>
> Don't you want to check if the number of slab reclaimable pages is
> greater than pgdat->min_slab_pages before reclaiming from slab in your
> else statement? Where is the check to see whether number of
> reclaimable slab pages is greater than pgdat->min_slab_pages? It looks like your
> shrinking slab on the condition if (node_pagecache_reclaimable(pgdata) >
> min_unmapped_pages) is false, Not if (pgdat->min_slab_pages <
> node_page_state(pgdat, NR_SLAB_RECLAIMABLE))? What do you think?
>

Hi Bharath,

Because in  __node_reclaim(), if node_pagecache_reclaimable(pgdat) is
not greater than
pgdat->min_unmapped_pages, then reclaimable slab pages must be greater than
pgdat->min_slab_pages, so we don't need to check it again.

Pls. see the code in node_reclaim():
node_reclaim
    if (node_pagecache_reclaimable(pgdat) <= pgdat->min_unmapped_pages &&
        node_page_state(pgdat, NR_SLAB_RECLAIMABLE) <= pgdat->min_slab_pages)
        return NODE_RECLAIM_FULL;
    __node_reclaim();

> Also would it be better if we update sc.may_shrinkslab outside the if
> statement of checking min_unmapped_pages? I think it may look better?
>
> Would it be better if we move updating sc.may_shrinkslab outside the
> if statement where we check min_unmapped_pages and add a else if
> (sc.may_shrinkslab) rather than an else and then start shrinking the slab?
>

Because sc.may_shrinkslab  is used in shrink_node() only, while it will not be
used in the else statement, so we don't need to update sc.may_shrinkslab outside
the if statement.

Hope it could clarify.
Feel free to ask me it you still have any questions.

Thanks
Yafang


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-06-02 14:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-02  9:22 [PATCH v3 0/3] mm: improvement in shrink slab Yafang Shao
2019-06-02  9:22 ` [PATCH v3 1/3] mm/vmstat: expose min_slab_pages in /proc/zoneinfo Yafang Shao
2019-06-02  9:22 ` [PATCH v3 2/3] mm/vmscan: change return type of shrink_node() to void Yafang Shao
2019-06-02  9:23 ` [PATCH v3 3/3] mm/vmscan: shrink slab in node reclaim Yafang Shao
2019-06-02 13:58   ` Bharath Vedartham
2019-06-02 14:25     ` Yafang Shao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.