[PATCH v1] mm:memcg: skip memcg of current in mem_cgroup_soft_limit

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v1] mm:memcg: skip memcg of current in mem_cgroup_soft_limit_reclaim
@ 2018-08-03  5:48 Zhaoyang Huang
  2018-08-03  6:11 ` Zhaoyang Huang
  2018-08-03  6:15 ` Michal Hocko
  0 siblings, 2 replies; 6+ messages in thread
From: Zhaoyang Huang @ 2018-08-03  5:48 UTC (permalink / raw)
  To: Steven Rostedt, Ingo Molnar, Johannes Weiner, Michal Hocko,
	Vladimir Davydov, linux-mm, cgroups, linux-kernel,
	kernel-patch-test

for the soft_limit reclaim has more directivity than global reclaim, we
have current memcg be skipped to avoid potential page thrashing.

Signed-off-by: Zhaoyang Huang <zhaoyang.huang@spreadtrum.com>
---
 mm/memcontrol.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8c0280b..9d09e95 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2537,12 +2537,21 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 			mz = mem_cgroup_largest_soft_limit_node(mctz);
 		if (!mz)
 			break;
-
+		/*
+		 * skip current memcg to avoid page thrashing, for the
+		 * mem_cgroup_soft_reclaim has more directivity than
+		 * global reclaim.
+		 */
+		if (get_mem_cgroup_from_mm(current->mm) == mz->memcg) {
+			reclaimed = 0;
+			goto next;
+		}
 		nr_scanned = 0;
 		reclaimed = mem_cgroup_soft_reclaim(mz->memcg, pgdat,
 						    gfp_mask, &nr_scanned);
 		nr_reclaimed += reclaimed;
 		*total_scanned += nr_scanned;
+next:
 		spin_lock_irq(&mctz->lock);
 		__mem_cgroup_remove_exceeded(mz, mctz);
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v1] mm:memcg: skip memcg of current in mem_cgroup_soft_limit_reclaim
  2018-08-03  5:48 [PATCH v1] mm:memcg: skip memcg of current in mem_cgroup_soft_limit_reclaim Zhaoyang Huang
@ 2018-08-03  6:11 ` Zhaoyang Huang
  2018-08-03  6:18   ` Michal Hocko
  2018-08-03  6:15 ` Michal Hocko
  1 sibling, 1 reply; 6+ messages in thread
From: Zhaoyang Huang @ 2018-08-03  6:11 UTC (permalink / raw)
  To: Steven Rostedt, Ingo Molnar, Johannes Weiner, Michal Hocko,
	Vladimir Davydov, open list:MEMORY MANAGEMENT, cgroups, LKML,
	kernel-patch-test

On Fri, Aug 3, 2018 at 1:48 PM Zhaoyang Huang <huangzhaoyang@gmail.com> wrote:
>
> for the soft_limit reclaim has more directivity than global reclaim, we40960
> have current memcg be skipped to avoid potential page thrashing.
>
The patch is tested in our android system with 2GB ram.  The case
mainly focus on the smooth slide of pictures on a gallery, which used
to stall on the direct reclaim for over several hundred
millionseconds. By further debugging, we find that the direct reclaim
spend most of time to reclaim pages on its own with softlimit set to
40960KB. I add a ftrace event to verify that the patch can help
escaping such scenario. Furthermore, we also measured the major fault
of this process(by dumpsys of android). The result is the patch can
help to reduce 20% of the major fault during the test.

> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@spreadtrum.com>
> ---
>  mm/memcontrol.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 8c0280b..9d09e95 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2537,12 +2537,21 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
>                         mz = mem_cgroup_largest_soft_limit_node(mctz);
>                 if (!mz)
>                         break;
> -
> +               /*
> +                * skip current memcg to avoid page thrashing, for the
> +                * mem_cgroup_soft_reclaim has more directivity than
> +                * global reclaim.
> +                */
> +               if (get_mem_cgroup_from_mm(current->mm) == mz->memcg) {
> +                       reclaimed = 0;
> +                       goto next;
> +               }
>                 nr_scanned = 0;
>                 reclaimed = mem_cgroup_soft_reclaim(mz->memcg, pgdat,
>                                                     gfp_mask, &nr_scanned);
>                 nr_reclaimed += reclaimed;
>                 *total_scanned += nr_scanned;
> +next:
>                 spin_lock_irq(&mctz->lock);
>                 __mem_cgroup_remove_exceeded(mz, mctz);
>
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1] mm:memcg: skip memcg of current in mem_cgroup_soft_limit_reclaim
  2018-08-03  5:48 [PATCH v1] mm:memcg: skip memcg of current in mem_cgroup_soft_limit_reclaim Zhaoyang Huang
  2018-08-03  6:11 ` Zhaoyang Huang
@ 2018-08-03  6:15 ` Michal Hocko
  1 sibling, 0 replies; 6+ messages in thread
From: Michal Hocko @ 2018-08-03  6:15 UTC (permalink / raw)
  To: Zhaoyang Huang
  Cc: Steven Rostedt, Ingo Molnar, Johannes Weiner, Vladimir Davydov,
	linux-mm, cgroups, linux-kernel, kernel-patch-test

On Fri 03-08-18 13:48:05, Zhaoyang Huang wrote:
> for the soft_limit reclaim has more directivity than global reclaim, we
> have current memcg be skipped to avoid potential page thrashing.

a) this changelog doesn't really explain the problem nor does it explain
   why the proposed solution is reasonable or why it works at all and
b) no, this doesn't really work. You could easily break the current soft
   limit semantic.

I understand that you are not really happy about how the soft limit
works. Me neither but this whole interface is a huge mistake of past and
the general recommendation is to not use it. We simply cannot fix it
because it is unfixable. The semantic is just broken and somebody might
really depend on it.

> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@spreadtrum.com>
> ---
>  mm/memcontrol.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 8c0280b..9d09e95 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2537,12 +2537,21 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
>  			mz = mem_cgroup_largest_soft_limit_node(mctz);
>  		if (!mz)
>  			break;
> -
> +		/*
> +		 * skip current memcg to avoid page thrashing, for the
> +		 * mem_cgroup_soft_reclaim has more directivity than
> +		 * global reclaim.
> +		 */
> +		if (get_mem_cgroup_from_mm(current->mm) == mz->memcg) {
> +			reclaimed = 0;
> +			goto next;
> +		}
>  		nr_scanned = 0;
>  		reclaimed = mem_cgroup_soft_reclaim(mz->memcg, pgdat,
>  						    gfp_mask, &nr_scanned);
>  		nr_reclaimed += reclaimed;
>  		*total_scanned += nr_scanned;
> +next:
>  		spin_lock_irq(&mctz->lock);
>  		__mem_cgroup_remove_exceeded(mz, mctz);
>  
> -- 
> 1.9.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1] mm:memcg: skip memcg of current in mem_cgroup_soft_limit_reclaim
  2018-08-03  6:11 ` Zhaoyang Huang
@ 2018-08-03  6:18   ` Michal Hocko
  2018-08-03  6:59     ` Zhaoyang Huang
  0 siblings, 1 reply; 6+ messages in thread
From: Michal Hocko @ 2018-08-03  6:18 UTC (permalink / raw)
  To: Zhaoyang Huang
  Cc: Steven Rostedt, Ingo Molnar, Johannes Weiner, Vladimir Davydov,
	open list:MEMORY MANAGEMENT, cgroups, LKML, kernel-patch-test

On Fri 03-08-18 14:11:26, Zhaoyang Huang wrote:
> On Fri, Aug 3, 2018 at 1:48 PM Zhaoyang Huang <huangzhaoyang@gmail.com> wrote:
> >
> > for the soft_limit reclaim has more directivity than global reclaim, we40960
> > have current memcg be skipped to avoid potential page thrashing.
> >
> The patch is tested in our android system with 2GB ram.  The case
> mainly focus on the smooth slide of pictures on a gallery, which used
> to stall on the direct reclaim for over several hundred
> millionseconds. By further debugging, we find that the direct reclaim
> spend most of time to reclaim pages on its own with softlimit set to
> 40960KB. I add a ftrace event to verify that the patch can help
> escaping such scenario. Furthermore, we also measured the major fault
> of this process(by dumpsys of android). The result is the patch can
> help to reduce 20% of the major fault during the test.

I have asked already asked. Why do you use the soft limit in the first
place? It is known to cause excessive reclaim and long stalls.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1] mm:memcg: skip memcg of current in mem_cgroup_soft_limit_reclaim
  2018-08-03  6:18   ` Michal Hocko
@ 2018-08-03  6:59     ` Zhaoyang Huang
  2018-08-03  7:07       ` Michal Hocko
  0 siblings, 1 reply; 6+ messages in thread
From: Zhaoyang Huang @ 2018-08-03  6:59 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Steven Rostedt, Ingo Molnar, Johannes Weiner, Vladimir Davydov,
	open list:MEMORY MANAGEMENT, cgroups, LKML, kernel-patch-test

On Fri, Aug 3, 2018 at 2:18 PM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Fri 03-08-18 14:11:26, Zhaoyang Huang wrote:
> > On Fri, Aug 3, 2018 at 1:48 PM Zhaoyang Huang <huangzhaoyang@gmail.com> wrote:
> > >
> > > for the soft_limit reclaim has more directivity than global reclaim, we40960
> > > have current memcg be skipped to avoid potential page thrashing.
> > >
> > The patch is tested in our android system with 2GB ram.  The case
> > mainly focus on the smooth slide of pictures on a gallery, which used
> > to stall on the direct reclaim for over several hundred
> > millionseconds. By further debugging, we find that the direct reclaim
> > spend most of time to reclaim pages on its own with softlimit set to
> > 40960KB. I add a ftrace event to verify that the patch can help
> > escaping such scenario. Furthermore, we also measured the major fault
> > of this process(by dumpsys of android). The result is the patch can
> > help to reduce 20% of the major fault during the test.
>
> I have asked already asked. Why do you use the soft limit in the first
> place? It is known to cause excessive reclaim and long stalls.

It is required by Google for applying new version of android system.
There was such a mechanism called LMK in previous ANDROID version,
which will kill process when in memory contention like OOM does. I
think Google want to drop such rough way for reclaiming pages and turn
to memcg. They setup different memcg groups for different process of
the system and set their softlimit according to the oom_adj. Their
original purpose is to reclaim pages gentlely in direct reclaim and
kswapd. During the debugging process , it seems to me that memcg maybe
tunable somehow. At least , the patch works on our system.
> --
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1] mm:memcg: skip memcg of current in mem_cgroup_soft_limit_reclaim
  2018-08-03  6:59     ` Zhaoyang Huang
@ 2018-08-03  7:07       ` Michal Hocko
  0 siblings, 0 replies; 6+ messages in thread
From: Michal Hocko @ 2018-08-03  7:07 UTC (permalink / raw)
  To: Zhaoyang Huang
  Cc: Steven Rostedt, Ingo Molnar, Johannes Weiner, Vladimir Davydov,
	open list:MEMORY MANAGEMENT, cgroups, LKML, kernel-patch-test

On Fri 03-08-18 14:59:34, Zhaoyang Huang wrote:
> On Fri, Aug 3, 2018 at 2:18 PM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Fri 03-08-18 14:11:26, Zhaoyang Huang wrote:
> > > On Fri, Aug 3, 2018 at 1:48 PM Zhaoyang Huang <huangzhaoyang@gmail.com> wrote:
> > > >
> > > > for the soft_limit reclaim has more directivity than global reclaim, we40960
> > > > have current memcg be skipped to avoid potential page thrashing.
> > > >
> > > The patch is tested in our android system with 2GB ram.  The case
> > > mainly focus on the smooth slide of pictures on a gallery, which used
> > > to stall on the direct reclaim for over several hundred
> > > millionseconds. By further debugging, we find that the direct reclaim
> > > spend most of time to reclaim pages on its own with softlimit set to
> > > 40960KB. I add a ftrace event to verify that the patch can help
> > > escaping such scenario. Furthermore, we also measured the major fault
> > > of this process(by dumpsys of android). The result is the patch can
> > > help to reduce 20% of the major fault during the test.
> >
> > I have asked already asked. Why do you use the soft limit in the first
> > place? It is known to cause excessive reclaim and long stalls.
> 
> It is required by Google for applying new version of android system.
> There was such a mechanism called LMK in previous ANDROID version,
> which will kill process when in memory contention like OOM does. I
> think Google want to drop such rough way for reclaiming pages and turn
> to memcg. They setup different memcg groups for different process of
> the system and set their softlimit according to the oom_adj. Their
> original purpose is to reclaim pages gentlely in direct reclaim and
> kswapd. During the debugging process , it seems to me that memcg maybe
> tunable somehow. At least , the patch works on our system.

Then the suggestion is to use v2 and the high limit. This is much less
disruptive method for pro-active reclaim. Really softlimit semantic is
established for many years and you cannot change it even when it sucks
for your workload. Others might depend on the traditional behavior.

I have tried to change the semantic in the past and there was a general
consensus that changing the semantic is just too risky. So it is nice
that it helps for your particular workload but this is not an upstream
material, I am sorry.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-08-03  7:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-03  5:48 [PATCH v1] mm:memcg: skip memcg of current in mem_cgroup_soft_limit_reclaim Zhaoyang Huang
2018-08-03  6:11 ` Zhaoyang Huang
2018-08-03  6:18   ` Michal Hocko
2018-08-03  6:59     ` Zhaoyang Huang
2018-08-03  7:07       ` Michal Hocko
2018-08-03  6:15 ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).