[PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink_zones
@ 2017-03-12 11:06 ` Yisheng Xie
  0 siblings, 0 replies; 12+ messages in thread
From: Yisheng Xie @ 2017-03-12 11:06 UTC (permalink / raw)
  To: akpm, hannes, mgorman, vbabka, mhocko, riel, shakeelb
  Cc: linux-mm, linux-kernel, xieyisheng1, guohanjun, qiuxishi

From: Yisheng Xie <xieyisheng1@huawei.com>

When we enter do_try_to_free_pages, the may_thrash is always clear, and
it will retry shrink zones to tap cgroup's reserves memory by setting
may_thrash when the former shrink_zones reclaim nothing.

However, when memcg is disabled or on legacy hierarchy, it should not do
this useless retry at all, for we do not have any cgroup's reserves
memory to tap, and we have already done hard work but made no progress.

To avoid this time costly and useless retrying, add a stub function
mem_cgroup_thrashed() and return true when memcg is disabled or on
legacy hierarchy.

Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Suggested-by: Shakeel Butt <shakeelb@google.com>
---
v3:
 - rename function may_thrash() to mem_cgroup_thrashed() to avoid confusing.

v2:
 - more restrictive condition for retry of shrink_zones (restricting
   cgroup_disabled=memory boot option and cgroup legacy hierarchy) - Shakeel

 - add a stub function may_thrash() to avoid compile error or warning.

 - rename subject from "donot retry shrink zones when memcg is disable"
   to "more restrictive condition for retry in do_try_to_free_pages"

Any comment is more than welcome!

Thanks
Yisheng Xie

 mm/vmscan.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index bc8031e..a76475af 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -184,6 +184,19 @@ static bool sane_reclaim(struct scan_control *sc)
 #endif
 	return false;
 }
+
+static bool mem_cgroup_thrashed(struct scan_control *sc)
+{
+	/*
+	 * When memcg is disabled or on legacy hierarchy, there is no cgroup
+	 * reserves memory to tap. So fake it as thrashed.
+	 */
+	if (!cgroup_subsys_enabled(memory_cgrp_subsys) ||
+	    !cgroup_subsys_on_dfl(memory_cgrp_subsys))
+		return true;
+
+	return sc->may_thrash;
+}
 #else
 static bool global_reclaim(struct scan_control *sc)
 {
@@ -194,6 +207,11 @@ static bool sane_reclaim(struct scan_control *sc)
 {
 	return true;
 }
+
+static bool mem_cgroup_thrashed(struct scan_control *sc)
+{
+	return true;
+}
 #endif
 
 /*
@@ -2808,7 +2826,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
 		return 1;
 
 	/* Untapped cgroup reserves?  Don't OOM, retry. */
-	if (!sc->may_thrash) {
+	if (!mem_cgroup_thrashed(sc)) {
 		sc->priority = initial_priority;
 		sc->may_thrash = 1;
 		goto retry;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink_zones
@ 2017-03-12 11:06 ` Yisheng Xie
  0 siblings, 0 replies; 12+ messages in thread
From: Yisheng Xie @ 2017-03-12 11:06 UTC (permalink / raw)
  To: akpm, hannes, mgorman, vbabka, mhocko, riel, shakeelb
  Cc: linux-mm, linux-kernel, xieyisheng1, guohanjun, qiuxishi

From: Yisheng Xie <xieyisheng1@huawei.com>

When we enter do_try_to_free_pages, the may_thrash is always clear, and
it will retry shrink zones to tap cgroup's reserves memory by setting
may_thrash when the former shrink_zones reclaim nothing.

However, when memcg is disabled or on legacy hierarchy, it should not do
this useless retry at all, for we do not have any cgroup's reserves
memory to tap, and we have already done hard work but made no progress.

To avoid this time costly and useless retrying, add a stub function
mem_cgroup_thrashed() and return true when memcg is disabled or on
legacy hierarchy.

Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Suggested-by: Shakeel Butt <shakeelb@google.com>
---
v3:
 - rename function may_thrash() to mem_cgroup_thrashed() to avoid confusing.

v2:
 - more restrictive condition for retry of shrink_zones (restricting
   cgroup_disabled=memory boot option and cgroup legacy hierarchy) - Shakeel

 - add a stub function may_thrash() to avoid compile error or warning.

 - rename subject from "donot retry shrink zones when memcg is disable"
   to "more restrictive condition for retry in do_try_to_free_pages"

Any comment is more than welcome!

Thanks
Yisheng Xie

 mm/vmscan.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index bc8031e..a76475af 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -184,6 +184,19 @@ static bool sane_reclaim(struct scan_control *sc)
 #endif
 	return false;
 }
+
+static bool mem_cgroup_thrashed(struct scan_control *sc)
+{
+	/*
+	 * When memcg is disabled or on legacy hierarchy, there is no cgroup
+	 * reserves memory to tap. So fake it as thrashed.
+	 */
+	if (!cgroup_subsys_enabled(memory_cgrp_subsys) ||
+	    !cgroup_subsys_on_dfl(memory_cgrp_subsys))
+		return true;
+
+	return sc->may_thrash;
+}
 #else
 static bool global_reclaim(struct scan_control *sc)
 {
@@ -194,6 +207,11 @@ static bool sane_reclaim(struct scan_control *sc)
 {
 	return true;
 }
+
+static bool mem_cgroup_thrashed(struct scan_control *sc)
+{
+	return true;
+}
 #endif
 
 /*
@@ -2808,7 +2826,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
 		return 1;
 
 	/* Untapped cgroup reserves?  Don't OOM, retry. */
-	if (!sc->may_thrash) {
+	if (!mem_cgroup_thrashed(sc)) {
 		sc->priority = initial_priority;
 		sc->may_thrash = 1;
 		goto retry;
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink_zones
  2017-03-12 11:06 ` Yisheng Xie
@ 2017-03-12 20:20   ` Shakeel Butt
  -1 siblings, 0 replies; 12+ messages in thread
From: Shakeel Butt @ 2017-03-12 20:20 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Andrew Morton, Johannes Weiner, Mel Gorman, Vlastimil Babka,
	Michal Hocko, riel, Linux MM, LKML, xieyisheng1, guohanjun,
	Xishi Qiu

On Sun, Mar 12, 2017 at 4:06 AM, Yisheng Xie <ysxie@foxmail.com> wrote:
> From: Yisheng Xie <xieyisheng1@huawei.com>
>
> When we enter do_try_to_free_pages, the may_thrash is always clear, and
> it will retry shrink zones to tap cgroup's reserves memory by setting
> may_thrash when the former shrink_zones reclaim nothing.
>
> However, when memcg is disabled or on legacy hierarchy, it should not do
> this useless retry at all, for we do not have any cgroup's reserves
> memory to tap, and we have already done hard work but made no progress.
>
> To avoid this time costly and useless retrying, add a stub function
> mem_cgroup_thrashed() and return true when memcg is disabled or on
> legacy hierarchy.
>
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
> Suggested-by: Shakeel Butt <shakeelb@google.com>

Thanks.
Reviewed-by: Shakeel Butt <shakeelb@google.com>

> ---
> v3:
>  - rename function may_thrash() to mem_cgroup_thrashed() to avoid confusing.
>
> v2:
>  - more restrictive condition for retry of shrink_zones (restricting
>    cgroup_disabled=memory boot option and cgroup legacy hierarchy) - Shakeel
>
>  - add a stub function may_thrash() to avoid compile error or warning.
>
>  - rename subject from "donot retry shrink zones when memcg is disable"
>    to "more restrictive condition for retry in do_try_to_free_pages"
>
> Any comment is more than welcome!
>
> Thanks
> Yisheng Xie
>
>  mm/vmscan.c | 20 +++++++++++++++++++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index bc8031e..a76475af 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -184,6 +184,19 @@ static bool sane_reclaim(struct scan_control *sc)
>  #endif
>         return false;
>  }
> +
> +static bool mem_cgroup_thrashed(struct scan_control *sc)
> +{
> +       /*
> +        * When memcg is disabled or on legacy hierarchy, there is no cgroup
> +        * reserves memory to tap. So fake it as thrashed.
> +        */
> +       if (!cgroup_subsys_enabled(memory_cgrp_subsys) ||
> +           !cgroup_subsys_on_dfl(memory_cgrp_subsys))
> +               return true;
> +
> +       return sc->may_thrash;
> +}
>  #else
>  static bool global_reclaim(struct scan_control *sc)
>  {
> @@ -194,6 +207,11 @@ static bool sane_reclaim(struct scan_control *sc)
>  {
>         return true;
>  }
> +
> +static bool mem_cgroup_thrashed(struct scan_control *sc)
> +{
> +       return true;
> +}
>  #endif
>
>  /*
> @@ -2808,7 +2826,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
>                 return 1;
>
>         /* Untapped cgroup reserves?  Don't OOM, retry. */
> -       if (!sc->may_thrash) {
> +       if (!mem_cgroup_thrashed(sc)) {
>                 sc->priority = initial_priority;
>                 sc->may_thrash = 1;
>                 goto retry;
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink_zones
@ 2017-03-12 20:20   ` Shakeel Butt
  0 siblings, 0 replies; 12+ messages in thread
From: Shakeel Butt @ 2017-03-12 20:20 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Andrew Morton, Johannes Weiner, Mel Gorman, Vlastimil Babka,
	Michal Hocko, riel, Linux MM, LKML, xieyisheng1, guohanjun,
	Xishi Qiu

On Sun, Mar 12, 2017 at 4:06 AM, Yisheng Xie <ysxie@foxmail.com> wrote:
> From: Yisheng Xie <xieyisheng1@huawei.com>
>
> When we enter do_try_to_free_pages, the may_thrash is always clear, and
> it will retry shrink zones to tap cgroup's reserves memory by setting
> may_thrash when the former shrink_zones reclaim nothing.
>
> However, when memcg is disabled or on legacy hierarchy, it should not do
> this useless retry at all, for we do not have any cgroup's reserves
> memory to tap, and we have already done hard work but made no progress.
>
> To avoid this time costly and useless retrying, add a stub function
> mem_cgroup_thrashed() and return true when memcg is disabled or on
> legacy hierarchy.
>
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
> Suggested-by: Shakeel Butt <shakeelb@google.com>

Thanks.
Reviewed-by: Shakeel Butt <shakeelb@google.com>

> ---
> v3:
>  - rename function may_thrash() to mem_cgroup_thrashed() to avoid confusing.
>
> v2:
>  - more restrictive condition for retry of shrink_zones (restricting
>    cgroup_disabled=memory boot option and cgroup legacy hierarchy) - Shakeel
>
>  - add a stub function may_thrash() to avoid compile error or warning.
>
>  - rename subject from "donot retry shrink zones when memcg is disable"
>    to "more restrictive condition for retry in do_try_to_free_pages"
>
> Any comment is more than welcome!
>
> Thanks
> Yisheng Xie
>
>  mm/vmscan.c | 20 +++++++++++++++++++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index bc8031e..a76475af 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -184,6 +184,19 @@ static bool sane_reclaim(struct scan_control *sc)
>  #endif
>         return false;
>  }
> +
> +static bool mem_cgroup_thrashed(struct scan_control *sc)
> +{
> +       /*
> +        * When memcg is disabled or on legacy hierarchy, there is no cgroup
> +        * reserves memory to tap. So fake it as thrashed.
> +        */
> +       if (!cgroup_subsys_enabled(memory_cgrp_subsys) ||
> +           !cgroup_subsys_on_dfl(memory_cgrp_subsys))
> +               return true;
> +
> +       return sc->may_thrash;
> +}
>  #else
>  static bool global_reclaim(struct scan_control *sc)
>  {
> @@ -194,6 +207,11 @@ static bool sane_reclaim(struct scan_control *sc)
>  {
>         return true;
>  }
> +
> +static bool mem_cgroup_thrashed(struct scan_control *sc)
> +{
> +       return true;
> +}
>  #endif
>
>  /*
> @@ -2808,7 +2826,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
>                 return 1;
>
>         /* Untapped cgroup reserves?  Don't OOM, retry. */
> -       if (!sc->may_thrash) {
> +       if (!mem_cgroup_thrashed(sc)) {
>                 sc->priority = initial_priority;
>                 sc->may_thrash = 1;
>                 goto retry;
> --
> 1.9.1
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink_zones
  2017-03-12 11:06 ` Yisheng Xie
@ 2017-03-13  8:33   ` Michal Hocko
  -1 siblings, 0 replies; 12+ messages in thread
From: Michal Hocko @ 2017-03-13  8:33 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: akpm, hannes, mgorman, vbabka, riel, shakeelb, linux-mm,
	linux-kernel, xieyisheng1, guohanjun, qiuxishi

Please do not post new version after a single feedback and try to wait
for more review to accumulate. This is in the 3rd version and it is not
clear why it is still an RFC.

On Sun 12-03-17 19:06:10, Yisheng Xie wrote:
> From: Yisheng Xie <xieyisheng1@huawei.com>
> 
> When we enter do_try_to_free_pages, the may_thrash is always clear, and
> it will retry shrink zones to tap cgroup's reserves memory by setting
> may_thrash when the former shrink_zones reclaim nothing.
> 
> However, when memcg is disabled or on legacy hierarchy, it should not do
> this useless retry at all, for we do not have any cgroup's reserves
> memory to tap, and we have already done hard work but made no progress.
> 
> To avoid this time costly and useless retrying, add a stub function
> mem_cgroup_thrashed() and return true when memcg is disabled or on
> legacy hierarchy.

Have you actually seen this as a bad behavior? On which workload? Or
have spotted this by the code review?

Please note that more than _what_ it is more interesting _why_ the patch
has been prepared.

I agree the current additional round of reclaim is just lame because we
are trying hard to control the retry logic from the page allocator which
is a sufficient justification to fix this IMO. But I really hate the
name. At this point we do not have any idea that the memcg is trashing
as the name of the function suggests.

All of them simply might not have any reclaimable pages. So I would
suggest either a better name e.g. memcg_allow_lowmem_reclaim() or,
preferably, fix this properly. E.g. something like the following.
---
diff --git a/mm/vmscan.c b/mm/vmscan.c
index bae698484e8e..989ba9761921 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -99,6 +99,9 @@ struct scan_control {
 	/* Can cgroups be reclaimed below their normal consumption range? */
 	unsigned int may_thrash:1;
 
+	/* Did we have any memcg protected by the low limit */
+	unsigned int memcg_low_protection:1;
+
 	unsigned int hibernation_mode:1;
 
 	/* One of the zones is ready for compaction */
@@ -2513,6 +2516,7 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 			if (mem_cgroup_low(root, memcg)) {
 				if (!sc->may_thrash)
 					continue;
+				sc->memcg_low_protection = true;
 				mem_cgroup_events(memcg, MEMCG_LOW, 1);
 			}
 
@@ -2774,7 +2778,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
 		return 1;
 
 	/* Untapped cgroup reserves?  Don't OOM, retry. */
-	if (!sc->may_thrash) {
+	if ( sc->memcg_low_protection && !sc->may_thrash) {
 		sc->priority = initial_priority;
 		sc->may_thrash = 1;
 		goto retry;
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink_zones
@ 2017-03-13  8:33   ` Michal Hocko
  0 siblings, 0 replies; 12+ messages in thread
From: Michal Hocko @ 2017-03-13  8:33 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: akpm, hannes, mgorman, vbabka, riel, shakeelb, linux-mm,
	linux-kernel, xieyisheng1, guohanjun, qiuxishi

Please do not post new version after a single feedback and try to wait
for more review to accumulate. This is in the 3rd version and it is not
clear why it is still an RFC.

On Sun 12-03-17 19:06:10, Yisheng Xie wrote:
> From: Yisheng Xie <xieyisheng1@huawei.com>
> 
> When we enter do_try_to_free_pages, the may_thrash is always clear, and
> it will retry shrink zones to tap cgroup's reserves memory by setting
> may_thrash when the former shrink_zones reclaim nothing.
> 
> However, when memcg is disabled or on legacy hierarchy, it should not do
> this useless retry at all, for we do not have any cgroup's reserves
> memory to tap, and we have already done hard work but made no progress.
> 
> To avoid this time costly and useless retrying, add a stub function
> mem_cgroup_thrashed() and return true when memcg is disabled or on
> legacy hierarchy.

Have you actually seen this as a bad behavior? On which workload? Or
have spotted this by the code review?

Please note that more than _what_ it is more interesting _why_ the patch
has been prepared.

I agree the current additional round of reclaim is just lame because we
are trying hard to control the retry logic from the page allocator which
is a sufficient justification to fix this IMO. But I really hate the
name. At this point we do not have any idea that the memcg is trashing
as the name of the function suggests.

All of them simply might not have any reclaimable pages. So I would
suggest either a better name e.g. memcg_allow_lowmem_reclaim() or,
preferably, fix this properly. E.g. something like the following.
---
diff --git a/mm/vmscan.c b/mm/vmscan.c
index bae698484e8e..989ba9761921 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -99,6 +99,9 @@ struct scan_control {
 	/* Can cgroups be reclaimed below their normal consumption range? */
 	unsigned int may_thrash:1;
 
+	/* Did we have any memcg protected by the low limit */
+	unsigned int memcg_low_protection:1;
+
 	unsigned int hibernation_mode:1;
 
 	/* One of the zones is ready for compaction */
@@ -2513,6 +2516,7 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 			if (mem_cgroup_low(root, memcg)) {
 				if (!sc->may_thrash)
 					continue;
+				sc->memcg_low_protection = true;
 				mem_cgroup_events(memcg, MEMCG_LOW, 1);
 			}
 
@@ -2774,7 +2778,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
 		return 1;
 
 	/* Untapped cgroup reserves?  Don't OOM, retry. */
-	if (!sc->may_thrash) {
+	if ( sc->memcg_low_protection && !sc->may_thrash) {
 		sc->priority = initial_priority;
 		sc->may_thrash = 1;
 		goto retry;
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink_zones
  2017-03-13  8:33   ` Michal Hocko
@ 2017-03-13 12:00     ` Yisheng Xie
  -1 siblings, 0 replies; 12+ messages in thread
From: Yisheng Xie @ 2017-03-13 12:00 UTC (permalink / raw)
  To: Michal Hocko, Yisheng Xie
  Cc: akpm, hannes, mgorman, vbabka, riel, shakeelb, linux-mm,
	linux-kernel, guohanjun, qiuxishi

Hi Michal,

Thanks for reviewing.
On 2017/3/13 16:33, Michal Hocko wrote:
> Please do not post new version after a single feedback and try to wait
> for more review to accumulate. This is in the 3rd version and it is not
> clear why it is still an RFC.
Get it, thanks for pointing out these.

> 
> On Sun 12-03-17 19:06:10, Yisheng Xie wrote:
>> From: Yisheng Xie <xieyisheng1@huawei.com>
>>
>> When we enter do_try_to_free_pages, the may_thrash is always clear, and
>> it will retry shrink zones to tap cgroup's reserves memory by setting
>> may_thrash when the former shrink_zones reclaim nothing.
>>
>> However, when memcg is disabled or on legacy hierarchy, it should not do
>> this useless retry at all, for we do not have any cgroup's reserves
>> memory to tap, and we have already done hard work but made no progress.
>>
>> To avoid this time costly and useless retrying, add a stub function
>> mem_cgroup_thrashed() and return true when memcg is disabled or on
>> legacy hierarchy.
> 
> Have you actually seen this as a bad behavior? On which workload? Or
> have spotted this by the code review?
Sorry, this is just spotted by code review. I will point it out changelog.

> 
> Please note that more than _what_ it is more interesting _why_ the patch
> has been prepared.
Get it.

> 
> I agree the current additional round of reclaim is just lame because we
> are trying hard to control the retry logic from the page allocator which
> is a sufficient justification to fix this IMO. 
Right.

But I really hate the
> name. At this point we do not have any idea that the memcg is trashing
> as the name of the function suggests.
> 
> All of them simply might not have any reclaimable pages. So I would
> suggest either a better name e.g. memcg_allow_lowmem_reclaim() or,
> preferably, fix this properly. E.g. something like the following.
hmm, it is pretty a good idea than just rename the function, I will check
it's logical carefully, and then send out a new version based on this,
if you do not mind.

Thanks
Yisheng Xie.

> ---
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index bae698484e8e..989ba9761921 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -99,6 +99,9 @@ struct scan_control {
>  	/* Can cgroups be reclaimed below their normal consumption range? */
>  	unsigned int may_thrash:1;
>  
> +	/* Did we have any memcg protected by the low limit */
> +	unsigned int memcg_low_protection:1;
> +
>  	unsigned int hibernation_mode:1;
>  
>  	/* One of the zones is ready for compaction */
> @@ -2513,6 +2516,7 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
>  			if (mem_cgroup_low(root, memcg)) {
>  				if (!sc->may_thrash)
>  					continue;
> +				sc->memcg_low_protection = true;
>  				mem_cgroup_events(memcg, MEMCG_LOW, 1);
>  			}
>  
> @@ -2774,7 +2778,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
>  		return 1;
>  
>  	/* Untapped cgroup reserves?  Don't OOM, retry. */
> -	if (!sc->may_thrash) {
> +	if ( sc->memcg_low_protection && !sc->may_thrash) {
>  		sc->priority = initial_priority;
>  		sc->may_thrash = 1;
>  		goto retry;
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink_zones
@ 2017-03-13 12:00     ` Yisheng Xie
  0 siblings, 0 replies; 12+ messages in thread
From: Yisheng Xie @ 2017-03-13 12:00 UTC (permalink / raw)
  To: Michal Hocko, Yisheng Xie
  Cc: akpm, hannes, mgorman, vbabka, riel, shakeelb, linux-mm,
	linux-kernel, guohanjun, qiuxishi

Hi Michal,

Thanks for reviewing.
On 2017/3/13 16:33, Michal Hocko wrote:
> Please do not post new version after a single feedback and try to wait
> for more review to accumulate. This is in the 3rd version and it is not
> clear why it is still an RFC.
Get it, thanks for pointing out these.

> 
> On Sun 12-03-17 19:06:10, Yisheng Xie wrote:
>> From: Yisheng Xie <xieyisheng1@huawei.com>
>>
>> When we enter do_try_to_free_pages, the may_thrash is always clear, and
>> it will retry shrink zones to tap cgroup's reserves memory by setting
>> may_thrash when the former shrink_zones reclaim nothing.
>>
>> However, when memcg is disabled or on legacy hierarchy, it should not do
>> this useless retry at all, for we do not have any cgroup's reserves
>> memory to tap, and we have already done hard work but made no progress.
>>
>> To avoid this time costly and useless retrying, add a stub function
>> mem_cgroup_thrashed() and return true when memcg is disabled or on
>> legacy hierarchy.
> 
> Have you actually seen this as a bad behavior? On which workload? Or
> have spotted this by the code review?
Sorry, this is just spotted by code review. I will point it out changelog.

> 
> Please note that more than _what_ it is more interesting _why_ the patch
> has been prepared.
Get it.

> 
> I agree the current additional round of reclaim is just lame because we
> are trying hard to control the retry logic from the page allocator which
> is a sufficient justification to fix this IMO. 
Right.

But I really hate the
> name. At this point we do not have any idea that the memcg is trashing
> as the name of the function suggests.
> 
> All of them simply might not have any reclaimable pages. So I would
> suggest either a better name e.g. memcg_allow_lowmem_reclaim() or,
> preferably, fix this properly. E.g. something like the following.
hmm, it is pretty a good idea than just rename the function, I will check
it's logical carefully, and then send out a new version based on this,
if you do not mind.

Thanks
Yisheng Xie.

> ---
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index bae698484e8e..989ba9761921 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -99,6 +99,9 @@ struct scan_control {
>  	/* Can cgroups be reclaimed below their normal consumption range? */
>  	unsigned int may_thrash:1;
>  
> +	/* Did we have any memcg protected by the low limit */
> +	unsigned int memcg_low_protection:1;
> +
>  	unsigned int hibernation_mode:1;
>  
>  	/* One of the zones is ready for compaction */
> @@ -2513,6 +2516,7 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
>  			if (mem_cgroup_low(root, memcg)) {
>  				if (!sc->may_thrash)
>  					continue;
> +				sc->memcg_low_protection = true;
>  				mem_cgroup_events(memcg, MEMCG_LOW, 1);
>  			}
>  
> @@ -2774,7 +2778,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
>  		return 1;
>  
>  	/* Untapped cgroup reserves?  Don't OOM, retry. */
> -	if (!sc->may_thrash) {
> +	if ( sc->memcg_low_protection && !sc->may_thrash) {
>  		sc->priority = initial_priority;
>  		sc->may_thrash = 1;
>  		goto retry;
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink_zones
  2017-03-13  8:33   ` Michal Hocko
@ 2017-03-13 15:17     ` Shakeel Butt
  -1 siblings, 0 replies; 12+ messages in thread
From: Shakeel Butt @ 2017-03-13 15:17 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Yisheng Xie, Andrew Morton, Johannes Weiner, Mel Gorman,
	Vlastimil Babka, riel, Linux MM, LKML, xieyisheng1, guohanjun,
	Xishi Qiu

On Mon, Mar 13, 2017 at 1:33 AM, Michal Hocko <mhocko@kernel.org> wrote:
> Please do not post new version after a single feedback and try to wait
> for more review to accumulate. This is in the 3rd version and it is not
> clear why it is still an RFC.
>
> On Sun 12-03-17 19:06:10, Yisheng Xie wrote:
>> From: Yisheng Xie <xieyisheng1@huawei.com>
>>
>> When we enter do_try_to_free_pages, the may_thrash is always clear, and
>> it will retry shrink zones to tap cgroup's reserves memory by setting
>> may_thrash when the former shrink_zones reclaim nothing.
>>
>> However, when memcg is disabled or on legacy hierarchy, it should not do
>> this useless retry at all, for we do not have any cgroup's reserves
>> memory to tap, and we have already done hard work but made no progress.
>>
>> To avoid this time costly and useless retrying, add a stub function
>> mem_cgroup_thrashed() and return true when memcg is disabled or on
>> legacy hierarchy.
>
> Have you actually seen this as a bad behavior? On which workload? Or
> have spotted this by the code review?
>
> Please note that more than _what_ it is more interesting _why_ the patch
> has been prepared.
>
> I agree the current additional round of reclaim is just lame because we
> are trying hard to control the retry logic from the page allocator which
> is a sufficient justification to fix this IMO. But I really hate the
> name. At this point we do not have any idea that the memcg is trashing
> as the name of the function suggests.
>
> All of them simply might not have any reclaimable pages. So I would
> suggest either a better name e.g. memcg_allow_lowmem_reclaim() or,
> preferably, fix this properly. E.g. something like the following.
> ---
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index bae698484e8e..989ba9761921 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -99,6 +99,9 @@ struct scan_control {
>         /* Can cgroups be reclaimed below their normal consumption range? */
>         unsigned int may_thrash:1;
>
> +       /* Did we have any memcg protected by the low limit */
> +       unsigned int memcg_low_protection:1;
> +
>         unsigned int hibernation_mode:1;
>
>         /* One of the zones is ready for compaction */
> @@ -2513,6 +2516,7 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
>                         if (mem_cgroup_low(root, memcg)) {
>                                 if (!sc->may_thrash)
>                                         continue;
> +                               sc->memcg_low_protection = true;

I think you wanted to put this statement before the continue otherwise
it will just disable the sc->may_thrash (second reclaim pass)
altogether.

>                                 mem_cgroup_events(memcg, MEMCG_LOW, 1);
>                         }
>
> @@ -2774,7 +2778,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
>                 return 1;
>
>         /* Untapped cgroup reserves?  Don't OOM, retry. */
> -       if (!sc->may_thrash) {
> +       if ( sc->memcg_low_protection && !sc->may_thrash) {
>                 sc->priority = initial_priority;
>                 sc->may_thrash = 1;
>                 goto retry;
> --
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink_zones
@ 2017-03-13 15:17     ` Shakeel Butt
  0 siblings, 0 replies; 12+ messages in thread
From: Shakeel Butt @ 2017-03-13 15:17 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Yisheng Xie, Andrew Morton, Johannes Weiner, Mel Gorman,
	Vlastimil Babka, riel, Linux MM, LKML, xieyisheng1, guohanjun,
	Xishi Qiu

On Mon, Mar 13, 2017 at 1:33 AM, Michal Hocko <mhocko@kernel.org> wrote:
> Please do not post new version after a single feedback and try to wait
> for more review to accumulate. This is in the 3rd version and it is not
> clear why it is still an RFC.
>
> On Sun 12-03-17 19:06:10, Yisheng Xie wrote:
>> From: Yisheng Xie <xieyisheng1@huawei.com>
>>
>> When we enter do_try_to_free_pages, the may_thrash is always clear, and
>> it will retry shrink zones to tap cgroup's reserves memory by setting
>> may_thrash when the former shrink_zones reclaim nothing.
>>
>> However, when memcg is disabled or on legacy hierarchy, it should not do
>> this useless retry at all, for we do not have any cgroup's reserves
>> memory to tap, and we have already done hard work but made no progress.
>>
>> To avoid this time costly and useless retrying, add a stub function
>> mem_cgroup_thrashed() and return true when memcg is disabled or on
>> legacy hierarchy.
>
> Have you actually seen this as a bad behavior? On which workload? Or
> have spotted this by the code review?
>
> Please note that more than _what_ it is more interesting _why_ the patch
> has been prepared.
>
> I agree the current additional round of reclaim is just lame because we
> are trying hard to control the retry logic from the page allocator which
> is a sufficient justification to fix this IMO. But I really hate the
> name. At this point we do not have any idea that the memcg is trashing
> as the name of the function suggests.
>
> All of them simply might not have any reclaimable pages. So I would
> suggest either a better name e.g. memcg_allow_lowmem_reclaim() or,
> preferably, fix this properly. E.g. something like the following.
> ---
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index bae698484e8e..989ba9761921 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -99,6 +99,9 @@ struct scan_control {
>         /* Can cgroups be reclaimed below their normal consumption range? */
>         unsigned int may_thrash:1;
>
> +       /* Did we have any memcg protected by the low limit */
> +       unsigned int memcg_low_protection:1;
> +
>         unsigned int hibernation_mode:1;
>
>         /* One of the zones is ready for compaction */
> @@ -2513,6 +2516,7 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
>                         if (mem_cgroup_low(root, memcg)) {
>                                 if (!sc->may_thrash)
>                                         continue;
> +                               sc->memcg_low_protection = true;

I think you wanted to put this statement before the continue otherwise
it will just disable the sc->may_thrash (second reclaim pass)
altogether.

>                                 mem_cgroup_events(memcg, MEMCG_LOW, 1);
>                         }
>
> @@ -2774,7 +2778,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
>                 return 1;
>
>         /* Untapped cgroup reserves?  Don't OOM, retry. */
> -       if (!sc->may_thrash) {
> +       if ( sc->memcg_low_protection && !sc->may_thrash) {
>                 sc->priority = initial_priority;
>                 sc->may_thrash = 1;
>                 goto retry;
> --
> Michal Hocko
> SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink_zones
  2017-03-13 15:17     ` Shakeel Butt
@ 2017-03-13 15:48       ` Michal Hocko
  -1 siblings, 0 replies; 12+ messages in thread
From: Michal Hocko @ 2017-03-13 15:48 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Yisheng Xie, Andrew Morton, Johannes Weiner, Mel Gorman,
	Vlastimil Babka, riel, Linux MM, LKML, xieyisheng1, guohanjun,
	Xishi Qiu

On Mon 13-03-17 08:17:56, Shakeel Butt wrote:
> On Mon, Mar 13, 2017 at 1:33 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > Please do not post new version after a single feedback and try to wait
> > for more review to accumulate. This is in the 3rd version and it is not
> > clear why it is still an RFC.
> >
> > On Sun 12-03-17 19:06:10, Yisheng Xie wrote:
> >> From: Yisheng Xie <xieyisheng1@huawei.com>
> >>
> >> When we enter do_try_to_free_pages, the may_thrash is always clear, and
> >> it will retry shrink zones to tap cgroup's reserves memory by setting
> >> may_thrash when the former shrink_zones reclaim nothing.
> >>
> >> However, when memcg is disabled or on legacy hierarchy, it should not do
> >> this useless retry at all, for we do not have any cgroup's reserves
> >> memory to tap, and we have already done hard work but made no progress.
> >>
> >> To avoid this time costly and useless retrying, add a stub function
> >> mem_cgroup_thrashed() and return true when memcg is disabled or on
> >> legacy hierarchy.
> >
> > Have you actually seen this as a bad behavior? On which workload? Or
> > have spotted this by the code review?
> >
> > Please note that more than _what_ it is more interesting _why_ the patch
> > has been prepared.
> >
> > I agree the current additional round of reclaim is just lame because we
> > are trying hard to control the retry logic from the page allocator which
> > is a sufficient justification to fix this IMO. But I really hate the
> > name. At this point we do not have any idea that the memcg is trashing
> > as the name of the function suggests.
> >
> > All of them simply might not have any reclaimable pages. So I would
> > suggest either a better name e.g. memcg_allow_lowmem_reclaim() or,
> > preferably, fix this properly. E.g. something like the following.
> > ---
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index bae698484e8e..989ba9761921 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -99,6 +99,9 @@ struct scan_control {
> >         /* Can cgroups be reclaimed below their normal consumption range? */
> >         unsigned int may_thrash:1;
> >
> > +       /* Did we have any memcg protected by the low limit */
> > +       unsigned int memcg_low_protection:1;
> > +
> >         unsigned int hibernation_mode:1;
> >
> >         /* One of the zones is ready for compaction */
> > @@ -2513,6 +2516,7 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
> >                         if (mem_cgroup_low(root, memcg)) {
> >                                 if (!sc->may_thrash)
> >                                         continue;
> > +                               sc->memcg_low_protection = true;
> 
> I think you wanted to put this statement before the continue otherwise
> it will just disable the sc->may_thrash (second reclaim pass)
> altogether.

yes, of course, just a quick and dirty hack to show my point.
Sorry about the confusion.
 
> >                                 mem_cgroup_events(memcg, MEMCG_LOW, 1);
> >                         }
> >
> > @@ -2774,7 +2778,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
> >                 return 1;
> >
> >         /* Untapped cgroup reserves?  Don't OOM, retry. */
> > -       if (!sc->may_thrash) {
> > +       if ( sc->memcg_low_protection && !sc->may_thrash) {
> >                 sc->priority = initial_priority;
> >                 sc->may_thrash = 1;
> >                 goto retry;
> > --
> > Michal Hocko
> > SUSE Labs
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink_zones
@ 2017-03-13 15:48       ` Michal Hocko
  0 siblings, 0 replies; 12+ messages in thread
From: Michal Hocko @ 2017-03-13 15:48 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Yisheng Xie, Andrew Morton, Johannes Weiner, Mel Gorman,
	Vlastimil Babka, riel, Linux MM, LKML, xieyisheng1, guohanjun,
	Xishi Qiu

On Mon 13-03-17 08:17:56, Shakeel Butt wrote:
> On Mon, Mar 13, 2017 at 1:33 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > Please do not post new version after a single feedback and try to wait
> > for more review to accumulate. This is in the 3rd version and it is not
> > clear why it is still an RFC.
> >
> > On Sun 12-03-17 19:06:10, Yisheng Xie wrote:
> >> From: Yisheng Xie <xieyisheng1@huawei.com>
> >>
> >> When we enter do_try_to_free_pages, the may_thrash is always clear, and
> >> it will retry shrink zones to tap cgroup's reserves memory by setting
> >> may_thrash when the former shrink_zones reclaim nothing.
> >>
> >> However, when memcg is disabled or on legacy hierarchy, it should not do
> >> this useless retry at all, for we do not have any cgroup's reserves
> >> memory to tap, and we have already done hard work but made no progress.
> >>
> >> To avoid this time costly and useless retrying, add a stub function
> >> mem_cgroup_thrashed() and return true when memcg is disabled or on
> >> legacy hierarchy.
> >
> > Have you actually seen this as a bad behavior? On which workload? Or
> > have spotted this by the code review?
> >
> > Please note that more than _what_ it is more interesting _why_ the patch
> > has been prepared.
> >
> > I agree the current additional round of reclaim is just lame because we
> > are trying hard to control the retry logic from the page allocator which
> > is a sufficient justification to fix this IMO. But I really hate the
> > name. At this point we do not have any idea that the memcg is trashing
> > as the name of the function suggests.
> >
> > All of them simply might not have any reclaimable pages. So I would
> > suggest either a better name e.g. memcg_allow_lowmem_reclaim() or,
> > preferably, fix this properly. E.g. something like the following.
> > ---
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index bae698484e8e..989ba9761921 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -99,6 +99,9 @@ struct scan_control {
> >         /* Can cgroups be reclaimed below their normal consumption range? */
> >         unsigned int may_thrash:1;
> >
> > +       /* Did we have any memcg protected by the low limit */
> > +       unsigned int memcg_low_protection:1;
> > +
> >         unsigned int hibernation_mode:1;
> >
> >         /* One of the zones is ready for compaction */
> > @@ -2513,6 +2516,7 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
> >                         if (mem_cgroup_low(root, memcg)) {
> >                                 if (!sc->may_thrash)
> >                                         continue;
> > +                               sc->memcg_low_protection = true;
> 
> I think you wanted to put this statement before the continue otherwise
> it will just disable the sc->may_thrash (second reclaim pass)
> altogether.

yes, of course, just a quick and dirty hack to show my point.
Sorry about the confusion.
 
> >                                 mem_cgroup_events(memcg, MEMCG_LOW, 1);
> >                         }
> >
> > @@ -2774,7 +2778,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
> >                 return 1;
> >
> >         /* Untapped cgroup reserves?  Don't OOM, retry. */
> > -       if (!sc->may_thrash) {
> > +       if ( sc->memcg_low_protection && !sc->may_thrash) {
> >                 sc->priority = initial_priority;
> >                 sc->may_thrash = 1;
> >                 goto retry;
> > --
> > Michal Hocko
> > SUSE Labs
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-03-13 15:49 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-12 11:06 [PATCH v3 RFC] mm/vmscan: more restrictive condition for retry of shrink_zones Yisheng Xie
2017-03-12 11:06 ` Yisheng Xie
2017-03-12 20:20 ` Shakeel Butt
2017-03-12 20:20   ` Shakeel Butt
2017-03-13  8:33 ` Michal Hocko
2017-03-13  8:33   ` Michal Hocko
2017-03-13 12:00   ` Yisheng Xie
2017-03-13 12:00     ` Yisheng Xie
2017-03-13 15:17   ` Shakeel Butt
2017-03-13 15:17     ` Shakeel Butt
2017-03-13 15:48     ` Michal Hocko
2017-03-13 15:48       ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.