linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] mm: propagate memory effective protection on setting memory.min/low
@ 2018-05-22 13:25 Roman Gushchin
  2018-05-22 13:25 ` [PATCH 2/2] mm: don't skip memory guarantee calculations Roman Gushchin
  2018-06-04 12:26 ` [PATCH 1/2] mm: propagate memory effective protection on setting memory.min/low Michal Hocko
  0 siblings, 2 replies; 8+ messages in thread
From: Roman Gushchin @ 2018-05-22 13:25 UTC (permalink / raw)
  To: linux-mm
  Cc: kernel-team, linux-kernel, Roman Gushchin, Johannes Weiner,
	Michal Hocko, Vladimir Davydov, Greg Thelen, Tejun Heo,
	Andrew Morton

Explicitly propagate effective memory min/low values down by the tree.

If there is the global memory pressure, it's not really necessary.
Effective memory guarantees will be propagated automatically
as we traverse memory cgroup tree in the reclaim path.

But if there is no global memory pressure, effective memory protection
still matters for local (memcg-scoped) memory pressure.
So, we have to update effective limits in the subtree,
if a user changes memory.min and memory.low values.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 mm/memcontrol.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index ab5673dbfc4e..b9cd0bb63759 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5374,7 +5374,7 @@ static int memory_min_show(struct seq_file *m, void *v)
 static ssize_t memory_min_write(struct kernfs_open_file *of,
 				char *buf, size_t nbytes, loff_t off)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of));
+	struct mem_cgroup *iter, *memcg = mem_cgroup_from_css(of_css(of));
 	unsigned long min;
 	int err;
 
@@ -5385,6 +5385,11 @@ static ssize_t memory_min_write(struct kernfs_open_file *of,
 
 	page_counter_set_min(&memcg->memory, min);
 
+	rcu_read_lock();
+	for_each_mem_cgroup_tree(iter, memcg)
+		mem_cgroup_protected(NULL, iter);
+	rcu_read_unlock();
+
 	return nbytes;
 }
 
@@ -5404,7 +5409,7 @@ static int memory_low_show(struct seq_file *m, void *v)
 static ssize_t memory_low_write(struct kernfs_open_file *of,
 				char *buf, size_t nbytes, loff_t off)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of));
+	struct mem_cgroup *iter, *memcg = mem_cgroup_from_css(of_css(of));
 	unsigned long low;
 	int err;
 
@@ -5415,6 +5420,11 @@ static ssize_t memory_low_write(struct kernfs_open_file *of,
 
 	page_counter_set_low(&memcg->memory, low);
 
+	rcu_read_lock();
+	for_each_mem_cgroup_tree(iter, memcg)
+		mem_cgroup_protected(NULL, iter);
+	rcu_read_unlock();
+
 	return nbytes;
 }
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] mm: don't skip memory guarantee calculations
  2018-05-22 13:25 [PATCH 1/2] mm: propagate memory effective protection on setting memory.min/low Roman Gushchin
@ 2018-05-22 13:25 ` Roman Gushchin
  2018-06-04 12:29   ` Michal Hocko
  2018-06-04 12:26 ` [PATCH 1/2] mm: propagate memory effective protection on setting memory.min/low Michal Hocko
  1 sibling, 1 reply; 8+ messages in thread
From: Roman Gushchin @ 2018-05-22 13:25 UTC (permalink / raw)
  To: linux-mm
  Cc: kernel-team, linux-kernel, Roman Gushchin, Johannes Weiner,
	Michal Hocko, Vladimir Davydov, Greg Thelen, Tejun Heo,
	Andrew Morton

There are two cases when effective memory guarantee calculation
is mistakenly skipped:

1) If memcg is a child of the root cgroup, and the root
cgroup is not root_mem_cgroup (in other words, if the reclaim
is targeted). Top-level memory cgroups are handled specially
in mem_cgroup_protected(), because the root memory cgroup doesn't
have memory guarantee and can't limit its children guarantees.
So, all effective guarantee calculation is skipped.
But in case of targeted reclaim things are different:
cgroups, which parent exceeded its memory limit aren't special.

2) If memcg has no charged memory (memory usage is 0). In this
case mem_cgroup_protected() always returns MEMCG_PROT_NONE, which
is correct and prevents to generate fake memory low events for
empty cgroups. But skipping memory emin/elow calculation is wrong:
if there is no global memory pressure there might be no good
chance again, so we can end up with effective guarantees set to 0
without any reason.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 mm/memcontrol.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b9cd0bb63759..20c4f0a97d4c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5809,20 +5809,15 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
 	if (mem_cgroup_disabled())
 		return MEMCG_PROT_NONE;
 
-	if (!root)
-		root = root_mem_cgroup;
-	if (memcg == root)
+	if (memcg == root_mem_cgroup)
 		return MEMCG_PROT_NONE;
 
 	usage = page_counter_read(&memcg->memory);
-	if (!usage)
-		return MEMCG_PROT_NONE;
-
 	emin = memcg->memory.min;
 	elow = memcg->memory.low;
 
 	parent = parent_mem_cgroup(memcg);
-	if (parent == root)
+	if (parent == root_mem_cgroup)
 		goto exit;
 
 	parent_emin = READ_ONCE(parent->memory.emin);
@@ -5857,6 +5852,12 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
 	memcg->memory.emin = emin;
 	memcg->memory.elow = elow;
 
+	if (root && memcg == root)
+		return MEMCG_PROT_NONE;
+
+	if (!usage)
+		return MEMCG_PROT_NONE;
+
 	if (usage <= emin)
 		return MEMCG_PROT_MIN;
 	else if (usage <= elow)
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm: propagate memory effective protection on setting memory.min/low
  2018-05-22 13:25 [PATCH 1/2] mm: propagate memory effective protection on setting memory.min/low Roman Gushchin
  2018-05-22 13:25 ` [PATCH 2/2] mm: don't skip memory guarantee calculations Roman Gushchin
@ 2018-06-04 12:26 ` Michal Hocko
  1 sibling, 0 replies; 8+ messages in thread
From: Michal Hocko @ 2018-06-04 12:26 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, kernel-team, linux-kernel, Johannes Weiner,
	Vladimir Davydov, Greg Thelen, Tejun Heo, Andrew Morton

On Tue 22-05-18 14:25:27, Roman Gushchin wrote:
> Explicitly propagate effective memory min/low values down by the tree.
> 
> If there is the global memory pressure, it's not really necessary.
> Effective memory guarantees will be propagated automatically
> as we traverse memory cgroup tree in the reclaim path.
> 
> But if there is no global memory pressure, effective memory protection
> still matters for local (memcg-scoped) memory pressure.
> So, we have to update effective limits in the subtree,
> if a user changes memory.min and memory.low values.

Please be explicit about the exact problem. Ideally with a memcg tree example.

> Signed-off-by: Roman Gushchin <guro@fb.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: Greg Thelen <gthelen@google.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> ---
>  mm/memcontrol.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index ab5673dbfc4e..b9cd0bb63759 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5374,7 +5374,7 @@ static int memory_min_show(struct seq_file *m, void *v)
>  static ssize_t memory_min_write(struct kernfs_open_file *of,
>  				char *buf, size_t nbytes, loff_t off)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of));
> +	struct mem_cgroup *iter, *memcg = mem_cgroup_from_css(of_css(of));
>  	unsigned long min;
>  	int err;
>  
> @@ -5385,6 +5385,11 @@ static ssize_t memory_min_write(struct kernfs_open_file *of,
>  
>  	page_counter_set_min(&memcg->memory, min);
>  
> +	rcu_read_lock();
> +	for_each_mem_cgroup_tree(iter, memcg)
> +		mem_cgroup_protected(NULL, iter);
> +	rcu_read_unlock();
> +
>  	return nbytes;
>  }
>  
> @@ -5404,7 +5409,7 @@ static int memory_low_show(struct seq_file *m, void *v)
>  static ssize_t memory_low_write(struct kernfs_open_file *of,
>  				char *buf, size_t nbytes, loff_t off)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of));
> +	struct mem_cgroup *iter, *memcg = mem_cgroup_from_css(of_css(of));
>  	unsigned long low;
>  	int err;
>  
> @@ -5415,6 +5420,11 @@ static ssize_t memory_low_write(struct kernfs_open_file *of,
>  
>  	page_counter_set_low(&memcg->memory, low);
>  
> +	rcu_read_lock();
> +	for_each_mem_cgroup_tree(iter, memcg)
> +		mem_cgroup_protected(NULL, iter);
> +	rcu_read_unlock();
> +
>  	return nbytes;
>  }
>  
> -- 
> 2.14.3

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] mm: don't skip memory guarantee calculations
  2018-05-22 13:25 ` [PATCH 2/2] mm: don't skip memory guarantee calculations Roman Gushchin
@ 2018-06-04 12:29   ` Michal Hocko
  2018-06-04 16:23     ` Roman Gushchin
  0 siblings, 1 reply; 8+ messages in thread
From: Michal Hocko @ 2018-06-04 12:29 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, kernel-team, linux-kernel, Johannes Weiner,
	Vladimir Davydov, Greg Thelen, Tejun Heo, Andrew Morton

On Tue 22-05-18 14:25:28, Roman Gushchin wrote:
> There are two cases when effective memory guarantee calculation
> is mistakenly skipped:
> 
> 1) If memcg is a child of the root cgroup, and the root
> cgroup is not root_mem_cgroup (in other words, if the reclaim
> is targeted). Top-level memory cgroups are handled specially
> in mem_cgroup_protected(), because the root memory cgroup doesn't
> have memory guarantee and can't limit its children guarantees.
> So, all effective guarantee calculation is skipped.
> But in case of targeted reclaim things are different:
> cgroups, which parent exceeded its memory limit aren't special.
> 
> 2) If memcg has no charged memory (memory usage is 0). In this
> case mem_cgroup_protected() always returns MEMCG_PROT_NONE, which
> is correct and prevents to generate fake memory low events for
> empty cgroups. But skipping memory emin/elow calculation is wrong:
> if there is no global memory pressure there might be no good
> chance again, so we can end up with effective guarantees set to 0
> without any reason.

Roman, so these two patches are on top of the min limit patches, right?
The fact that they come after just makes me feel this whole thing is not
completely thought through and I would like to see all 4 patch in one
series describing the whole design. We are getting really close to the
merge window and last minute updates makes me really nervouse. Can you
please repost the whole thing after the merge window, please?

As I've said earlier I am not even sure we really want to have a hard
guarantee once we decided to go with low limit. So a very good reasoning
should be added for the whole thing.

Thanks!
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] mm: don't skip memory guarantee calculations
  2018-06-04 12:29   ` Michal Hocko
@ 2018-06-04 16:23     ` Roman Gushchin
  2018-06-05  9:03       ` Michal Hocko
  0 siblings, 1 reply; 8+ messages in thread
From: Roman Gushchin @ 2018-06-04 16:23 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, kernel-team, linux-kernel, Johannes Weiner,
	Vladimir Davydov, Greg Thelen, Tejun Heo, Andrew Morton

On Mon, Jun 04, 2018 at 02:29:53PM +0200, Michal Hocko wrote:
> On Tue 22-05-18 14:25:28, Roman Gushchin wrote:
> > There are two cases when effective memory guarantee calculation
> > is mistakenly skipped:
> > 
> > 1) If memcg is a child of the root cgroup, and the root
> > cgroup is not root_mem_cgroup (in other words, if the reclaim
> > is targeted). Top-level memory cgroups are handled specially
> > in mem_cgroup_protected(), because the root memory cgroup doesn't
> > have memory guarantee and can't limit its children guarantees.
> > So, all effective guarantee calculation is skipped.
> > But in case of targeted reclaim things are different:
> > cgroups, which parent exceeded its memory limit aren't special.
> > 
> > 2) If memcg has no charged memory (memory usage is 0). In this
> > case mem_cgroup_protected() always returns MEMCG_PROT_NONE, which
> > is correct and prevents to generate fake memory low events for
> > empty cgroups. But skipping memory emin/elow calculation is wrong:
> > if there is no global memory pressure there might be no good
> > chance again, so we can end up with effective guarantees set to 0
> > without any reason.
> 
> Roman, so these two patches are on top of the min limit patches, right?
> The fact that they come after just makes me feel this whole thing is not
> completely thought through and I would like to see all 4 patch in one
> series describing the whole design. We are getting really close to the
> merge window and last minute updates makes me really nervouse. Can you
> please repost the whole thing after the merge window, please?

Hi, Michal!

These changes are fixing some edge cases which I've discovered
when I started writing unit tests for the memory controller
(see in tools/testing/selftesting/cgroup/). All these edge cases
are temporarily effects which exist only when there is no
global memory pressure.

We're already using my implementation in production for some time,
and so far had no issues with it.

Please note, that the existing implementation of memory.low has much more
serious problems: it barely works without some significant configuration
tweaks (e.g. set all memory.low in the hierarchy to max, except leaves),
which are painful in production.

I'm happy to discuss any concrete issues/concerns, but I really see
no reasons to drop it from the mm tree now and start the discussion
from scratch.

Thank you!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] mm: don't skip memory guarantee calculations
  2018-06-04 16:23     ` Roman Gushchin
@ 2018-06-05  9:03       ` Michal Hocko
  2018-06-05 10:15         ` Roman Gushchin
  0 siblings, 1 reply; 8+ messages in thread
From: Michal Hocko @ 2018-06-05  9:03 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, kernel-team, linux-kernel, Johannes Weiner,
	Vladimir Davydov, Greg Thelen, Tejun Heo, Andrew Morton

On Mon 04-06-18 17:23:06, Roman Gushchin wrote:
[...]
> I'm happy to discuss any concrete issues/concerns, but I really see
> no reasons to drop it from the mm tree now and start the discussion
> from scratch.

I do not think this is ready for the current merge window. Sorry! I
would really prefer to see the whole thing in one series to have a
better picture.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] mm: don't skip memory guarantee calculations
  2018-06-05  9:03       ` Michal Hocko
@ 2018-06-05 10:15         ` Roman Gushchin
  2018-06-05 10:28           ` Michal Hocko
  0 siblings, 1 reply; 8+ messages in thread
From: Roman Gushchin @ 2018-06-05 10:15 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, kernel-team, linux-kernel, Johannes Weiner,
	Vladimir Davydov, Greg Thelen, Tejun Heo, Andrew Morton

On Tue, Jun 05, 2018 at 11:03:49AM +0200, Michal Hocko wrote:
> On Mon 04-06-18 17:23:06, Roman Gushchin wrote:
> [...]
> > I'm happy to discuss any concrete issues/concerns, but I really see
> > no reasons to drop it from the mm tree now and start the discussion
> > from scratch.
> 
> I do not think this is ready for the current merge window. Sorry! I
> would really prefer to see the whole thing in one series to have a
> better picture.

Please, provide any specific reason for that. I appreciate your opinion,
but *I think* it's not an argument, seriously.

We've discussed the patchset back to March and I made several iterations
based on the received feedback. Later we had a separate discussion with Greg,
who proposed an alternative solution, which, unfortunately, had some serious
shortcomings. And, as I remember, some time ago we've discussed memory.min
with you.
And now you want to start from scratch without providing any reason.
I find it counter-productive, sorry.

Thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] mm: don't skip memory guarantee calculations
  2018-06-05 10:15         ` Roman Gushchin
@ 2018-06-05 10:28           ` Michal Hocko
  0 siblings, 0 replies; 8+ messages in thread
From: Michal Hocko @ 2018-06-05 10:28 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, kernel-team, linux-kernel, Johannes Weiner,
	Vladimir Davydov, Greg Thelen, Tejun Heo, Andrew Morton

On Tue 05-06-18 11:15:45, Roman Gushchin wrote:
> On Tue, Jun 05, 2018 at 11:03:49AM +0200, Michal Hocko wrote:
> > On Mon 04-06-18 17:23:06, Roman Gushchin wrote:
> > [...]
> > > I'm happy to discuss any concrete issues/concerns, but I really see
> > > no reasons to drop it from the mm tree now and start the discussion
> > > from scratch.
> > 
> > I do not think this is ready for the current merge window. Sorry! I
> > would really prefer to see the whole thing in one series to have a
> > better picture.
> 
> Please, provide any specific reason for that. I appreciate your opinion,
> but *I think* it's not an argument, seriously.

Seeing two follow up fixes close to the merge window just speaks for
itself. Besides that there is not need to rush this now.
 
> We've discussed the patchset back to March and I made several iterations
> based on the received feedback. Later we had a separate discussion with Greg,
> who proposed an alternative solution, which, unfortunately, had some serious
> shortcomings. And, as I remember, some time ago we've discussed memory.min
> with you.
> And now you want to start from scratch without providing any reason.
> I find it counter-productive, sorry.

I am sorry I couldn't give it more time, but this release cycle was even
crazier than usual.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-06-05 10:28 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-22 13:25 [PATCH 1/2] mm: propagate memory effective protection on setting memory.min/low Roman Gushchin
2018-05-22 13:25 ` [PATCH 2/2] mm: don't skip memory guarantee calculations Roman Gushchin
2018-06-04 12:29   ` Michal Hocko
2018-06-04 16:23     ` Roman Gushchin
2018-06-05  9:03       ` Michal Hocko
2018-06-05 10:15         ` Roman Gushchin
2018-06-05 10:28           ` Michal Hocko
2018-06-04 12:26 ` [PATCH 1/2] mm: propagate memory effective protection on setting memory.min/low Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).