All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty
@ 2016-08-01 10:00 ` Michal Hocko
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2016-08-01 10:00 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Vladimir Davydov, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

We've had a report about soft lockups caused by lock bouncing in the
soft reclaim path:

[331404.849734] BUG: soft lockup - CPU#0 stuck for 22s! [kav4proxy-kavic:3128]
[331404.849920] RIP: 0010:[<ffffffff81469798>]  [<ffffffff81469798>] _raw_spin_lock+0x18/0x20
[331404.849997] Call Trace:
[331404.850010]  [<ffffffff811557ea>] mem_cgroup_soft_limit_reclaim+0x25a/0x280
[331404.850020]  [<ffffffff8111041d>] shrink_zones+0xed/0x200
[331404.850027]  [<ffffffff81111a94>] do_try_to_free_pages+0x74/0x320
[331404.850034]  [<ffffffff81112072>] try_to_free_pages+0x112/0x180
[331404.850042]  [<ffffffff81104a6f>] __alloc_pages_slowpath+0x3ff/0x820
[331404.850049]  [<ffffffff81105079>] __alloc_pages_nodemask+0x1e9/0x200
[331404.850056]  [<ffffffff81141e01>] alloc_pages_vma+0xe1/0x290
[331404.850064]  [<ffffffff8112402f>] do_wp_page+0x19f/0x840
[331404.850071]  [<ffffffff811257cd>] handle_pte_fault+0x1cd/0x230
[331404.850079]  [<ffffffff8146d3ed>] do_page_fault+0x1fd/0x4c0
[331404.850087]  [<ffffffff81469ec5>] page_fault+0x25/0x30

There are no memcgs created so there cannot be any in the soft limit
excess obviously:
[...]
memory  0       1       1

so all this just seems to be mem_cgroup_largest_soft_limit_node
trying to get spin_lock_irq(&mctz->lock) just to find out that the soft
limit excess tree is empty. This is just pointless waisting of cycles
and cache line bouncing during heavy parallel reclaim on large machines.
The particular machine wasn't very healthy and most probably suffering
from a memory leak which just caused the memory reclaim to trash
heavily. But bouncing on the lock certainly didn't help...

Introduce soft_limit_tree_empty which does the optimistic lockless check
and bail out early if the tree is empty. This is theoretically racy but
that shouldn't matter all that much. First of all soft limit is a best
effort feature and it is slowly getting deprecated and its usage should
be really scarce. Bouncing on a lock without a good reason is surely
much bigger problem, especially on large CPU machines.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/memcontrol.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c265212bec8c..eb7e39c2d948 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2543,6 +2543,11 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg,
 	return ret;
 }
 
+static inline bool soft_limit_tree_empty(struct mem_cgroup_tree_per_node *mctz)
+{
+	return rb_last(&mctz->rb_root) == NULL;
+}
+
 unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 					    gfp_t gfp_mask,
 					    unsigned long *total_scanned)
@@ -2559,6 +2564,9 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 		return 0;
 
 	mctz = soft_limit_tree_node(pgdat->node_id);
+	if (soft_limit_tree_empty(mctz))
+		return 0;
+
 	/*
 	 * This loop can run a while, specially if mem_cgroup's continuously
 	 * keep exceeding their soft limit and putting the system under
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty
@ 2016-08-01 10:00 ` Michal Hocko
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2016-08-01 10:00 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Vladimir Davydov, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

We've had a report about soft lockups caused by lock bouncing in the
soft reclaim path:

[331404.849734] BUG: soft lockup - CPU#0 stuck for 22s! [kav4proxy-kavic:3128]
[331404.849920] RIP: 0010:[<ffffffff81469798>]  [<ffffffff81469798>] _raw_spin_lock+0x18/0x20
[331404.849997] Call Trace:
[331404.850010]  [<ffffffff811557ea>] mem_cgroup_soft_limit_reclaim+0x25a/0x280
[331404.850020]  [<ffffffff8111041d>] shrink_zones+0xed/0x200
[331404.850027]  [<ffffffff81111a94>] do_try_to_free_pages+0x74/0x320
[331404.850034]  [<ffffffff81112072>] try_to_free_pages+0x112/0x180
[331404.850042]  [<ffffffff81104a6f>] __alloc_pages_slowpath+0x3ff/0x820
[331404.850049]  [<ffffffff81105079>] __alloc_pages_nodemask+0x1e9/0x200
[331404.850056]  [<ffffffff81141e01>] alloc_pages_vma+0xe1/0x290
[331404.850064]  [<ffffffff8112402f>] do_wp_page+0x19f/0x840
[331404.850071]  [<ffffffff811257cd>] handle_pte_fault+0x1cd/0x230
[331404.850079]  [<ffffffff8146d3ed>] do_page_fault+0x1fd/0x4c0
[331404.850087]  [<ffffffff81469ec5>] page_fault+0x25/0x30

There are no memcgs created so there cannot be any in the soft limit
excess obviously:
[...]
memory  0       1       1

so all this just seems to be mem_cgroup_largest_soft_limit_node
trying to get spin_lock_irq(&mctz->lock) just to find out that the soft
limit excess tree is empty. This is just pointless waisting of cycles
and cache line bouncing during heavy parallel reclaim on large machines.
The particular machine wasn't very healthy and most probably suffering
from a memory leak which just caused the memory reclaim to trash
heavily. But bouncing on the lock certainly didn't help...

Introduce soft_limit_tree_empty which does the optimistic lockless check
and bail out early if the tree is empty. This is theoretically racy but
that shouldn't matter all that much. First of all soft limit is a best
effort feature and it is slowly getting deprecated and its usage should
be really scarce. Bouncing on a lock without a good reason is surely
much bigger problem, especially on large CPU machines.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/memcontrol.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c265212bec8c..eb7e39c2d948 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2543,6 +2543,11 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg,
 	return ret;
 }
 
+static inline bool soft_limit_tree_empty(struct mem_cgroup_tree_per_node *mctz)
+{
+	return rb_last(&mctz->rb_root) == NULL;
+}
+
 unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 					    gfp_t gfp_mask,
 					    unsigned long *total_scanned)
@@ -2559,6 +2564,9 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 		return 0;
 
 	mctz = soft_limit_tree_node(pgdat->node_id);
+	if (soft_limit_tree_empty(mctz))
+		return 0;
+
 	/*
 	 * This loop can run a while, specially if mem_cgroup's continuously
 	 * keep exceeding their soft limit and putting the system under
-- 
2.8.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty
  2016-08-01 10:00 ` Michal Hocko
@ 2016-08-01 13:57   ` Vladimir Davydov
  -1 siblings, 0 replies; 16+ messages in thread
From: Vladimir Davydov @ 2016-08-01 13:57 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, Johannes Weiner, linux-mm, LKML, Michal Hocko

On Mon, Aug 01, 2016 at 12:00:21PM +0200, Michal Hocko wrote:
...
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c265212bec8c..eb7e39c2d948 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2543,6 +2543,11 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg,
>  	return ret;
>  }
>  
> +static inline bool soft_limit_tree_empty(struct mem_cgroup_tree_per_node *mctz)
> +{
> +	return rb_last(&mctz->rb_root) == NULL;
> +}
> +

I don't think traversing rb tree as rb_last() does w/o holding the lock
is a good idea. Why is RB_EMPTY_ROOT() insufficient here?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty
@ 2016-08-01 13:57   ` Vladimir Davydov
  0 siblings, 0 replies; 16+ messages in thread
From: Vladimir Davydov @ 2016-08-01 13:57 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, Johannes Weiner, linux-mm, LKML, Michal Hocko

On Mon, Aug 01, 2016 at 12:00:21PM +0200, Michal Hocko wrote:
...
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c265212bec8c..eb7e39c2d948 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2543,6 +2543,11 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg,
>  	return ret;
>  }
>  
> +static inline bool soft_limit_tree_empty(struct mem_cgroup_tree_per_node *mctz)
> +{
> +	return rb_last(&mctz->rb_root) == NULL;
> +}
> +

I don't think traversing rb tree as rb_last() does w/o holding the lock
is a good idea. Why is RB_EMPTY_ROOT() insufficient here?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty
  2016-08-01 13:57   ` Vladimir Davydov
@ 2016-08-01 14:12     ` Michal Hocko
  -1 siblings, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2016-08-01 14:12 UTC (permalink / raw)
  To: Vladimir Davydov; +Cc: Andrew Morton, Johannes Weiner, linux-mm, LKML

On Mon 01-08-16 16:57:57, Vladimir Davydov wrote:
> On Mon, Aug 01, 2016 at 12:00:21PM +0200, Michal Hocko wrote:
> ...
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index c265212bec8c..eb7e39c2d948 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -2543,6 +2543,11 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg,
> >  	return ret;
> >  }
> >  
> > +static inline bool soft_limit_tree_empty(struct mem_cgroup_tree_per_node *mctz)
> > +{
> > +	return rb_last(&mctz->rb_root) == NULL;
> > +}
> > +
> 
> I don't think traversing rb tree as rb_last() does w/o holding the lock
> is a good idea. Why is RB_EMPTY_ROOT() insufficient here?

Of course it is not. Dohh, forgot to refresh the patch! Sorry about
that.

Updated patch.
---
>From 9076cc87cbc49d8c16cee4120c7f5e518511b953 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Mon, 1 Aug 2016 10:42:06 +0200
Subject: [PATCH] memcg: put soft limit reclaim out of way if the excess tree
 is empty

We've had a report about soft lockups caused by lock bouncing in the
soft reclaim path:

[331404.849734] BUG: soft lockup - CPU#0 stuck for 22s! [kav4proxy-kavic:3128]
[331404.849920] RIP: 0010:[<ffffffff81469798>]  [<ffffffff81469798>] _raw_spin_lock+0x18/0x20
[331404.849997] Call Trace:
[331404.850010]  [<ffffffff811557ea>] mem_cgroup_soft_limit_reclaim+0x25a/0x280
[331404.850020]  [<ffffffff8111041d>] shrink_zones+0xed/0x200
[331404.850027]  [<ffffffff81111a94>] do_try_to_free_pages+0x74/0x320
[331404.850034]  [<ffffffff81112072>] try_to_free_pages+0x112/0x180
[331404.850042]  [<ffffffff81104a6f>] __alloc_pages_slowpath+0x3ff/0x820
[331404.850049]  [<ffffffff81105079>] __alloc_pages_nodemask+0x1e9/0x200
[331404.850056]  [<ffffffff81141e01>] alloc_pages_vma+0xe1/0x290
[331404.850064]  [<ffffffff8112402f>] do_wp_page+0x19f/0x840
[331404.850071]  [<ffffffff811257cd>] handle_pte_fault+0x1cd/0x230
[331404.850079]  [<ffffffff8146d3ed>] do_page_fault+0x1fd/0x4c0
[331404.850087]  [<ffffffff81469ec5>] page_fault+0x25/0x30

There are no memcgs created so there cannot be any in the soft limit
excess obviously:
[...]
memory  0       1       1

so all this just seems to be mem_cgroup_largest_soft_limit_node
trying to get spin_lock_irq(&mctz->lock) just to find out that the soft
limit excess tree is empty. This is just pointless waisting of cycles
and cache line bouncing during heavy parallel reclaim on large machines.
The particular machine wasn't very healthy and most probably suffering
from a memory leak which just caused the memory reclaim to trash
heavily. But bouncing on the lock certainly didn't help...

Introduce soft_limit_tree_empty which does the optimistic lockless check
and bail out early if the tree is empty. This is theoretically racy but
that shouldn't matter all that much. First of all soft limit is a best
effort feature and it is slowly getting deprecated and its usage should
be really scarce. Bouncing on a lock without a good reason is surely
much bigger problem, especially on large CPU machines.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/memcontrol.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c265212bec8c..c0b57b6a194e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2543,6 +2543,11 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg,
 	return ret;
 }
 
+static inline bool soft_limit_tree_empty(struct mem_cgroup_tree_per_node *mctz)
+{
+	return RB_EMPTY_ROOT(&mctz->rb_root);
+}
+
 unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 					    gfp_t gfp_mask,
 					    unsigned long *total_scanned)
@@ -2559,6 +2564,9 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 		return 0;
 
 	mctz = soft_limit_tree_node(pgdat->node_id);
+	if (soft_limit_tree_empty(mctz))
+		return 0;
+
 	/*
 	 * This loop can run a while, specially if mem_cgroup's continuously
 	 * keep exceeding their soft limit and putting the system under
-- 
2.8.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty
@ 2016-08-01 14:12     ` Michal Hocko
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2016-08-01 14:12 UTC (permalink / raw)
  To: Vladimir Davydov; +Cc: Andrew Morton, Johannes Weiner, linux-mm, LKML

On Mon 01-08-16 16:57:57, Vladimir Davydov wrote:
> On Mon, Aug 01, 2016 at 12:00:21PM +0200, Michal Hocko wrote:
> ...
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index c265212bec8c..eb7e39c2d948 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -2543,6 +2543,11 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg,
> >  	return ret;
> >  }
> >  
> > +static inline bool soft_limit_tree_empty(struct mem_cgroup_tree_per_node *mctz)
> > +{
> > +	return rb_last(&mctz->rb_root) == NULL;
> > +}
> > +
> 
> I don't think traversing rb tree as rb_last() does w/o holding the lock
> is a good idea. Why is RB_EMPTY_ROOT() insufficient here?

Of course it is not. Dohh, forgot to refresh the patch! Sorry about
that.

Updated patch.
---

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty
  2016-08-01 14:12     ` Michal Hocko
@ 2016-08-01 14:26       ` Vladimir Davydov
  -1 siblings, 0 replies; 16+ messages in thread
From: Vladimir Davydov @ 2016-08-01 14:26 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, Johannes Weiner, linux-mm, LKML

On Mon, Aug 01, 2016 at 04:12:28PM +0200, Michal Hocko wrote:
...
> From: Michal Hocko <mhocko@suse.com>
> Date: Mon, 1 Aug 2016 10:42:06 +0200
> Subject: [PATCH] memcg: put soft limit reclaim out of way if the excess tree
>  is empty
> 
> We've had a report about soft lockups caused by lock bouncing in the
> soft reclaim path:
> 
> [331404.849734] BUG: soft lockup - CPU#0 stuck for 22s! [kav4proxy-kavic:3128]
> [331404.849920] RIP: 0010:[<ffffffff81469798>]  [<ffffffff81469798>] _raw_spin_lock+0x18/0x20
> [331404.849997] Call Trace:
> [331404.850010]  [<ffffffff811557ea>] mem_cgroup_soft_limit_reclaim+0x25a/0x280
> [331404.850020]  [<ffffffff8111041d>] shrink_zones+0xed/0x200
> [331404.850027]  [<ffffffff81111a94>] do_try_to_free_pages+0x74/0x320
> [331404.850034]  [<ffffffff81112072>] try_to_free_pages+0x112/0x180
> [331404.850042]  [<ffffffff81104a6f>] __alloc_pages_slowpath+0x3ff/0x820
> [331404.850049]  [<ffffffff81105079>] __alloc_pages_nodemask+0x1e9/0x200
> [331404.850056]  [<ffffffff81141e01>] alloc_pages_vma+0xe1/0x290
> [331404.850064]  [<ffffffff8112402f>] do_wp_page+0x19f/0x840
> [331404.850071]  [<ffffffff811257cd>] handle_pte_fault+0x1cd/0x230
> [331404.850079]  [<ffffffff8146d3ed>] do_page_fault+0x1fd/0x4c0
> [331404.850087]  [<ffffffff81469ec5>] page_fault+0x25/0x30
> 
> There are no memcgs created so there cannot be any in the soft limit
> excess obviously:
> [...]
> memory  0       1       1
> 
> so all this just seems to be mem_cgroup_largest_soft_limit_node
> trying to get spin_lock_irq(&mctz->lock) just to find out that the soft
> limit excess tree is empty. This is just pointless waisting of cycles
> and cache line bouncing during heavy parallel reclaim on large machines.
> The particular machine wasn't very healthy and most probably suffering
> from a memory leak which just caused the memory reclaim to trash
> heavily. But bouncing on the lock certainly didn't help...
> 
> Introduce soft_limit_tree_empty which does the optimistic lockless check
> and bail out early if the tree is empty. This is theoretically racy but
> that shouldn't matter all that much. First of all soft limit is a best
> effort feature and it is slowly getting deprecated and its usage should
> be really scarce. Bouncing on a lock without a good reason is surely
> much bigger problem, especially on large CPU machines.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty
@ 2016-08-01 14:26       ` Vladimir Davydov
  0 siblings, 0 replies; 16+ messages in thread
From: Vladimir Davydov @ 2016-08-01 14:26 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, Johannes Weiner, linux-mm, LKML

On Mon, Aug 01, 2016 at 04:12:28PM +0200, Michal Hocko wrote:
...
> From: Michal Hocko <mhocko@suse.com>
> Date: Mon, 1 Aug 2016 10:42:06 +0200
> Subject: [PATCH] memcg: put soft limit reclaim out of way if the excess tree
>  is empty
> 
> We've had a report about soft lockups caused by lock bouncing in the
> soft reclaim path:
> 
> [331404.849734] BUG: soft lockup - CPU#0 stuck for 22s! [kav4proxy-kavic:3128]
> [331404.849920] RIP: 0010:[<ffffffff81469798>]  [<ffffffff81469798>] _raw_spin_lock+0x18/0x20
> [331404.849997] Call Trace:
> [331404.850010]  [<ffffffff811557ea>] mem_cgroup_soft_limit_reclaim+0x25a/0x280
> [331404.850020]  [<ffffffff8111041d>] shrink_zones+0xed/0x200
> [331404.850027]  [<ffffffff81111a94>] do_try_to_free_pages+0x74/0x320
> [331404.850034]  [<ffffffff81112072>] try_to_free_pages+0x112/0x180
> [331404.850042]  [<ffffffff81104a6f>] __alloc_pages_slowpath+0x3ff/0x820
> [331404.850049]  [<ffffffff81105079>] __alloc_pages_nodemask+0x1e9/0x200
> [331404.850056]  [<ffffffff81141e01>] alloc_pages_vma+0xe1/0x290
> [331404.850064]  [<ffffffff8112402f>] do_wp_page+0x19f/0x840
> [331404.850071]  [<ffffffff811257cd>] handle_pte_fault+0x1cd/0x230
> [331404.850079]  [<ffffffff8146d3ed>] do_page_fault+0x1fd/0x4c0
> [331404.850087]  [<ffffffff81469ec5>] page_fault+0x25/0x30
> 
> There are no memcgs created so there cannot be any in the soft limit
> excess obviously:
> [...]
> memory  0       1       1
> 
> so all this just seems to be mem_cgroup_largest_soft_limit_node
> trying to get spin_lock_irq(&mctz->lock) just to find out that the soft
> limit excess tree is empty. This is just pointless waisting of cycles
> and cache line bouncing during heavy parallel reclaim on large machines.
> The particular machine wasn't very healthy and most probably suffering
> from a memory leak which just caused the memory reclaim to trash
> heavily. But bouncing on the lock certainly didn't help...
> 
> Introduce soft_limit_tree_empty which does the optimistic lockless check
> and bail out early if the tree is empty. This is theoretically racy but
> that shouldn't matter all that much. First of all soft limit is a best
> effort feature and it is slowly getting deprecated and its usage should
> be really scarce. Bouncing on a lock without a good reason is surely
> much bigger problem, especially on large CPU machines.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty
  2016-08-01 14:12     ` Michal Hocko
@ 2016-08-01 15:03       ` Johannes Weiner
  -1 siblings, 0 replies; 16+ messages in thread
From: Johannes Weiner @ 2016-08-01 15:03 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Vladimir Davydov, Andrew Morton, linux-mm, LKML

On Mon, Aug 01, 2016 at 04:12:28PM +0200, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> Date: Mon, 1 Aug 2016 10:42:06 +0200
> Subject: [PATCH] memcg: put soft limit reclaim out of way if the excess tree
>  is empty
> 
> We've had a report about soft lockups caused by lock bouncing in the
> soft reclaim path:
> 
> [331404.849734] BUG: soft lockup - CPU#0 stuck for 22s! [kav4proxy-kavic:3128]
> [331404.849920] RIP: 0010:[<ffffffff81469798>]  [<ffffffff81469798>] _raw_spin_lock+0x18/0x20
> [331404.849997] Call Trace:
> [331404.850010]  [<ffffffff811557ea>] mem_cgroup_soft_limit_reclaim+0x25a/0x280
> [331404.850020]  [<ffffffff8111041d>] shrink_zones+0xed/0x200
> [331404.850027]  [<ffffffff81111a94>] do_try_to_free_pages+0x74/0x320
> [331404.850034]  [<ffffffff81112072>] try_to_free_pages+0x112/0x180
> [331404.850042]  [<ffffffff81104a6f>] __alloc_pages_slowpath+0x3ff/0x820
> [331404.850049]  [<ffffffff81105079>] __alloc_pages_nodemask+0x1e9/0x200
> [331404.850056]  [<ffffffff81141e01>] alloc_pages_vma+0xe1/0x290
> [331404.850064]  [<ffffffff8112402f>] do_wp_page+0x19f/0x840
> [331404.850071]  [<ffffffff811257cd>] handle_pte_fault+0x1cd/0x230
> [331404.850079]  [<ffffffff8146d3ed>] do_page_fault+0x1fd/0x4c0
> [331404.850087]  [<ffffffff81469ec5>] page_fault+0x25/0x30
> 
> There are no memcgs created so there cannot be any in the soft limit
> excess obviously:
> [...]
> memory  0       1       1
> 
> so all this just seems to be mem_cgroup_largest_soft_limit_node
> trying to get spin_lock_irq(&mctz->lock) just to find out that the soft
> limit excess tree is empty. This is just pointless waisting of cycles
> and cache line bouncing during heavy parallel reclaim on large machines.
> The particular machine wasn't very healthy and most probably suffering
> from a memory leak which just caused the memory reclaim to trash
> heavily. But bouncing on the lock certainly didn't help...
> 
> Introduce soft_limit_tree_empty which does the optimistic lockless check
> and bail out early if the tree is empty. This is theoretically racy but
> that shouldn't matter all that much. First of all soft limit is a best
> effort feature and it is slowly getting deprecated and its usage should
> be really scarce. Bouncing on a lock without a good reason is surely
> much bigger problem, especially on large CPU machines.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  mm/memcontrol.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c265212bec8c..c0b57b6a194e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2543,6 +2543,11 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg,
>  	return ret;
>  }
>  
> +static inline bool soft_limit_tree_empty(struct mem_cgroup_tree_per_node *mctz)
> +{
> +	return RB_EMPTY_ROOT(&mctz->rb_root);
> +}

Can you please fold this into the caller? It should be obvious enough.

Other than that, this patch makes sense to me.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty
@ 2016-08-01 15:03       ` Johannes Weiner
  0 siblings, 0 replies; 16+ messages in thread
From: Johannes Weiner @ 2016-08-01 15:03 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Vladimir Davydov, Andrew Morton, linux-mm, LKML

On Mon, Aug 01, 2016 at 04:12:28PM +0200, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> Date: Mon, 1 Aug 2016 10:42:06 +0200
> Subject: [PATCH] memcg: put soft limit reclaim out of way if the excess tree
>  is empty
> 
> We've had a report about soft lockups caused by lock bouncing in the
> soft reclaim path:
> 
> [331404.849734] BUG: soft lockup - CPU#0 stuck for 22s! [kav4proxy-kavic:3128]
> [331404.849920] RIP: 0010:[<ffffffff81469798>]  [<ffffffff81469798>] _raw_spin_lock+0x18/0x20
> [331404.849997] Call Trace:
> [331404.850010]  [<ffffffff811557ea>] mem_cgroup_soft_limit_reclaim+0x25a/0x280
> [331404.850020]  [<ffffffff8111041d>] shrink_zones+0xed/0x200
> [331404.850027]  [<ffffffff81111a94>] do_try_to_free_pages+0x74/0x320
> [331404.850034]  [<ffffffff81112072>] try_to_free_pages+0x112/0x180
> [331404.850042]  [<ffffffff81104a6f>] __alloc_pages_slowpath+0x3ff/0x820
> [331404.850049]  [<ffffffff81105079>] __alloc_pages_nodemask+0x1e9/0x200
> [331404.850056]  [<ffffffff81141e01>] alloc_pages_vma+0xe1/0x290
> [331404.850064]  [<ffffffff8112402f>] do_wp_page+0x19f/0x840
> [331404.850071]  [<ffffffff811257cd>] handle_pte_fault+0x1cd/0x230
> [331404.850079]  [<ffffffff8146d3ed>] do_page_fault+0x1fd/0x4c0
> [331404.850087]  [<ffffffff81469ec5>] page_fault+0x25/0x30
> 
> There are no memcgs created so there cannot be any in the soft limit
> excess obviously:
> [...]
> memory  0       1       1
> 
> so all this just seems to be mem_cgroup_largest_soft_limit_node
> trying to get spin_lock_irq(&mctz->lock) just to find out that the soft
> limit excess tree is empty. This is just pointless waisting of cycles
> and cache line bouncing during heavy parallel reclaim on large machines.
> The particular machine wasn't very healthy and most probably suffering
> from a memory leak which just caused the memory reclaim to trash
> heavily. But bouncing on the lock certainly didn't help...
> 
> Introduce soft_limit_tree_empty which does the optimistic lockless check
> and bail out early if the tree is empty. This is theoretically racy but
> that shouldn't matter all that much. First of all soft limit is a best
> effort feature and it is slowly getting deprecated and its usage should
> be really scarce. Bouncing on a lock without a good reason is surely
> much bigger problem, especially on large CPU machines.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  mm/memcontrol.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c265212bec8c..c0b57b6a194e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2543,6 +2543,11 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg,
>  	return ret;
>  }
>  
> +static inline bool soft_limit_tree_empty(struct mem_cgroup_tree_per_node *mctz)
> +{
> +	return RB_EMPTY_ROOT(&mctz->rb_root);
> +}

Can you please fold this into the caller? It should be obvious enough.

Other than that, this patch makes sense to me.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty
  2016-08-01 15:03       ` Johannes Weiner
@ 2016-08-01 15:24         ` Michal Hocko
  -1 siblings, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2016-08-01 15:24 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Vladimir Davydov, Andrew Morton, linux-mm, LKML

On Mon 01-08-16 11:03:43, Johannes Weiner wrote:
> On Mon, Aug 01, 2016 at 04:12:28PM +0200, Michal Hocko wrote:
[...]
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index c265212bec8c..c0b57b6a194e 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -2543,6 +2543,11 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg,
> >  	return ret;
> >  }
> >  
> > +static inline bool soft_limit_tree_empty(struct mem_cgroup_tree_per_node *mctz)
> > +{
> > +	return RB_EMPTY_ROOT(&mctz->rb_root);
> > +}
> 
> Can you please fold this into the caller? It should be obvious enough.

OK, fair enough. There will probably be no other callers. I've added
comment as well

> Other than that, this patch makes sense to me.
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Thanks!

If the following sounds good I will resend v2.
---
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c0b57b6a194e..e56d6a0f92ac 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2543,11 +2543,6 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg,
 	return ret;
 }
 
-static inline bool soft_limit_tree_empty(struct mem_cgroup_tree_per_node *mctz)
-{
-	return RB_EMPTY_ROOT(&mctz->rb_root);
-}
-
 unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 					    gfp_t gfp_mask,
 					    unsigned long *total_scanned)
@@ -2564,7 +2559,13 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 		return 0;
 
 	mctz = soft_limit_tree_node(pgdat->node_id);
-	if (soft_limit_tree_empty(mctz))
+
+	/*
+	 * Do not even bother to check the largest node if the node
+	 * is empty. Do it lockless to prevent lock bouncing. Races
+	 * are acceptable as soft limit is best effort anyway.
+	 */
+	if (RB_EMPTY_ROOT(&mctz->rb_root))
 		return 0;
 
 	/*

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty
@ 2016-08-01 15:24         ` Michal Hocko
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2016-08-01 15:24 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Vladimir Davydov, Andrew Morton, linux-mm, LKML

On Mon 01-08-16 11:03:43, Johannes Weiner wrote:
> On Mon, Aug 01, 2016 at 04:12:28PM +0200, Michal Hocko wrote:
[...]
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index c265212bec8c..c0b57b6a194e 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -2543,6 +2543,11 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg,
> >  	return ret;
> >  }
> >  
> > +static inline bool soft_limit_tree_empty(struct mem_cgroup_tree_per_node *mctz)
> > +{
> > +	return RB_EMPTY_ROOT(&mctz->rb_root);
> > +}
> 
> Can you please fold this into the caller? It should be obvious enough.

OK, fair enough. There will probably be no other callers. I've added
comment as well

> Other than that, this patch makes sense to me.
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Thanks!

If the following sounds good I will resend v2.
---
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c0b57b6a194e..e56d6a0f92ac 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2543,11 +2543,6 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg,
 	return ret;
 }
 
-static inline bool soft_limit_tree_empty(struct mem_cgroup_tree_per_node *mctz)
-{
-	return RB_EMPTY_ROOT(&mctz->rb_root);
-}
-
 unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 					    gfp_t gfp_mask,
 					    unsigned long *total_scanned)
@@ -2564,7 +2559,13 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 		return 0;
 
 	mctz = soft_limit_tree_node(pgdat->node_id);
-	if (soft_limit_tree_empty(mctz))
+
+	/*
+	 * Do not even bother to check the largest node if the node
+	 * is empty. Do it lockless to prevent lock bouncing. Races
+	 * are acceptable as soft limit is best effort anyway.
+	 */
+	if (RB_EMPTY_ROOT(&mctz->rb_root))
 		return 0;
 
 	/*

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty
  2016-08-01 15:24         ` Michal Hocko
@ 2016-08-01 17:17           ` Johannes Weiner
  -1 siblings, 0 replies; 16+ messages in thread
From: Johannes Weiner @ 2016-08-01 17:17 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Vladimir Davydov, Andrew Morton, linux-mm, LKML

On Mon, Aug 01, 2016 at 05:24:54PM +0200, Michal Hocko wrote:
> @@ -2564,7 +2559,13 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
>  		return 0;
>  
>  	mctz = soft_limit_tree_node(pgdat->node_id);
> -	if (soft_limit_tree_empty(mctz))
> +
> +	/*
> +	 * Do not even bother to check the largest node if the node

                                                               root

> +	 * is empty. Do it lockless to prevent lock bouncing. Races
> +	 * are acceptable as soft limit is best effort anyway.
> +	 */
> +	if (RB_EMPTY_ROOT(&mctz->rb_root))
>  		return 0;

Other than that, looks good. Please retain my

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

in version 2.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty
@ 2016-08-01 17:17           ` Johannes Weiner
  0 siblings, 0 replies; 16+ messages in thread
From: Johannes Weiner @ 2016-08-01 17:17 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Vladimir Davydov, Andrew Morton, linux-mm, LKML

On Mon, Aug 01, 2016 at 05:24:54PM +0200, Michal Hocko wrote:
> @@ -2564,7 +2559,13 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
>  		return 0;
>  
>  	mctz = soft_limit_tree_node(pgdat->node_id);
> -	if (soft_limit_tree_empty(mctz))
> +
> +	/*
> +	 * Do not even bother to check the largest node if the node

                                                               root

> +	 * is empty. Do it lockless to prevent lock bouncing. Races
> +	 * are acceptable as soft limit is best effort anyway.
> +	 */
> +	if (RB_EMPTY_ROOT(&mctz->rb_root))
>  		return 0;

Other than that, looks good. Please retain my

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

in version 2.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty
  2016-08-01 17:17           ` Johannes Weiner
@ 2016-08-01 17:39             ` Michal Hocko
  -1 siblings, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2016-08-01 17:39 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Vladimir Davydov, Andrew Morton, linux-mm, LKML

On Mon 01-08-16 13:17:17, Johannes Weiner wrote:
> On Mon, Aug 01, 2016 at 05:24:54PM +0200, Michal Hocko wrote:
> > @@ -2564,7 +2559,13 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
> >  		return 0;
> >  
> >  	mctz = soft_limit_tree_node(pgdat->node_id);
> > -	if (soft_limit_tree_empty(mctz))
> > +
> > +	/*
> > +	 * Do not even bother to check the largest node if the node
> 
>                                                                root

Fixed

> 
> > +	 * is empty. Do it lockless to prevent lock bouncing. Races
> > +	 * are acceptable as soft limit is best effort anyway.
> > +	 */
> > +	if (RB_EMPTY_ROOT(&mctz->rb_root))
> >  		return 0;
> 
> Other than that, looks good. Please retain my
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Thanks.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty
@ 2016-08-01 17:39             ` Michal Hocko
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2016-08-01 17:39 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Vladimir Davydov, Andrew Morton, linux-mm, LKML

On Mon 01-08-16 13:17:17, Johannes Weiner wrote:
> On Mon, Aug 01, 2016 at 05:24:54PM +0200, Michal Hocko wrote:
> > @@ -2564,7 +2559,13 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
> >  		return 0;
> >  
> >  	mctz = soft_limit_tree_node(pgdat->node_id);
> > -	if (soft_limit_tree_empty(mctz))
> > +
> > +	/*
> > +	 * Do not even bother to check the largest node if the node
> 
>                                                                root

Fixed

> 
> > +	 * is empty. Do it lockless to prevent lock bouncing. Races
> > +	 * are acceptable as soft limit is best effort anyway.
> > +	 */
> > +	if (RB_EMPTY_ROOT(&mctz->rb_root))
> >  		return 0;
> 
> Other than that, looks good. Please retain my
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Thanks.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-08-02  5:11 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-01 10:00 [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty Michal Hocko
2016-08-01 10:00 ` Michal Hocko
2016-08-01 13:57 ` Vladimir Davydov
2016-08-01 13:57   ` Vladimir Davydov
2016-08-01 14:12   ` Michal Hocko
2016-08-01 14:12     ` Michal Hocko
2016-08-01 14:26     ` Vladimir Davydov
2016-08-01 14:26       ` Vladimir Davydov
2016-08-01 15:03     ` Johannes Weiner
2016-08-01 15:03       ` Johannes Weiner
2016-08-01 15:24       ` Michal Hocko
2016-08-01 15:24         ` Michal Hocko
2016-08-01 17:17         ` Johannes Weiner
2016-08-01 17:17           ` Johannes Weiner
2016-08-01 17:39           ` Michal Hocko
2016-08-01 17:39             ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.