linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch] mm, memcg: periodically schedule when emptying page list
@ 2014-06-02 23:13 David Rientjes
  2014-06-02 23:44 ` Hugh Dickins
  2014-06-03  0:51 ` [patch v2] " David Rientjes
  0 siblings, 2 replies; 7+ messages in thread
From: David Rientjes @ 2014-06-02 23:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, linux-kernel, linux-mm, cgroups

mem_cgroup_force_empty_list() can iterate a large number of pages on an lru and 
mem_cgroup_move_parent() doesn't return an errno unless certain criteria, none 
of which indicate that the iteration may be taking too long, is met.

We have encountered the following stack trace many times indicating
"need_resched set for > 51000020 ns (51 ticks) without schedule", for example:

	scheduler_tick()
	<timer irq>
	mem_cgroup_move_account+0x4d/0x1d5
	mem_cgroup_move_parent+0x8d/0x109
	mem_cgroup_reparent_charges+0x149/0x2ba
	mem_cgroup_css_offline+0xeb/0x11b
	cgroup_offline_fn+0x68/0x16b
	process_one_work+0x129/0x350

If this iteration is taking too long, indicated by need_resched(), then 
periodically schedule and continue from where we last left off.

Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/memcontrol.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4764,6 +4764,7 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
 	do {
 		struct page_cgroup *pc;
 		struct page *page;
+		int ret;
 
 		spin_lock_irqsave(&zone->lru_lock, flags);
 		if (list_empty(list)) {
@@ -4781,8 +4782,13 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
 
 		pc = lookup_page_cgroup(page);
 
-		if (mem_cgroup_move_parent(page, pc, memcg)) {
-			/* found lock contention or "pc" is obsolete. */
+		ret = mem_cgroup_move_parent(page, pc, memcg);
+		if (ret || need_resched()) {
+			/*
+			 * Couldn't grab the page reference, isolate the page,
+			 * there was a pc mismatch, or we simply need to
+			 * schedule because this is taking too long.
+			 */
 			busy = page;
 			cond_resched();
 		} else

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] mm, memcg: periodically schedule when emptying page list
  2014-06-02 23:13 [patch] mm, memcg: periodically schedule when emptying page list David Rientjes
@ 2014-06-02 23:44 ` Hugh Dickins
  2014-06-02 23:48   ` David Rientjes
  2014-06-03  0:51 ` [patch v2] " David Rientjes
  1 sibling, 1 reply; 7+ messages in thread
From: Hugh Dickins @ 2014-06-02 23:44 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, linux-kernel,
	linux-mm, cgroups

On Mon, 2 Jun 2014, David Rientjes wrote:

> mem_cgroup_force_empty_list() can iterate a large number of pages on an lru and 
> mem_cgroup_move_parent() doesn't return an errno unless certain criteria, none 
> of which indicate that the iteration may be taking too long, is met.
> 
> We have encountered the following stack trace many times indicating
> "need_resched set for > 51000020 ns (51 ticks) without schedule", for example:
> 
> 	scheduler_tick()
> 	<timer irq>
> 	mem_cgroup_move_account+0x4d/0x1d5
> 	mem_cgroup_move_parent+0x8d/0x109
> 	mem_cgroup_reparent_charges+0x149/0x2ba
> 	mem_cgroup_css_offline+0xeb/0x11b
> 	cgroup_offline_fn+0x68/0x16b
> 	process_one_work+0x129/0x350
> 
> If this iteration is taking too long, indicated by need_resched(), then 
> periodically schedule and continue from where we last left off.
> 
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
>  mm/memcontrol.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4764,6 +4764,7 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
>  	do {
>  		struct page_cgroup *pc;
>  		struct page *page;
> +		int ret;
>  
>  		spin_lock_irqsave(&zone->lru_lock, flags);
>  		if (list_empty(list)) {
> @@ -4781,8 +4782,13 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
>  
>  		pc = lookup_page_cgroup(page);
>  
> -		if (mem_cgroup_move_parent(page, pc, memcg)) {
> -			/* found lock contention or "pc" is obsolete. */
> +		ret = mem_cgroup_move_parent(page, pc, memcg);
> +		if (ret || need_resched()) {
> +			/*
> +			 * Couldn't grab the page reference, isolate the page,
> +			 * there was a pc mismatch, or we simply need to
> +			 * schedule because this is taking too long.
> +			 */
>  			busy = page;
>  			cond_resched();
>  		} else

Why not just move that cond_resched() down below the if/else?
No need to test need_resched() separately, and this page is not busy.

Hugh

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] mm, memcg: periodically schedule when emptying page list
  2014-06-02 23:44 ` Hugh Dickins
@ 2014-06-02 23:48   ` David Rientjes
  2014-06-03  0:01     ` Hugh Dickins
  0 siblings, 1 reply; 7+ messages in thread
From: David Rientjes @ 2014-06-02 23:48 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, linux-kernel,
	linux-mm, cgroups

On Mon, 2 Jun 2014, Hugh Dickins wrote:

> Why not just move that cond_resched() down below the if/else?
> No need to test need_resched() separately, and this page is not busy.
> 

Would you like to propose your version from our kernel instead?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] mm, memcg: periodically schedule when emptying page list
  2014-06-02 23:48   ` David Rientjes
@ 2014-06-03  0:01     ` Hugh Dickins
  0 siblings, 0 replies; 7+ messages in thread
From: Hugh Dickins @ 2014-06-03  0:01 UTC (permalink / raw)
  To: David Rientjes
  Cc: Hugh Dickins, Andrew Morton, Johannes Weiner, Michal Hocko,
	linux-kernel, linux-mm, cgroups

On Mon, 2 Jun 2014, David Rientjes wrote:
> On Mon, 2 Jun 2014, Hugh Dickins wrote:
> 
> > Why not just move that cond_resched() down below the if/else?
> > No need to test need_resched() separately, and this page is not busy.
> > 
> 
> Would you like to propose your version from our kernel instead?

If that's how you prefer to work it, sure.  The patch would indeed
look uncannily like what I put in our kernel a week ago; but I think
the description might owe a lot to you!  I'll get to it later, while
secretly hoping you get to it sooner :)

Hugh

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [patch v2] mm, memcg: periodically schedule when emptying page list
  2014-06-02 23:13 [patch] mm, memcg: periodically schedule when emptying page list David Rientjes
  2014-06-02 23:44 ` Hugh Dickins
@ 2014-06-03  0:51 ` David Rientjes
  2014-06-03  1:15   ` Johannes Weiner
  2014-06-03  6:27   ` Michal Hocko
  1 sibling, 2 replies; 7+ messages in thread
From: David Rientjes @ 2014-06-03  0:51 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Hugh Dickins, Johannes Weiner, Michal Hocko, linux-kernel,
	linux-mm, cgroups

From: Hugh Dickins <hughd@google.com>

mem_cgroup_force_empty_list() can iterate a large number of pages on an lru and 
mem_cgroup_move_parent() doesn't return an errno unless certain criteria, none 
of which indicate that the iteration may be taking too long, is met.

We have encountered the following stack trace many times indicating
"need_resched set for > 51000020 ns (51 ticks) without schedule", for example:

	scheduler_tick()
	<timer irq>
	mem_cgroup_move_account+0x4d/0x1d5
	mem_cgroup_move_parent+0x8d/0x109
	mem_cgroup_reparent_charges+0x149/0x2ba
	mem_cgroup_css_offline+0xeb/0x11b
	cgroup_offline_fn+0x68/0x16b
	process_one_work+0x129/0x350

If this iteration is taking too long, we still need to do cond_resched() even 
when an individual page is not busy.

[rientjes@google.com: changelog]
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 v2: always reschedule if needed, "page" itself may not have a pc mismatch
     or been unable to isolate.

 mm/memcontrol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4784,9 +4784,9 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
 		if (mem_cgroup_move_parent(page, pc, memcg)) {
 			/* found lock contention or "pc" is obsolete. */
 			busy = page;
-			cond_resched();
 		} else
 			busy = NULL;
+		cond_resched();
 	} while (!list_empty(list));
 }
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch v2] mm, memcg: periodically schedule when emptying page list
  2014-06-03  0:51 ` [patch v2] " David Rientjes
@ 2014-06-03  1:15   ` Johannes Weiner
  2014-06-03  6:27   ` Michal Hocko
  1 sibling, 0 replies; 7+ messages in thread
From: Johannes Weiner @ 2014-06-03  1:15 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Hugh Dickins, Michal Hocko, linux-kernel,
	linux-mm, cgroups

On Mon, Jun 02, 2014 at 05:51:25PM -0700, David Rientjes wrote:
> From: Hugh Dickins <hughd@google.com>
> 
> mem_cgroup_force_empty_list() can iterate a large number of pages on an lru and 
> mem_cgroup_move_parent() doesn't return an errno unless certain criteria, none 
> of which indicate that the iteration may be taking too long, is met.
> 
> We have encountered the following stack trace many times indicating
> "need_resched set for > 51000020 ns (51 ticks) without schedule", for example:
> 
> 	scheduler_tick()
> 	<timer irq>
> 	mem_cgroup_move_account+0x4d/0x1d5
> 	mem_cgroup_move_parent+0x8d/0x109
> 	mem_cgroup_reparent_charges+0x149/0x2ba
> 	mem_cgroup_css_offline+0xeb/0x11b
> 	cgroup_offline_fn+0x68/0x16b
> 	process_one_work+0x129/0x350
> 
> If this iteration is taking too long, we still need to do cond_resched() even 
> when an individual page is not busy.
> 
> [rientjes@google.com: changelog]
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: David Rientjes <rientjes@google.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch v2] mm, memcg: periodically schedule when emptying page list
  2014-06-03  0:51 ` [patch v2] " David Rientjes
  2014-06-03  1:15   ` Johannes Weiner
@ 2014-06-03  6:27   ` Michal Hocko
  1 sibling, 0 replies; 7+ messages in thread
From: Michal Hocko @ 2014-06-03  6:27 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Hugh Dickins, Johannes Weiner, linux-kernel,
	linux-mm, cgroups

On Mon 02-06-14 17:51:25, David Rientjes wrote:
> From: Hugh Dickins <hughd@google.com>
> 
> mem_cgroup_force_empty_list() can iterate a large number of pages on an lru and 
> mem_cgroup_move_parent() doesn't return an errno unless certain criteria, none 
> of which indicate that the iteration may be taking too long, is met.
> 
> We have encountered the following stack trace many times indicating
> "need_resched set for > 51000020 ns (51 ticks) without schedule", for example:
> 
> 	scheduler_tick()
> 	<timer irq>
> 	mem_cgroup_move_account+0x4d/0x1d5
> 	mem_cgroup_move_parent+0x8d/0x109
> 	mem_cgroup_reparent_charges+0x149/0x2ba
> 	mem_cgroup_css_offline+0xeb/0x11b
> 	cgroup_offline_fn+0x68/0x16b
> 	process_one_work+0x129/0x350
> 
> If this iteration is taking too long, we still need to do cond_resched() even 
> when an individual page is not busy.
> 
> [rientjes@google.com: changelog]
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: David Rientjes <rientjes@google.com>

Acked-by: Michal Hocko <mhocko@suse.cz>

Thanks!

> ---
>  v2: always reschedule if needed, "page" itself may not have a pc mismatch
>      or been unable to isolate.
> 
>  mm/memcontrol.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4784,9 +4784,9 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
>  		if (mem_cgroup_move_parent(page, pc, memcg)) {
>  			/* found lock contention or "pc" is obsolete. */
>  			busy = page;
> -			cond_resched();
>  		} else
>  			busy = NULL;
> +		cond_resched();
>  	} while (!list_empty(list));
>  }
>  

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-06-03  6:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-02 23:13 [patch] mm, memcg: periodically schedule when emptying page list David Rientjes
2014-06-02 23:44 ` Hugh Dickins
2014-06-02 23:48   ` David Rientjes
2014-06-03  0:01     ` Hugh Dickins
2014-06-03  0:51 ` [patch v2] " David Rientjes
2014-06-03  1:15   ` Johannes Weiner
2014-06-03  6:27   ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).