linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] memcg: reset to root_mem_cgroup at bypassing
@ 2011-12-19  7:51 KAMEZAWA Hiroyuki
  2011-12-19 20:49 ` Hugh Dickins
  2011-12-21  8:24 ` [PATCH v2] memcg: return -EINTR at bypassing try_charge() KAMEZAWA Hiroyuki
  0 siblings, 2 replies; 11+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-19  7:51 UTC (permalink / raw)
  To: akpm; +Cc: linux-mm, cgroups, linux-kernel, hannes, Michal Hocko, Hugh Dickins

>From d620ff605a3a592c2b1de3a046498ce5cd3d3c50 Mon Sep 17 00:00:00 2001
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Date: Mon, 19 Dec 2011 16:55:10 +0900
Subject: [PATCH 2/2] memcg: reset lru to root_mem_cgroup in special cases.

This patch is a fix for memcg-simplify-lru-handling-by-new-rule.patch

After the patch, all pages which will be onto LRU must have sane
pc->mem_cgroup. But, in special case, it's not set.

If task->mm is NULL or task is TIF_MEMDIE or fatal_signal_pending(),
try_charge() is bypassed and the new charge will not be charged. And
pc->mem_cgroup is unset even if the page will be used/mapped and added
to LRU. To avoid this,  this patch charges such pages to root_mem_cgroup,
then, pc->mem_cgroup will be handled correctly.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 0d6d21c..9268e8e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2324,7 +2324,7 @@ nomem:
 	*ptr = NULL;
 	return -ENOMEM;
 bypass:
-	*ptr = NULL;
+	*ptr = root_mem_cgroup;
 	return 0;
 }
 
-- 
1.7.4.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] memcg: reset to root_mem_cgroup at bypassing
  2011-12-19  7:51 [PATCH] memcg: reset to root_mem_cgroup at bypassing KAMEZAWA Hiroyuki
@ 2011-12-19 20:49 ` Hugh Dickins
  2011-12-20  0:24   ` Hiroyuki Kamezawa
  2011-12-21  8:24 ` [PATCH v2] memcg: return -EINTR at bypassing try_charge() KAMEZAWA Hiroyuki
  1 sibling, 1 reply; 11+ messages in thread
From: Hugh Dickins @ 2011-12-19 20:49 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: akpm, linux-mm, cgroups, linux-kernel, hannes, Michal Hocko

On Mon, 19 Dec 2011, KAMEZAWA Hiroyuki wrote:
> From d620ff605a3a592c2b1de3a046498ce5cd3d3c50 Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Mon, 19 Dec 2011 16:55:10 +0900
> Subject: [PATCH 2/2] memcg: reset lru to root_mem_cgroup in special cases.
> 
> This patch is a fix for memcg-simplify-lru-handling-by-new-rule.patch
> 
> After the patch, all pages which will be onto LRU must have sane
> pc->mem_cgroup. But, in special case, it's not set.
> 
> If task->mm is NULL or task is TIF_MEMDIE or fatal_signal_pending(),
> try_charge() is bypassed and the new charge will not be charged. And
> pc->mem_cgroup is unset even if the page will be used/mapped and added
> to LRU. To avoid this,  this patch charges such pages to root_mem_cgroup,
> then, pc->mem_cgroup will be handled correctly.
> 
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
>  mm/memcontrol.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 0d6d21c..9268e8e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2324,7 +2324,7 @@ nomem:
>  	*ptr = NULL;
>  	return -ENOMEM;
>  bypass:
> -	*ptr = NULL;
> +	*ptr = root_mem_cgroup;
>  	return 0;
>  }
>  
> -- 

I'm dubious about this patch: certainly you have not fully justified it.

I speak from experience: I did *exactly* the same at "bypass" when
I introduced our mem_cgroup_reset_page(), which corresponds to your
mem_cgroup_reset_owner(); it seemed right to me that a successful
(return 0) call to try_charge() should provide a good *ptr.

But others (Ying and Greg) pointed out that it changes the semantics
of __mem_cgroup_try_charge() in this case, so you need to justify the
change to all those places which do something like "if (ret || !memcg)"
after calling it.  Perhaps it is a good change everywhere, but that's
not obvious, so we chose caution.

Doesn't it lead to bypass pages being marked as charged to root, so
they don't get charged to the right owner next time they're touched?

In our internal kernel, I restored "bypass" to set *ptr = NULL as
before, but routed those callers that need it to continue on to
__mem_cgroup_commit_charge() when it's NULL, and let that do a
quick little mem_cgroup_reset_page() to root_mem_cgroup for this.

But I was growing tired of mem_cgroup_reset_page() when I prepared
the rollup I posted two weeks ago, it adds overhead where we don't
want it, so I found a way to avoid it completely.

What you're doing with mem_cgroup_reset_owner() seems reasonable to
me as a phase to go through (though there's probably more callsites
to be found - sorry to be unhelpfully mysterious about that, but
just because per-memcg-lru-locking needed them doesn't imply that
your patchset needs them), but I expect to (offer a patch to) remove
it later.

I am intending to rebase upon your patches, or at least the ones
which akpm has already taken in (I've not studied the pcg flag ones,
more noise than I want at the moment).  I'm waiting for those to
appear in a linux-next, disappointed that they weren't in today's.

(But I'm afraid my patches will then clash with Mel's new lru work.)

I have been running successfully on several machines with an
approximation to what I expect linux-next to be when it has your
patches in.  Ran very stably on two, but one hangs in reclaim after
a few hours, that's high on my list to investigate (you made no
change to vmscan.c, maybe the problem comes from Hannes's earlier
patches, but I hadn't noticed it with those alone).

Hugh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] memcg: reset to root_mem_cgroup at bypassing
  2011-12-19 20:49 ` Hugh Dickins
@ 2011-12-20  0:24   ` Hiroyuki Kamezawa
  2011-12-21  0:13     ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 11+ messages in thread
From: Hiroyuki Kamezawa @ 2011-12-20  0:24 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: KAMEZAWA Hiroyuki, akpm, linux-mm, cgroups, linux-kernel, hannes,
	Michal Hocko

2011/12/20 Hugh Dickins <hughd@google.com>:
> On Mon, 19 Dec 2011, KAMEZAWA Hiroyuki wrote:
>> From d620ff605a3a592c2b1de3a046498ce5cd3d3c50 Mon Sep 17 00:00:00 2001
>> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>> Date: Mon, 19 Dec 2011 16:55:10 +0900
>> Subject: [PATCH 2/2] memcg: reset lru to root_mem_cgroup in special cases.
>>
>> This patch is a fix for memcg-simplify-lru-handling-by-new-rule.patch
>>
>> After the patch, all pages which will be onto LRU must have sane
>> pc->mem_cgroup. But, in special case, it's not set.
>>
>> If task->mm is NULL or task is TIF_MEMDIE or fatal_signal_pending(),
>> try_charge() is bypassed and the new charge will not be charged. And
>> pc->mem_cgroup is unset even if the page will be used/mapped and added
>> to LRU. To avoid this,  this patch charges such pages to root_mem_cgroup,
>> then, pc->mem_cgroup will be handled correctly.
>>
>> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>> ---
>>  mm/memcontrol.c |    2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 0d6d21c..9268e8e 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -2324,7 +2324,7 @@ nomem:
>>       *ptr = NULL;
>>       return -ENOMEM;
>>  bypass:
>> -     *ptr = NULL;
>> +     *ptr = root_mem_cgroup;
>>       return 0;
>>  }
>>
>> --
>
Thank you for review.

> I'm dubious about this patch: certainly you have not fully justified it.
>
I sometimes see panics (in !pc->mem_cgroup check in lru code)
when I stops test programs by Ctrl-C or some. That was because
of this path. I checked this by adding a debug code to make
pc->mem_cgroup = NULL in prep_new_page.

> I speak from experience: I did *exactly* the same at "bypass" when
> I introduced our mem_cgroup_reset_page(), which corresponds to your
> mem_cgroup_reset_owner(); it seemed right to me that a successful
> (return 0) call to try_charge() should provide a good *ptr.
>
ok.

> But others (Ying and Greg) pointed out that it changes the semantics
> of __mem_cgroup_try_charge() in this case, so you need to justify the
> change to all those places which do something like "if (ret || !memcg)"
> after calling it.  Perhaps it is a good change everywhere, but that's
> not obvious, so we chose caution.
>

> Doesn't it lead to bypass pages being marked as charged to root, so
> they don't get charged to the right owner next time they're touched?
>
Yes. You're right.
Hm. So, it seems I should add reset_owner() to the !memcg path
rather than here.

> In our internal kernel, I restored "bypass" to set *ptr = NULL as
> before, but routed those callers that need it to continue on to
> __mem_cgroup_commit_charge() when it's NULL, and let that do a
> quick little mem_cgroup_reset_page() to root_mem_cgroup for this.
>
Yes, I'll prepare v2.

> But I was growing tired of mem_cgroup_reset_page() when I prepared
> the rollup I posted two weeks ago, it adds overhead where we don't
> want it, so I found a way to avoid it completely.
>
Hmm.

> What you're doing with mem_cgroup_reset_owner() seems reasonable to
> me as a phase to go through (though there's probably more callsites
> to be found - sorry to be unhelpfully mysterious about that, but
> just because per-memcg-lru-locking needed them doesn't imply that
> your patchset needs them), but I expect to (offer a patch to) remove
> it later.
>
Sure. I'm now considering, finally, after removing pc->flags,
we'll have chance to merge page_cgroup to struct page. If so,
reseting pc->mem_cgroup in prep_new_page() will be a choice.

> I am intending to rebase upon your patches, or at least the ones
> which akpm has already taken in (I've not studied the pcg flag ones,
> more noise than I want at the moment).  I'm waiting for those to
> appear in a linux-next, disappointed that they weren't in today's.
>
> (But I'm afraid my patches will then clash with Mel's new lru work.)
>
> I have been running successfully on several machines with an
> approximation to what I expect linux-next to be when it has your
> patches in.  Ran very stably on two, but one hangs in reclaim after
> a few hours, that's high on my list to investigate (you made no
> change to vmscan.c, maybe the problem comes from Hannes's earlier
> patches, but I hadn't noticed it with those alone).
>

I saw file caches are not reclaimed at all by force_empty...only once.
I'm now digging it.

Thank you.
-Kame

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] memcg: reset to root_mem_cgroup at bypassing
  2011-12-20  0:24   ` Hiroyuki Kamezawa
@ 2011-12-21  0:13     ` KAMEZAWA Hiroyuki
  2011-12-21  3:25       ` Hugh Dickins
  0 siblings, 1 reply; 11+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-21  0:13 UTC (permalink / raw)
  To: Hiroyuki Kamezawa
  Cc: Hugh Dickins, akpm, linux-mm, cgroups, linux-kernel, hannes,
	Michal Hocko

On Tue, 20 Dec 2011 09:24:47 +0900
Hiroyuki Kamezawa <kamezawa.hiroyuki@gmail.com> wrote:

> 2011/12/20 Hugh Dickins <hughd@google.com>:
> > On Mon, 19 Dec 2011, KAMEZAWA Hiroyuki wrote:
> >> From d620ff605a3a592c2b1de3a046498ce5cd3d3c50 Mon Sep 17 00:00:00 2001
> >> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> >> Date: Mon, 19 Dec 2011 16:55:10 +0900
> >> Subject: [PATCH 2/2] memcg: reset lru to root_mem_cgroup in special cases.
> >>
> >> This patch is a fix for memcg-simplify-lru-handling-by-new-rule.patch
> >>
> >> After the patch, all pages which will be onto LRU must have sane
> >> pc->mem_cgroup. But, in special case, it's not set.
> >>
> >> If task->mm is NULL or task is TIF_MEMDIE or fatal_signal_pending(),
> >> try_charge() is bypassed and the new charge will not be charged. And
> >> pc->mem_cgroup is unset even if the page will be used/mapped and added
> >> to LRU. To avoid this,  this patch charges such pages to root_mem_cgroup,
> >> then, pc->mem_cgroup will be handled correctly.
> >>
> >> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> >> ---
> >>  mm/memcontrol.c |    2 +-
> >>  1 files changed, 1 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> >> index 0d6d21c..9268e8e 100644
> >> --- a/mm/memcontrol.c
> >> +++ b/mm/memcontrol.c
> >> @@ -2324,7 +2324,7 @@ nomem:
> >>       *ptr = NULL;
> >>       return -ENOMEM;
> >>  bypass:
> >> -     *ptr = NULL;
> >> +     *ptr = root_mem_cgroup;
> >>       return 0;
> >>  }
> >>
> >> --
> >
> Thank you for review.
> 
> > I'm dubious about this patch: certainly you have not fully justified it.
> >
> I sometimes see panics (in !pc->mem_cgroup check in lru code)
> when I stops test programs by Ctrl-C or some. That was because
> of this path. I checked this by adding a debug code to make
> pc->mem_cgroup = NULL in prep_new_page.
> 
> > I speak from experience: I did *exactly* the same at "bypass" when
> > I introduced our mem_cgroup_reset_page(), which corresponds to your
> > mem_cgroup_reset_owner(); it seemed right to me that a successful
> > (return 0) call to try_charge() should provide a good *ptr.
> >
> ok.
> 
> > But others (Ying and Greg) pointed out that it changes the semantics
> > of __mem_cgroup_try_charge() in this case, so you need to justify the
> > change to all those places which do something like "if (ret || !memcg)"
> > after calling it.  Perhaps it is a good change everywhere, but that's
> > not obvious, so we chose caution.
> >
> 
> > Doesn't it lead to bypass pages being marked as charged to root, so
> > they don't get charged to the right owner next time they're touched?
> >
> Yes. You're right.
> Hm. So, it seems I should add reset_owner() to the !memcg path
> rather than here.
> 
Considering this again..

Now, we catch 'charge' event only once in lifetime of anon/file page.
So, it doesn't depend on that it's marked as PCG_USED or not.




> > In our internal kernel, I restored "bypass" to set *ptr = NULL as
> > before, but routed those callers that need it to continue on to
> > __mem_cgroup_commit_charge() when it's NULL, and let that do a
> > quick little mem_cgroup_reset_page() to root_mem_cgroup for this.
> >
> Yes, I'll prepare v2.
> 

But ok, I'll go this way with some more description.

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] memcg: reset to root_mem_cgroup at bypassing
  2011-12-21  0:13     ` KAMEZAWA Hiroyuki
@ 2011-12-21  3:25       ` Hugh Dickins
  2011-12-21  4:00         ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 11+ messages in thread
From: Hugh Dickins @ 2011-12-21  3:25 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hiroyuki Kamezawa, akpm, linux-mm, cgroups, linux-kernel, hannes,
	Michal Hocko

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1655 bytes --]

On Wed, 21 Dec 2011, KAMEZAWA Hiroyuki wrote:
> On Tue, 20 Dec 2011 09:24:47 +0900
> Hiroyuki Kamezawa <kamezawa.hiroyuki@gmail.com> wrote:
> > 2011/12/20 Hugh Dickins <hughd@google.com>:
> > 
> > > I speak from experience: I did *exactly* the same at "bypass" when
> > > I introduced our mem_cgroup_reset_page(), which corresponds to your
> > > mem_cgroup_reset_owner(); it seemed right to me that a successful
> > > (return 0) call to try_charge() should provide a good *ptr.
> > >
> > ok.
> > 
> > > But others (Ying and Greg) pointed out that it changes the semantics
> > > of __mem_cgroup_try_charge() in this case, so you need to justify the
> > > change to all those places which do something like "if (ret || !memcg)"
> > > after calling it.  Perhaps it is a good change everywhere, but that's
> > > not obvious, so we chose caution.
> > >
> > > Doesn't it lead to bypass pages being marked as charged to root, so
> > > they don't get charged to the right owner next time they're touched?
> > >
> > Yes. You're right.
> > Hm. So, it seems I should add reset_owner() to the !memcg path
> > rather than here.
> > 
> Considering this again..
> 
> Now, we catch 'charge' event only once in lifetime of anon/file page.
> So, it doesn't depend on that it's marked as PCG_USED or not.

That's an interesting argument, I hadn't been looking at it that way.
It's not true of swapcache, but I guess we don't need to preserve its
peculiarities in this case.

I've not checked the (ret || !memcg) cases yet to see if any change
needed there.

I certainly like that the success return guarantees that memcg is set.

Hugh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] memcg: reset to root_mem_cgroup at bypassing
  2011-12-21  3:25       ` Hugh Dickins
@ 2011-12-21  4:00         ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 11+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-21  4:00 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Hiroyuki Kamezawa, akpm, linux-mm, cgroups, linux-kernel, hannes,
	Michal Hocko

On Tue, 20 Dec 2011 19:25:04 -0800 (PST)
Hugh Dickins <hughd@google.com> wrote:

> On Wed, 21 Dec 2011, KAMEZAWA Hiroyuki wrote:
> > On Tue, 20 Dec 2011 09:24:47 +0900
> > Hiroyuki Kamezawa <kamezawa.hiroyuki@gmail.com> wrote:
> > > 2011/12/20 Hugh Dickins <hughd@google.com>:
> > > 
> > > > I speak from experience: I did *exactly* the same at "bypass" when
> > > > I introduced our mem_cgroup_reset_page(), which corresponds to your
> > > > mem_cgroup_reset_owner(); it seemed right to me that a successful
> > > > (return 0) call to try_charge() should provide a good *ptr.
> > > >
> > > ok.
> > > 
> > > > But others (Ying and Greg) pointed out that it changes the semantics
> > > > of __mem_cgroup_try_charge() in this case, so you need to justify the
> > > > change to all those places which do something like "if (ret || !memcg)"
> > > > after calling it.  Perhaps it is a good change everywhere, but that's
> > > > not obvious, so we chose caution.
> > > >
> > > > Doesn't it lead to bypass pages being marked as charged to root, so
> > > > they don't get charged to the right owner next time they're touched?
> > > >
> > > Yes. You're right.
> > > Hm. So, it seems I should add reset_owner() to the !memcg path
> > > rather than here.
> > > 
> > Considering this again..
> > 
> > Now, we catch 'charge' event only once in lifetime of anon/file page.
> > So, it doesn't depend on that it's marked as PCG_USED or not.
> 
> That's an interesting argument, I hadn't been looking at it that way.
> It's not true of swapcache, but I guess we don't need to preserve its
> peculiarities in this case.
> 
> I've not checked the (ret || !memcg) cases yet to see if any change
> needed there.
> 
> I certainly like that the success return guarantees that memcg is set.
> 
Hmm. Ok, then....I'll update patch description to be precise.
And check I can remove !memcg case in the same patch. Then, repost v2.

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2] memcg: return -EINTR at bypassing try_charge().
  2011-12-19  7:51 [PATCH] memcg: reset to root_mem_cgroup at bypassing KAMEZAWA Hiroyuki
  2011-12-19 20:49 ` Hugh Dickins
@ 2011-12-21  8:24 ` KAMEZAWA Hiroyuki
  2011-12-21  9:57   ` Johannes Weiner
  2011-12-21 10:29   ` [PATCH v3] " KAMEZAWA Hiroyuki
  1 sibling, 2 replies; 11+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-21  8:24 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: akpm, linux-mm, cgroups, linux-kernel, hannes, Michal Hocko,
	Hugh Dickins

How about this ?
--
>From 6076425613f594d442c58a5d463c09f8309236aa Mon Sep 17 00:00:00 2001
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Date: Wed, 21 Dec 2011 16:27:25 +0900
Subject: [PATCH] memcg: return -EINTR at bypassing try_charge().

This patch is a fix for memcg-simplify-lru-handling-by-new-rule.patch
When running testprogram and stop it by Ctrl-C, add_lru/del_lru
will find pc->mem_cgroup is NULL and get panic. The reason
is bypass code in try_charge().

At try_charge(), it checks the thread is fatal or not as..
fatal_signal_pending() or TIF_MEMDIE. In this case, __try_charge()
returns 0(success) with setting *ptr as NULL.

Now, lruvec are deteremined by pc->mem_cgroup. So, it's better
to reset pc->mem_cgroup as root_mem_cgroup. This patch does
following change in try_charge()
  1. return -EINTR at bypassing.
  2. set *ptr = root_mem_cgroup at bypassing.

By this change, in page fault / radix-tree-insert path,
the page will be charged against root_mem_cgroup and the thread's
operations will go ahead without trouble. In other path,
migration or move_account etc..., -EINTR will stop the operation.
(may need some cleanup later..)

After this change, pc->mem_cgroup will have valid pointer if
the page is used.

Changelog: v1 -> v2
 - returns -EINTR at bypassing.
 - change error code handling at callers.
 - changed the name of patch.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c |   53 +++++++++++++++++++++++++++++++++++++++++------------
 1 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 9175097..3c6eb7e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2185,6 +2185,23 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
 }
 
 /*
+ * __mem_cgroup_try_charge() does
+ * 1. detect memcg to be charged against from passed *mm and *ptr,
+ * 2. update res_counter
+ * 3. call memory reclaim if necessary.
+ *
+ * In some special case, if the task is fatal, fatal_signal_pending() or
+ * TIF_MEMDIE or ->mm is NULL, this functoion returns -EINTR with filling
+ * *ptr as root_mem_cgroup. There are 2 reasons for this. 1st is that
+ * fatal threads should quit as soon as possible without any hazards.
+ * 2nd is that all page should have valid pc->mem_cgroup if it will be
+ * used.
+ *
+ * So, try_charge will return
+ *  0       ...  at success. filling *ptr with a valid memcg pointer.
+ *  -ENOMEM ...  charge failure because of resource limits.
+ *  -EINTR  ...  if thread is fatal. *ptr is filled with root_mem_cgroup.
+ *
  * Unlike exported interface, "oom" parameter is added. if oom==true,
  * oom-killer can be invoked.
  */
@@ -2316,8 +2333,8 @@ nomem:
 	*ptr = NULL;
 	return -ENOMEM;
 bypass:
-	*ptr = NULL;
-	return 0;
+	*ptr = root_mem_cgroup;
+	return -EINTR;
 }
 
 /*
@@ -2564,7 +2581,7 @@ static int mem_cgroup_move_parent(struct page *page,
 {
 	struct cgroup *cg = child->css.cgroup;
 	struct cgroup *pcg = cg->parent;
-	struct mem_cgroup *parent;
+	struct mem_cgroup *parent, *ptr;
 	unsigned int nr_pages;
 	unsigned long uninitialized_var(flags);
 	int ret;
@@ -2582,8 +2599,8 @@ static int mem_cgroup_move_parent(struct page *page,
 	nr_pages = hpage_nr_pages(page);
 
 	parent = mem_cgroup_from_cont(pcg);
-	ret = __mem_cgroup_try_charge(NULL, gfp_mask, nr_pages, &parent, false);
-	if (ret || !parent)
+	ret = __mem_cgroup_try_charge(NULL, gfp_mask, nr_pages, &ptr, false);
+	if (ret)
 		goto put_back;
 
 	if (nr_pages > 1)
@@ -2630,9 +2647,9 @@ static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
 
 	pc = lookup_page_cgroup(page);
 	ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom);
-	if (ret || !memcg)
+	if (ret == -ENOMEM)
 		return ret;
-
+	/* we'll bypass -EINTR case and charge this page to root memcg */
 	__mem_cgroup_commit_charge(memcg, page, nr_pages, pc, ctype);
 	return 0;
 }
@@ -2703,6 +2720,7 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 		ret = mem_cgroup_charge_common(page, mm, gfp_mask, type);
 	else { /* page is swapcache/shmem */
 		ret = mem_cgroup_try_charge_swapin(mm, page, gfp_mask, &memcg);
+		/* see try_charge_swapi() for -EINTR case */
 		if (!ret)
 			__mem_cgroup_commit_charge_swapin(page, memcg, type);
 	}
@@ -2743,11 +2761,21 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm,
 	*memcgp = memcg;
 	ret = __mem_cgroup_try_charge(NULL, mask, 1, memcgp, true);
 	css_put(&memcg->css);
+	/*
+	 * If this thread is fatal, charge against root cgroup and allow
+	 * this thread to exit in quick manner. EINTR is not handled
+	 * in page fault path. So, just bypass this.
+	 */
+	if (ret == -EINTR)
+		ret = 0;
 	return ret;
 charge_cur_mm:
 	if (unlikely(!mm))
 		mm = &init_mm;
-	return __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
+	ret = __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
+	if (ret == -EINTR)
+		ret = 0;
+	return ret;
 }
 
 static void
@@ -3205,7 +3233,7 @@ int mem_cgroup_prepare_migration(struct page *page,
 	*memcgp = memcg;
 	ret = __mem_cgroup_try_charge(NULL, gfp_mask, 1, memcgp, false);
 	css_put(&memcg->css);/* drop extra refcnt */
-	if (ret || *memcgp == NULL) {
+	if (ret) {
 		if (PageAnon(page)) {
 			lock_page_cgroup(pc);
 			ClearPageCgroupMigration(pc);
@@ -3215,6 +3243,7 @@ int mem_cgroup_prepare_migration(struct page *page,
 			 */
 			mem_cgroup_uncharge_page(page);
 		}
+		/* we'll need to revisit this error code (we have -EINTR) */
 		return -ENOMEM;
 	}
 	/*
@@ -3633,7 +3662,7 @@ static int mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
 		pc = lookup_page_cgroup(page);
 
 		ret = mem_cgroup_move_parent(page, pc, memcg, GFP_KERNEL);
-		if (ret == -ENOMEM)
+		if (ret == -ENOMEM || ret == -EINTR)
 			break;
 
 		if (ret == -EBUSY || ret == -EINVAL) {
@@ -5091,9 +5120,9 @@ one_by_one:
 		}
 		ret = __mem_cgroup_try_charge(NULL,
 					GFP_KERNEL, 1, &memcg, false);
-		if (ret || !memcg)
+		if (ret)
 			/* mem_cgroup_clear_mc() will do uncharge later */
-			return -ENOMEM;
+			return ret;
 		mc.precharge++;
 	}
 	return ret;
-- 
1.7.4.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] memcg: return -EINTR at bypassing try_charge().
  2011-12-21  8:24 ` [PATCH v2] memcg: return -EINTR at bypassing try_charge() KAMEZAWA Hiroyuki
@ 2011-12-21  9:57   ` Johannes Weiner
  2011-12-21 10:08     ` KAMEZAWA Hiroyuki
  2011-12-21 10:29   ` [PATCH v3] " KAMEZAWA Hiroyuki
  1 sibling, 1 reply; 11+ messages in thread
From: Johannes Weiner @ 2011-12-21  9:57 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: akpm, linux-mm, cgroups, linux-kernel, Michal Hocko, Hugh Dickins

On Wed, Dec 21, 2011 at 05:24:23PM +0900, KAMEZAWA Hiroyuki wrote:
> How about this ?
> --
> >From 6076425613f594d442c58a5d463c09f8309236aa Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Wed, 21 Dec 2011 16:27:25 +0900
> Subject: [PATCH] memcg: return -EINTR at bypassing try_charge().
> 
> This patch is a fix for memcg-simplify-lru-handling-by-new-rule.patch
> When running testprogram and stop it by Ctrl-C, add_lru/del_lru
> will find pc->mem_cgroup is NULL and get panic. The reason
> is bypass code in try_charge().
> 
> At try_charge(), it checks the thread is fatal or not as..
> fatal_signal_pending() or TIF_MEMDIE. In this case, __try_charge()
> returns 0(success) with setting *ptr as NULL.
> 
> Now, lruvec are deteremined by pc->mem_cgroup. So, it's better
> to reset pc->mem_cgroup as root_mem_cgroup. This patch does
> following change in try_charge()
>   1. return -EINTR at bypassing.
>   2. set *ptr = root_mem_cgroup at bypassing.
> 
> By this change, in page fault / radix-tree-insert path,
> the page will be charged against root_mem_cgroup and the thread's
> operations will go ahead without trouble. In other path,
> migration or move_account etc..., -EINTR will stop the operation.
> (may need some cleanup later..)
> 
> After this change, pc->mem_cgroup will have valid pointer if
> the page is used.
> 
> Changelog: v1 -> v2
>  - returns -EINTR at bypassing.
>  - change error code handling at callers.
>  - changed the name of patch.
> 
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
>  mm/memcontrol.c |   53 +++++++++++++++++++++++++++++++++++++++++------------
>  1 files changed, 41 insertions(+), 12 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 9175097..3c6eb7e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2185,6 +2185,23 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  }
>  
>  /*
> + * __mem_cgroup_try_charge() does
> + * 1. detect memcg to be charged against from passed *mm and *ptr,
> + * 2. update res_counter
> + * 3. call memory reclaim if necessary.
> + *
> + * In some special case, if the task is fatal, fatal_signal_pending() or
> + * TIF_MEMDIE or ->mm is NULL, this functoion returns -EINTR with filling
> + * *ptr as root_mem_cgroup. There are 2 reasons for this. 1st is that
> + * fatal threads should quit as soon as possible without any hazards.
> + * 2nd is that all page should have valid pc->mem_cgroup if it will be
> + * used.
> + *
> + * So, try_charge will return
> + *  0       ...  at success. filling *ptr with a valid memcg pointer.
> + *  -ENOMEM ...  charge failure because of resource limits.
> + *  -EINTR  ...  if thread is fatal. *ptr is filled with root_mem_cgroup.
> + *
>   * Unlike exported interface, "oom" parameter is added. if oom==true,
>   * oom-killer can be invoked.
>   */
> @@ -2316,8 +2333,8 @@ nomem:
>  	*ptr = NULL;
>  	return -ENOMEM;
>  bypass:
> -	*ptr = NULL;
> -	return 0;
> +	*ptr = root_mem_cgroup;
> +	return -EINTR;
>  }

What about this case:

	/*
	 * We always charge the cgroup the mm_struct belongs to.
	 * The mm_struct's mem_cgroup changes on task migration if the
	 * thread group leader migrates. It's possible that mm is not
	 * set, if so charge the init_mm (happens for pagecache usage).
	 */
	if (!*ptr && !mm)
		goto bypass;

> @@ -2564,7 +2581,7 @@ static int mem_cgroup_move_parent(struct page *page,
>  {
>  	struct cgroup *cg = child->css.cgroup;
>  	struct cgroup *pcg = cg->parent;
> -	struct mem_cgroup *parent;
> +	struct mem_cgroup *parent, *ptr;
>  	unsigned int nr_pages;
>  	unsigned long uninitialized_var(flags);
>  	int ret;
> @@ -2582,8 +2599,8 @@ static int mem_cgroup_move_parent(struct page *page,
>  	nr_pages = hpage_nr_pages(page);
>  
>  	parent = mem_cgroup_from_cont(pcg);
> -	ret = __mem_cgroup_try_charge(NULL, gfp_mask, nr_pages, &parent, false);
> -	if (ret || !parent)
> +	ret = __mem_cgroup_try_charge(NULL, gfp_mask, nr_pages, &ptr, false);
> +	if (ret)
>  		goto put_back;

That doesn't seem right.  That unitilialized ptr is used in
try_charge(), so this may crash, and it should really charge against
parent.

> @@ -2630,9 +2647,9 @@ static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
>  
>  	pc = lookup_page_cgroup(page);
>  	ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom);
> -	if (ret || !memcg)
> +	if (ret == -ENOMEM)
>  		return ret;
> -
> +	/* we'll bypass -EINTR case and charge this page to root memcg */
>  	__mem_cgroup_commit_charge(memcg, page, nr_pages, pc, ctype);
>  	return 0;
>  }

This comment is not very useful.  WHY do we do this?  Maybe just copy
the comment from try_charge_swapin()?

> @@ -2703,6 +2720,7 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
>  		ret = mem_cgroup_charge_common(page, mm, gfp_mask, type);
>  	else { /* page is swapcache/shmem */
>  		ret = mem_cgroup_try_charge_swapin(mm, page, gfp_mask, &memcg);
> +		/* see try_charge_swapi() for -EINTR case */
>  		if (!ret)
>  			__mem_cgroup_commit_charge_swapin(page, memcg, type);
>  	}

Missing n in the comment.

> @@ -2743,11 +2761,21 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm,
>  	*memcgp = memcg;
>  	ret = __mem_cgroup_try_charge(NULL, mask, 1, memcgp, true);
>  	css_put(&memcg->css);
> +	/*
> +	 * If this thread is fatal, charge against root cgroup and allow
> +	 * this thread to exit in quick manner. EINTR is not handled
> +	 * in page fault path. So, just bypass this.
> +	 */
> +	if (ret == -EINTR)
> +		ret = 0;
>  	return ret;
>  charge_cur_mm:
>  	if (unlikely(!mm))
>  		mm = &init_mm;
> -	return __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
> +	ret = __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
> +	if (ret == -EINTR)
> +		ret = 0;
> +	return ret;
>  }
>  
>  static void

> @@ -3633,7 +3662,7 @@ static int mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
>  		pc = lookup_page_cgroup(page);
>  
>  		ret = mem_cgroup_move_parent(page, pc, memcg, GFP_KERNEL);
> -		if (ret == -ENOMEM)
> +		if (ret == -ENOMEM || ret == -EINTR)
>  			break;

if (ret)

?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] memcg: return -EINTR at bypassing try_charge().
  2011-12-21  9:57   ` Johannes Weiner
@ 2011-12-21 10:08     ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 11+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-21 10:08 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: akpm, linux-mm, cgroups, linux-kernel, Michal Hocko, Hugh Dickins

On Wed, 21 Dec 2011 10:57:08 +0100
Johannes Weiner <hannes@cmpxchg.org> wrote:

> On Wed, Dec 21, 2011 at 05:24:23PM +0900, KAMEZAWA Hiroyuki wrote:
> > How about this ?
> > --
> > >From 6076425613f594d442c58a5d463c09f8309236aa Mon Sep 17 00:00:00 2001
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > Date: Wed, 21 Dec 2011 16:27:25 +0900
> > Subject: [PATCH] memcg: return -EINTR at bypassing try_charge().
> > 
> > This patch is a fix for memcg-simplify-lru-handling-by-new-rule.patch
> > When running testprogram and stop it by Ctrl-C, add_lru/del_lru
> > will find pc->mem_cgroup is NULL and get panic. The reason
> > is bypass code in try_charge().
> > 
> > At try_charge(), it checks the thread is fatal or not as..
> > fatal_signal_pending() or TIF_MEMDIE. In this case, __try_charge()
> > returns 0(success) with setting *ptr as NULL.
> > 
> > Now, lruvec are deteremined by pc->mem_cgroup. So, it's better
> > to reset pc->mem_cgroup as root_mem_cgroup. This patch does
> > following change in try_charge()
> >   1. return -EINTR at bypassing.
> >   2. set *ptr = root_mem_cgroup at bypassing.
> > 
> > By this change, in page fault / radix-tree-insert path,
> > the page will be charged against root_mem_cgroup and the thread's
> > operations will go ahead without trouble. In other path,
> > migration or move_account etc..., -EINTR will stop the operation.
> > (may need some cleanup later..)
> > 
> > After this change, pc->mem_cgroup will have valid pointer if
> > the page is used.
> > 
> > Changelog: v1 -> v2
> >  - returns -EINTR at bypassing.
> >  - change error code handling at callers.
> >  - changed the name of patch.
> > 
> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > ---
> >  mm/memcontrol.c |   53 +++++++++++++++++++++++++++++++++++++++++------------
> >  1 files changed, 41 insertions(+), 12 deletions(-)
> > 
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 9175097..3c6eb7e 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -2185,6 +2185,23 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
> >  }
> >  
> >  /*
> > + * __mem_cgroup_try_charge() does
> > + * 1. detect memcg to be charged against from passed *mm and *ptr,
> > + * 2. update res_counter
> > + * 3. call memory reclaim if necessary.
> > + *
> > + * In some special case, if the task is fatal, fatal_signal_pending() or
> > + * TIF_MEMDIE or ->mm is NULL, this functoion returns -EINTR with filling
> > + * *ptr as root_mem_cgroup. There are 2 reasons for this. 1st is that
> > + * fatal threads should quit as soon as possible without any hazards.
> > + * 2nd is that all page should have valid pc->mem_cgroup if it will be
> > + * used.
> > + *
> > + * So, try_charge will return
> > + *  0       ...  at success. filling *ptr with a valid memcg pointer.
> > + *  -ENOMEM ...  charge failure because of resource limits.
> > + *  -EINTR  ...  if thread is fatal. *ptr is filled with root_mem_cgroup.
> > + *
> >   * Unlike exported interface, "oom" parameter is added. if oom==true,
> >   * oom-killer can be invoked.
> >   */
> > @@ -2316,8 +2333,8 @@ nomem:
> >  	*ptr = NULL;
> >  	return -ENOMEM;
> >  bypass:
> > -	*ptr = NULL;
> > -	return 0;
> > +	*ptr = root_mem_cgroup;
> > +	return -EINTR;
> >  }
> 
> What about this case:
> 
> 	/*
> 	 * We always charge the cgroup the mm_struct belongs to.
> 	 * The mm_struct's mem_cgroup changes on task migration if the
> 	 * thread group leader migrates. It's possible that mm is not
> 	 * set, if so charge the init_mm (happens for pagecache usage).
> 	 */
> 	if (!*ptr && !mm)
> 		goto bypass;
> 

IIUC, task->mm is NULL when the task is about to exit.
So, charge to root_mem_cgroup will be enough good.
I'm not sure how 'mm_struct's mem_cgroup changes on.....' cases affect
this path...




> > @@ -2564,7 +2581,7 @@ static int mem_cgroup_move_parent(struct page *page,
> >  {
> >  	struct cgroup *cg = child->css.cgroup;
> >  	struct cgroup *pcg = cg->parent;
> > -	struct mem_cgroup *parent;
> > +	struct mem_cgroup *parent, *ptr;
> >  	unsigned int nr_pages;
> >  	unsigned long uninitialized_var(flags);
> >  	int ret;
> > @@ -2582,8 +2599,8 @@ static int mem_cgroup_move_parent(struct page *page,
> >  	nr_pages = hpage_nr_pages(page);
> >  
> >  	parent = mem_cgroup_from_cont(pcg);
> > -	ret = __mem_cgroup_try_charge(NULL, gfp_mask, nr_pages, &parent, false);
> > -	if (ret || !parent)
> > +	ret = __mem_cgroup_try_charge(NULL, gfp_mask, nr_pages, &ptr, false);
> > +	if (ret)
> >  		goto put_back;
> 
> That doesn't seem right.  That unitilialized ptr is used in
> try_charge(), so this may crash, and it should really charge against
> parent.

Ah, yes. my mistake.



> > @@ -2630,9 +2647,9 @@ static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
> >  
> >  	pc = lookup_page_cgroup(page);
> >  	ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom);
> > -	if (ret || !memcg)
> > +	if (ret == -ENOMEM)
> >  		return ret;
> > -
> > +	/* we'll bypass -EINTR case and charge this page to root memcg */
> >  	__mem_cgroup_commit_charge(memcg, page, nr_pages, pc, ctype);
> >  	return 0;
> >  }
> 
> This comment is not very useful.  WHY do we do this?  Maybe just copy
> the comment from try_charge_swapin()?
> 
Hmm, ok. I'll remove this comment.


> > @@ -2703,6 +2720,7 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
> >  		ret = mem_cgroup_charge_common(page, mm, gfp_mask, type);
> >  	else { /* page is swapcache/shmem */
> >  		ret = mem_cgroup_try_charge_swapin(mm, page, gfp_mask, &memcg);
> > +		/* see try_charge_swapi() for -EINTR case */
> >  		if (!ret)
> >  			__mem_cgroup_commit_charge_swapin(page, memcg, type);
> >  	}
> 
> Missing n in the comment.
> 

will remove this, too.


> > @@ -2743,11 +2761,21 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm,
> >  	*memcgp = memcg;
> >  	ret = __mem_cgroup_try_charge(NULL, mask, 1, memcgp, true);
> >  	css_put(&memcg->css);
> > +	/*
> > +	 * If this thread is fatal, charge against root cgroup and allow
> > +	 * this thread to exit in quick manner. EINTR is not handled
> > +	 * in page fault path. So, just bypass this.
> > +	 */
> > +	if (ret == -EINTR)
> > +		ret = 0;
> >  	return ret;
> >  charge_cur_mm:
> >  	if (unlikely(!mm))
> >  		mm = &init_mm;
> > -	return __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
> > +	ret = __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
> > +	if (ret == -EINTR)
> > +		ret = 0;
> > +	return ret;
> >  }
> >  
> >  static void
> 
> > @@ -3633,7 +3662,7 @@ static int mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
> >  		pc = lookup_page_cgroup(page);
> >  
> >  		ret = mem_cgroup_move_parent(page, pc, memcg, GFP_KERNEL);
> > -		if (ret == -ENOMEM)
> > +		if (ret == -ENOMEM || ret == -EINTR)
> >  			break;
> 
> if (ret)
> 

-EBUSY check is below this code.

I'll prepare v3. 

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3] memcg: return -EINTR at bypassing try_charge().
  2011-12-21  8:24 ` [PATCH v2] memcg: return -EINTR at bypassing try_charge() KAMEZAWA Hiroyuki
  2011-12-21  9:57   ` Johannes Weiner
@ 2011-12-21 10:29   ` KAMEZAWA Hiroyuki
  2011-12-21 12:28     ` Johannes Weiner
  1 sibling, 1 reply; 11+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-12-21 10:29 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: akpm, linux-mm, cgroups, linux-kernel, hannes, Michal Hocko,
	Hugh Dickins

Thank you for review.
I'm sorry if my response is delayed.
==
>From 1e8c917c64b3947d2e54c6e5073d53d80bd97c30 Mon Sep 17 00:00:00 2001
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Date: Wed, 21 Dec 2011 16:27:25 +0900
Subject: [PATCH] memcg: return -EINTR at bypassing try_charge().

This patch is a fix for memcg-simplify-lru-handling-by-new-rule.patch
When running testprogram and stop it by Ctrl-C, add_lru/del_lru
will find pc->mem_cgroup is NULL and get panic. The reason
is bypass code in try_charge().

At try_charge(), it checks the thread is fatal or not as..
fatal_signal_pending() or TIF_MEMDIE. In this case, __try_charge()
returns 0(success) with setting *ptr as NULL.

Now, lruvec are deteremined by pc->mem_cgroup. So, it's better
to reset pc->mem_cgroup as root_mem_cgroup. This patch does
following change in try_charge()
  1. return -EINTR at bypassing.
  2. set *ptr = root_mem_cgroup at bypassing.

By this change, in page fault / radix-tree-insert path,
the page will be charged against root_mem_cgroup and the thread's
operations will go ahead without trouble. In other path,
migration or move_account etc..., -EINTR will stop the operation.
(may need some cleanup later..)

After this change, pc->mem_cgroup will have valid pointer if
the page is used.

Changelog: v2 -> v3
 - handle !mm case in another way.
 - removed redundant commments
 - fixed move_parent bug of uninitialized pointer
Changelog: v1 -> v2
 - returns -EINTR at bypassing.
 - change error code handling at callers.
 - changed the name of patch.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c |   45 ++++++++++++++++++++++++++++++++++-----------
 1 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 9175097..5f78c99 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2185,6 +2185,24 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
 }
 
 /*
+ * __mem_cgroup_try_charge() does
+ * 1. detect memcg to be charged against from passed *mm and *ptr,
+ * 2. update res_counter
+ * 3. call memory reclaim if necessary.
+ *
+ * In some special case, if the task is fatal, fatal_signal_pending() or
+ * TIF_MEMDIE, this functoion returns -EINTR with filling *ptr as
+ * root_mem_cgroup. There are 2 reasons for this. 1st is that
+ * fatal threads should quit as soon as possible without any hazards.
+ * 2nd is that all page should have valid pc->mem_cgroup if it will be
+ * used. If mm is NULL and the caller doesn't pass valid memcg pointer,
+ * that's treated as charge to root_mem_cgroup.
+ *
+ * So, try_charge will return
+ *  0       ...  at success. filling *ptr with a valid memcg pointer.
+ *  -ENOMEM ...  charge failure because of resource limits.
+ *  -EINTR  ...  if thread is fatal. *ptr is filled with root_mem_cgroup.
+ *
  * Unlike exported interface, "oom" parameter is added. if oom==true,
  * oom-killer can be invoked.
  */
@@ -2215,7 +2233,7 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
 	 * set, if so charge the init_mm (happens for pagecache usage).
 	 */
 	if (!*ptr && !mm)
-		goto bypass;
+		*ptr = root_mem_cgroup;
 again:
 	if (*ptr) { /* css should be a valid one */
 		memcg = *ptr;
@@ -2316,8 +2334,8 @@ nomem:
 	*ptr = NULL;
 	return -ENOMEM;
 bypass:
-	*ptr = NULL;
-	return 0;
+	*ptr = root_mem_cgroup;
+	return -EINTR;
 }
 
 /*
@@ -2583,7 +2601,7 @@ static int mem_cgroup_move_parent(struct page *page,
 
 	parent = mem_cgroup_from_cont(pcg);
 	ret = __mem_cgroup_try_charge(NULL, gfp_mask, nr_pages, &parent, false);
-	if (ret || !parent)
+	if (ret)
 		goto put_back;
 
 	if (nr_pages > 1)
@@ -2630,9 +2648,8 @@ static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
 
 	pc = lookup_page_cgroup(page);
 	ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom);
-	if (ret || !memcg)
+	if (ret == -ENOMEM)
 		return ret;
-
 	__mem_cgroup_commit_charge(memcg, page, nr_pages, pc, ctype);
 	return 0;
 }
@@ -2743,11 +2760,16 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm,
 	*memcgp = memcg;
 	ret = __mem_cgroup_try_charge(NULL, mask, 1, memcgp, true);
 	css_put(&memcg->css);
+	if (ret == -EINTR)
+		ret = 0;
 	return ret;
 charge_cur_mm:
 	if (unlikely(!mm))
 		mm = &init_mm;
-	return __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
+	ret = __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
+	if (ret == -EINTR)
+		ret = 0;
+	return ret;
 }
 
 static void
@@ -3205,7 +3227,7 @@ int mem_cgroup_prepare_migration(struct page *page,
 	*memcgp = memcg;
 	ret = __mem_cgroup_try_charge(NULL, gfp_mask, 1, memcgp, false);
 	css_put(&memcg->css);/* drop extra refcnt */
-	if (ret || *memcgp == NULL) {
+	if (ret) {
 		if (PageAnon(page)) {
 			lock_page_cgroup(pc);
 			ClearPageCgroupMigration(pc);
@@ -3215,6 +3237,7 @@ int mem_cgroup_prepare_migration(struct page *page,
 			 */
 			mem_cgroup_uncharge_page(page);
 		}
+		/* we'll need to revisit this error code (we have -EINTR) */
 		return -ENOMEM;
 	}
 	/*
@@ -3633,7 +3656,7 @@ static int mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
 		pc = lookup_page_cgroup(page);
 
 		ret = mem_cgroup_move_parent(page, pc, memcg, GFP_KERNEL);
-		if (ret == -ENOMEM)
+		if (ret == -ENOMEM || ret == -EINTR)
 			break;
 
 		if (ret == -EBUSY || ret == -EINVAL) {
@@ -5091,9 +5114,9 @@ one_by_one:
 		}
 		ret = __mem_cgroup_try_charge(NULL,
 					GFP_KERNEL, 1, &memcg, false);
-		if (ret || !memcg)
+		if (ret)
 			/* mem_cgroup_clear_mc() will do uncharge later */
-			return -ENOMEM;
+			return ret;
 		mc.precharge++;
 	}
 	return ret;
-- 
1.7.4.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] memcg: return -EINTR at bypassing try_charge().
  2011-12-21 10:29   ` [PATCH v3] " KAMEZAWA Hiroyuki
@ 2011-12-21 12:28     ` Johannes Weiner
  0 siblings, 0 replies; 11+ messages in thread
From: Johannes Weiner @ 2011-12-21 12:28 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: akpm, linux-mm, cgroups, linux-kernel, Michal Hocko, Hugh Dickins

On Wed, Dec 21, 2011 at 07:29:34PM +0900, KAMEZAWA Hiroyuki wrote:
> Thank you for review.
> I'm sorry if my response is delayed.
> ==
> >From 1e8c917c64b3947d2e54c6e5073d53d80bd97c30 Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Wed, 21 Dec 2011 16:27:25 +0900
> Subject: [PATCH] memcg: return -EINTR at bypassing try_charge().
> 
> This patch is a fix for memcg-simplify-lru-handling-by-new-rule.patch
> When running testprogram and stop it by Ctrl-C, add_lru/del_lru
> will find pc->mem_cgroup is NULL and get panic. The reason
> is bypass code in try_charge().
> 
> At try_charge(), it checks the thread is fatal or not as..
> fatal_signal_pending() or TIF_MEMDIE. In this case, __try_charge()
> returns 0(success) with setting *ptr as NULL.
> 
> Now, lruvec are deteremined by pc->mem_cgroup. So, it's better
> to reset pc->mem_cgroup as root_mem_cgroup. This patch does
> following change in try_charge()
>   1. return -EINTR at bypassing.
>   2. set *ptr = root_mem_cgroup at bypassing.
> 
> By this change, in page fault / radix-tree-insert path,
> the page will be charged against root_mem_cgroup and the thread's
> operations will go ahead without trouble. In other path,
> migration or move_account etc..., -EINTR will stop the operation.
> (may need some cleanup later..)
> 
> After this change, pc->mem_cgroup will have valid pointer if
> the page is used.
> 
> Changelog: v2 -> v3
>  - handle !mm case in another way.
>  - removed redundant commments
>  - fixed move_parent bug of uninitialized pointer
> Changelog: v1 -> v2
>  - returns -EINTR at bypassing.
>  - change error code handling at callers.
>  - changed the name of patch.
> 
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Looks good now, thanks.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-12-21 12:28 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-19  7:51 [PATCH] memcg: reset to root_mem_cgroup at bypassing KAMEZAWA Hiroyuki
2011-12-19 20:49 ` Hugh Dickins
2011-12-20  0:24   ` Hiroyuki Kamezawa
2011-12-21  0:13     ` KAMEZAWA Hiroyuki
2011-12-21  3:25       ` Hugh Dickins
2011-12-21  4:00         ` KAMEZAWA Hiroyuki
2011-12-21  8:24 ` [PATCH v2] memcg: return -EINTR at bypassing try_charge() KAMEZAWA Hiroyuki
2011-12-21  9:57   ` Johannes Weiner
2011-12-21 10:08     ` KAMEZAWA Hiroyuki
2011-12-21 10:29   ` [PATCH v3] " KAMEZAWA Hiroyuki
2011-12-21 12:28     ` Johannes Weiner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).