All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUGFIX][PATCH] memcg: charge swapcache to proper memcg
@ 2009-03-10  1:07 Daisuke Nishimura
  2009-03-10  2:35 ` KAMEZAWA Hiroyuki
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Daisuke Nishimura @ 2009-03-10  1:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, Balbir Singh, KAMEZAWA Hiroyuki, Li Zefan,
	Hugh Dickins, Daisuke Nishimura

From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>

memcg_test.txt says at 4.1:

	This swap-in is one of the most complicated work. In do_swap_page(),
	following events occur when pte is unchanged.

	(1) the page (SwapCache) is looked up.
	(2) lock_page()
	(3) try_charge_swapin()
	(4) reuse_swap_page() (may call delete_swap_cache())
	(5) commit_charge_swapin()
	(6) swap_free().

	Considering following situation for example.

	(A) The page has not been charged before (2) and reuse_swap_page()
	    doesn't call delete_from_swap_cache().
	(B) The page has not been charged before (2) and reuse_swap_page()
	    calls delete_from_swap_cache().
	(C) The page has been charged before (2) and reuse_swap_page() doesn't
	    call delete_from_swap_cache().
	(D) The page has been charged before (2) and reuse_swap_page() calls
	    delete_from_swap_cache().

	    memory.usage/memsw.usage changes to this page/swp_entry will be
	 Case          (A)      (B)       (C)     (D)
         Event
       Before (2)     0/ 1     0/ 1      1/ 1    1/ 1
          ===========================================
          (3)        +1/+1    +1/+1     +1/+1   +1/+1
          (4)          -       0/ 0       -     -1/ 0
          (5)         0/-1     0/ 0     -1/-1    0/ 0
          (6)          -       0/-1       -      0/-1
          ===========================================
       Result         1/ 1     1/ 1      1/ 1    1/ 1

       In any cases, charges to this page should be 1/ 1.

In case of (D), mem_cgroup_try_get_from_swapcache() returns NULL
(because lookup_swap_cgroup() returns NULL), so "+1/+1" at (3) means
charges to the memcg("foo") to which the "current" belongs.
OTOH, "-1/0" at (4) and "0/-1" at (6) means uncharges from the memcg("baa")
to which the page has been charged.

So, if the "foo" and "baa" is different(for example because of task move),
this charge will be moved from "baa" to "foo".

I think this is an unexpected behavior.

This patch fixes this by modifying mem_cgroup_try_get_from_swapcache()
to return the memcg to which the swapcache has been charged if PCG_USED bit
is set.
IIUC, checking PCG_USED bit of swapcache is safe under page lock.


Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
---
 mm/memcontrol.c |   15 +++++++++++++--
 1 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 73c51c8..f2efbc0 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -909,13 +909,24 @@ nomem:
 static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
 {
 	struct mem_cgroup *mem;
+	struct page_cgroup *pc;
 	swp_entry_t ent;
 
+	VM_BUG_ON(!PageLocked(page));
+
 	if (!PageSwapCache(page))
 		return NULL;
 
-	ent.val = page_private(page);
-	mem = lookup_swap_cgroup(ent);
+	pc = lookup_page_cgroup(page);
+	/*
+	 * Used bit of swapcache is solid under page lock.
+	 */
+	if (PageCgroupUsed(pc))
+		mem = pc->mem_cgroup;
+	else {
+		ent.val = page_private(page);
+		mem = lookup_swap_cgroup(ent);
+	}
 	if (!mem)
 		return NULL;
 	if (!css_tryget(&mem->css))

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [BUGFIX][PATCH] memcg: charge swapcache to proper memcg
  2009-03-10  1:07 [BUGFIX][PATCH] memcg: charge swapcache to proper memcg Daisuke Nishimura
@ 2009-03-10  2:35 ` KAMEZAWA Hiroyuki
  2009-03-10  3:18   ` Daisuke Nishimura
  2009-03-10  4:33 ` KAMEZAWA Hiroyuki
  2009-03-10 23:08 ` Andrew Morton
  2 siblings, 1 reply; 21+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-10  2:35 UTC (permalink / raw)
  To: Daisuke Nishimura
  Cc: Andrew Morton, linux-mm, Balbir Singh, Li Zefan, Hugh Dickins

On Tue, 10 Mar 2009 10:07:07 +0900
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:

> From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> 
> memcg_test.txt says at 4.1:
> 
> 	This swap-in is one of the most complicated work. In do_swap_page(),
> 	following events occur when pte is unchanged.
> 
> 	(1) the page (SwapCache) is looked up.
> 	(2) lock_page()
> 	(3) try_charge_swapin()
> 	(4) reuse_swap_page() (may call delete_swap_cache())
> 	(5) commit_charge_swapin()
> 	(6) swap_free().
> 
> 	Considering following situation for example.
> 
> 	(A) The page has not been charged before (2) and reuse_swap_page()
> 	    doesn't call delete_from_swap_cache().
> 	(B) The page has not been charged before (2) and reuse_swap_page()
> 	    calls delete_from_swap_cache().
> 	(C) The page has been charged before (2) and reuse_swap_page() doesn't
> 	    call delete_from_swap_cache().
> 	(D) The page has been charged before (2) and reuse_swap_page() calls
> 	    delete_from_swap_cache().
> 
> 	    memory.usage/memsw.usage changes to this page/swp_entry will be
> 	 Case          (A)      (B)       (C)     (D)
>          Event
>        Before (2)     0/ 1     0/ 1      1/ 1    1/ 1
>           ===========================================
>           (3)        +1/+1    +1/+1     +1/+1   +1/+1
>           (4)          -       0/ 0       -     -1/ 0
>           (5)         0/-1     0/ 0     -1/-1    0/ 0
>           (6)          -       0/-1       -      0/-1
>           ===========================================
>        Result         1/ 1     1/ 1      1/ 1    1/ 1
> 
>        In any cases, charges to this page should be 1/ 1.
> 
> In case of (D), mem_cgroup_try_get_from_swapcache() returns NULL
> (because lookup_swap_cgroup() returns NULL), so "+1/+1" at (3) means
> charges to the memcg("foo") to which the "current" belongs.

Hmm...in try_charge_swapin(), if !PageSwapCache(),
it seems no charges and returns NULL...(means commit will not occur.)
Could you clarify ?

Regards,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [BUGFIX][PATCH] memcg: charge swapcache to proper memcg
  2009-03-10  2:35 ` KAMEZAWA Hiroyuki
@ 2009-03-10  3:18   ` Daisuke Nishimura
  2009-03-10  3:42     ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 21+ messages in thread
From: Daisuke Nishimura @ 2009-03-10  3:18 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: nishimura, Andrew Morton, linux-mm, Balbir Singh, Li Zefan, Hugh Dickins

On Tue, 10 Mar 2009 11:35:02 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> On Tue, 10 Mar 2009 10:07:07 +0900
> Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> 
> > From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > 
> > memcg_test.txt says at 4.1:
> > 
> > 	This swap-in is one of the most complicated work. In do_swap_page(),
> > 	following events occur when pte is unchanged.
> > 
> > 	(1) the page (SwapCache) is looked up.
> > 	(2) lock_page()
> > 	(3) try_charge_swapin()
> > 	(4) reuse_swap_page() (may call delete_swap_cache())
> > 	(5) commit_charge_swapin()
> > 	(6) swap_free().
> > 
> > 	Considering following situation for example.
> > 
> > 	(A) The page has not been charged before (2) and reuse_swap_page()
> > 	    doesn't call delete_from_swap_cache().
> > 	(B) The page has not been charged before (2) and reuse_swap_page()
> > 	    calls delete_from_swap_cache().
> > 	(C) The page has been charged before (2) and reuse_swap_page() doesn't
> > 	    call delete_from_swap_cache().
> > 	(D) The page has been charged before (2) and reuse_swap_page() calls
> > 	    delete_from_swap_cache().
> > 
> > 	    memory.usage/memsw.usage changes to this page/swp_entry will be
> > 	 Case          (A)      (B)       (C)     (D)
> >          Event
> >        Before (2)     0/ 1     0/ 1      1/ 1    1/ 1
> >           ===========================================
> >           (3)        +1/+1    +1/+1     +1/+1   +1/+1
> >           (4)          -       0/ 0       -     -1/ 0
> >           (5)         0/-1     0/ 0     -1/-1    0/ 0
> >           (6)          -       0/-1       -      0/-1
> >           ===========================================
> >        Result         1/ 1     1/ 1      1/ 1    1/ 1
> > 
> >        In any cases, charges to this page should be 1/ 1.
> > 
> > In case of (D), mem_cgroup_try_get_from_swapcache() returns NULL
> > (because lookup_swap_cgroup() returns NULL), so "+1/+1" at (3) means
> > charges to the memcg("foo") to which the "current" belongs.
> 
> Hmm...in try_charge_swapin(), if !PageSwapCache(),
> it seems no charges and returns NULL...(means commit will not occur.)
> Could you clarify ?
> 
I'm saying about PageSwapCache() case.
IIUC, swapcache which has been unmapped but has not been removed from
swapcache yet on swap-out path (so not uncharged yet) can go through
this (D) path.


Thanks,
Daisuke Nishimura.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [BUGFIX][PATCH] memcg: charge swapcache to proper memcg
  2009-03-10  3:18   ` Daisuke Nishimura
@ 2009-03-10  3:42     ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 21+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-10  3:42 UTC (permalink / raw)
  To: Daisuke Nishimura
  Cc: Andrew Morton, linux-mm, Balbir Singh, Li Zefan, Hugh Dickins

On Tue, 10 Mar 2009 12:18:56 +0900
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:

> On Tue, 10 Mar 2009 11:35:02 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > On Tue, 10 Mar 2009 10:07:07 +0900
> > Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> > 
> > > From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > > 
> > > memcg_test.txt says at 4.1:
> > > 
> > > 	This swap-in is one of the most complicated work. In do_swap_page(),
> > > 	following events occur when pte is unchanged.
> > > 
> > > 	(1) the page (SwapCache) is looked up.
> > > 	(2) lock_page()
> > > 	(3) try_charge_swapin()
> > > 	(4) reuse_swap_page() (may call delete_swap_cache())
> > > 	(5) commit_charge_swapin()
> > > 	(6) swap_free().
> > > 
> > > 	Considering following situation for example.
> > > 
> > > 	(A) The page has not been charged before (2) and reuse_swap_page()
> > > 	    doesn't call delete_from_swap_cache().
> > > 	(B) The page has not been charged before (2) and reuse_swap_page()
> > > 	    calls delete_from_swap_cache().
> > > 	(C) The page has been charged before (2) and reuse_swap_page() doesn't
> > > 	    call delete_from_swap_cache().
> > > 	(D) The page has been charged before (2) and reuse_swap_page() calls
> > > 	    delete_from_swap_cache().
> > > 
> > > 	    memory.usage/memsw.usage changes to this page/swp_entry will be
> > > 	 Case          (A)      (B)       (C)     (D)
> > >          Event
> > >        Before (2)     0/ 1     0/ 1      1/ 1    1/ 1
> > >           ===========================================
> > >           (3)        +1/+1    +1/+1     +1/+1   +1/+1
> > >           (4)          -       0/ 0       -     -1/ 0
> > >           (5)         0/-1     0/ 0     -1/-1    0/ 0
> > >           (6)          -       0/-1       -      0/-1
> > >           ===========================================
> > >        Result         1/ 1     1/ 1      1/ 1    1/ 1
> > > 
> > >        In any cases, charges to this page should be 1/ 1.
> > > 
> > > In case of (D), mem_cgroup_try_get_from_swapcache() returns NULL
> > > (because lookup_swap_cgroup() returns NULL), so "+1/+1" at (3) means
> > > charges to the memcg("foo") to which the "current" belongs.
> > 
> > Hmm...in try_charge_swapin(), if !PageSwapCache(),
> > it seems no charges and returns NULL...(means commit will not occur.)
> > Could you clarify ?
> > 
> I'm saying about PageSwapCache() case.
> IIUC, swapcache which has been unmapped but has not been removed from
> swapcache yet on swap-out path (so not uncharged yet) can go through
> this (D) path.
> 
Ok, then, lookup_swap_cgroup() can return NULL if swap_cgroup is cleared.
i.e. PageCgroupUsed(pc) == true.

Thank you for your continuous effort!

Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>



> 
> Thanks,
> Daisuke Nishimura.
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [BUGFIX][PATCH] memcg: charge swapcache to proper memcg
  2009-03-10  1:07 [BUGFIX][PATCH] memcg: charge swapcache to proper memcg Daisuke Nishimura
  2009-03-10  2:35 ` KAMEZAWA Hiroyuki
@ 2009-03-10  4:33 ` KAMEZAWA Hiroyuki
  2009-03-10  4:47   ` Daisuke Nishimura
  2009-03-10 23:08 ` Andrew Morton
  2 siblings, 1 reply; 21+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-10  4:33 UTC (permalink / raw)
  To: Daisuke Nishimura
  Cc: Andrew Morton, linux-mm, Balbir Singh, Li Zefan, Hugh Dickins

On Tue, 10 Mar 2009 10:07:07 +0900
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:

> From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> 
> memcg_test.txt says at 4.1:
> 
> 	This swap-in is one of the most complicated work. In do_swap_page(),
> 	following events occur when pte is unchanged.
> 
> 	(1) the page (SwapCache) is looked up.
> 	(2) lock_page()
> 	(3) try_charge_swapin()
> 	(4) reuse_swap_page() (may call delete_swap_cache())
> 	(5) commit_charge_swapin()
> 	(6) swap_free().
> 
> 	Considering following situation for example.
> 
> 	(A) The page has not been charged before (2) and reuse_swap_page()
> 	    doesn't call delete_from_swap_cache().
> 	(B) The page has not been charged before (2) and reuse_swap_page()
> 	    calls delete_from_swap_cache().
> 	(C) The page has been charged before (2) and reuse_swap_page() doesn't
> 	    call delete_from_swap_cache().
> 	(D) The page has been charged before (2) and reuse_swap_page() calls
> 	    delete_from_swap_cache().
> 
> 	    memory.usage/memsw.usage changes to this page/swp_entry will be
> 	 Case          (A)      (B)       (C)     (D)
>          Event
>        Before (2)     0/ 1     0/ 1      1/ 1    1/ 1
>           ===========================================
>           (3)        +1/+1    +1/+1     +1/+1   +1/+1
>           (4)          -       0/ 0       -     -1/ 0
>           (5)         0/-1     0/ 0     -1/-1    0/ 0
>           (6)          -       0/-1       -      0/-1
>           ===========================================
>        Result         1/ 1     1/ 1      1/ 1    1/ 1
> 
>        In any cases, charges to this page should be 1/ 1.
> 
> In case of (D), mem_cgroup_try_get_from_swapcache() returns NULL
> (because lookup_swap_cgroup() returns NULL), so "+1/+1" at (3) means
> charges to the memcg("foo") to which the "current" belongs.
> OTOH, "-1/0" at (4) and "0/-1" at (6) means uncharges from the memcg("baa")
> to which the page has been charged.
> 
> So, if the "foo" and "baa" is different(for example because of task move),
> this charge will be moved from "baa" to "foo".
> 
> I think this is an unexpected behavior.
> 
> This patch fixes this by modifying mem_cgroup_try_get_from_swapcache()
> to return the memcg to which the swapcache has been charged if PCG_USED bit
> is set.
> IIUC, checking PCG_USED bit of swapcache is safe under page lock.
> 
> 
> Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> ---
>  mm/memcontrol.c |   15 +++++++++++++--
>  1 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 73c51c8..f2efbc0 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -909,13 +909,24 @@ nomem:
>  static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
>  {
>  	struct mem_cgroup *mem;
> +	struct page_cgroup *pc;
>  	swp_entry_t ent;
>  
> +	VM_BUG_ON(!PageLocked(page));
> +
>  	if (!PageSwapCache(page))
>  		return NULL;
>  
> -	ent.val = page_private(page);
> -	mem = lookup_swap_cgroup(ent);
> +	pc = lookup_page_cgroup(page);
> +	/*
> +	 * Used bit of swapcache is solid under page lock.
> +	 */
> +	if (PageCgroupUsed(pc))
> +		mem = pc->mem_cgroup;

I've already acked but how about returning NULL here ?

THanks
-Kame

> +	else {
> +		ent.val = page_private(page);
> +		mem = lookup_swap_cgroup(ent);
> +	}
>  	if (!mem)
>  		return NULL;
>  	if (!css_tryget(&mem->css))
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [BUGFIX][PATCH] memcg: charge swapcache to proper memcg
  2009-03-10  4:33 ` KAMEZAWA Hiroyuki
@ 2009-03-10  4:47   ` Daisuke Nishimura
  2009-03-10  5:04     ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 21+ messages in thread
From: Daisuke Nishimura @ 2009-03-10  4:47 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: nishimura, Andrew Morton, linux-mm, Balbir Singh, Li Zefan, Hugh Dickins

On Tue, 10 Mar 2009 13:33:16 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> On Tue, 10 Mar 2009 10:07:07 +0900
> Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> 
> > From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > 
> > memcg_test.txt says at 4.1:
> > 
> > 	This swap-in is one of the most complicated work. In do_swap_page(),
> > 	following events occur when pte is unchanged.
> > 
> > 	(1) the page (SwapCache) is looked up.
> > 	(2) lock_page()
> > 	(3) try_charge_swapin()
> > 	(4) reuse_swap_page() (may call delete_swap_cache())
> > 	(5) commit_charge_swapin()
> > 	(6) swap_free().
> > 
> > 	Considering following situation for example.
> > 
> > 	(A) The page has not been charged before (2) and reuse_swap_page()
> > 	    doesn't call delete_from_swap_cache().
> > 	(B) The page has not been charged before (2) and reuse_swap_page()
> > 	    calls delete_from_swap_cache().
> > 	(C) The page has been charged before (2) and reuse_swap_page() doesn't
> > 	    call delete_from_swap_cache().
> > 	(D) The page has been charged before (2) and reuse_swap_page() calls
> > 	    delete_from_swap_cache().
> > 
> > 	    memory.usage/memsw.usage changes to this page/swp_entry will be
> > 	 Case          (A)      (B)       (C)     (D)
> >          Event
> >        Before (2)     0/ 1     0/ 1      1/ 1    1/ 1
> >           ===========================================
> >           (3)        +1/+1    +1/+1     +1/+1   +1/+1
> >           (4)          -       0/ 0       -     -1/ 0
> >           (5)         0/-1     0/ 0     -1/-1    0/ 0
> >           (6)          -       0/-1       -      0/-1
> >           ===========================================
> >        Result         1/ 1     1/ 1      1/ 1    1/ 1
> > 
> >        In any cases, charges to this page should be 1/ 1.
> > 
> > In case of (D), mem_cgroup_try_get_from_swapcache() returns NULL
> > (because lookup_swap_cgroup() returns NULL), so "+1/+1" at (3) means
> > charges to the memcg("foo") to which the "current" belongs.
> > OTOH, "-1/0" at (4) and "0/-1" at (6) means uncharges from the memcg("baa")
> > to which the page has been charged.
> > 
> > So, if the "foo" and "baa" is different(for example because of task move),
> > this charge will be moved from "baa" to "foo".
> > 
> > I think this is an unexpected behavior.
> > 
> > This patch fixes this by modifying mem_cgroup_try_get_from_swapcache()
> > to return the memcg to which the swapcache has been charged if PCG_USED bit
> > is set.
> > IIUC, checking PCG_USED bit of swapcache is safe under page lock.
> > 
> > 
> > Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > ---
> >  mm/memcontrol.c |   15 +++++++++++++--
> >  1 files changed, 13 insertions(+), 2 deletions(-)
> > 
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 73c51c8..f2efbc0 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -909,13 +909,24 @@ nomem:
> >  static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
> >  {
> >  	struct mem_cgroup *mem;
> > +	struct page_cgroup *pc;
> >  	swp_entry_t ent;
> >  
> > +	VM_BUG_ON(!PageLocked(page));
> > +
> >  	if (!PageSwapCache(page))
> >  		return NULL;
> >  
> > -	ent.val = page_private(page);
> > -	mem = lookup_swap_cgroup(ent);
> > +	pc = lookup_page_cgroup(page);
> > +	/*
> > +	 * Used bit of swapcache is solid under page lock.
> > +	 */
> > +	if (PageCgroupUsed(pc))
> > +		mem = pc->mem_cgroup;
> 
> I've already acked but how about returning NULL here ?
> 
Returning NULL here means try_charge_swapin charges "current" memcg("foo"
in the patch description above).
So, it doesn't change current behavior at all.


Thanks,
Daisuke Nishimura.

> THanks
> -Kame
> 
> > +	else {
> > +		ent.val = page_private(page);
> > +		mem = lookup_swap_cgroup(ent);
> > +	}
> >  	if (!mem)
> >  		return NULL;
> >  	if (!css_tryget(&mem->css))
> > 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [BUGFIX][PATCH] memcg: charge swapcache to proper memcg
  2009-03-10  4:47   ` Daisuke Nishimura
@ 2009-03-10  5:04     ` KAMEZAWA Hiroyuki
  2009-03-10  6:38       ` Daisuke Nishimura
  0 siblings, 1 reply; 21+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-10  5:04 UTC (permalink / raw)
  To: Daisuke Nishimura
  Cc: Andrew Morton, linux-mm, Balbir Singh, Li Zefan, Hugh Dickins

On Tue, 10 Mar 2009 13:47:57 +0900
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:

> On Tue, 10 Mar 2009 13:33:16 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > On Tue, 10 Mar 2009 10:07:07 +0900
> > Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> > 
> > > From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > > 
> > > memcg_test.txt says at 4.1:
> > > 
> > > 	This swap-in is one of the most complicated work. In do_swap_page(),
> > > 	following events occur when pte is unchanged.
> > > 
> > > 	(1) the page (SwapCache) is looked up.
> > > 	(2) lock_page()
> > > 	(3) try_charge_swapin()
> > > 	(4) reuse_swap_page() (may call delete_swap_cache())
> > > 	(5) commit_charge_swapin()
> > > 	(6) swap_free().
> > > 
> > > 	Considering following situation for example.
> > > 
> > > 	(A) The page has not been charged before (2) and reuse_swap_page()
> > > 	    doesn't call delete_from_swap_cache().
> > > 	(B) The page has not been charged before (2) and reuse_swap_page()
> > > 	    calls delete_from_swap_cache().
> > > 	(C) The page has been charged before (2) and reuse_swap_page() doesn't
> > > 	    call delete_from_swap_cache().
> > > 	(D) The page has been charged before (2) and reuse_swap_page() calls
> > > 	    delete_from_swap_cache().
> > > 
> > > 	    memory.usage/memsw.usage changes to this page/swp_entry will be
> > > 	 Case          (A)      (B)       (C)     (D)
> > >          Event
> > >        Before (2)     0/ 1     0/ 1      1/ 1    1/ 1
> > >           ===========================================
> > >           (3)        +1/+1    +1/+1     +1/+1   +1/+1
> > >           (4)          -       0/ 0       -     -1/ 0
> > >           (5)         0/-1     0/ 0     -1/-1    0/ 0
> > >           (6)          -       0/-1       -      0/-1
> > >           ===========================================
> > >        Result         1/ 1     1/ 1      1/ 1    1/ 1
> > > 
> > >        In any cases, charges to this page should be 1/ 1.
> > > 
> > > In case of (D), mem_cgroup_try_get_from_swapcache() returns NULL
> > > (because lookup_swap_cgroup() returns NULL), so "+1/+1" at (3) means
> > > charges to the memcg("foo") to which the "current" belongs.
> > > OTOH, "-1/0" at (4) and "0/-1" at (6) means uncharges from the memcg("baa")
> > > to which the page has been charged.
> > > 
> > > So, if the "foo" and "baa" is different(for example because of task move),
> > > this charge will be moved from "baa" to "foo".
> > > 
> > > I think this is an unexpected behavior.
> > > 
> > > This patch fixes this by modifying mem_cgroup_try_get_from_swapcache()
> > > to return the memcg to which the swapcache has been charged if PCG_USED bit
> > > is set.
> > > IIUC, checking PCG_USED bit of swapcache is safe under page lock.
> > > 
> > > 
> > > Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > > ---
> > >  mm/memcontrol.c |   15 +++++++++++++--
> > >  1 files changed, 13 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index 73c51c8..f2efbc0 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -909,13 +909,24 @@ nomem:
> > >  static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
> > >  {
> > >  	struct mem_cgroup *mem;
> > > +	struct page_cgroup *pc;
> > >  	swp_entry_t ent;
> > >  
> > > +	VM_BUG_ON(!PageLocked(page));
> > > +
> > >  	if (!PageSwapCache(page))
> > >  		return NULL;
> > >  
> > > -	ent.val = page_private(page);
> > > -	mem = lookup_swap_cgroup(ent);
> > > +	pc = lookup_page_cgroup(page);
> > > +	/*
> > > +	 * Used bit of swapcache is solid under page lock.
> > > +	 */
> > > +	if (PageCgroupUsed(pc))
> > > +		mem = pc->mem_cgroup;
> > 
> > I've already acked but how about returning NULL here ?
> > 
> Returning NULL here means try_charge_swapin charges "current" memcg("foo"
> in the patch description above).
> So, it doesn't change current behavior at all.
> 

ok, maybe try_charge_swapin() should check Used bit...and set ptr=NULL 
before reaching here.

Can't we move this 
+ pc = lookup_page_cgroup(page);
+ if (PageCgroupUsed(pc))

check to try_charge_swapin() ? (I think this is safe because the page is locked.)

By this, we can avoid more works in commit_charge().

-Kame

> Thanks,
> Daisuke Nishimura.
> 
> > THanks
> > -Kame
> > 
> > > +	else {
> > > +		ent.val = page_private(page);
> > > +		mem = lookup_swap_cgroup(ent);
> > > +	}
> > >  	if (!mem)
> > >  		return NULL;
> > >  	if (!css_tryget(&mem->css))
> > > 
> > 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [BUGFIX][PATCH] memcg: charge swapcache to proper memcg
  2009-03-10  5:04     ` KAMEZAWA Hiroyuki
@ 2009-03-10  6:38       ` Daisuke Nishimura
  2009-03-10  6:56         ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 21+ messages in thread
From: Daisuke Nishimura @ 2009-03-10  6:38 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: nishimura, Andrew Morton, linux-mm, Balbir Singh, Li Zefan, Hugh Dickins

On Tue, 10 Mar 2009 14:04:16 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> On Tue, 10 Mar 2009 13:47:57 +0900
> Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> 
> > On Tue, 10 Mar 2009 13:33:16 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > > On Tue, 10 Mar 2009 10:07:07 +0900
> > > Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> > > 
> > > > From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > > > 
> > > > memcg_test.txt says at 4.1:
> > > > 
> > > > 	This swap-in is one of the most complicated work. In do_swap_page(),
> > > > 	following events occur when pte is unchanged.
> > > > 
> > > > 	(1) the page (SwapCache) is looked up.
> > > > 	(2) lock_page()
> > > > 	(3) try_charge_swapin()
> > > > 	(4) reuse_swap_page() (may call delete_swap_cache())
> > > > 	(5) commit_charge_swapin()
> > > > 	(6) swap_free().
> > > > 
> > > > 	Considering following situation for example.
> > > > 
> > > > 	(A) The page has not been charged before (2) and reuse_swap_page()
> > > > 	    doesn't call delete_from_swap_cache().
> > > > 	(B) The page has not been charged before (2) and reuse_swap_page()
> > > > 	    calls delete_from_swap_cache().
> > > > 	(C) The page has been charged before (2) and reuse_swap_page() doesn't
> > > > 	    call delete_from_swap_cache().
> > > > 	(D) The page has been charged before (2) and reuse_swap_page() calls
> > > > 	    delete_from_swap_cache().
> > > > 
> > > > 	    memory.usage/memsw.usage changes to this page/swp_entry will be
> > > > 	 Case          (A)      (B)       (C)     (D)
> > > >          Event
> > > >        Before (2)     0/ 1     0/ 1      1/ 1    1/ 1
> > > >           ===========================================
> > > >           (3)        +1/+1    +1/+1     +1/+1   +1/+1
> > > >           (4)          -       0/ 0       -     -1/ 0
> > > >           (5)         0/-1     0/ 0     -1/-1    0/ 0
> > > >           (6)          -       0/-1       -      0/-1
> > > >           ===========================================
> > > >        Result         1/ 1     1/ 1      1/ 1    1/ 1
> > > > 
> > > >        In any cases, charges to this page should be 1/ 1.
> > > > 
> > > > In case of (D), mem_cgroup_try_get_from_swapcache() returns NULL
> > > > (because lookup_swap_cgroup() returns NULL), so "+1/+1" at (3) means
> > > > charges to the memcg("foo") to which the "current" belongs.
> > > > OTOH, "-1/0" at (4) and "0/-1" at (6) means uncharges from the memcg("baa")
> > > > to which the page has been charged.
> > > > 
> > > > So, if the "foo" and "baa" is different(for example because of task move),
> > > > this charge will be moved from "baa" to "foo".
> > > > 
> > > > I think this is an unexpected behavior.
> > > > 
> > > > This patch fixes this by modifying mem_cgroup_try_get_from_swapcache()
> > > > to return the memcg to which the swapcache has been charged if PCG_USED bit
> > > > is set.
> > > > IIUC, checking PCG_USED bit of swapcache is safe under page lock.
> > > > 
> > > > 
> > > > Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > > > ---
> > > >  mm/memcontrol.c |   15 +++++++++++++--
> > > >  1 files changed, 13 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > > index 73c51c8..f2efbc0 100644
> > > > --- a/mm/memcontrol.c
> > > > +++ b/mm/memcontrol.c
> > > > @@ -909,13 +909,24 @@ nomem:
> > > >  static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
> > > >  {
> > > >  	struct mem_cgroup *mem;
> > > > +	struct page_cgroup *pc;
> > > >  	swp_entry_t ent;
> > > >  
> > > > +	VM_BUG_ON(!PageLocked(page));
> > > > +
> > > >  	if (!PageSwapCache(page))
> > > >  		return NULL;
> > > >  
> > > > -	ent.val = page_private(page);
> > > > -	mem = lookup_swap_cgroup(ent);
> > > > +	pc = lookup_page_cgroup(page);
> > > > +	/*
> > > > +	 * Used bit of swapcache is solid under page lock.
> > > > +	 */
> > > > +	if (PageCgroupUsed(pc))
> > > > +		mem = pc->mem_cgroup;
> > > 
> > > I've already acked but how about returning NULL here ?
> > > 
> > Returning NULL here means try_charge_swapin charges "current" memcg("foo"
> > in the patch description above).
> > So, it doesn't change current behavior at all.
> > 
> 
> ok, maybe try_charge_swapin() should check Used bit...and set ptr=NULL 
> before reaching here.
> 
> Can't we move this 
> + pc = lookup_page_cgroup(page);
> + if (PageCgroupUsed(pc))
> 
> check to try_charge_swapin() ? (I think this is safe because the page is locked.)
> 
> By this, we can avoid more works in commit_charge().
> 
hmm, I thought the same thing at first, but this means:

     Case       (D)
     Event
   Before (2)   1/ 1
      ===============
      (3)       0/ 0
      (4)      -1/ 0
      (5)       0/ 0
      (6)       0/-1
      ===============
   Result       0/ 0

Instead of handling this case properly, I selected
the current derection(changed memcg codes only).


Thanks,
Daisuke Nishimura.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [BUGFIX][PATCH] memcg: charge swapcache to proper memcg
  2009-03-10  6:38       ` Daisuke Nishimura
@ 2009-03-10  6:56         ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 21+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-10  6:56 UTC (permalink / raw)
  To: Daisuke Nishimura
  Cc: Andrew Morton, linux-mm, Balbir Singh, Li Zefan, Hugh Dickins

On Tue, 10 Mar 2009 15:38:59 +0900
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:

> On Tue, 10 Mar 2009 14:04:16 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > On Tue, 10 Mar 2009 13:47:57 +0900
> > Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> > 
> > > On Tue, 10 Mar 2009 13:33:16 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > > > On Tue, 10 Mar 2009 10:07:07 +0900
> > > > Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> > > > 
> > > > > From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > > > > 
> > > > > memcg_test.txt says at 4.1:
> > > > > 
> > > > > 	This swap-in is one of the most complicated work. In do_swap_page(),
> > > > > 	following events occur when pte is unchanged.
> > > > > 
> > > > > 	(1) the page (SwapCache) is looked up.
> > > > > 	(2) lock_page()
> > > > > 	(3) try_charge_swapin()
> > > > > 	(4) reuse_swap_page() (may call delete_swap_cache())
> > > > > 	(5) commit_charge_swapin()
> > > > > 	(6) swap_free().
> > > > > 
> > > > > 	Considering following situation for example.
> > > > > 
> > > > > 	(A) The page has not been charged before (2) and reuse_swap_page()
> > > > > 	    doesn't call delete_from_swap_cache().
> > > > > 	(B) The page has not been charged before (2) and reuse_swap_page()
> > > > > 	    calls delete_from_swap_cache().
> > > > > 	(C) The page has been charged before (2) and reuse_swap_page() doesn't
> > > > > 	    call delete_from_swap_cache().
> > > > > 	(D) The page has been charged before (2) and reuse_swap_page() calls
> > > > > 	    delete_from_swap_cache().
> > > > > 
> > > > > 	    memory.usage/memsw.usage changes to this page/swp_entry will be
> > > > > 	 Case          (A)      (B)       (C)     (D)
> > > > >          Event
> > > > >        Before (2)     0/ 1     0/ 1      1/ 1    1/ 1
> > > > >           ===========================================
> > > > >           (3)        +1/+1    +1/+1     +1/+1   +1/+1
> > > > >           (4)          -       0/ 0       -     -1/ 0
> > > > >           (5)         0/-1     0/ 0     -1/-1    0/ 0
> > > > >           (6)          -       0/-1       -      0/-1
> > > > >           ===========================================
> > > > >        Result         1/ 1     1/ 1      1/ 1    1/ 1
> > > > > 
> > > > >        In any cases, charges to this page should be 1/ 1.
> > > > > 
> > > > > In case of (D), mem_cgroup_try_get_from_swapcache() returns NULL
> > > > > (because lookup_swap_cgroup() returns NULL), so "+1/+1" at (3) means
> > > > > charges to the memcg("foo") to which the "current" belongs.
> > > > > OTOH, "-1/0" at (4) and "0/-1" at (6) means uncharges from the memcg("baa")
> > > > > to which the page has been charged.
> > > > > 
> > > > > So, if the "foo" and "baa" is different(for example because of task move),
> > > > > this charge will be moved from "baa" to "foo".
> > > > > 
> > > > > I think this is an unexpected behavior.
> > > > > 
> > > > > This patch fixes this by modifying mem_cgroup_try_get_from_swapcache()
> > > > > to return the memcg to which the swapcache has been charged if PCG_USED bit
> > > > > is set.
> > > > > IIUC, checking PCG_USED bit of swapcache is safe under page lock.
> > > > > 
> > > > > 
> > > > > Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > > > > ---
> > > > >  mm/memcontrol.c |   15 +++++++++++++--
> > > > >  1 files changed, 13 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > > > index 73c51c8..f2efbc0 100644
> > > > > --- a/mm/memcontrol.c
> > > > > +++ b/mm/memcontrol.c
> > > > > @@ -909,13 +909,24 @@ nomem:
> > > > >  static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
> > > > >  {
> > > > >  	struct mem_cgroup *mem;
> > > > > +	struct page_cgroup *pc;
> > > > >  	swp_entry_t ent;
> > > > >  
> > > > > +	VM_BUG_ON(!PageLocked(page));
> > > > > +
> > > > >  	if (!PageSwapCache(page))
> > > > >  		return NULL;
> > > > >  
> > > > > -	ent.val = page_private(page);
> > > > > -	mem = lookup_swap_cgroup(ent);
> > > > > +	pc = lookup_page_cgroup(page);
> > > > > +	/*
> > > > > +	 * Used bit of swapcache is solid under page lock.
> > > > > +	 */
> > > > > +	if (PageCgroupUsed(pc))
> > > > > +		mem = pc->mem_cgroup;
> > > > 
> > > > I've already acked but how about returning NULL here ?
> > > > 
> > > Returning NULL here means try_charge_swapin charges "current" memcg("foo"
> > > in the patch description above).
> > > So, it doesn't change current behavior at all.
> > > 
> > 
> > ok, maybe try_charge_swapin() should check Used bit...and set ptr=NULL 
> > before reaching here.
> > 
> > Can't we move this 
> > + pc = lookup_page_cgroup(page);
> > + if (PageCgroupUsed(pc))
> > 
> > check to try_charge_swapin() ? (I think this is safe because the page is locked.)
> > 
> > By this, we can avoid more works in commit_charge().
> > 
> hmm, I thought the same thing at first, but this means:
> 
>      Case       (D)
>      Event
>    Before (2)   1/ 1
>       ===============
>       (3)       0/ 0
>       (4)      -1/ 0
>       (5)       0/ 0
>       (6)       0/-1
>       ===============
>    Result       0/ 0
> 
> Instead of handling this case properly, I selected
> the current derection(changed memcg codes only).
> 
> 
Ok, I see. page->mapcount is 0 at reuse_swap_page() and above sequence will happen.
Then, we have to call commit_charge() anyway.
Thank you for clarification.

Thanks,
-Kame

> Thanks,
> Daisuke Nishimura.
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [BUGFIX][PATCH] memcg: charge swapcache to proper memcg
  2009-03-10  1:07 [BUGFIX][PATCH] memcg: charge swapcache to proper memcg Daisuke Nishimura
  2009-03-10  2:35 ` KAMEZAWA Hiroyuki
  2009-03-10  4:33 ` KAMEZAWA Hiroyuki
@ 2009-03-10 23:08 ` Andrew Morton
  2009-03-10 23:53   ` KAMEZAWA Hiroyuki
  2 siblings, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2009-03-10 23:08 UTC (permalink / raw)
  To: Daisuke Nishimura; +Cc: linux-mm, balbir, kamezawa.hiroyu, lizf, hugh

On Tue, 10 Mar 2009 10:07:07 +0900
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:

> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -909,13 +909,24 @@ nomem:
>  static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
>  {
>  	struct mem_cgroup *mem;
> +	struct page_cgroup *pc;
>  	swp_entry_t ent;
>  
> +	VM_BUG_ON(!PageLocked(page));
> +
>  	if (!PageSwapCache(page))
>  		return NULL;
>  
> -	ent.val = page_private(page);
> -	mem = lookup_swap_cgroup(ent);
> +	pc = lookup_page_cgroup(page);
> +	/*
> +	 * Used bit of swapcache is solid under page lock.
> +	 */
> +	if (PageCgroupUsed(pc))
> +		mem = pc->mem_cgroup;
> +	else {
> +		ent.val = page_private(page);
> +		mem = lookup_swap_cgroup(ent);
> +	}
>  	if (!mem)
>  		return NULL;
>  	if (!css_tryget(&mem->css))

This patch made rather a mess of
use-css-id-in-swap_cgroup-for-saving-memory-v4.patch.

I temporarily dropped
use-css-id-in-swap_cgroup-for-saving-memory-v4.patch.  Could I have a
fixed version please?

Do we think that this patch
(memcg-charge-swapcache-to-proper-memcg.patch) shouild be in 2.6.29?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [BUGFIX][PATCH] memcg: charge swapcache to proper memcg
  2009-03-10 23:08 ` Andrew Morton
@ 2009-03-10 23:53   ` KAMEZAWA Hiroyuki
  2009-03-11  0:43     ` nishimura
  0 siblings, 1 reply; 21+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-10 23:53 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Daisuke Nishimura, linux-mm, balbir, lizf, hugh

On Tue, 10 Mar 2009 16:08:56 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Tue, 10 Mar 2009 10:07:07 +0900
> Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> 
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -909,13 +909,24 @@ nomem:
> >  static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
> >  {
> >  	struct mem_cgroup *mem;
> > +	struct page_cgroup *pc;
> >  	swp_entry_t ent;
> >  
> > +	VM_BUG_ON(!PageLocked(page));
> > +
> >  	if (!PageSwapCache(page))
> >  		return NULL;
> >  
> > -	ent.val = page_private(page);
> > -	mem = lookup_swap_cgroup(ent);
> > +	pc = lookup_page_cgroup(page);
> > +	/*
> > +	 * Used bit of swapcache is solid under page lock.
> > +	 */
> > +	if (PageCgroupUsed(pc))
> > +		mem = pc->mem_cgroup;
> > +	else {
> > +		ent.val = page_private(page);
> > +		mem = lookup_swap_cgroup(ent);
> > +	}
> >  	if (!mem)
> >  		return NULL;
> >  	if (!css_tryget(&mem->css))
> 
> This patch made rather a mess of
> use-css-id-in-swap_cgroup-for-saving-memory-v4.patch.
> 
> I temporarily dropped
> use-css-id-in-swap_cgroup-for-saving-memory-v4.patch.  Could I have a
> fixed version please?
Okay.

> 
> Do we think that this patch
> (memcg-charge-swapcache-to-proper-memcg.patch) shouild be in 2.6.29?
> 
please.

Thanks,
-Kame

> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [BUGFIX][PATCH] memcg: charge swapcache to proper memcg
  2009-03-10 23:53   ` KAMEZAWA Hiroyuki
@ 2009-03-11  0:43     ` nishimura
  2009-03-11  0:47       ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 21+ messages in thread
From: nishimura @ 2009-03-11  0:43 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki, Andrew Morton
  Cc: Daisuke Nishimura, linux-mm, balbir, lizf, hugh

> On Tue, 10 Mar 2009 16:08:56 -0700
> Andrew Morton <akpm@linux-foundation.org> wrote:
> 
>> On Tue, 10 Mar 2009 10:07:07 +0900
>> Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
>> 
>> > --- a/mm/memcontrol.c
>> > +++ b/mm/memcontrol.c
>> > @@ -909,13 +909,24 @@ nomem:
>> >  static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
>> >  {
>> >  	struct mem_cgroup *mem;
>> > +	struct page_cgroup *pc;
>> >  	swp_entry_t ent;
>> >  
>> > +	VM_BUG_ON(!PageLocked(page));
>> > +
>> >  	if (!PageSwapCache(page))
>> >  		return NULL;
>> >  
>> > -	ent.val = page_private(page);
>> > -	mem = lookup_swap_cgroup(ent);
>> > +	pc = lookup_page_cgroup(page);
>> > +	/*
>> > +	 * Used bit of swapcache is solid under page lock.
>> > +	 */
>> > +	if (PageCgroupUsed(pc))
>> > +		mem = pc->mem_cgroup;
>> > +	else {
>> > +		ent.val = page_private(page);
>> > +		mem = lookup_swap_cgroup(ent);
>> > +	}
>> >  	if (!mem)
>> >  		return NULL;
>> >  	if (!css_tryget(&mem->css))
>> 
>> This patch made rather a mess of
>> use-css-id-in-swap_cgroup-for-saving-memory-v4.patch.
>> 
Ah.. I found this bug while testing rc6, so I made this patch
based on rc6 and forgot to rebase it on mmotm.

>> I temporarily dropped
>> use-css-id-in-swap_cgroup-for-saving-memory-v4.patch.  Could I have a
>> fixed version please?
> Okay.
> 
I'm sorry for bothering you.


Daisuke Nishimura.

>> 
>> Do we think that this patch
>> (memcg-charge-swapcache-to-proper-memcg.patch) shouild be in 2.6.29?
>> 
> please.
> 
> Thanks,
> -Kame
> 
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [BUGFIX][PATCH] memcg: charge swapcache to proper memcg
  2009-03-11  0:43     ` nishimura
@ 2009-03-11  0:47       ` KAMEZAWA Hiroyuki
  2009-03-11  3:04         ` [PATCH] use css id in swap cgroup for saving memory v5 KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 21+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-11  0:47 UTC (permalink / raw)
  To: nishimura; +Cc: Andrew Morton, linux-mm, balbir, lizf, hugh

On Wed, 11 Mar 2009 09:43:23 +0900
nishimura@mxp.nes.nec.co.jp wrote:

> >> I temporarily dropped
> >> use-css-id-in-swap_cgroup-for-saving-memory-v4.patch.  Could I have a
> >> fixed version please?
> > Okay.
> > 
> I'm sorry for bothering you.
> 
No problem :) Thank you for digging bug.

-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH] use css id in swap cgroup for saving memory v5
  2009-03-11  0:47       ` KAMEZAWA Hiroyuki
@ 2009-03-11  3:04         ` KAMEZAWA Hiroyuki
  2009-03-11 11:05           ` Hugh Dickins
  0 siblings, 1 reply; 21+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-11  3:04 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: nishimura, Andrew Morton, linux-mm, balbir, lizf, hugh

This patch is against mm-of-the-moment snapshot 2009-03-10-16-39.
rework of use-css-id-in-swap_cgroup-for-saving-memory-v4.patch.
Tested on my x86-64 box.

Thanks,
-Kame
==
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Try to use CSS ID for records in swap_cgroup.  By this, on 64bit machine,
size of swap_cgroup goes down to 2 bytes from 8bytes.

This means, when 2GB of swap is equipped, (assume the page size is 4096bytes)

	From size of swap_cgroup = 2G/4k * 8 = 4Mbytes.
	To   size of swap_cgroup = 2G/4k * 2 = 1Mbytes.

Reduction is large.  Of course, there are trade-offs.  This CSS ID will
add overhead to swap-in/swap-out/swap-free.

But in general,
  - swap is a resource which the user tend to avoid use.
  - If swap is never used, swap_cgroup area is not used.
  - Reading traditional manuals, size of swap should be proportional to
    size of memory. Memory size of machine is increasing now.

I think reducing size of swap_cgroup makes sense.

Note:
  - ID->CSS lookup routine has no locks, it's under RCU-Read-Side.
  - memcg can be obsolete at rmdir() but not freed while refcnt from
    swap_cgroup is available.

Changelog v4->v5:
 - reworked on to memcg-charge-swapcache-to-proper-memcg.patch
Changlog ->v4:
 - fixed not configured case.
 - deleted unnecessary comments.
 - fixed NULL pointer bug.
 - fixed message in dmesg.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
 include/linux/page_cgroup.h |   13 ++++----
 mm/memcontrol.c             |   64 +++++++++++++++++++++++++++++++++++++++-----
 mm/page_cgroup.c            |   32 +++++++++-------------
 3 files changed, 78 insertions(+), 31 deletions(-)

Index: mmotm-2.6.29-Mar10/include/linux/page_cgroup.h
===================================================================
--- mmotm-2.6.29-Mar10.orig/include/linux/page_cgroup.h
+++ mmotm-2.6.29-Mar10/include/linux/page_cgroup.h
@@ -91,24 +91,23 @@ static inline void page_cgroup_init(void
 
 #ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
 #include <linux/swap.h>
-extern struct mem_cgroup *
-swap_cgroup_record(swp_entry_t ent, struct mem_cgroup *mem);
-extern struct mem_cgroup *lookup_swap_cgroup(swp_entry_t ent);
+extern unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id);
+extern unsigned short lookup_swap_cgroup(swp_entry_t ent);
 extern int swap_cgroup_swapon(int type, unsigned long max_pages);
 extern void swap_cgroup_swapoff(int type);
 #else
 #include <linux/swap.h>
 
 static inline
-struct mem_cgroup *swap_cgroup_record(swp_entry_t ent, struct mem_cgroup *mem)
+unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id)
 {
-	return NULL;
+	return 0;
 }
 
 static inline
-struct mem_cgroup *lookup_swap_cgroup(swp_entry_t ent)
+unsigned short lookup_swap_cgroup(swp_entry_t ent)
 {
-	return NULL;
+	return 0;
 }
 
 static inline int
Index: mmotm-2.6.29-Mar10/mm/memcontrol.c
===================================================================
--- mmotm-2.6.29-Mar10.orig/mm/memcontrol.c
+++ mmotm-2.6.29-Mar10/mm/memcontrol.c
@@ -991,10 +991,31 @@ nomem:
 	return -ENOMEM;
 }
 
+
+/*
+ * A helper function to get mem_cgroup from ID. must be called under
+ * rcu_read_lock(). The caller must check css_is_removed() or some if
+ * it's concern. (dropping refcnt from swap can be called against removed
+ * memcg.)
+ */
+static struct mem_cgroup *mem_cgroup_lookup(unsigned short id)
+{
+	struct cgroup_subsys_state *css;
+
+	/* ID 0 is unused ID */
+	if (!id)
+		return NULL;
+	css = css_lookup(&mem_cgroup_subsys, id);
+	if (!css)
+		return NULL;
+	return container_of(css, struct mem_cgroup, css);
+}
+
 static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
 {
 	struct mem_cgroup *mem;
 	struct page_cgroup *pc;
+	unsigned short id;
 	swp_entry_t ent;
 
 	VM_BUG_ON(!PageLocked(page));
@@ -1010,7 +1031,12 @@ static struct mem_cgroup *try_get_mem_cg
 		mem = pc->mem_cgroup;
 	else {
 		ent.val = page_private(page);
-		mem = lookup_swap_cgroup(ent);
+		id = lookup_swap_cgroup(ent);
+		rcu_read_lock();
+		mem = mem_cgroup_lookup(id);
+		if (mem && !css_tryget(&mem->css))
+			mem = NULL;
+		rcu_read_unlock();
 	}
 	if (!mem)
 		return NULL;
@@ -1276,12 +1302,22 @@ int mem_cgroup_cache_charge(struct page 
 
 	if (do_swap_account && !ret && PageSwapCache(page)) {
 		swp_entry_t ent = {.val = page_private(page)};
+		unsigned short id;
 		/* avoid double counting */
-		mem = swap_cgroup_record(ent, NULL);
+		id = swap_cgroup_record(ent, 0);
+		rcu_read_lock();
+		mem = mem_cgroup_lookup(id);
 		if (mem) {
+			/*
+			 * We did swap-in. Then, this entry is doubly counted
+			 * both in mem and memsw. We uncharge it, here.
+			 * Recorded ID can be obsolete. We avoid calling
+			 * css_tryget()
+			 */
 			res_counter_uncharge(&mem->memsw, PAGE_SIZE);
 			mem_cgroup_put(mem);
 		}
+		rcu_read_unlock();
 	}
 	return ret;
 }
@@ -1346,13 +1382,21 @@ void mem_cgroup_commit_charge_swapin(str
 	 */
 	if (do_swap_account && PageSwapCache(page)) {
 		swp_entry_t ent = {.val = page_private(page)};
+		unsigned short id;
 		struct mem_cgroup *memcg;
-		memcg = swap_cgroup_record(ent, NULL);
+
+		id = swap_cgroup_record(ent, 0);
+		rcu_read_lock();
+		memcg = mem_cgroup_lookup(id);
 		if (memcg) {
+			/*
+			 * This recorded memcg can be obsolete one. So, avoid
+			 * calling css_tryget
+			 */
 			res_counter_uncharge(&memcg->memsw, PAGE_SIZE);
 			mem_cgroup_put(memcg);
 		}
-
+		rcu_read_unlock();
 	}
 	/* add this page(page_cgroup) to the LRU we want. */
 
@@ -1473,7 +1517,7 @@ void mem_cgroup_uncharge_swapcache(struc
 					MEM_CGROUP_CHARGE_TYPE_SWAPOUT);
 	/* record memcg information */
 	if (do_swap_account && memcg) {
-		swap_cgroup_record(ent, memcg);
+		swap_cgroup_record(ent, css_id(&memcg->css));
 		mem_cgroup_get(memcg);
 	}
 	if (memcg)
@@ -1488,15 +1532,23 @@ void mem_cgroup_uncharge_swapcache(struc
 void mem_cgroup_uncharge_swap(swp_entry_t ent)
 {
 	struct mem_cgroup *memcg;
+	unsigned short id;
 
 	if (!do_swap_account)
 		return;
 
-	memcg = swap_cgroup_record(ent, NULL);
+	id = swap_cgroup_record(ent, 0);
+	rcu_read_lock();
+	memcg = mem_cgroup_lookup(id);
 	if (memcg) {
+		/*
+		 * We uncharge this because swap is freed.
+		 * This memcg can be obsolete one. We avoid calling css_tryget
+		 */
 		res_counter_uncharge(&memcg->memsw, PAGE_SIZE);
 		mem_cgroup_put(memcg);
 	}
+	rcu_read_unlock();
 }
 #endif
 
Index: mmotm-2.6.29-Mar10/mm/page_cgroup.c
===================================================================
--- mmotm-2.6.29-Mar10.orig/mm/page_cgroup.c
+++ mmotm-2.6.29-Mar10/mm/page_cgroup.c
@@ -285,12 +285,8 @@ struct swap_cgroup_ctrl {
 
 struct swap_cgroup_ctrl swap_cgroup_ctrl[MAX_SWAPFILES];
 
-/*
- * This 8bytes seems big..maybe we can reduce this when we can use "id" for
- * cgroup rather than pointer.
- */
 struct swap_cgroup {
-	struct mem_cgroup	*val;
+	unsigned short		id;
 };
 #define SC_PER_PAGE	(PAGE_SIZE/sizeof(struct swap_cgroup))
 #define SC_POS_MASK	(SC_PER_PAGE - 1)
@@ -342,10 +338,10 @@ not_enough_page:
  * @ent: swap entry to be recorded into
  * @mem: mem_cgroup to be recorded
  *
- * Returns old value at success, NULL at failure.
- * (Of course, old value can be NULL.)
+ * Returns old value at success, 0 at failure.
+ * (Of course, old value can be 0.)
  */
-struct mem_cgroup *swap_cgroup_record(swp_entry_t ent, struct mem_cgroup *mem)
+unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id)
 {
 	int type = swp_type(ent);
 	unsigned long offset = swp_offset(ent);
@@ -354,18 +350,18 @@ struct mem_cgroup *swap_cgroup_record(sw
 	struct swap_cgroup_ctrl *ctrl;
 	struct page *mappage;
 	struct swap_cgroup *sc;
-	struct mem_cgroup *old;
+	unsigned short old;
 
 	if (!do_swap_account)
-		return NULL;
+		return 0;
 
 	ctrl = &swap_cgroup_ctrl[type];
 
 	mappage = ctrl->map[idx];
 	sc = page_address(mappage);
 	sc += pos;
-	old = sc->val;
-	sc->val = mem;
+	old = sc->id;
+	sc->id = id;
 
 	return old;
 }
@@ -374,9 +370,9 @@ struct mem_cgroup *swap_cgroup_record(sw
  * lookup_swap_cgroup - lookup mem_cgroup tied to swap entry
  * @ent: swap entry to be looked up.
  *
- * Returns pointer to mem_cgroup at success. NULL at failure.
+ * Returns CSS ID of mem_cgroup at success. 0 at failure. (0 is invalid ID)
  */
-struct mem_cgroup *lookup_swap_cgroup(swp_entry_t ent)
+unsigned short lookup_swap_cgroup(swp_entry_t ent)
 {
 	int type = swp_type(ent);
 	unsigned long offset = swp_offset(ent);
@@ -385,16 +381,16 @@ struct mem_cgroup *lookup_swap_cgroup(sw
 	struct swap_cgroup_ctrl *ctrl;
 	struct page *mappage;
 	struct swap_cgroup *sc;
-	struct mem_cgroup *ret;
+	unsigned short ret;
 
 	if (!do_swap_account)
-		return NULL;
+		return 0;
 
 	ctrl = &swap_cgroup_ctrl[type];
 	mappage = ctrl->map[idx];
 	sc = page_address(mappage);
 	sc += pos;
-	ret = sc->val;
+	ret = sc->id;
 	return ret;
 }
 
@@ -432,7 +428,7 @@ int swap_cgroup_swapon(int type, unsigne
 
 	printk(KERN_INFO
 		"swap_cgroup: uses %ld bytes of vmalloc for pointer array space"
-		" and %ld bytes to hold mem_cgroup pointers on swap\n",
+		" and %ld bytes to hold mem_cgroup information per swap ents\n",
 		array_size, length * PAGE_SIZE);
 	printk(KERN_INFO
 	"swap_cgroup can be disabled by noswapaccount boot option.\n");

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] use css id in swap cgroup for saving memory v5
  2009-03-11  3:04         ` [PATCH] use css id in swap cgroup for saving memory v5 KAMEZAWA Hiroyuki
@ 2009-03-11 11:05           ` Hugh Dickins
  2009-03-11 17:16             ` Balbir Singh
  2009-03-11 23:46             ` KAMEZAWA Hiroyuki
  0 siblings, 2 replies; 21+ messages in thread
From: Hugh Dickins @ 2009-03-11 11:05 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: nishimura, Andrew Morton, linux-mm, balbir, lizf

Sorry, you'd prefer my input on the rest, which I've not studied, but ...

On Wed, 11 Mar 2009, KAMEZAWA Hiroyuki wrote:
...
>  
> @@ -432,7 +428,7 @@ int swap_cgroup_swapon(int type, unsigne
>  
>  	printk(KERN_INFO
>  		"swap_cgroup: uses %ld bytes of vmalloc for pointer array space"
> -		" and %ld bytes to hold mem_cgroup pointers on swap\n",
> +		" and %ld bytes to hold mem_cgroup information per swap ents\n",
>  		array_size, length * PAGE_SIZE);
>  	printk(KERN_INFO
>  	"swap_cgroup can be disabled by noswapaccount boot option.\n");

... I do get very irritated by all the screenspace these messages take
up every time I swapon.  I can see that you're following a page_cgroup
precedent, one which never bothered me because it got buried in dmesg;
and most other people wouldn't be doing swapon very often, and wouldn't
be logging to a visible screen ... but is there any chance of putting an
approximation to this info in the CGROUP_MEM_RES_CTRL_SWAP Kconfig help
text and removing these runtime messages?  How do other people feel?

I'm also disappointed that we invented such a tortuously generic boot
option as "cgroup_disable=memory", then departed from it when the very
first extension "noswapaccount" was required.  "cgroup_disable=swap"?
Probably too late.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] use css id in swap cgroup for saving memory v5
  2009-03-11 11:05           ` Hugh Dickins
@ 2009-03-11 17:16             ` Balbir Singh
  2009-03-11 23:46             ` KAMEZAWA Hiroyuki
  1 sibling, 0 replies; 21+ messages in thread
From: Balbir Singh @ 2009-03-11 17:16 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: KAMEZAWA Hiroyuki, nishimura, Andrew Morton, linux-mm, lizf

* Hugh Dickins <hugh@veritas.com> [2009-03-11 11:05:55]:

> Sorry, you'd prefer my input on the rest, which I've not studied, but ...
> 
> On Wed, 11 Mar 2009, KAMEZAWA Hiroyuki wrote:
> ...
> >  
> > @@ -432,7 +428,7 @@ int swap_cgroup_swapon(int type, unsigne
> >  
> >  	printk(KERN_INFO
> >  		"swap_cgroup: uses %ld bytes of vmalloc for pointer array space"
> > -		" and %ld bytes to hold mem_cgroup pointers on swap\n",
> > +		" and %ld bytes to hold mem_cgroup information per swap ents\n",
> >  		array_size, length * PAGE_SIZE);
> >  	printk(KERN_INFO
> >  	"swap_cgroup can be disabled by noswapaccount boot option.\n");
> 
> ... I do get very irritated by all the screenspace these messages take
> up every time I swapon.  I can see that you're following a page_cgroup
> precedent, one which never bothered me because it got buried in dmesg;
> and most other people wouldn't be doing swapon very often, and wouldn't
> be logging to a visible screen ... but is there any chance of putting an
> approximation to this info in the CGROUP_MEM_RES_CTRL_SWAP Kconfig help
> text and removing these runtime messages?  How do other people feel?
>

We could just print out the numbers and point to the Documentation for
the user to look at.
 
> I'm also disappointed that we invented such a tortuously generic boot
> option as "cgroup_disable=memory", then departed from it when the very
> first extension "noswapaccount" was required.  "cgroup_disable=swap"?
> Probably too late.
>

Good suggestion and it makes things more consistent for the end user.
It might be too early to start obsoleting noswapaccount?

-- 
	Balbir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] use css id in swap cgroup for saving memory v5
  2009-03-11 11:05           ` Hugh Dickins
  2009-03-11 17:16             ` Balbir Singh
@ 2009-03-11 23:46             ` KAMEZAWA Hiroyuki
  2009-03-11 23:50               ` KAMEZAWA Hiroyuki
  2009-03-16 22:25               ` Hugh Dickins
  1 sibling, 2 replies; 21+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-11 23:46 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: nishimura, Andrew Morton, linux-mm, balbir, lizf

On Wed, 11 Mar 2009 11:05:55 +0000 (GMT)
Hugh Dickins <hugh@veritas.com> wrote:

> Sorry, you'd prefer my input on the rest, which I've not studied, but ...
> 
> On Wed, 11 Mar 2009, KAMEZAWA Hiroyuki wrote:
> ...
> >  
> > @@ -432,7 +428,7 @@ int swap_cgroup_swapon(int type, unsigne
> >  
> >  	printk(KERN_INFO
> >  		"swap_cgroup: uses %ld bytes of vmalloc for pointer array space"
> > -		" and %ld bytes to hold mem_cgroup pointers on swap\n",
> > +		" and %ld bytes to hold mem_cgroup information per swap ents\n",
> >  		array_size, length * PAGE_SIZE);
> >  	printk(KERN_INFO
> >  	"swap_cgroup can be disabled by noswapaccount boot option.\n");
> 
> ... I do get very irritated by all the screenspace these messages take
> up every time I swapon.  I can see that you're following a page_cgroup
> precedent, one which never bothered me because it got buried in dmesg;
> and most other people wouldn't be doing swapon very often, and wouldn't
> be logging to a visible screen ... but is there any chance of putting an
> approximation to this info in the CGROUP_MEM_RES_CTRL_SWAP Kconfig help
> text and removing these runtime messages?  How do other people feel?
> 
Ok, will remove this. (in other patch.)

> I'm also disappointed that we invented such a tortuously generic boot
> option as "cgroup_disable=memory", then departed from it when the very
> first extension "noswapaccount" was required.  "cgroup_disable=swap"?
> Probably too late.
> 
Hmm, cgroup_disable=memory is option to disable memory cgroup and "memory"
is the subsytem name of cgroup. But "swap" isn't.
Just removing "noswapaccount" option is ok ? Anyway, there is config.

Thanks,
-Kame


> Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] use css id in swap cgroup for saving memory v5
  2009-03-11 23:46             ` KAMEZAWA Hiroyuki
@ 2009-03-11 23:50               ` KAMEZAWA Hiroyuki
  2009-03-16 22:25               ` Hugh Dickins
  1 sibling, 0 replies; 21+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-11 23:50 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hugh Dickins, nishimura, Andrew Morton, linux-mm, balbir, lizf

On Thu, 12 Mar 2009 08:46:23 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> On Wed, 11 Mar 2009 11:05:55 +0000 (GMT)
> Hugh Dickins <hugh@veritas.com> wrote:
> 
> > Sorry, you'd prefer my input on the rest, which I've not studied, but ...
> > 
> > On Wed, 11 Mar 2009, KAMEZAWA Hiroyuki wrote:
> > ...
> > >  
> > > @@ -432,7 +428,7 @@ int swap_cgroup_swapon(int type, unsigne
> > >  
> > >  	printk(KERN_INFO
> > >  		"swap_cgroup: uses %ld bytes of vmalloc for pointer array space"
> > > -		" and %ld bytes to hold mem_cgroup pointers on swap\n",
> > > +		" and %ld bytes to hold mem_cgroup information per swap ents\n",
> > >  		array_size, length * PAGE_SIZE);
> > >  	printk(KERN_INFO
> > >  	"swap_cgroup can be disabled by noswapaccount boot option.\n");
> > 
> > ... I do get very irritated by all the screenspace these messages take
> > up every time I swapon.  I can see that you're following a page_cgroup
> > precedent, one which never bothered me because it got buried in dmesg;
> > and most other people wouldn't be doing swapon very often, and wouldn't
> > be logging to a visible screen ... but is there any chance of putting an
> > approximation to this info in the CGROUP_MEM_RES_CTRL_SWAP Kconfig help
> > text and removing these runtime messages?  How do other people feel?
> > 
> Ok, will remove this. (in other patch.)
> 
> > I'm also disappointed that we invented such a tortuously generic boot
> > option as "cgroup_disable=memory", then departed from it when the very
> > first extension "noswapaccount" was required.  "cgroup_disable=swap"?
> > Probably too late.
> > 
> Hmm, cgroup_disable=memory is option to disable memory cgroup and "memory"
> is the subsytem name of cgroup. But "swap" isn't.
> Just removing "noswapaccount" option is ok ? Anyway, there is config.

Ah.. after this patch, 2bytes per swap ent. This makes current memory usage
for swap management twice.

Thanks,
-Kame

> 
> Thanks,
> -Kame
> 
> 
> > Hugh
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] use css id in swap cgroup for saving memory v5
  2009-03-11 23:46             ` KAMEZAWA Hiroyuki
  2009-03-11 23:50               ` KAMEZAWA Hiroyuki
@ 2009-03-16 22:25               ` Hugh Dickins
  2009-03-19  0:44                   ` KAMEZAWA Hiroyuki
  1 sibling, 1 reply; 21+ messages in thread
From: Hugh Dickins @ 2009-03-16 22:25 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: nishimura, Andrew Morton, linux-mm, balbir, lizf

On Thu, 12 Mar 2009, KAMEZAWA Hiroyuki wrote:
> On Wed, 11 Mar 2009 11:05:55 +0000 (GMT)
> Hugh Dickins <hugh@veritas.com> wrote:
> > > @@ -432,7 +428,7 @@ int swap_cgroup_swapon(int type, unsigne
> > >  
> > >  	printk(KERN_INFO
> > >  		"swap_cgroup: uses %ld bytes of vmalloc for pointer array space"
> > > -		" and %ld bytes to hold mem_cgroup pointers on swap\n",
> > > +		" and %ld bytes to hold mem_cgroup information per swap ents\n",
> > >  		array_size, length * PAGE_SIZE);
> > >  	printk(KERN_INFO
> > >  	"swap_cgroup can be disabled by noswapaccount boot option.\n");
> > 
> > ... I do get very irritated by all the screenspace these messages take
> > up every time I swapon.  I can see that you're following a page_cgroup
> > precedent, one which never bothered me because it got buried in dmesg;
> > and most other people wouldn't be doing swapon very often, and wouldn't
> > be logging to a visible screen ... but is there any chance of putting an
> > approximation to this info in the CGROUP_MEM_RES_CTRL_SWAP Kconfig help
> > text and removing these runtime messages?  How do other people feel?
> > 
> Ok, will remove this. (in other patch.)

Thanks!

> 
> > I'm also disappointed that we invented such a tortuously generic boot
> > option as "cgroup_disable=memory", then departed from it when the very
> > first extension "noswapaccount" was required.  "cgroup_disable=swap"?
> > Probably too late.
> > 
> Hmm, cgroup_disable=memory is option to disable memory cgroup and "memory"
> is the subsytem name of cgroup. But "swap" isn't.

Yes, "swap" was always going to be a more awkward case than a cgroup.

> Just removing "noswapaccount" option is ok ? Anyway, there is config.

Removing "noswapaccount" would be okay by me, but I don't think you
should remove it without wider agreement of mem_cgroup folks.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH] memcg remvoe redundant message at swapon
  2009-03-16 22:25               ` Hugh Dickins
@ 2009-03-19  0:44                   ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 21+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-19  0:44 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: nishimura, Andrew Morton, linux-mm, balbir, lizf, linux-kernel

This is based on this thread http://marc.info/?l=linux-mm&m=123724233418715&w=2
against the mmotm.
=
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

It's pointed out that swap_cgroup's message at swapon() is nonsense.
Because
  * It can be calculated very easily if all necessary information is
    written in Kconfig.
  * It's not necessary to annoying people at every swapon().

In other view, now, memory usage per swp_entry is reduced to 2bytes
from 8bytes(64bit) and I think it's reasonably small.

Reported-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
Index: mmotm-2.6.29-Mar11/mm/page_cgroup.c
===================================================================
--- mmotm-2.6.29-Mar11.orig/mm/page_cgroup.c
+++ mmotm-2.6.29-Mar11/mm/page_cgroup.c
@@ -426,13 +426,6 @@ int swap_cgroup_swapon(int type, unsigne
 	}
 	mutex_unlock(&swap_cgroup_mutex);
 
-	printk(KERN_INFO
-		"swap_cgroup: uses %ld bytes of vmalloc for pointer array space"
-		" and %ld bytes to hold mem_cgroup information per swap ents\n",
-		array_size, length * PAGE_SIZE);
-	printk(KERN_INFO
-	"swap_cgroup can be disabled by noswapaccount boot option.\n");
-
 	return 0;
 nomem:
 	printk(KERN_INFO "couldn't allocate enough memory for swap_cgroup.\n");
Index: mmotm-2.6.29-Mar11/init/Kconfig
===================================================================
--- mmotm-2.6.29-Mar11.orig/init/Kconfig
+++ mmotm-2.6.29-Mar11/init/Kconfig
@@ -603,6 +603,8 @@ config CGROUP_MEM_RES_CTLR_SWAP
 	  is disabled by boot option, this will be automatically disabled and
 	  there will be no overhead from this. Even when you set this config=y,
 	  if boot option "noswapaccount" is set, swap will not be accounted.
+	  Now, memory usage of swap_cgroup is 2 bytes per entry. If swap page
+	  size is 4096bytes, 512k per 1Gbytes of swap.
 
 endif # CGROUPS
 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH] memcg remvoe redundant message at swapon
@ 2009-03-19  0:44                   ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 21+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-19  0:44 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: nishimura, Andrew Morton, linux-mm, balbir, lizf, linux-kernel

This is based on this thread http://marc.info/?l=linux-mm&m=123724233418715&w=2
against the mmotm.
=
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

It's pointed out that swap_cgroup's message at swapon() is nonsense.
Because
  * It can be calculated very easily if all necessary information is
    written in Kconfig.
  * It's not necessary to annoying people at every swapon().

In other view, now, memory usage per swp_entry is reduced to 2bytes
from 8bytes(64bit) and I think it's reasonably small.

Reported-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
Index: mmotm-2.6.29-Mar11/mm/page_cgroup.c
===================================================================
--- mmotm-2.6.29-Mar11.orig/mm/page_cgroup.c
+++ mmotm-2.6.29-Mar11/mm/page_cgroup.c
@@ -426,13 +426,6 @@ int swap_cgroup_swapon(int type, unsigne
 	}
 	mutex_unlock(&swap_cgroup_mutex);
 
-	printk(KERN_INFO
-		"swap_cgroup: uses %ld bytes of vmalloc for pointer array space"
-		" and %ld bytes to hold mem_cgroup information per swap ents\n",
-		array_size, length * PAGE_SIZE);
-	printk(KERN_INFO
-	"swap_cgroup can be disabled by noswapaccount boot option.\n");
-
 	return 0;
 nomem:
 	printk(KERN_INFO "couldn't allocate enough memory for swap_cgroup.\n");
Index: mmotm-2.6.29-Mar11/init/Kconfig
===================================================================
--- mmotm-2.6.29-Mar11.orig/init/Kconfig
+++ mmotm-2.6.29-Mar11/init/Kconfig
@@ -603,6 +603,8 @@ config CGROUP_MEM_RES_CTLR_SWAP
 	  is disabled by boot option, this will be automatically disabled and
 	  there will be no overhead from this. Even when you set this config=y,
 	  if boot option "noswapaccount" is set, swap will not be accounted.
+	  Now, memory usage of swap_cgroup is 2 bytes per entry. If swap page
+	  size is 4096bytes, 512k per 1Gbytes of swap.
 
 endif # CGROUPS
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2009-03-19  0:46 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-10  1:07 [BUGFIX][PATCH] memcg: charge swapcache to proper memcg Daisuke Nishimura
2009-03-10  2:35 ` KAMEZAWA Hiroyuki
2009-03-10  3:18   ` Daisuke Nishimura
2009-03-10  3:42     ` KAMEZAWA Hiroyuki
2009-03-10  4:33 ` KAMEZAWA Hiroyuki
2009-03-10  4:47   ` Daisuke Nishimura
2009-03-10  5:04     ` KAMEZAWA Hiroyuki
2009-03-10  6:38       ` Daisuke Nishimura
2009-03-10  6:56         ` KAMEZAWA Hiroyuki
2009-03-10 23:08 ` Andrew Morton
2009-03-10 23:53   ` KAMEZAWA Hiroyuki
2009-03-11  0:43     ` nishimura
2009-03-11  0:47       ` KAMEZAWA Hiroyuki
2009-03-11  3:04         ` [PATCH] use css id in swap cgroup for saving memory v5 KAMEZAWA Hiroyuki
2009-03-11 11:05           ` Hugh Dickins
2009-03-11 17:16             ` Balbir Singh
2009-03-11 23:46             ` KAMEZAWA Hiroyuki
2009-03-11 23:50               ` KAMEZAWA Hiroyuki
2009-03-16 22:25               ` Hugh Dickins
2009-03-19  0:44                 ` [PATCH] memcg remvoe redundant message at swapon KAMEZAWA Hiroyuki
2009-03-19  0:44                   ` KAMEZAWA Hiroyuki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.