All of lore.kernel.org
 help / color / mirror / Atom feed
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Konstantin Khlebnikov <khlebnikov@openvz.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Ying Han <yinghan@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 8/10] mm/memcg: nest lru_lock inside page_cgroup lock
Date: Tue, 21 Feb 2012 18:48:29 +0900	[thread overview]
Message-ID: <20120221184829.78d523a8.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <alpine.LSU.2.00.1202201535460.23274@eggly.anvils>

On Mon, 20 Feb 2012 15:36:55 -0800 (PST)
Hugh Dickins <hughd@google.com> wrote:

> Cut back on some of the overhead we've added, particularly the lruvec
> locking added to every __mem_cgroup_uncharge_common(), and the page
> cgroup locking in mem_cgroup_reset_uncharged_to_root().
> 
> Our hands were tied by the lock ordering (page cgroup inside lruvec)
> defined by __mem_cgroup_commit_charge_lrucare().  There is no strong
> reason for why that nesting needs to be one way or the other, and if
> we invert it, then some optimizations become possible.
> 
> So delete __mem_cgroup_commit_charge_lrucare(), passing a bool lrucare
> to __mem_cgroup_commit_charge() instead, using page_lock_lruvec() there
> inside lock_page_cgroup() in the lrucare case.  (I'd prefer to work it
> out internally, than rely upon an lrucare argument: but that is hard -
> certainly PageLRU is not enough, racing with pages on pagevec about to
> be flushed to lru.)  Use page_relock_lruvec() after setting mem_cgroup,
> before adding to the appropriate new lruvec: so that (if lock depends
> on memcg) old lock is held across change in ownership while off lru.
> 
> Delete the lruvec locking on entry to __mem_cgroup_uncharge_common();
> but if the page being uncharged is not on lru, then we do need to
> reset its ownership, and must dance very carefully with mem_cgroup_
> reset_uncharged_to_root(), to make sure that when there's a race
> between uncharging and removing from lru, one side or the other
> will see it - smp_mb__after_clear_bit() at both ends.
> 

> Avoid overhead of calls to mem_cgroup_reset_uncharged_to_root() from
> release_pages() and __page_cache_release(), by doing its work inside
> page_relock_lruvec() when the page_count is 0 i.e. the page is frozen
> from other references and about to be freed.  That was not possible
> with the old lock ordering, since __mem_cgroup_uncharge_common()'s
> lock then changed ownership too soon.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
>  mm/memcontrol.c |  142 ++++++++++++++++++++++++----------------------
>  mm/swap.c       |    2 
>  2 files changed, 75 insertions(+), 69 deletions(-)
> 
> --- mmotm.orig/mm/memcontrol.c	2012-02-18 11:57:55.551524898 -0800
> +++ mmotm/mm/memcontrol.c	2012-02-18 11:58:02.451525062 -0800
> @@ -1059,6 +1059,14 @@ void page_relock_lruvec(struct page *pag
>  		 */
>  		if (unlikely(!memcg))
>  			memcg = pc->mem_cgroup = root_mem_cgroup;
> +		/*
> +		 * We must reset pc->mem_cgroup back to root before freeing
> +		 * a page: avoid additional callouts from hot paths by doing
> +		 * it here when we see the page is frozen (can safely be done
> +		 * before taking lru_lock because the page is frozen).
> +		 */
> +		if (memcg != root_mem_cgroup && !page_count(page))
> +			pc->mem_cgroup = root_mem_cgroup;
>  		mz = page_cgroup_zoneinfo(memcg, page);
>  		lruvec = &mz->lruvec;
>  	}
> @@ -1083,23 +1091,20 @@ void mem_cgroup_reset_uncharged_to_root(
>  		return;
>  
>  	VM_BUG_ON(PageLRU(page));
> +	/*
> +	 * Caller just did ClearPageLRU():
> +	 * make sure that __mem_cgroup_uncharge_common()
> +	 * can see that before we test PageCgroupUsed(pc).
> +	 */
> +	smp_mb__after_clear_bit();
>  
>  	/*
>  	 * Once an uncharged page is isolated from the mem_cgroup's lru,
>  	 * it no longer protects that mem_cgroup from rmdir: reset to root.
> -	 *
> -	 * __page_cache_release() and release_pages() may be called at
> -	 * interrupt time: we cannot lock_page_cgroup() then (we might
> -	 * have interrupted a section with page_cgroup already locked),
> -	 * nor do we need to since the page is frozen and about to be freed.
>  	 */
>  	pc = lookup_page_cgroup(page);
> -	if (page_count(page))
> -		lock_page_cgroup(pc);
>  	if (!PageCgroupUsed(pc) && pc->mem_cgroup != root_mem_cgroup)
>  		pc->mem_cgroup = root_mem_cgroup;
> -	if (page_count(page))
> -		unlock_page_cgroup(pc);
>  }
>  
>  /**
> @@ -2422,9 +2427,11 @@ static void __mem_cgroup_commit_charge(s
>  				       struct page *page,
>  				       unsigned int nr_pages,
>  				       struct page_cgroup *pc,
> -				       enum charge_type ctype)
> +				       enum charge_type ctype,
> +				       bool lrucare)
>  {
> -	bool anon;
> +	struct lruvec *lruvec;
> +	bool was_on_lru = false;
>  
>  	lock_page_cgroup(pc);
>  	if (unlikely(PageCgroupUsed(pc))) {
> @@ -2433,28 +2440,41 @@ static void __mem_cgroup_commit_charge(s
>  		return;
>  	}
>  	/*
> -	 * we don't need page_cgroup_lock about tail pages, becase they are not
> -	 * accessed by any other context at this point.
> +	 * We don't need lock_page_cgroup on tail pages, because they are not
> +	 * accessible to any other context at this point.
>  	 */
> -	pc->mem_cgroup = memcg;
> +
>  	/*
> -	 * We access a page_cgroup asynchronously without lock_page_cgroup().
> -	 * Especially when a page_cgroup is taken from a page, pc->mem_cgroup
> -	 * is accessed after testing USED bit. To make pc->mem_cgroup visible
> -	 * before USED bit, we need memory barrier here.
> -	 * See mem_cgroup_add_lru_list(), etc.
> - 	 */
> -	smp_wmb();
> +	 * In some cases, SwapCache and FUSE(splice_buf->radixtree), the page
> +	 * may already be on some other page_cgroup's LRU.  Take care of it.
> +	 */
> +	if (lrucare) {
> +		lruvec = page_lock_lruvec(page);
> +		if (PageLRU(page)) {
> +			ClearPageLRU(page);
> +			del_page_from_lru_list(page, lruvec, page_lru(page));
> +			was_on_lru = true;
> +		}
> +	}
>  
> +	pc->mem_cgroup = memcg;
>  	SetPageCgroupUsed(pc);
> -	if (ctype == MEM_CGROUP_CHARGE_TYPE_MAPPED)
> -		anon = true;
> -	else
> -		anon = false;
>  
> -	mem_cgroup_charge_statistics(memcg, anon, nr_pages);
> +	if (lrucare) {
> +		if (was_on_lru) {
> +			page_relock_lruvec(page, &lruvec);
> +			if (!PageLRU(page)) {
> +				SetPageLRU(page);
> +				add_page_to_lru_list(page, lruvec, page_lru(page));
> +			}
> +		}
> +		unlock_lruvec(lruvec);
> +	}
> +
> +	mem_cgroup_charge_statistics(memcg,
> +			ctype == MEM_CGROUP_CHARGE_TYPE_MAPPED, nr_pages);
>  	unlock_page_cgroup(pc);
> -	WARN_ON_ONCE(PageLRU(page));
> +
>  	/*
>  	 * "charge_statistics" updated event counter. Then, check it.
>  	 * Insert ancestor (and ancestor's ancestors), to softlimit RB-tree.
> @@ -2652,7 +2672,7 @@ static int mem_cgroup_charge_common(stru
>  	ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom);
>  	if (ret == -ENOMEM)
>  		return ret;
> -	__mem_cgroup_commit_charge(memcg, page, nr_pages, pc, ctype);
> +	__mem_cgroup_commit_charge(memcg, page, nr_pages, pc, ctype, false);
>  	return 0;
>  }
>  
> @@ -2672,34 +2692,6 @@ static void
>  __mem_cgroup_commit_charge_swapin(struct page *page, struct mem_cgroup *ptr,
>  					enum charge_type ctype);
>  
> -static void
> -__mem_cgroup_commit_charge_lrucare(struct page *page, struct mem_cgroup *memcg,
> -					enum charge_type ctype)
> -{
> -	struct page_cgroup *pc = lookup_page_cgroup(page);
> -	struct lruvec *lruvec;
> -	bool removed = false;
> -
> -	/*
> -	 * In some case, SwapCache, FUSE(splice_buf->radixtree), the page
> -	 * is already on LRU. It means the page may on some other page_cgroup's
> -	 * LRU. Take care of it.
> -	 */
> -	lruvec = page_lock_lruvec(page);
> -	if (PageLRU(page)) {
> -		del_page_from_lru_list(page, lruvec, page_lru(page));
> -		ClearPageLRU(page);
> -		removed = true;
> -	}
> -	__mem_cgroup_commit_charge(memcg, page, 1, pc, ctype);
> -	if (removed) {
> -		page_relock_lruvec(page, &lruvec);
> -		add_page_to_lru_list(page, lruvec, page_lru(page));
> -		SetPageLRU(page);
> -	}
> -	unlock_lruvec(lruvec);
> -}
> -
>  int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
>  				gfp_t gfp_mask)
>  {
> @@ -2777,13 +2769,16 @@ static void
>  __mem_cgroup_commit_charge_swapin(struct page *page, struct mem_cgroup *memcg,
>  					enum charge_type ctype)
>  {
> +	struct page_cgroup *pc;
> +
>  	if (mem_cgroup_disabled())
>  		return;
>  	if (!memcg)
>  		return;
>  	cgroup_exclude_rmdir(&memcg->css);
>  
> -	__mem_cgroup_commit_charge_lrucare(page, memcg, ctype);
> +	pc = lookup_page_cgroup(page);
> +	__mem_cgroup_commit_charge(memcg, page, 1, pc, ctype, true);
>  	/*
>  	 * Now swap is on-memory. This means this page may be
>  	 * counted both as mem and swap....double count.
> @@ -2898,7 +2893,6 @@ __mem_cgroup_uncharge_common(struct page
>  	struct mem_cgroup *memcg = NULL;
>  	unsigned int nr_pages = 1;
>  	struct page_cgroup *pc;
> -	struct lruvec *lruvec;
>  	bool anon;
>  
>  	if (mem_cgroup_disabled())
> @@ -2918,7 +2912,6 @@ __mem_cgroup_uncharge_common(struct page
>  	if (unlikely(!PageCgroupUsed(pc)))
>  		return NULL;
>  
> -	lruvec = page_lock_lruvec(page);
>  	lock_page_cgroup(pc);
>  
>  	memcg = pc->mem_cgroup;
> @@ -2950,16 +2943,31 @@ __mem_cgroup_uncharge_common(struct page
>  	mem_cgroup_charge_statistics(memcg, anon, -nr_pages);
>  
>  	ClearPageCgroupUsed(pc);
> +	/*
> +	 * Make sure that mem_cgroup_reset_uncharged_to_root()
> +	 * can see that before we test PageLRU(page).
> +	 */
> +	smp_mb__after_clear_bit();
>  
>  	/*
>  	 * Once an uncharged page is isolated from the mem_cgroup's lru,
>  	 * it no longer protects that mem_cgroup from rmdir: reset to root.
> -	 */
> -	if (!PageLRU(page) && pc->mem_cgroup != root_mem_cgroup)
> -		pc->mem_cgroup = root_mem_cgroup;
> -
> +	 *
> +	 * The page_count() test avoids the lock in the common case when
> +	 * shrink_page_list()'s __remove_mapping() has frozen references
> +	 * to 0 and the page is on its way to freedom.
> +	 */
> +	if (!PageLRU(page) && pc->mem_cgroup != root_mem_cgroup) {
> +		struct lruvec *lruvec = NULL;
> +
> +		if (page_count(page))
> +			lruvec = page_lock_lruvec(page);
> +		if (!PageLRU(page))
> +			pc->mem_cgroup = root_mem_cgroup;
> +		if (lruvec)
> +			unlock_lruvec(lruvec);
> +	}

Hmm. ok, isoalte_lru_page() at el take care of all problems if PageLRU()==true,
right ?

I wonder which is better to delay freeing lruvec or this locking scheme...

Thanks,
-Kame




WARNING: multiple messages have this Message-ID (diff)
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Konstantin Khlebnikov <khlebnikov@openvz.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Ying Han <yinghan@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 8/10] mm/memcg: nest lru_lock inside page_cgroup lock
Date: Tue, 21 Feb 2012 18:48:29 +0900	[thread overview]
Message-ID: <20120221184829.78d523a8.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <alpine.LSU.2.00.1202201535460.23274@eggly.anvils>

On Mon, 20 Feb 2012 15:36:55 -0800 (PST)
Hugh Dickins <hughd@google.com> wrote:

> Cut back on some of the overhead we've added, particularly the lruvec
> locking added to every __mem_cgroup_uncharge_common(), and the page
> cgroup locking in mem_cgroup_reset_uncharged_to_root().
> 
> Our hands were tied by the lock ordering (page cgroup inside lruvec)
> defined by __mem_cgroup_commit_charge_lrucare().  There is no strong
> reason for why that nesting needs to be one way or the other, and if
> we invert it, then some optimizations become possible.
> 
> So delete __mem_cgroup_commit_charge_lrucare(), passing a bool lrucare
> to __mem_cgroup_commit_charge() instead, using page_lock_lruvec() there
> inside lock_page_cgroup() in the lrucare case.  (I'd prefer to work it
> out internally, than rely upon an lrucare argument: but that is hard -
> certainly PageLRU is not enough, racing with pages on pagevec about to
> be flushed to lru.)  Use page_relock_lruvec() after setting mem_cgroup,
> before adding to the appropriate new lruvec: so that (if lock depends
> on memcg) old lock is held across change in ownership while off lru.
> 
> Delete the lruvec locking on entry to __mem_cgroup_uncharge_common();
> but if the page being uncharged is not on lru, then we do need to
> reset its ownership, and must dance very carefully with mem_cgroup_
> reset_uncharged_to_root(), to make sure that when there's a race
> between uncharging and removing from lru, one side or the other
> will see it - smp_mb__after_clear_bit() at both ends.
> 

> Avoid overhead of calls to mem_cgroup_reset_uncharged_to_root() from
> release_pages() and __page_cache_release(), by doing its work inside
> page_relock_lruvec() when the page_count is 0 i.e. the page is frozen
> from other references and about to be freed.  That was not possible
> with the old lock ordering, since __mem_cgroup_uncharge_common()'s
> lock then changed ownership too soon.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
>  mm/memcontrol.c |  142 ++++++++++++++++++++++++----------------------
>  mm/swap.c       |    2 
>  2 files changed, 75 insertions(+), 69 deletions(-)
> 
> --- mmotm.orig/mm/memcontrol.c	2012-02-18 11:57:55.551524898 -0800
> +++ mmotm/mm/memcontrol.c	2012-02-18 11:58:02.451525062 -0800
> @@ -1059,6 +1059,14 @@ void page_relock_lruvec(struct page *pag
>  		 */
>  		if (unlikely(!memcg))
>  			memcg = pc->mem_cgroup = root_mem_cgroup;
> +		/*
> +		 * We must reset pc->mem_cgroup back to root before freeing
> +		 * a page: avoid additional callouts from hot paths by doing
> +		 * it here when we see the page is frozen (can safely be done
> +		 * before taking lru_lock because the page is frozen).
> +		 */
> +		if (memcg != root_mem_cgroup && !page_count(page))
> +			pc->mem_cgroup = root_mem_cgroup;
>  		mz = page_cgroup_zoneinfo(memcg, page);
>  		lruvec = &mz->lruvec;
>  	}
> @@ -1083,23 +1091,20 @@ void mem_cgroup_reset_uncharged_to_root(
>  		return;
>  
>  	VM_BUG_ON(PageLRU(page));
> +	/*
> +	 * Caller just did ClearPageLRU():
> +	 * make sure that __mem_cgroup_uncharge_common()
> +	 * can see that before we test PageCgroupUsed(pc).
> +	 */
> +	smp_mb__after_clear_bit();
>  
>  	/*
>  	 * Once an uncharged page is isolated from the mem_cgroup's lru,
>  	 * it no longer protects that mem_cgroup from rmdir: reset to root.
> -	 *
> -	 * __page_cache_release() and release_pages() may be called at
> -	 * interrupt time: we cannot lock_page_cgroup() then (we might
> -	 * have interrupted a section with page_cgroup already locked),
> -	 * nor do we need to since the page is frozen and about to be freed.
>  	 */
>  	pc = lookup_page_cgroup(page);
> -	if (page_count(page))
> -		lock_page_cgroup(pc);
>  	if (!PageCgroupUsed(pc) && pc->mem_cgroup != root_mem_cgroup)
>  		pc->mem_cgroup = root_mem_cgroup;
> -	if (page_count(page))
> -		unlock_page_cgroup(pc);
>  }
>  
>  /**
> @@ -2422,9 +2427,11 @@ static void __mem_cgroup_commit_charge(s
>  				       struct page *page,
>  				       unsigned int nr_pages,
>  				       struct page_cgroup *pc,
> -				       enum charge_type ctype)
> +				       enum charge_type ctype,
> +				       bool lrucare)
>  {
> -	bool anon;
> +	struct lruvec *lruvec;
> +	bool was_on_lru = false;
>  
>  	lock_page_cgroup(pc);
>  	if (unlikely(PageCgroupUsed(pc))) {
> @@ -2433,28 +2440,41 @@ static void __mem_cgroup_commit_charge(s
>  		return;
>  	}
>  	/*
> -	 * we don't need page_cgroup_lock about tail pages, becase they are not
> -	 * accessed by any other context at this point.
> +	 * We don't need lock_page_cgroup on tail pages, because they are not
> +	 * accessible to any other context at this point.
>  	 */
> -	pc->mem_cgroup = memcg;
> +
>  	/*
> -	 * We access a page_cgroup asynchronously without lock_page_cgroup().
> -	 * Especially when a page_cgroup is taken from a page, pc->mem_cgroup
> -	 * is accessed after testing USED bit. To make pc->mem_cgroup visible
> -	 * before USED bit, we need memory barrier here.
> -	 * See mem_cgroup_add_lru_list(), etc.
> - 	 */
> -	smp_wmb();
> +	 * In some cases, SwapCache and FUSE(splice_buf->radixtree), the page
> +	 * may already be on some other page_cgroup's LRU.  Take care of it.
> +	 */
> +	if (lrucare) {
> +		lruvec = page_lock_lruvec(page);
> +		if (PageLRU(page)) {
> +			ClearPageLRU(page);
> +			del_page_from_lru_list(page, lruvec, page_lru(page));
> +			was_on_lru = true;
> +		}
> +	}
>  
> +	pc->mem_cgroup = memcg;
>  	SetPageCgroupUsed(pc);
> -	if (ctype == MEM_CGROUP_CHARGE_TYPE_MAPPED)
> -		anon = true;
> -	else
> -		anon = false;
>  
> -	mem_cgroup_charge_statistics(memcg, anon, nr_pages);
> +	if (lrucare) {
> +		if (was_on_lru) {
> +			page_relock_lruvec(page, &lruvec);
> +			if (!PageLRU(page)) {
> +				SetPageLRU(page);
> +				add_page_to_lru_list(page, lruvec, page_lru(page));
> +			}
> +		}
> +		unlock_lruvec(lruvec);
> +	}
> +
> +	mem_cgroup_charge_statistics(memcg,
> +			ctype == MEM_CGROUP_CHARGE_TYPE_MAPPED, nr_pages);
>  	unlock_page_cgroup(pc);
> -	WARN_ON_ONCE(PageLRU(page));
> +
>  	/*
>  	 * "charge_statistics" updated event counter. Then, check it.
>  	 * Insert ancestor (and ancestor's ancestors), to softlimit RB-tree.
> @@ -2652,7 +2672,7 @@ static int mem_cgroup_charge_common(stru
>  	ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom);
>  	if (ret == -ENOMEM)
>  		return ret;
> -	__mem_cgroup_commit_charge(memcg, page, nr_pages, pc, ctype);
> +	__mem_cgroup_commit_charge(memcg, page, nr_pages, pc, ctype, false);
>  	return 0;
>  }
>  
> @@ -2672,34 +2692,6 @@ static void
>  __mem_cgroup_commit_charge_swapin(struct page *page, struct mem_cgroup *ptr,
>  					enum charge_type ctype);
>  
> -static void
> -__mem_cgroup_commit_charge_lrucare(struct page *page, struct mem_cgroup *memcg,
> -					enum charge_type ctype)
> -{
> -	struct page_cgroup *pc = lookup_page_cgroup(page);
> -	struct lruvec *lruvec;
> -	bool removed = false;
> -
> -	/*
> -	 * In some case, SwapCache, FUSE(splice_buf->radixtree), the page
> -	 * is already on LRU. It means the page may on some other page_cgroup's
> -	 * LRU. Take care of it.
> -	 */
> -	lruvec = page_lock_lruvec(page);
> -	if (PageLRU(page)) {
> -		del_page_from_lru_list(page, lruvec, page_lru(page));
> -		ClearPageLRU(page);
> -		removed = true;
> -	}
> -	__mem_cgroup_commit_charge(memcg, page, 1, pc, ctype);
> -	if (removed) {
> -		page_relock_lruvec(page, &lruvec);
> -		add_page_to_lru_list(page, lruvec, page_lru(page));
> -		SetPageLRU(page);
> -	}
> -	unlock_lruvec(lruvec);
> -}
> -
>  int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
>  				gfp_t gfp_mask)
>  {
> @@ -2777,13 +2769,16 @@ static void
>  __mem_cgroup_commit_charge_swapin(struct page *page, struct mem_cgroup *memcg,
>  					enum charge_type ctype)
>  {
> +	struct page_cgroup *pc;
> +
>  	if (mem_cgroup_disabled())
>  		return;
>  	if (!memcg)
>  		return;
>  	cgroup_exclude_rmdir(&memcg->css);
>  
> -	__mem_cgroup_commit_charge_lrucare(page, memcg, ctype);
> +	pc = lookup_page_cgroup(page);
> +	__mem_cgroup_commit_charge(memcg, page, 1, pc, ctype, true);
>  	/*
>  	 * Now swap is on-memory. This means this page may be
>  	 * counted both as mem and swap....double count.
> @@ -2898,7 +2893,6 @@ __mem_cgroup_uncharge_common(struct page
>  	struct mem_cgroup *memcg = NULL;
>  	unsigned int nr_pages = 1;
>  	struct page_cgroup *pc;
> -	struct lruvec *lruvec;
>  	bool anon;
>  
>  	if (mem_cgroup_disabled())
> @@ -2918,7 +2912,6 @@ __mem_cgroup_uncharge_common(struct page
>  	if (unlikely(!PageCgroupUsed(pc)))
>  		return NULL;
>  
> -	lruvec = page_lock_lruvec(page);
>  	lock_page_cgroup(pc);
>  
>  	memcg = pc->mem_cgroup;
> @@ -2950,16 +2943,31 @@ __mem_cgroup_uncharge_common(struct page
>  	mem_cgroup_charge_statistics(memcg, anon, -nr_pages);
>  
>  	ClearPageCgroupUsed(pc);
> +	/*
> +	 * Make sure that mem_cgroup_reset_uncharged_to_root()
> +	 * can see that before we test PageLRU(page).
> +	 */
> +	smp_mb__after_clear_bit();
>  
>  	/*
>  	 * Once an uncharged page is isolated from the mem_cgroup's lru,
>  	 * it no longer protects that mem_cgroup from rmdir: reset to root.
> -	 */
> -	if (!PageLRU(page) && pc->mem_cgroup != root_mem_cgroup)
> -		pc->mem_cgroup = root_mem_cgroup;
> -
> +	 *
> +	 * The page_count() test avoids the lock in the common case when
> +	 * shrink_page_list()'s __remove_mapping() has frozen references
> +	 * to 0 and the page is on its way to freedom.
> +	 */
> +	if (!PageLRU(page) && pc->mem_cgroup != root_mem_cgroup) {
> +		struct lruvec *lruvec = NULL;
> +
> +		if (page_count(page))
> +			lruvec = page_lock_lruvec(page);
> +		if (!PageLRU(page))
> +			pc->mem_cgroup = root_mem_cgroup;
> +		if (lruvec)
> +			unlock_lruvec(lruvec);
> +	}

Hmm. ok, isoalte_lru_page() at el take care of all problems if PageLRU()==true,
right ?

I wonder which is better to delay freeing lruvec or this locking scheme...

Thanks,
-Kame



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-02-21  9:50 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-20 23:26 [PATCH 0/10] mm/memcg: per-memcg per-zone lru locking Hugh Dickins
2012-02-20 23:26 ` Hugh Dickins
2012-02-20 23:28 ` [PATCH 1/10] mm/memcg: scanning_global_lru means mem_cgroup_disabled Hugh Dickins
2012-02-20 23:28   ` Hugh Dickins
2012-02-21  8:03   ` KAMEZAWA Hiroyuki
2012-02-21  8:03     ` KAMEZAWA Hiroyuki
2012-02-20 23:29 ` [PATCH 2/10] mm/memcg: move reclaim_stat into lruvec Hugh Dickins
2012-02-20 23:29   ` Hugh Dickins
2012-02-21  8:05   ` KAMEZAWA Hiroyuki
2012-02-21  8:05     ` KAMEZAWA Hiroyuki
2012-02-20 23:30 ` [PATCH 3/10] mm/memcg: add zone pointer " Hugh Dickins
2012-02-20 23:30   ` Hugh Dickins
2012-02-21  8:08   ` KAMEZAWA Hiroyuki
2012-02-21  8:08     ` KAMEZAWA Hiroyuki
2012-02-20 23:32 ` [PATCH 4/10] mm/memcg: apply add/del_page to lruvec Hugh Dickins
2012-02-20 23:32   ` Hugh Dickins
2012-02-21  8:20   ` KAMEZAWA Hiroyuki
2012-02-21  8:20     ` KAMEZAWA Hiroyuki
2012-02-21 22:25     ` Hugh Dickins
2012-02-21 22:25       ` Hugh Dickins
2012-02-20 23:33 ` [PATCH 5/10] mm/memcg: introduce page_relock_lruvec Hugh Dickins
2012-02-20 23:33   ` Hugh Dickins
2012-02-21  8:38   ` KAMEZAWA Hiroyuki
2012-02-21  8:38     ` KAMEZAWA Hiroyuki
2012-02-21 22:36     ` Hugh Dickins
2012-02-21 22:36       ` Hugh Dickins
2012-02-20 23:34 ` [PATCH 6/10] mm/memcg: take care over pc->mem_cgroup Hugh Dickins
2012-02-20 23:34   ` Hugh Dickins
2012-02-21  5:55   ` Konstantin Khlebnikov
2012-02-21  5:55     ` Konstantin Khlebnikov
2012-02-21 19:37     ` Hugh Dickins
2012-02-21 19:37       ` Hugh Dickins
2012-02-21 20:40       ` Konstantin Khlebnikov
2012-02-21 20:40         ` Konstantin Khlebnikov
2012-02-21 22:05         ` Hugh Dickins
2012-02-21 22:05           ` Hugh Dickins
2012-02-21  6:05   ` Konstantin Khlebnikov
2012-02-21  6:05     ` Konstantin Khlebnikov
2012-02-21 20:00     ` Hugh Dickins
2012-02-21 20:00       ` Hugh Dickins
2012-02-21  9:13   ` KAMEZAWA Hiroyuki
2012-02-21  9:13     ` KAMEZAWA Hiroyuki
2012-02-21 23:03     ` Hugh Dickins
2012-02-21 23:03       ` Hugh Dickins
2012-02-22  4:05       ` Konstantin Khlebnikov
2012-02-22  4:05         ` Konstantin Khlebnikov
2012-02-20 23:35 ` [PATCH 7/10] mm/memcg: remove mem_cgroup_reset_owner Hugh Dickins
2012-02-20 23:35   ` Hugh Dickins
2012-02-21  9:17   ` KAMEZAWA Hiroyuki
2012-02-21  9:17     ` KAMEZAWA Hiroyuki
2012-02-20 23:36 ` [PATCH 8/10] mm/memcg: nest lru_lock inside page_cgroup lock Hugh Dickins
2012-02-20 23:36   ` Hugh Dickins
2012-02-21  9:48   ` KAMEZAWA Hiroyuki [this message]
2012-02-21  9:48     ` KAMEZAWA Hiroyuki
2012-02-20 23:38 ` [PATCH 9/10] mm/memcg: move lru_lock into lruvec Hugh Dickins
2012-02-20 23:38   ` Hugh Dickins
2012-02-21  7:08   ` Konstantin Khlebnikov
2012-02-21  7:08     ` Konstantin Khlebnikov
2012-02-21 20:12     ` Hugh Dickins
2012-02-21 20:12       ` Hugh Dickins
2012-02-21 21:35       ` Konstantin Khlebnikov
2012-02-21 21:35         ` Konstantin Khlebnikov
2012-02-21 22:12         ` Hugh Dickins
2012-02-21 22:12           ` Hugh Dickins
2012-02-22  3:43           ` Konstantin Khlebnikov
2012-02-22  3:43             ` Konstantin Khlebnikov
2012-02-22  6:09             ` Hugh Dickins
2012-02-22  6:09               ` Hugh Dickins
2012-02-23 14:21               ` Konstantin Khlebnikov
2012-02-23 14:21                 ` Konstantin Khlebnikov
2012-02-20 23:39 ` [PATCH 10/10] mm/memcg: per-memcg per-zone lru locking Hugh Dickins
2012-02-20 23:39   ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120221184829.78d523a8.kamezawa.hiroyu@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=khlebnikov@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.