All of lore.kernel.org
 help / color / mirror / Atom feed
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Minchan Kim <minchan.kim@gmail.com>
Cc: Andrea Righi <arighi@develer.com>,
	Vivek Goyal <vgoyal@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Suleiman Souhlal <suleiman@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	containers@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] memcg: dirty pages accounting and limiting  infrastructure
Date: Fri, 26 Feb 2010 15:15:06 +0900	[thread overview]
Message-ID: <20100226151506.c78b4312.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <28c262361002252153s587b70ecxf89eda9a642e527c@mail.gmail.com>

On Fri, 26 Feb 2010 14:53:39 +0900
Minchan Kim <minchan.kim@gmail.com> wrote:

> On Fri, Feb 26, 2010 at 2:01 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > Hi,
> >
> > On Fri, 26 Feb 2010 13:50:04 +0900
> > Minchan Kim <minchan.kim@gmail.com> wrote:
> >
> >> > Hm ? I don't read the whole thread but can_attach() is called under
> >> > cgroup_mutex(). So, it doesn't need to use RCU.
> >>
> >> Vivek mentioned memcg is protected by RCU if I understand his intention right.
> >> So I commented that without enough knowledge of memcg.
> >> After your comment, I dive into the code.
> >>
> >> Just out of curiosity.
> >>
> >> Really, memcg is protected by RCU?
> > yes. All cgroup subsystem is protected by RCU.
> >
> >> I think most of RCU around memcg is for protecting task_struct and
> >> cgroup_subsys_state.
> >> The memcg is protected by cgroup_mutex as you mentioned.
> >> Am I missing something?
> >
> > There are several levels of protections.
> >
> > cgroup subsystem's ->destroy() call back is finally called by
> >
> > As this.
> >
> >  768                 synchronize_rcu();
> >  769
> >  770                 mutex_lock(&cgroup_mutex);
> >  771                 /*
> >  772                  * Release the subsystem state objects.
> >  773                  */
> >  774                 for_each_subsys(cgrp->root, ss)
> >  775                         ss->destroy(ss, cgrp);
> >  776
> >  777                 cgrp->root->number_of_cgroups--;
> >  778                 mutex_unlock(&cgroup_mutex);
> >
> > Before here,
> >        - there are no tasks under this cgroup (cgroup's refcnt is 0)
> >          && cgroup is marked as REMOVED.
> >
> > Then, this access
> >        rcu_read_lock();
> >        mem = mem_cgroup_from_task(task);
> >        if (css_tryget(mem->css))   <===============checks cgroup refcnt
> 
> If it it, do we always need css_tryget after mem_cgroup_from_task
> without cgroup_mutex to make sure css is vaild?
> 
On a case by cases. 

> But I found several cases that don't call css_tryget
> 
> 1. mm_match_cgroup
> It's used by page_referenced_xxx. so we I think we don't grab
> cgroup_mutex at that time.
> 
yes. but all check are done under RCU. And this function never
access contents of memory which pointer holds.
And, please conider the whole context.

	mem_cgroup_try_charge()
		mem_cout =..... (refcnt +1)
		....
		try_to_free_mem_cgroup_pages(mem_cont)
		....
		shrink_xxx_list()
		....
			page_referenced_anon(page, mem_cont)
				mm_match_cgroup(mm, mem_cont)

Then, this mem_cont (2nd argument to mm_match_cgroup) is always valid.
	rcu_read_lock();
	memcg = mem_cgroup_from_task(rcu_dereference(mm->ownder));
	rcu_read_unlock();
	return memcg != mem_cont;

Here,
	1. mem_cont is never reused. (because refcnt+1)
	2. we don't access memcg's contents.

I think even rcu_read_lock()/unlock() is unnecessary.



> 2. mem_cgroup_oom_called
> I think in here we don't grab cgroup_mutex, too.
> 
In OOM-killer context, memcg which causes OOM has refcnt +1.
Then, not necessary. 


> I guess some design would cover that problems.
Maybe.

> Could you tell me if you don't mind?
> Sorry for bothering you.
> 

In my point of view, the most terrible porblem is heavy cost of
css_tryget() when you run multi-thread heavy program.
So, I want to see some inovation, but haven't find yet.

I admit this RCU+refcnt is tend to be hard to review. But it's costly
operation to take refcnt and it's worth to be handled in the best
usage of each logics, on a case by cases for now.

Thanks,
-Kame



  reply	other threads:[~2010-02-26  6:18 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-21 15:18 [RFC] [PATCH 0/2] memcg: per cgroup dirty limit Andrea Righi
2010-02-21 15:18 ` [PATCH 1/2] memcg: dirty pages accounting and limiting infrastructure Andrea Righi
2010-02-21 21:28   ` David Rientjes
2010-02-21 22:17     ` Andrea Righi
2010-02-22 18:07       ` Vivek Goyal
     [not found]         ` <20100222180732.GC3096-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-23 11:58           ` Andrea Righi
2010-02-23 11:58             ` Andrea Righi
2010-02-25 15:36             ` Minchan Kim
     [not found]               ` <28c262361002250736k57543379j8291e0dfb8df194e-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-02-26  0:23                 ` KAMEZAWA Hiroyuki
2010-02-26  0:23               ` KAMEZAWA Hiroyuki
     [not found]                 ` <20100226092339.1f639cbf.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-02-26  4:50                   ` Minchan Kim
2010-02-26  4:50                 ` Minchan Kim
2010-02-26  5:01                   ` KAMEZAWA Hiroyuki
2010-02-26  5:53                     ` Minchan Kim
2010-02-26  6:15                       ` KAMEZAWA Hiroyuki [this message]
2010-02-26  6:35                         ` Minchan Kim
     [not found]                         ` <20100226151506.c78b4312.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-02-26  6:35                           ` Minchan Kim
     [not found]                       ` <28c262361002252153s587b70ecxf89eda9a642e527c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-02-26  6:15                         ` KAMEZAWA Hiroyuki
     [not found]                     ` <20100226140135.23c32a8d.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-02-26  5:53                       ` Minchan Kim
     [not found]                   ` <28c262361002252050r29f54ea2u6c6e87f1f702d195-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-02-26  5:01                     ` KAMEZAWA Hiroyuki
2010-02-25 15:36             ` Minchan Kim
2010-02-22 18:07       ` Vivek Goyal
2010-02-22  0:22   ` KAMEZAWA Hiroyuki
2010-02-22 18:00     ` Andrea Righi
2010-02-22 21:21       ` David Rientjes
2010-02-22 21:21       ` David Rientjes
     [not found]     ` <20100222092242.98df82e4.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-02-22 18:00       ` Andrea Righi
2010-02-22 19:31     ` Vivek Goyal
     [not found]       ` <20100222193113.GE3096-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-23  9:58         ` Andrea Righi
2010-02-23  9:58       ` Andrea Righi
2010-02-22 15:58   ` Vivek Goyal
2010-02-22 17:29     ` Balbir Singh
     [not found]     ` <20100222155840.GC13823-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-22 17:29       ` Balbir Singh
2010-02-23  9:26       ` Andrea Righi
2010-02-23  9:26     ` Andrea Righi
     [not found]   ` <1266765525-30890-2-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2010-02-22  0:22     ` KAMEZAWA Hiroyuki
2010-02-22 15:58     ` Vivek Goyal
2010-02-22 16:14     ` Balbir Singh
2010-02-22 16:14   ` Balbir Singh
2010-02-23  9:28     ` Andrea Righi
2010-02-24  0:09       ` KAMEZAWA Hiroyuki
2010-02-24  0:09       ` KAMEZAWA Hiroyuki
     [not found]     ` <20100222161442.GE3063-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
2010-02-23  9:28       ` Andrea Righi
2010-02-21 15:18 ` [PATCH 2/2] memcg: dirty pages instrumentation Andrea Righi
2010-02-21 21:38   ` David Rientjes
2010-02-21 22:33     ` Andrea Righi
2010-02-22  0:32   ` KAMEZAWA Hiroyuki
     [not found]     ` <20100222093221.eaaff1b4.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-02-22 17:57       ` Andrea Righi
2010-02-22 17:57     ` Andrea Righi
2010-02-22 16:52   ` Vivek Goyal
2010-02-23  9:40     ` Andrea Righi
2010-02-23  9:45       ` Andrea Righi
2010-02-23  9:45       ` Andrea Righi
2010-02-23 19:56       ` Vivek Goyal
2010-02-23 22:22         ` David Rientjes
2010-02-25 14:34           ` Andrea Righi
2010-02-26  0:14             ` KAMEZAWA Hiroyuki
2010-02-26  0:14             ` KAMEZAWA Hiroyuki
     [not found]           ` <alpine.DEB.2.00.1002231419450.8693-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2010-02-25 14:34             ` Andrea Righi
     [not found]         ` <20100223195606.GD11930-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-23 22:22           ` David Rientjes
2010-02-23 19:56       ` Vivek Goyal
     [not found]     ` <20100222165215.GA3096-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-23  9:40       ` Andrea Righi
     [not found]   ` <1266765525-30890-3-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2010-02-22  0:32     ` KAMEZAWA Hiroyuki
2010-02-22 16:52     ` Vivek Goyal
2010-02-22 18:20     ` Peter Zijlstra
2010-02-23 21:29     ` Vivek Goyal
2010-02-22 18:20   ` Peter Zijlstra
2010-02-23  9:46     ` Andrea Righi
2010-02-23  9:46     ` Andrea Righi
2010-02-23 21:29   ` Vivek Goyal
     [not found]     ` <20100223212943.GF11930-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-25 15:12       ` Andrea Righi
2010-02-25 15:12     ` Andrea Righi
2010-02-26 21:48       ` Vivek Goyal
2010-02-26 22:21         ` Andrea Righi
2010-02-26 22:28           ` Vivek Goyal
2010-02-26 22:28           ` Vivek Goyal
     [not found]         ` <20100226214811.GB7498-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-26 22:21           ` Andrea Righi
2010-03-01  0:47           ` KAMEZAWA Hiroyuki
2010-03-01  0:47         ` KAMEZAWA Hiroyuki
2010-02-26 21:48       ` Vivek Goyal
2010-02-22 14:27 ` [RFC] [PATCH 0/2] memcg: per cgroup dirty limit Vivek Goyal
2010-02-22 17:36   ` Balbir Singh
     [not found]     ` <20100222173640.GG3063-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
2010-02-22 17:58       ` Vivek Goyal
2010-02-22 17:58     ` Vivek Goyal
2010-02-23  0:07       ` KAMEZAWA Hiroyuki
     [not found]         ` <20100223090704.839d8bef.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-02-23 15:12           ` Vivek Goyal
2010-02-23 15:12         ` Vivek Goyal
     [not found]           ` <20100223151201.GB11930-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-24  0:19             ` KAMEZAWA Hiroyuki
2010-02-24  0:19           ` KAMEZAWA Hiroyuki
     [not found]       ` <20100222175833.GB3096-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-23  0:07         ` KAMEZAWA Hiroyuki
     [not found]   ` <20100222142744.GB13823-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-22 17:36     ` Balbir Singh
2010-02-22 18:12     ` Andrea Righi
2010-02-22 18:12   ` Andrea Righi
2010-02-22 18:29     ` Vivek Goyal
2010-02-22 21:15       ` David Rientjes
2010-02-23  9:55       ` Andrea Righi
2010-02-23 20:01         ` Vivek Goyal
2010-02-23 20:01         ` Vivek Goyal
     [not found]       ` <20100222182934.GD3096-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-22 21:15         ` David Rientjes
2010-02-23  9:55         ` Andrea Righi
2010-02-22 18:29     ` Vivek Goyal
     [not found] ` <1266765525-30890-1-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2010-02-21 23:48   ` KAMEZAWA Hiroyuki
2010-02-21 23:48     ` KAMEZAWA Hiroyuki
2010-02-22 14:27   ` Vivek Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100226151506.c78b4312.kamezawa.hiroyu@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=arighi@develer.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=minchan.kim@gmail.com \
    --cc=rientjes@google.com \
    --cc=suleiman@google.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.