From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Minchan Kim <minchan.kim@gmail.com>
Cc: Andrea Righi <arighi@develer.com>,
Vivek Goyal <vgoyal@redhat.com>,
David Rientjes <rientjes@google.com>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
Suleiman Souhlal <suleiman@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
containers@lists.linux-foundation.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] memcg: dirty pages accounting and limiting infrastructure
Date: Fri, 26 Feb 2010 15:15:06 +0900 [thread overview]
Message-ID: <20100226151506.c78b4312.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <28c262361002252153s587b70ecxf89eda9a642e527c@mail.gmail.com>
On Fri, 26 Feb 2010 14:53:39 +0900
Minchan Kim <minchan.kim@gmail.com> wrote:
> On Fri, Feb 26, 2010 at 2:01 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > Hi,
> >
> > On Fri, 26 Feb 2010 13:50:04 +0900
> > Minchan Kim <minchan.kim@gmail.com> wrote:
> >
> >> > Hm ? I don't read the whole thread but can_attach() is called under
> >> > cgroup_mutex(). So, it doesn't need to use RCU.
> >>
> >> Vivek mentioned memcg is protected by RCU if I understand his intention right.
> >> So I commented that without enough knowledge of memcg.
> >> After your comment, I dive into the code.
> >>
> >> Just out of curiosity.
> >>
> >> Really, memcg is protected by RCU?
> > yes. All cgroup subsystem is protected by RCU.
> >
> >> I think most of RCU around memcg is for protecting task_struct and
> >> cgroup_subsys_state.
> >> The memcg is protected by cgroup_mutex as you mentioned.
> >> Am I missing something?
> >
> > There are several levels of protections.
> >
> > cgroup subsystem's ->destroy() call back is finally called by
> >
> > As this.
> >
> > 768 synchronize_rcu();
> > 769
> > 770 mutex_lock(&cgroup_mutex);
> > 771 /*
> > 772 * Release the subsystem state objects.
> > 773 */
> > 774 for_each_subsys(cgrp->root, ss)
> > 775 ss->destroy(ss, cgrp);
> > 776
> > 777 cgrp->root->number_of_cgroups--;
> > 778 mutex_unlock(&cgroup_mutex);
> >
> > Before here,
> > - there are no tasks under this cgroup (cgroup's refcnt is 0)
> > && cgroup is marked as REMOVED.
> >
> > Then, this access
> > rcu_read_lock();
> > mem = mem_cgroup_from_task(task);
> > if (css_tryget(mem->css)) <===============checks cgroup refcnt
>
> If it it, do we always need css_tryget after mem_cgroup_from_task
> without cgroup_mutex to make sure css is vaild?
>
On a case by cases.
> But I found several cases that don't call css_tryget
>
> 1. mm_match_cgroup
> It's used by page_referenced_xxx. so we I think we don't grab
> cgroup_mutex at that time.
>
yes. but all check are done under RCU. And this function never
access contents of memory which pointer holds.
And, please conider the whole context.
mem_cgroup_try_charge()
mem_cout =..... (refcnt +1)
....
try_to_free_mem_cgroup_pages(mem_cont)
....
shrink_xxx_list()
....
page_referenced_anon(page, mem_cont)
mm_match_cgroup(mm, mem_cont)
Then, this mem_cont (2nd argument to mm_match_cgroup) is always valid.
rcu_read_lock();
memcg = mem_cgroup_from_task(rcu_dereference(mm->ownder));
rcu_read_unlock();
return memcg != mem_cont;
Here,
1. mem_cont is never reused. (because refcnt+1)
2. we don't access memcg's contents.
I think even rcu_read_lock()/unlock() is unnecessary.
> 2. mem_cgroup_oom_called
> I think in here we don't grab cgroup_mutex, too.
>
In OOM-killer context, memcg which causes OOM has refcnt +1.
Then, not necessary.
> I guess some design would cover that problems.
Maybe.
> Could you tell me if you don't mind?
> Sorry for bothering you.
>
In my point of view, the most terrible porblem is heavy cost of
css_tryget() when you run multi-thread heavy program.
So, I want to see some inovation, but haven't find yet.
I admit this RCU+refcnt is tend to be hard to review. But it's costly
operation to take refcnt and it's worth to be handled in the best
usage of each logics, on a case by cases for now.
Thanks,
-Kame
next prev parent reply other threads:[~2010-02-26 6:18 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-21 15:18 [RFC] [PATCH 0/2] memcg: per cgroup dirty limit Andrea Righi
2010-02-21 15:18 ` [PATCH 1/2] memcg: dirty pages accounting and limiting infrastructure Andrea Righi
2010-02-21 21:28 ` David Rientjes
2010-02-21 22:17 ` Andrea Righi
2010-02-22 18:07 ` Vivek Goyal
[not found] ` <20100222180732.GC3096-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-23 11:58 ` Andrea Righi
2010-02-23 11:58 ` Andrea Righi
2010-02-25 15:36 ` Minchan Kim
[not found] ` <28c262361002250736k57543379j8291e0dfb8df194e-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-02-26 0:23 ` KAMEZAWA Hiroyuki
2010-02-26 0:23 ` KAMEZAWA Hiroyuki
[not found] ` <20100226092339.1f639cbf.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-02-26 4:50 ` Minchan Kim
2010-02-26 4:50 ` Minchan Kim
2010-02-26 5:01 ` KAMEZAWA Hiroyuki
2010-02-26 5:53 ` Minchan Kim
2010-02-26 6:15 ` KAMEZAWA Hiroyuki [this message]
2010-02-26 6:35 ` Minchan Kim
[not found] ` <20100226151506.c78b4312.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-02-26 6:35 ` Minchan Kim
[not found] ` <28c262361002252153s587b70ecxf89eda9a642e527c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-02-26 6:15 ` KAMEZAWA Hiroyuki
[not found] ` <20100226140135.23c32a8d.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-02-26 5:53 ` Minchan Kim
[not found] ` <28c262361002252050r29f54ea2u6c6e87f1f702d195-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-02-26 5:01 ` KAMEZAWA Hiroyuki
2010-02-25 15:36 ` Minchan Kim
2010-02-22 18:07 ` Vivek Goyal
2010-02-22 0:22 ` KAMEZAWA Hiroyuki
2010-02-22 18:00 ` Andrea Righi
2010-02-22 21:21 ` David Rientjes
2010-02-22 21:21 ` David Rientjes
[not found] ` <20100222092242.98df82e4.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-02-22 18:00 ` Andrea Righi
2010-02-22 19:31 ` Vivek Goyal
[not found] ` <20100222193113.GE3096-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-23 9:58 ` Andrea Righi
2010-02-23 9:58 ` Andrea Righi
2010-02-22 15:58 ` Vivek Goyal
2010-02-22 17:29 ` Balbir Singh
[not found] ` <20100222155840.GC13823-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-22 17:29 ` Balbir Singh
2010-02-23 9:26 ` Andrea Righi
2010-02-23 9:26 ` Andrea Righi
[not found] ` <1266765525-30890-2-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2010-02-22 0:22 ` KAMEZAWA Hiroyuki
2010-02-22 15:58 ` Vivek Goyal
2010-02-22 16:14 ` Balbir Singh
2010-02-22 16:14 ` Balbir Singh
2010-02-23 9:28 ` Andrea Righi
2010-02-24 0:09 ` KAMEZAWA Hiroyuki
2010-02-24 0:09 ` KAMEZAWA Hiroyuki
[not found] ` <20100222161442.GE3063-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
2010-02-23 9:28 ` Andrea Righi
2010-02-21 15:18 ` [PATCH 2/2] memcg: dirty pages instrumentation Andrea Righi
2010-02-21 21:38 ` David Rientjes
2010-02-21 22:33 ` Andrea Righi
2010-02-22 0:32 ` KAMEZAWA Hiroyuki
[not found] ` <20100222093221.eaaff1b4.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-02-22 17:57 ` Andrea Righi
2010-02-22 17:57 ` Andrea Righi
2010-02-22 16:52 ` Vivek Goyal
2010-02-23 9:40 ` Andrea Righi
2010-02-23 9:45 ` Andrea Righi
2010-02-23 9:45 ` Andrea Righi
2010-02-23 19:56 ` Vivek Goyal
2010-02-23 22:22 ` David Rientjes
2010-02-25 14:34 ` Andrea Righi
2010-02-26 0:14 ` KAMEZAWA Hiroyuki
2010-02-26 0:14 ` KAMEZAWA Hiroyuki
[not found] ` <alpine.DEB.2.00.1002231419450.8693-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2010-02-25 14:34 ` Andrea Righi
[not found] ` <20100223195606.GD11930-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-23 22:22 ` David Rientjes
2010-02-23 19:56 ` Vivek Goyal
[not found] ` <20100222165215.GA3096-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-23 9:40 ` Andrea Righi
[not found] ` <1266765525-30890-3-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2010-02-22 0:32 ` KAMEZAWA Hiroyuki
2010-02-22 16:52 ` Vivek Goyal
2010-02-22 18:20 ` Peter Zijlstra
2010-02-23 21:29 ` Vivek Goyal
2010-02-22 18:20 ` Peter Zijlstra
2010-02-23 9:46 ` Andrea Righi
2010-02-23 9:46 ` Andrea Righi
2010-02-23 21:29 ` Vivek Goyal
[not found] ` <20100223212943.GF11930-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-25 15:12 ` Andrea Righi
2010-02-25 15:12 ` Andrea Righi
2010-02-26 21:48 ` Vivek Goyal
2010-02-26 22:21 ` Andrea Righi
2010-02-26 22:28 ` Vivek Goyal
2010-02-26 22:28 ` Vivek Goyal
[not found] ` <20100226214811.GB7498-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-26 22:21 ` Andrea Righi
2010-03-01 0:47 ` KAMEZAWA Hiroyuki
2010-03-01 0:47 ` KAMEZAWA Hiroyuki
2010-02-26 21:48 ` Vivek Goyal
2010-02-22 14:27 ` [RFC] [PATCH 0/2] memcg: per cgroup dirty limit Vivek Goyal
2010-02-22 17:36 ` Balbir Singh
[not found] ` <20100222173640.GG3063-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
2010-02-22 17:58 ` Vivek Goyal
2010-02-22 17:58 ` Vivek Goyal
2010-02-23 0:07 ` KAMEZAWA Hiroyuki
[not found] ` <20100223090704.839d8bef.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2010-02-23 15:12 ` Vivek Goyal
2010-02-23 15:12 ` Vivek Goyal
[not found] ` <20100223151201.GB11930-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-24 0:19 ` KAMEZAWA Hiroyuki
2010-02-24 0:19 ` KAMEZAWA Hiroyuki
[not found] ` <20100222175833.GB3096-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-23 0:07 ` KAMEZAWA Hiroyuki
[not found] ` <20100222142744.GB13823-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-22 17:36 ` Balbir Singh
2010-02-22 18:12 ` Andrea Righi
2010-02-22 18:12 ` Andrea Righi
2010-02-22 18:29 ` Vivek Goyal
2010-02-22 21:15 ` David Rientjes
2010-02-23 9:55 ` Andrea Righi
2010-02-23 20:01 ` Vivek Goyal
2010-02-23 20:01 ` Vivek Goyal
[not found] ` <20100222182934.GD3096-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-02-22 21:15 ` David Rientjes
2010-02-23 9:55 ` Andrea Righi
2010-02-22 18:29 ` Vivek Goyal
[not found] ` <1266765525-30890-1-git-send-email-arighi-vWjgImWzx8FBDgjK7y7TUQ@public.gmane.org>
2010-02-21 23:48 ` KAMEZAWA Hiroyuki
2010-02-21 23:48 ` KAMEZAWA Hiroyuki
2010-02-22 14:27 ` Vivek Goyal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100226151506.c78b4312.kamezawa.hiroyu@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=arighi@develer.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=containers@lists.linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=minchan.kim@gmail.com \
--cc=rientjes@google.com \
--cc=suleiman@google.com \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.