All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Kirill Tkhai <ktkhai@virtuozzo.com>,
	peterz@infradead.org, oleg@redhat.com, viro@zeniv.linux.org.uk,
	mingo@kernel.org, paulmck@linux.vnet.ibm.com,
	keescook@chromium.org, riel@redhat.com, tglx@linutronix.de,
	kirill.shutemov@linux.intel.com, marcos.souza.org@gmail.com,
	hoeun.ryu@gmail.com, pasha.tatashin@oracle.com,
	gs051095@gmail.com, dhowells@redhat.com, rppt@linux.vnet.ibm.com,
	linux-kernel@vger.kernel.org,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH] memcg: Replace mm->owner with mm->memcg
Date: Wed, 02 May 2018 16:35:03 -0500	[thread overview]
Message-ID: <877eolzqpk.fsf@xmission.com> (raw)
In-Reply-To: <20180502140453.086f862f94496197cfa7d813@linux-foundation.org> (Andrew Morton's message of "Wed, 2 May 2018 14:04:53 -0700")

Andrew Morton <akpm@linux-foundation.org> writes:

> On Wed, 02 May 2018 14:21:35 -0500 ebiederm@xmission.com (Eric W. Biederman) wrote:
>
>> Recently it was reported that mm_update_next_owner could get into
>> cases where it was executing it's fallback for_each_process part of
>> the loop and thus taking up a lot of time.
>> 
>> To deal with this replace mm->owner with mm->memcg.  This just reduces
>> the complexity of everything.  As much as possible I have maintained
>> the current semantics.  There are two siginificant exceptions.  During
>> fork the memcg of the process calling fork is charged rather than
>> init_css_set.  During memory cgroup migration the charges are migrated
>> not if the process is the owner of the mm, but if the process being
>> migrated has the same memory cgroup as the mm.
>> 
>> I believe it was a bug if init_css_set is charged for memory activity
>> during fork, and the old behavior was simply a consequence of the new
>> task not having tsk->cgroup not initialized to it's proper cgroup.
>> 
>> Durhing cgroup migration only thread group leaders are allowed to
>> migrate.  Which means in practice there should only be one.  Linux
>> tasks created with CLONE_VM are the only exception, but the common
>> cases are already ruled out.  Processes created with vfork have a
>> suspended parent and can do nothing but call exec so they should never
>> show up.  Threads of the same cgroup are not the thread group leader
>> so also should not show up.  That leaves the old LinuxThreads library
>> which is probably out of use by now, and someone doing something very
>> creative with cgroups, and rolling their own threads with CLONE_VM.
>> So in practice I don't think the difference charge migration will
>> affect anyone.
>> 
>> To ensure that mm->memcg is updated appropriately I have implemented
>> cgroup "attach" and "fork" methods.  This ensures that at those
>> points the mm pointed to the task has the appropriate memory cgroup.
>> 
>> For simplicity instead of introducing a new mm lock I simply use
>> exchange on the pointer where the mm->memcg is updated to get
>> atomic updates.
>> 
>> Looking at the history effectively this change is a revert.  The
>> reason given for adding mm->owner is so that multiple cgroups can be
>> attached to the same mm.  In the last 8 years a second user of
>> mm->owner has not appeared.  A feature that has never used, makes the
>> code more complicated and has horrible worst case performance should
>> go.
>
> Cleanliness nit: I'm not sure that the removal and open-coding of
> mem_cgroup_from_task() actually improved things.  Should we restore
> it?

While writing the patch itself removing mem_cgroup_from_task forced
thinking about which places should use mm->memcg and which places
should use an alternative.

If we want to add it back afterwards with a second patch I don't mind.

I just don't want to have that in the same patch as opportunities get
lost to look at how the memory cgroup should be derived.

Eric

> --- a/mm/memcontrol.c~memcg-replace-mm-owner-with-mm-memcg-fix
> +++ a/mm/memcontrol.c
> @@ -664,6 +664,11 @@ static void memcg_check_events(struct me
>  	}
>  }
>  
> +static inline struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p)
> +{
> +	return mem_cgroup_from_css(task_css(p, memory_cgrp_id));
> +}
> +
>  struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm)
>  {
>  	struct mem_cgroup *memcg = NULL;
> @@ -1011,7 +1016,7 @@ bool task_in_mem_cgroup(struct task_stru
>  		 * killed to prevent needlessly killing additional tasks.
>  		 */
>  		rcu_read_lock();
> -		task_memcg = mem_cgroup_from_css(task_css(task, memory_cgrp_id));
> +		task_memcg = mem_cgroup_from_task(task);
>  		css_get(&task_memcg->css);
>  		rcu_read_unlock();
>  	}
> @@ -4829,7 +4834,7 @@ static int mem_cgroup_can_attach(struct
>  	if (!move_flags)
>  		return 0;
>  
> -	from = mem_cgroup_from_css(task_css(p, memory_cgrp_id));
> +	from = mem_cgroup_from_task(p);
>  
>  	VM_BUG_ON(from == memcg);
>  
> @@ -5887,7 +5892,7 @@ void mem_cgroup_sk_alloc(struct sock *sk
>  	}
>  
>  	rcu_read_lock();
> -	memcg = mem_cgroup_from_css(task_css(current, memory_cgrp_id));
> +	memcg = mem_cgroup_from_task(current);
>  	if (memcg == root_mem_cgroup)
>  		goto out;
>  	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && !memcg->tcpmem_active)
> _

  reply	other threads:[~2018-05-02 21:42 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-26 11:00 [PATCH 0/4] exit: Make unlikely case in mm_update_next_owner() more scalable Kirill Tkhai
2018-04-26 11:00 ` [PATCH 1/4] exit: Move read_unlock() up in mm_update_next_owner() Kirill Tkhai
2018-04-26 15:01   ` Oleg Nesterov
2018-04-26 11:00 ` [PATCH 2/4] exit: Use rcu instead of get_task_struct() " Kirill Tkhai
2018-04-26 11:00 ` [PATCH 3/4] exit: Rename assign_new_owner label " Kirill Tkhai
2018-04-26 11:01 ` [PATCH 4/4] exit: Lockless iteration over task list " Kirill Tkhai
2018-04-26 12:35   ` Andrea Parri
2018-04-26 13:52     ` Kirill Tkhai
2018-04-26 15:20       ` Peter Zijlstra
2018-04-26 15:56         ` Kirill Tkhai
2018-04-26 15:20       ` Peter Zijlstra
2018-04-26 16:04         ` Kirill Tkhai
2018-04-26 15:29       ` Andrea Parri
2018-04-26 16:11         ` Kirill Tkhai
2018-04-26 13:07 ` [PATCH 0/4] exit: Make unlikely case in mm_update_next_owner() more scalable Michal Hocko
2018-04-26 13:52   ` Oleg Nesterov
2018-04-26 14:07   ` Kirill Tkhai
2018-04-26 15:10     ` Oleg Nesterov
2018-04-26 16:19   ` Eric W. Biederman
2018-04-26 19:28     ` Michal Hocko
2018-04-27  7:08       ` Michal Hocko
2018-04-27 18:05         ` Eric W. Biederman
2018-05-01 17:22           ` Eric W. Biederman
2018-05-01 17:35             ` [RFC][PATCH] memcg: Replace mm->owner with mm->memcg Eric W. Biederman
2018-05-02  8:47               ` Michal Hocko
2018-05-02 13:20                 ` Johannes Weiner
2018-05-02 14:05                   ` Eric W. Biederman
2018-05-02 19:21                   ` [PATCH] " Eric W. Biederman
2018-05-02 21:04                     ` Andrew Morton
2018-05-02 21:35                       ` Eric W. Biederman [this message]
2018-05-03 13:33                     ` Oleg Nesterov
2018-05-03 14:39                       ` Eric W. Biederman
2018-05-04 14:20                         ` Oleg Nesterov
2018-05-04 14:36                           ` Eric W. Biederman
2018-05-04 14:54                             ` Oleg Nesterov
2018-05-04 15:49                               ` Eric W. Biederman
2018-05-04 16:22                                 ` Oleg Nesterov
2018-05-04 16:40                                   ` Eric W. Biederman
2018-05-04 17:26                                     ` [PATCH 0/2] mm->owner to mm->memcg fixes Eric W. Biederman
2018-05-04 17:26                                       ` [PATCH 1/2] memcg: Update the mm->memcg maintenance to work when !CONFIG_MMU Eric W. Biederman
2018-05-04 17:27                                       ` [PATCH 2/2] memcg: Close the race between migration and installing bprm->mm as mm Eric W. Biederman
2018-05-09 14:51                                         ` Oleg Nesterov
2018-05-10  3:00                                           ` Eric W. Biederman
2018-05-10 12:14                                       ` [PATCH 0/2] mm->owner to mm->memcg fixes Michal Hocko
2018-05-10 12:18                                         ` Michal Hocko
2018-05-22 12:57                                         ` Michal Hocko
2018-05-23 19:46                                           ` Eric W. Biederman
2018-05-24 11:10                                             ` Michal Hocko
2018-05-24 21:16                                               ` Andrew Morton
2018-05-24 23:37                                                 ` Andrea Parri
2018-05-30 12:17                                                 ` Michal Hocko
2018-05-31 18:41                                                   ` Eric W. Biederman
2018-06-01  1:57                                                     ` [PATCH] memcg: Replace mm->owner with mm->memcg Eric W. Biederman
2018-06-01 14:52                                                       ` [RFC][PATCH 0/2] memcg: Require every task that uses an mm to migrate together Eric W. Biederman
2018-06-01 14:53                                                         ` [RFC][PATCH 1/2] memcg: Ensure every task that uses an mm is in the same memory cgroup Eric W. Biederman
2018-06-01 16:50                                                           ` Tejun Heo
2018-06-01 18:11                                                             ` Eric W. Biederman
2018-06-01 19:16                                                               ` Tejun Heo
2018-06-04 13:01                                                                 ` Michal Hocko
2018-06-04 18:47                                                                   ` Tejun Heo
2018-06-04 19:11                                                                     ` Eric W. Biederman
2018-06-06 11:13                                                           ` Michal Hocko
2018-06-07 11:42                                                             ` Eric W. Biederman
2018-06-07 12:19                                                               ` Michal Hocko
2018-06-01 14:53                                                         ` [RFC][PATCH 2/2] memcgl: Remove dead code now that all tasks of an mm share a memcg Eric W. Biederman
2018-06-01 14:07                                                     ` [PATCH 0/2] mm->owner to mm->memcg fixes Michal Hocko
2018-05-24 21:17                                               ` Andrew Morton
2018-05-30 11:52                                             ` Michal Hocko
2018-05-31 17:43                                               ` Eric W. Biederman
2018-05-07 14:33                                     ` [PATCH] memcg: Replace mm->owner with mm->memcg Oleg Nesterov
2018-05-08  3:15                                       ` Eric W. Biederman
2018-05-09 14:40                                         ` Oleg Nesterov
2018-05-10  3:09                                           ` Eric W. Biederman
2018-05-10  4:03                                             ` [RFC][PATCH] cgroup: Don't mess with tasks in exec Eric W. Biederman
2018-05-10 12:15                                               ` Oleg Nesterov
2018-05-10 12:35                                                 ` Tejun Heo
2018-05-10 12:38                                             ` [PATCH] memcg: Replace mm->owner with mm->memcg Oleg Nesterov
2018-05-04 11:07                     ` Michal Hocko
2018-05-05 16:54                     ` kbuild test robot
2018-05-07 23:18                       ` Andrew Morton
2018-05-08  2:17                         ` Eric W. Biederman
2018-05-09 21:00                         ` Michal Hocko
2018-05-02 23:59               ` [RFC][PATCH] " Balbir Singh
2018-05-03 15:11                 ` Eric W. Biederman
2018-05-04  4:59                   ` Balbir Singh
2018-05-03 10:52           ` [PATCH 0/4] exit: Make unlikely case in mm_update_next_owner() more scalable Kirill Tkhai
2018-06-01  1:07   ` Eric W. Biederman
2018-06-01 13:57     ` Michal Hocko
2018-06-01 14:32       ` Eric W. Biederman
2018-06-01 15:02         ` Michal Hocko
2018-06-01 15:25           ` Eric W. Biederman
2018-06-04  6:54             ` Michal Hocko
2018-06-04 14:31               ` Eric W. Biederman
2018-06-05  8:15                 ` Michal Hocko
2018-06-05  8:48             ` Kirill Tkhai
2018-06-05 15:36               ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877eolzqpk.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=dhowells@redhat.com \
    --cc=gs051095@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=hoeun.ryu@gmail.com \
    --cc=keescook@chromium.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcos.souza.org@gmail.com \
    --cc=mhocko@kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=pasha.tatashin@oracle.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.