linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>,
	Vladimir Davydov <vdavydov@parallels.com>,
	Greg Thelen <gthelen@google.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 7/8] memcg: get rid of mm_struct::owner
Date: Tue, 14 Jul 2015 17:18:23 +0200	[thread overview]
Message-ID: <20150714151823.GG17660@dhcp22.suse.cz> (raw)
In-Reply-To: <20150710140533.GB29540@dhcp22.suse.cz>

On Fri 10-07-15 16:05:33, Michal Hocko wrote:
> JFYI: I've found some more issues while hamerring this more.

OK so the main issue is quite simple but I have completely missed it when
thinking about the patch before. clone(CLONE_VM) without CLONE_THREAD is
really nasty and it will easily lockup the machine with preempt. disabled
for ever. It goes like this:
taskA (in memcg A)
  taskB = clone(CLONE_VM)
				taskB
				  A -> B	# Both tasks charge to B now
				  exit()	# No tasks in B -> can be
				  		# offlined now
				css_offline()
  mem_cgroup_try_charge
    get_mem_cgroup_from_mm
      rcu_read_lock()
      do {
      } while css_tryget_online(mm->memcg)	# will never succeed
      rcu_read_unlock()

taskA and taskB are basically independent entities wrt. the life
cycle (unlike threads which are bound to the group leader). The
previous code handles this by re-ownering during exit by the monster
mm_update_next_owner.

I can see the following options without reintroducing reintroducing
some form of mm_update_next_owner:

1) Do not allow offlining a cgroup if we have active users in it.  This
would require a callback from the cgroup core to the subsystem called if
there are no active tasks tracked by the cgroup core. Tracking on the memcg
side doesn't sound terribly hard - just mark a mm_struct as an alien and
count the number of aliens during the move in mem_cgroup. mm_drop_memcg
then drops the counter. We could end up with EBUSY cgroup without any
visible tasks which is a bit awkward.

2) update get_mem_cgroup_from_mm and others to fallback to the parent
memcg if the current one is offline. This would be in line with charge
reparenting we used to do. I cannot say I would like this because it
allows for easy runaway to the root memcg if the hierarchy is not
configured cautiously. The code would be also quite tricky because each
direct consumer of mm->memcg would have to be aware of this. This is
awkward.

3) fail mem_cgroup_can_attach if we are trying to migrate a task sharing
mm_struct with a process outside of the tset. If I understand the
tset properly this would require all the sharing tasks to be migrated
together and we would never end up with task_css != &task->mm->css.
__cgroup_procs_write doesn't seem to support multi pid move currently
AFAICS, though. cgroup_migrate_add_src, however, seems to be intended
for this purpose so this should be doable. Without that support we would
basically disallow migrating these tasks - I wouldn't object if you ask
me.

Do you see other options? From the above three options the 3rd one
sounds the most sane to me and the 1st quite easy to implement. Both will
require some cgroup core work though. But maybe we would be good enough
with 3rd option without supporting moving schizophrenic tasks and that
would be reduced to memcg code.

Or we can, of course, stay with the current state but I think it would
be much saner to get rid of the schizophrenia.

What do you think?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-07-14 15:18 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-08 12:27 [PATCH 0/8 -v3] memcg cleanups + get rid of mm_struct::owner Michal Hocko
2015-07-08 12:27 ` [PATCH 1/8] memcg: export struct mem_cgroup Michal Hocko
2015-07-08 15:39   ` Vladimir Davydov
2015-07-09 11:22     ` Michal Hocko
2015-07-09 11:51       ` Vladimir Davydov
2015-07-08 12:27 ` [PATCH 2/8] memcg: get rid of mem_cgroup_root_css for !CONFIG_MEMCG Michal Hocko
2015-07-08 15:41   ` Vladimir Davydov
2015-07-08 12:27 ` [PATCH 3/8] memcg: get rid of extern for functions in memcontrol.h Michal Hocko
2015-07-08 15:43   ` Vladimir Davydov
2015-07-08 12:27 ` [PATCH 4/8] memcg, mm: move mem_cgroup_select_victim_node into vmscan Michal Hocko
2015-07-08 16:01   ` Vladimir Davydov
2015-07-09 12:08     ` Michal Hocko
2015-07-08 12:27 ` [PATCH 5/8] memcg: restructure mem_cgroup_can_attach() Michal Hocko
2015-07-08 16:05   ` Vladimir Davydov
2015-07-08 12:27 ` [PATCH 6/8] memcg, tcp_kmem: check for cg_proto in sock_update_memcg Michal Hocko
2015-07-08 16:11   ` Vladimir Davydov
2015-07-08 12:27 ` [PATCH 7/8] memcg: get rid of mm_struct::owner Michal Hocko
2015-07-08 17:32   ` Vladimir Davydov
2015-07-09 14:09     ` Michal Hocko
2015-07-10  7:54       ` Vladimir Davydov
2015-07-10 12:45         ` Michal Hocko
2015-07-11  7:09           ` Vladimir Davydov
2015-07-14 15:32             ` Michal Hocko
2015-07-10 14:05   ` Michal Hocko
2015-07-14 15:18     ` Michal Hocko [this message]
2015-07-29 11:58       ` Michal Hocko
2015-07-29 13:14       ` Johannes Weiner
2015-07-29 15:05         ` Michal Hocko
2015-07-29 16:42           ` Johannes Weiner
2015-07-08 12:27 ` [PATCH 8/8] memcg: get rid of mem_cgroup_from_task Michal Hocko
2015-07-08 17:43   ` Vladimir Davydov
2015-07-09 14:13     ` Michal Hocko
2015-07-09 14:32       ` Vladimir Davydov
2015-07-09 16:33         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150714151823.GG17660@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=oleg@redhat.com \
    --cc=tj@kernel.org \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).