All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michal Koutný" <mkoutny@suse.com>
To: Christian Brauner <christian.brauner@ubuntu.com>
Cc: linux-api@vger.kernel.org, linux-kernel@vger.kernel.org,
	Tejun Heo <tj@kernel.org>, Oleg Nesterov <oleg@redhat.com>,
	Ingo Molnar <mingo@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Li Zefan <lizefan@huawei.com>,
	Peter Zijlstra <peterz@infradead.org>,
	cgroups@vger.kernel.org
Subject: Re: [PATCH v5 5/6] clone3: allow spawning processes into cgroups
Date: Wed, 29 Jan 2020 14:27:19 +0100	[thread overview]
Message-ID: <20200129132719.GD11384@blackbody.suse.cz> (raw)
In-Reply-To: <20200121154844.411-6-christian.brauner@ubuntu.com>

Hello.

On Tue, Jan 21, 2020 at 04:48:43PM +0100, Christian Brauner <christian.brauner@ubuntu.com> wrote:
> +static int cgroup_css_set_fork(struct kernel_clone_args *kargs)
> +	__acquires(&cgroup_mutex) __acquires(&cgroup_threadgroup_rwsem)
> +{
> +	int ret;
> +	struct cgroup *dst_cgrp = NULL;
> +	struct css_set *cset;
> +	struct super_block *sb;
> +	struct file *f;
> +
> +	if (kargs->flags & CLONE_INTO_CGROUP)
> +		mutex_lock(&cgroup_mutex);
> +
> +	cgroup_threadgroup_change_begin(current);
> +
> +	spin_lock_irq(&css_set_lock);
> +	cset = task_css_set(current);
> +	get_css_set(cset);
> +	spin_unlock_irq(&css_set_lock);
> +
> +	if (!(kargs->flags & CLONE_INTO_CGROUP)) {
> +		kargs->cset = cset;
Where is this css_set put when CLONE_INTO_CGROUP isn't used?
(Aha, it's passed to child's tsk->cgroups but see my other note below.)

> +	dst_cgrp = cgroup_get_from_file(f);
> +	if (IS_ERR(dst_cgrp)) {
> +		ret = PTR_ERR(dst_cgrp);
> +		dst_cgrp = NULL;
> +		goto err;
> +	}
> +
> +	/*
> +	 * Verify that we the target cgroup is writable for us. This is
> +	 * usually done by the vfs layer but since we're not going through
> +	 * the vfs layer here we need to do it "manually".
> +	 */
> +	ret = cgroup_may_write(dst_cgrp, sb);
> +	if (ret)
> +		goto err;
> +
> +	ret = cgroup_attach_permissions(cset->dfl_cgrp, dst_cgrp, sb,
> +					!!(kargs->flags & CLONE_THREAD));
> +	if (ret)
> +		goto err;
> +
> +	kargs->cset = find_css_set(cset, dst_cgrp);
> +	if (!kargs->cset) {
> +		ret = -ENOMEM;
> +		goto err;
> +	}
> +
> +	if (cgroup_is_dead(dst_cgrp)) {
> +		ret = -ENODEV;
> +		goto err;
> +	}
I'd move this check right after cgroup_get_from_file. The fork-migration
path is synchrinized via cgroup_mutex with cgroup_destroy_locked and
there's no need checking permissions on cgroup that's going away anyway.


> +static void cgroup_css_set_put_fork(struct kernel_clone_args *kargs)
> +	__releases(&cgroup_threadgroup_rwsem) __releases(&cgroup_mutex)
> +{
> +	cgroup_threadgroup_change_end(current);
> +
> +	if (kargs->flags & CLONE_INTO_CGROUP) {
> +		struct cgroup *cgrp = kargs->cgrp;
> +		struct css_set *cset = kargs->cset;
> +
> +		mutex_unlock(&cgroup_mutex);
> +
> +		if (cset) {
> +			put_css_set(cset);
> +			kargs->cset = NULL;
> +		}
> +
> +		if (cgrp) {
> +			cgroup_put(cgrp);
> +			kargs->cgrp = NULL;
> +		}
> +	}
I don't see any function problem with this ordering, however, I'd
prefer symmetry with the "allocation" path (in cgroup_css_set_fork),
i.e. cgroup_put, put_css_set and lastly mutex_unlock.

> +void cgroup_post_fork(struct task_struct *child,
> +		      struct kernel_clone_args *kargs)
> +	__releases(&cgroup_threadgroup_rwsem) __releases(&cgroup_mutex)
>  {
>  	struct cgroup_subsys *ss;
> -	struct css_set *cset;
> +	struct css_set *cset = kargs->cset;
>  	int i;
>  
>  	spin_lock_irq(&css_set_lock);
>  
>  	WARN_ON_ONCE(!list_empty(&child->cg_list));
> -	cset = task_css_set(current); /* current is @child's parent */
> -	get_css_set(cset);
>  	cset->nr_tasks++;
>  	css_set_move_task(child, NULL, cset, false);
So, the reference is passed over from kargs->cset to task->cgroups. I
think it's necessary to zero kargs->cset in order to prevent droping the 
reference in cgroup_css_set_put_fork.
Perhaps, a general comment about css_set whereabouts during fork and
kargs passing would be useful.

> @@ -6016,6 +6146,17 @@ void cgroup_post_fork(struct task_struct *child)
>  	} while_each_subsys_mask();
>  
>  	cgroup_threadgroup_change_end(current);
> +
> +	if (kargs->flags & CLONE_INTO_CGROUP) {
> +		mutex_unlock(&cgroup_mutex);
> +
> +		cgroup_put(kargs->cgrp);
> +		kargs->cgrp = NULL;
> +	}
> +
> +	/* Make the new cset the root_cset of the new cgroup namespace. */
> +	if (kargs->flags & CLONE_NEWCGROUP)
> +		child->nsproxy->cgroup_ns->root_cset = cset;
root_cset reference (from copy_cgroup_ns) seems leaked here and where is
the additional reference to new cset obtained?

Thanks,
Michal

WARNING: multiple messages have this Message-ID (diff)
From: "Michal Koutný" <mkoutny-IBi9RG/b67k@public.gmane.org>
To: Christian Brauner
	<christian.brauner-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>
Cc: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>,
	Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH v5 5/6] clone3: allow spawning processes into cgroups
Date: Wed, 29 Jan 2020 14:27:19 +0100	[thread overview]
Message-ID: <20200129132719.GD11384@blackbody.suse.cz> (raw)
In-Reply-To: <20200121154844.411-6-christian.brauner-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>

Hello.

On Tue, Jan 21, 2020 at 04:48:43PM +0100, Christian Brauner <christian.brauner-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> wrote:
> +static int cgroup_css_set_fork(struct kernel_clone_args *kargs)
> +	__acquires(&cgroup_mutex) __acquires(&cgroup_threadgroup_rwsem)
> +{
> +	int ret;
> +	struct cgroup *dst_cgrp = NULL;
> +	struct css_set *cset;
> +	struct super_block *sb;
> +	struct file *f;
> +
> +	if (kargs->flags & CLONE_INTO_CGROUP)
> +		mutex_lock(&cgroup_mutex);
> +
> +	cgroup_threadgroup_change_begin(current);
> +
> +	spin_lock_irq(&css_set_lock);
> +	cset = task_css_set(current);
> +	get_css_set(cset);
> +	spin_unlock_irq(&css_set_lock);
> +
> +	if (!(kargs->flags & CLONE_INTO_CGROUP)) {
> +		kargs->cset = cset;
Where is this css_set put when CLONE_INTO_CGROUP isn't used?
(Aha, it's passed to child's tsk->cgroups but see my other note below.)

> +	dst_cgrp = cgroup_get_from_file(f);
> +	if (IS_ERR(dst_cgrp)) {
> +		ret = PTR_ERR(dst_cgrp);
> +		dst_cgrp = NULL;
> +		goto err;
> +	}
> +
> +	/*
> +	 * Verify that we the target cgroup is writable for us. This is
> +	 * usually done by the vfs layer but since we're not going through
> +	 * the vfs layer here we need to do it "manually".
> +	 */
> +	ret = cgroup_may_write(dst_cgrp, sb);
> +	if (ret)
> +		goto err;
> +
> +	ret = cgroup_attach_permissions(cset->dfl_cgrp, dst_cgrp, sb,
> +					!!(kargs->flags & CLONE_THREAD));
> +	if (ret)
> +		goto err;
> +
> +	kargs->cset = find_css_set(cset, dst_cgrp);
> +	if (!kargs->cset) {
> +		ret = -ENOMEM;
> +		goto err;
> +	}
> +
> +	if (cgroup_is_dead(dst_cgrp)) {
> +		ret = -ENODEV;
> +		goto err;
> +	}
I'd move this check right after cgroup_get_from_file. The fork-migration
path is synchrinized via cgroup_mutex with cgroup_destroy_locked and
there's no need checking permissions on cgroup that's going away anyway.


> +static void cgroup_css_set_put_fork(struct kernel_clone_args *kargs)
> +	__releases(&cgroup_threadgroup_rwsem) __releases(&cgroup_mutex)
> +{
> +	cgroup_threadgroup_change_end(current);
> +
> +	if (kargs->flags & CLONE_INTO_CGROUP) {
> +		struct cgroup *cgrp = kargs->cgrp;
> +		struct css_set *cset = kargs->cset;
> +
> +		mutex_unlock(&cgroup_mutex);
> +
> +		if (cset) {
> +			put_css_set(cset);
> +			kargs->cset = NULL;
> +		}
> +
> +		if (cgrp) {
> +			cgroup_put(cgrp);
> +			kargs->cgrp = NULL;
> +		}
> +	}
I don't see any function problem with this ordering, however, I'd
prefer symmetry with the "allocation" path (in cgroup_css_set_fork),
i.e. cgroup_put, put_css_set and lastly mutex_unlock.

> +void cgroup_post_fork(struct task_struct *child,
> +		      struct kernel_clone_args *kargs)
> +	__releases(&cgroup_threadgroup_rwsem) __releases(&cgroup_mutex)
>  {
>  	struct cgroup_subsys *ss;
> -	struct css_set *cset;
> +	struct css_set *cset = kargs->cset;
>  	int i;
>  
>  	spin_lock_irq(&css_set_lock);
>  
>  	WARN_ON_ONCE(!list_empty(&child->cg_list));
> -	cset = task_css_set(current); /* current is @child's parent */
> -	get_css_set(cset);
>  	cset->nr_tasks++;
>  	css_set_move_task(child, NULL, cset, false);
So, the reference is passed over from kargs->cset to task->cgroups. I
think it's necessary to zero kargs->cset in order to prevent droping the 
reference in cgroup_css_set_put_fork.
Perhaps, a general comment about css_set whereabouts during fork and
kargs passing would be useful.

> @@ -6016,6 +6146,17 @@ void cgroup_post_fork(struct task_struct *child)
>  	} while_each_subsys_mask();
>  
>  	cgroup_threadgroup_change_end(current);
> +
> +	if (kargs->flags & CLONE_INTO_CGROUP) {
> +		mutex_unlock(&cgroup_mutex);
> +
> +		cgroup_put(kargs->cgrp);
> +		kargs->cgrp = NULL;
> +	}
> +
> +	/* Make the new cset the root_cset of the new cgroup namespace. */
> +	if (kargs->flags & CLONE_NEWCGROUP)
> +		child->nsproxy->cgroup_ns->root_cset = cset;
root_cset reference (from copy_cgroup_ns) seems leaked here and where is
the additional reference to new cset obtained?

Thanks,
Michal

  reply	other threads:[~2020-01-29 13:27 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-21 15:48 [PATCH v5 0/6] clone3 & cgroups: allow spawning processes into cgroups Christian Brauner
2020-01-21 15:48 ` Christian Brauner
2020-01-21 15:48 ` [PATCH v5 1/6] cgroup: unify attach permission checking Christian Brauner
2020-01-29 13:25   ` Michal Koutný
2020-01-21 15:48 ` [PATCH v5 2/6] cgroup: add cgroup_get_from_file() helper Christian Brauner
2020-01-29 13:25   ` Michal Koutný
2020-01-29 13:25     ` Michal Koutný
2020-01-21 15:48 ` [PATCH v5 3/6] cgroup: refactor fork helpers Christian Brauner
2020-01-29 13:26   ` Michal Koutný
2020-01-29 13:26     ` Michal Koutný
2020-01-21 15:48 ` [PATCH v5 4/6] cgroup: add cgroup_may_write() helper Christian Brauner
2020-01-21 15:48 ` [PATCH v5 5/6] clone3: allow spawning processes into cgroups Christian Brauner
2020-01-21 15:48   ` Christian Brauner
2020-01-21 15:48   ` Christian Brauner
2020-01-29 13:27   ` Michal Koutný [this message]
2020-01-29 13:27     ` Michal Koutný
2020-02-02  9:37     ` Christian Brauner
2020-02-02  9:37       ` Christian Brauner
2020-02-02  9:37       ` Christian Brauner
2020-02-03 14:32       ` Michal Koutný
2020-02-03 14:32         ` Michal Koutný
2020-02-04 11:13         ` Christian Brauner
2020-02-04 11:13           ` Christian Brauner
2020-02-04 11:13           ` Christian Brauner
2020-02-04 11:53   ` Peter Zijlstra
2020-02-04 15:01     ` Christian Brauner
2020-02-04 15:01       ` Christian Brauner
2020-01-21 15:48 ` [PATCH v5 6/6] selftests/cgroup: add tests for cloning " Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200129132719.GD11384@blackbody.suse.cz \
    --to=mkoutny@suse.com \
    --cc=cgroups@vger.kernel.org \
    --cc=christian.brauner@ubuntu.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.