LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Andrei Vagin <avagin@virtuozzo.com>
To: Li Zefan <lizefan@huawei.com>
Cc: Tejun Heo <tj@kernel.org>, <dvyukov@google.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Cgroups <cgroups@vger.kernel.org>
Subject: Re: cgroup: avoid attaching a cgroup root to two different superblocks
Date: Fri, 14 Apr 2017 16:27:38 -0700
Message-ID: <20170414232737.GC20350@outlook.office365.com> (raw)
In-Reply-To: <58E7532B.4030505@huawei.com>

Hello,

One of our CRIU tests hangs with this patch.

Steps to reproduce:
curl -o cgroupns.c https://gist.githubusercontent.com/avagin/f87c8a8bd2a0de9afcc74976327786bc/raw/5843701ef3679f50dd2427cf57a80871082eb28c/gistfile1.txt
gcc cgroupns.c -o cgroupns
./cgroupns
./cgroupns

[root@fc24 ~]# strace -s 256 -fe clone,unshare,setns,mount ./cgroupns 
mount("none", "/tmp/cgroupns.test/zdtmtst", "cgroup", 0, "none,name=zdtmtst") = 0
unshare(CLONE_NEWCGROUP)                = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fe5da0b89d0) = 529
strace: Process 529 attached
[pid   529] setns(3, CLONE_NEWCGROUP)   = 0
[pid   529] +++ exited with 0 +++
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=529, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
+++ exited with 0 +++
[root@fc24 ~]# strace -s 256 -fe clone,unshare,setns,mount ./cgroupns 
mount("none", "/tmp/cgroupns.test/zdtmtst", "cgroup", 0, "none,name=zdtmtst") = ? ERESTARTNOINTR (To be restarted)
mount("none", "/tmp/cgroupns.test/zdtmtst", "cgroup", 0, "none,name=zdtmtst") = ? ERESTARTNOINTR (To be restarted)
mount("none", "/tmp/cgroupns.test/zdtmtst", "cgroup", 0, "none,name=zdtmtst") = ? ERESTARTNOINTR (To be restarted)
mount("none", "/tmp/cgroupns.test/zdtmtst", "cgroup", 0, "none,name=zdtmtst") = ? ERESTARTNOINTR (To be restarted)
mount("none", "/tmp/cgroupns.test/zdtmtst", "cgroup", 0, "none,name=zdtmtst") = ? ERESTARTNOINTR (To be restarted)
mount("none", "/tmp/cgroupns.test/zdtmtst", "cgroup", 0, "none,name=zdtmtst") = ? ERESTARTNOINTR (To be restarted)
....

Thanks,
Andrei

On Fri, Apr 07, 2017 at 04:51:55PM +0800, Li Zefan wrote:
> Run this:
> 
>     touch file0
>     for ((; ;))
>     {
>         mount -t cpuset xxx file0
>     }
> 
> And this concurrently:
> 
>     touch file1
>     for ((; ;))
>     {
>         mount -t cpuset xxx file1
>     }
> 
> We'll trigger a warning like this:
> 
>  ------------[ cut here ]------------
>  WARNING: CPU: 1 PID: 4675 at lib/percpu-refcount.c:317 percpu_ref_kill_and_confirm+0x92/0xb0
>  percpu_ref_kill_and_confirm called more than once on css_release!
>  CPU: 1 PID: 4675 Comm: mount Not tainted 4.11.0-rc5+ #5
>  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
>  Call Trace:
>   dump_stack+0x63/0x84
>   __warn+0xd1/0xf0
>   warn_slowpath_fmt+0x5f/0x80
>   percpu_ref_kill_and_confirm+0x92/0xb0
>   cgroup_kill_sb+0x95/0xb0
>   deactivate_locked_super+0x43/0x70
>   deactivate_super+0x46/0x60
>  ...
>  ---[ end trace a79f61c2a2633700 ]---
> 
> Here's a race:
> 
>   Thread A				Thread B
> 
>   cgroup1_mount()
>     # alloc a new cgroup root
>     cgroup_setup_root()
> 					cgroup1_mount()
> 					  # no sb yet, returns NULL
> 					  kernfs_pin_sb()
> 
> 					  # but succeeds in getting the refcnt,
> 					  # so re-use cgroup root
> 					  percpu_ref_tryget_live()
>     # alloc sb with cgroup root
>     cgroup_do_mount()
> 
>   cgroup_kill_sb()
> 					  # alloc another sb with same root
> 					  cgroup_do_mount()
> 
> 					cgroup_kill_sb()
> 
> We end up using the same cgroup root for two different superblocks,
> so percpu_ref_kill() will be called twice on the same root when the
> two superblocks are destroyed.
> 
> We should fix to make sure the superblock pinning is really successful.
> 
> Cc: stable@vger.kernel.org # 3.16+
> Reported-by: Dmitry Vyukov <dvyukov@google.com>
> Signed-off-by: Zefan Li <lizefan@huawei.com>
> ---
>  kernel/cgroup/cgroup-v1.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
> index 1dc22f6..12e19f0 100644
> --- a/kernel/cgroup/cgroup-v1.c
> +++ b/kernel/cgroup/cgroup-v1.c
> @@ -1146,7 +1146,7 @@ struct dentry *cgroup1_mount(struct file_system_type *fs_type, int flags,
>  		 * path is super cold.  Let's just sleep a bit and retry.
>  		 */
>  		pinned_sb = kernfs_pin_sb(root->kf_root, NULL);
> -		if (IS_ERR(pinned_sb) ||
> +		if (IS_ERR_OR_NULL(pinned_sb) ||
>  		    !percpu_ref_tryget_live(&root->cgrp.self.refcnt)) {
>  			mutex_unlock(&cgroup_mutex);
>  			if (!IS_ERR_OR_NULL(pinned_sb))

  parent reply index

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-07  8:51 [PATCH] " Zefan Li
2017-04-11  0:01 ` Tejun Heo
2017-04-14 23:27 ` Andrei Vagin [this message]
2017-04-14 23:32   ` Andrei Vagin
2017-04-17 10:41     ` Zefan Li
2017-04-18  4:09       ` Andrei Vagin
2017-04-18  6:39       ` Tejun Heo
2017-04-16 15:24   ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170414232737.GC20350@outlook.office365.com \
    --to=avagin@virtuozzo.com \
    --cc=cgroups@vger.kernel.org \
    --cc=dvyukov@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git