All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrei Vagin <avagin@gmail.com>
To: David Howells <dhowells@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org,
	Li Zefan <lizefan@huawei.com>
Subject: Re: [PATCH vfs/for-next v2] cgroup: fix top cgroup refcnt leak
Date: Wed, 2 Jan 2019 15:31:19 -0800	[thread overview]
Message-ID: <20190102233118.GA19202@gmail.com> (raw)
In-Reply-To: <20190102230655.GA8616@gmail.com>

On Wed, Jan 02, 2019 at 03:06:55PM -0800, Andrei Vagin wrote:
> On Wed, Jan 02, 2019 at 10:26:20PM +0000, David Howells wrote:
> > Andrei Vagin <avagin@gmail.com> wrote:
> > 
> > > It looks like the c6b3d5bcd67c ("cgroup: fix top cgroup refcnt leak")
> > > commit was reverted by mistake.
> > > 
> > > $ mkdir /tmp/cgroup
> > > $ mkdir /tmp/cgroup2
> > > $ mount -t cgroup -o none,name=test test /tmp/cgroup
> > > $ mount -t cgroup -o none,name=test test /tmp/cgroup2
> > > $ umount /tmp/cgroup
> > > $ umount /tmp/cgroup2
> > > $ cat /proc/self/cgroup | grep test
> > > 12:name=test:/
> > > 
> > > You can see the test cgroup was not freed.
> > > 
> > > Cc: Li Zefan <lizefan@huawei.com>
> > > Fixes: aea3f2676c83 ("kernfs, sysfs, cgroup, intel_rdt: Support fs_context")
> > > Signed-off-by: Andrei Vagin <avagin@gmail.com>
> > 
> > The kernel (Al's for-next branch, that is) seems to work fine without this
> > patch;
> 
> It doesn't work for me. The mount system call stucks in cgroup1_get_tree:

Here is a reproducer for this problem:

[root@fc24 ~]# cat fs-vs-cg 
set -m
d=$(mktemp -d /tmp/cg.XXXXXX)
for i in `seq 2`; do 
	mkdir $d/a$i
	mount -t cgroup -o none,name=xxxxy xxx $d/a$i
done
mkdir -p $d/a1/test
for i in `seq 2`; do 
	umount $d/a$i
done
d=$(mktemp -d /tmp/cg.XXXXXX)
for i in `seq 2`; do 
	mkdir $d/a$i
	mount -t cgroup -o none,name=xxxxy xxx $d/a$i
done
rmdir $d/a1/test
for i in `seq 2`; do 
	umount $d/a$i
done

You need to execute this script twice and it will stuck on a second
attmept.

I tested it on Al's vfs-next without my patch.

You will see this warning in the kernel log:

[   47.043634] ------------[ cut here ]------------
[   47.044522] percpu ref (css_release) <= 0 (0) after switching to atomic
[   47.044544] WARNING: CPU: 1 PID: 0 at lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0x1dc/0x210
[   47.047369] Modules linked in:
[   47.047911] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.20.0-rc1-00071-g1fab5fff0a7a #4

So my patch is probably correct and the problem is in another place.

> 
> [root@fc24 ~]# uname -a
> Linux fc24 4.20.0-rc1-00071-g1fab5fff0a7a #4 SMP Wed Jan 2 14:59:36 PST 2019 x86_64 x86_64 x86_64 GNU/Linux
> 
> [avagin@laptop linux]$ git log | head -n 6
> commit 1fab5fff0a7ae1fa3b78383a78f7a56f03a3d673
> Merge: ea5751ccd665 fd6261f4322c 4addd2640fca a40612ef0ee1 f91528955d00
> Author: Al Viro <viro@zeniv.linux.org.uk>
> Date:   Fri Dec 28 02:05:08 2018 -0500
> 
>     Merge branches 'work.mount', 'work.misc', 'misc.misc' and 'work.iov_iter' into for-next
> 
> 
> $ ps axf
> ...
>   636 ?        S      0:00      \_ sshd: root@pts/0
>   643 pts/0    Ss     0:00          \_ -bash
>   710 pts/0    S      0:00              \_ python test/zdtm.py run -p 4 --keep-going --report report -T .*cgroup.* --ignore-taint
>  1449 pts/0    S      0:00              |   \_ flock zdtm_mount_cgroups.lock ./zdtm_umount_cgroups
>  1450 pts/0    S      0:00              |       \_ /bin/sh ./zdtm_umount_cgroups
>  1456 pts/0    D      0:00              |           \_ mount -t cgroup -o none,name=zdtmtst.defaultroot zdtm zdtm.EW1qB8
> 
> 
> [root@fc24 criu]# cat /proc/1456/stack 
> [<0>] msleep+0x38/0x40
> [<0>] cgroup1_get_tree+0x47b/0x76b
> [<0>] vfs_get_tree+0x3d/0x100
> [<0>] do_mount+0x2d8/0xde0
> [<0>] ksys_mount+0xba/0xd0
> [<0>] __x64_sys_mount+0x21/0x30
> [<0>] do_syscall_64+0x5a/0x200
> [<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [<0>] 0xffffffffffffffff
> 
> > this patch causes the kernel to go bang (trace below).
> > 
> > David
> > ---
> > percpu ref (css_release) <= 0 (0) after switching to atomic
> 
> I have updated to Al's vfs-next and now I see this error too.
> 
> I tested by patch on 40effd960bec ("Merge branches 'work.mount', 'work.misc', 'misc.misc' and 'work.iov_iter' into for-next")
> and it works fine there...
> 
> 
> > WARNING: CPU: 1 PID: 0 at lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0x90/0x1a0
> > Modules linked in:
> > CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.20.0-rc1-fscache+ #1256
> > Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
> > RIP: 0010:percpu_ref_switch_to_atomic_rcu+0x90/0x1a0
> > Code: d8 48 85 c0 7f 26 80 3d 29 60 ff 00 00 75 1d 48 8b 53 d8 48 c7 c7 00 3c 16 82 c6 05 15 60 ff 00 01 48 8b 73 e8 e8 26 1f ac ff <0f> 0b 48 8d 43 d8 48 89 c7 49 89 c6 ff 53 f0 48 c7 43 f0 00 00 00
> > RSP: 0018:ffff8800c6c83ed8 EFLAGS: 00010286
> > RAX: 0000000000000000 RBX: ffff8800d3a6e048 RCX: ffff8800c6c83dc4
> > RDX: 0000000000000003 RSI: ffff8800c5c9cb08 RDI: ffff8800c5c9c340
> > RBP: ffff8800c6c83ef8 R08: 000000000000003b R09: 0000000000021900
> > R10: ffff8800c6c83b00 R11: 0000004ac8b230c2 R12: 000060ff39019cd0
> > R13: ffffffff825b0be8 R14: ffffffffffffffff R15: 0000000000000009
> > FS:  0000000000000000(0000) GS:ffff8800c6c80000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f33b5ad1f54 CR3: 0000000002410001 CR4: 00000000001606e0
> > Call Trace:
> >  <IRQ>
> >  ? rcu_process_callbacks+0x469/0x6de
> >  ? percpu_ref_exit+0x26/0x26
> >  rcu_process_callbacks+0x4d7/0x6de
> >  __do_softirq+0x1a5/0x38f
> >  irq_exit+0x63/0xd1
> >  smp_apic_timer_interrupt+0x1cd/0x1e0
> >  apic_timer_interrupt+0xf/0x20
> >  </IRQ>
> > RIP: 0010:cpuidle_enter_state+0x24e/0x2b1
> > Code: ff e8 7b be 8a ff 45 84 ed 74 17 9c 58 0f ba e0 09 73 08 0f 0b fa e8 63 8d 94 ff 31 ff e8 14 af 8f ff e8 cd 8b 94 ff fb 85 db <78> 47 48 8b 14 24 b8 ff ff ff 7f 48 b9 ff ff ff ff f3 01 00 00 48
> > RSP: 0018:ffff8800c5d23ea0 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
> > RAX: ffff8800c5c9c340 RBX: 0000000000000005 RCX: 000000000000001f
> > RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffff8800c5c9c340
> > RBP: 0000000000000005 R08: 0000000000000002 R09: 0000000000021900
> > R10: 071c71c71c71c71c R11: 0000004ac7f2c3ed R12: ffffe8ffffc89ed0
> > R13: 0000000000000000 R14: ffffffff8251d478 R15: 0000000000000000
> >  do_idle+0x163/0x1ea
> >  cpu_startup_entry+0x1d/0x1f
> >  start_secondary+0x175/0x190
> >  secondary_startup_64+0xa4/0xb0
> > irq event stamp: 99625
> > hardirqs last  enabled at (99624): [<ffffffff810b18a5>] vprintk_emit+0xe6/0x24a
> > hardirqs last disabled at (99625): [<ffffffff81001639>] trace_hardirqs_off_thunk+0x1a/0x1c
> > softirqs last  enabled at (99590): [<ffffffff8105eba9>] irq_enter+0x42/0x5d
> > softirqs last disabled at (99591): [<ffffffff8105ec27>] irq_exit+0x63/0xd1
> > ---[ end trace f59fd95ebc091779 ]---

  reply	other threads:[~2019-01-02 23:31 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-28 23:59 [PATCH] cgroup: fix top cgroup refcnt leak Andrei Vagin
2018-12-29  0:04 ` [PATCH vfs/for-next v2] " Andrei Vagin
2018-12-30 19:41   ` Andrei Vagin
2019-01-02  2:28   ` Al Viro
2019-01-02 18:14     ` [PATCH vfs/for-next v3] " Andrei Vagin
2019-01-02 19:37     ` [PATCH vfs/for-next v2] " Andrei Vagin
2019-01-02 19:37     ` [PATCH vfs/for-next v4] " Andrei Vagin
2019-01-02 20:02       ` Al Viro
2019-01-02 21:06         ` Andrei Vagin
2019-01-03  0:26         ` David Howells
2019-01-03  0:43           ` Andrei Vagin
2019-01-03  1:00             ` Andrei Vagin
2019-01-03  3:54               ` [PATCH vfs/for-next v6] " Andrei Vagin
2019-01-03  8:32                 ` Al Viro
2019-01-03 17:34                   ` Andrei Vagin
2019-01-03 21:54                   ` David Howells
2019-01-02 22:26 ` [PATCH vfs/for-next v2] " David Howells
2019-01-02 23:06   ` Andrei Vagin
2019-01-02 23:31     ` Andrei Vagin [this message]
2019-01-03  0:33     ` David Howells
2019-01-03 13:41     ` David Howells
2019-01-03 13:42     ` David Howells
2019-01-03 15:27     ` David Howells
2019-01-03 15:44     ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190102233118.GA19202@gmail.com \
    --to=avagin@gmail.com \
    --cc=cgroups@vger.kernel.org \
    --cc=dhowells@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.