All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: syzbot <syzbot+5b1e53987f858500ec00@syzkaller.appspotmail.com>
Cc: hdanton@sina.com, linux-kernel@vger.kernel.org,
	syzkaller-bugs@googlegroups.com
Subject: Re: [syzbot] WARNING in mntput_no_expire (3)
Date: Wed, 18 May 2022 04:57:46 +0000	[thread overview]
Message-ID: <YoR8yrgorv8QssX6@zeniv-ca.linux.org.uk> (raw)
In-Reply-To: <YoR4XSN2fn2BjkXw@zeniv-ca.linux.org.uk>

On Wed, May 18, 2022 at 04:38:53AM +0000, Al Viro wrote:
> On Wed, May 18, 2022 at 01:58:40AM +0000, Al Viro wrote:
> > On Wed, May 18, 2022 at 01:10:20AM +0000, Al Viro wrote:
> > > On Wed, May 18, 2022 at 12:59:46AM +0000, Al Viro wrote:
> > > > On Tue, May 17, 2022 at 10:58:15PM +0000, Al Viro wrote:
> > > > > On Tue, May 17, 2022 at 03:49:07PM -0700, syzbot wrote:
> > > > > > Hello,
> > > > > > 
> > > > > > syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> > > > > > WARNING in mntput_no_expire
> > > > > 
> > > > > Obvious question: which filesystem it is?
> > > > 
> > > > FWIW, can't reproduce here - at least not with C reproducer +
> > > > -rc7^ kernel + .config from report + debian kvm image (bullseye,
> > > > with systemd shite replaced with sysvinit, which might be relevant).
> > > > 
> > > > In case systemd-specific braindamage is needed to reproduce it...
> > > > Hell knows; at least mount --make-rshared / doesn't seem to suffice.
> > > 
> > > ... doesn't reproduce with genuine systemd either.  FWIW, 4-way SMP
> > > setup here.
> > 
> > OK, reproduced...
> 
> FWIW, it smells like something (cgroup?) fucking up percpu allocation/freeing.
> Note that struct mount has both refcount and writers count held in percpu;
> replacing the refcount with atomic_t gets rid of seeing negative refcount
> in mntput_no_expire(), but leaves negative writers count caught in
> cleanup_mnt(); turn that from WARN_ON into printk and we get past that,
> only to see
> 	percpu ref (css_release) <= 0 (-4294967294)
> immediately afterwards.
> 
> IOW, it looks like we are getting not messed refcounting on either side,
> but same refcount physically shared by unrelated objects.

Gotcha.
percpu_ref_init():
        ref->percpu_count_ptr = (unsigned long)
                __alloc_percpu_gfp(sizeof(unsigned long), align, gfp);
        if (!ref->percpu_count_ptr)
                return -ENOMEM;
        data = kzalloc(sizeof(*ref->data), gfp);
        if (!data) {
                free_percpu((void __percpu *)ref->percpu_count_ptr);
                return -ENOMEM;
        }

cgroup_create():
        err = percpu_ref_init(&css->refcnt, css_release, 0, GFP_KERNEL);
        if (err)
                goto err_free_css;

        err = cgroup_idr_alloc(&ss->css_idr, NULL, 2, 0, GFP_KERNEL);
        if (err < 0)
                goto err_free_css;

Now note that we end up hitting the same path in case of successful and
failed percpu_ref_init().  With no way to tell if css->refcnt.percpu_count_ptr
is an already freed object or needs to be freed.  And sure enough, we have

err_free_css:
        list_del_rcu(&css->rstat_css_node);
        INIT_RCU_WORK(&css->destroy_rwork, css_free_rwork_fn);
        queue_rcu_work(cgroup_destroy_wq, &css->destroy_rwork);

with css_free_rwork_fn() starting with
        percpu_ref_exit(&css->refcnt);

which will give that double free.  That might be not the only cause of
trouble, but this looks like a bug and a plausible source of the
symptoms observed here.  Let's see if this helps:

diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index af9302141bcf..e5c5315da274 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -76,6 +76,7 @@ int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release,
 	data = kzalloc(sizeof(*ref->data), gfp);
 	if (!data) {
 		free_percpu((void __percpu *)ref->percpu_count_ptr);
+		ref->percpu_count_ptr = 0;
 		return -ENOMEM;
 	}
 

  reply	other threads:[~2022-05-18  4:57 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220517223806.2299-1-hdanton@sina.com>
2022-05-17 22:49 ` [syzbot] WARNING in mntput_no_expire (3) syzbot
2022-05-17 22:58   ` Al Viro
2022-05-18  0:59     ` Al Viro
2022-05-18  1:10       ` Al Viro
2022-05-18  1:58         ` Al Viro
2022-05-18  4:38           ` Al Viro
2022-05-18  4:57             ` Al Viro [this message]
2022-05-18  5:37               ` Al Viro
2022-05-18  6:25                 ` Al Viro
2022-05-18  6:45                   ` syzbot
     [not found] <20220518104052.2373-1-hdanton@sina.com>
2022-05-18 11:00 ` syzbot
     [not found] <20220517111247.2103-1-hdanton@sina.com>
2022-05-17 11:35 ` syzbot
     [not found] <20220516233918.2046-1-hdanton@sina.com>
2022-05-17  2:57 ` syzbot
     [not found] <20220516122225.1986-1-hdanton@sina.com>
2022-05-16 12:33 ` syzbot
     [not found] <20220515133111.1864-1-hdanton@sina.com>
2022-05-15 13:42 ` syzbot
     [not found] <20220515094719.1786-1-hdanton@sina.com>
2022-05-15  9:59 ` syzbot
     [not found] <20220515050556.1646-1-hdanton@sina.com>
2022-05-15  7:52 ` syzbot
     [not found] <20220515012731.1529-1-hdanton@sina.com>
2022-05-15  7:23 ` syzbot
     [not found] <20220514233453.1426-1-hdanton@sina.com>
2022-05-15  0:22 ` syzbot
     [not found] <20220514132858.1322-1-hdanton@sina.com>
2022-05-14 13:40 ` syzbot
     [not found] <20220514114718.1254-1-hdanton@sina.com>
2022-05-14 11:59 ` syzbot
     [not found] <20220514084129.1104-1-hdanton@sina.com>
2022-05-14  9:20 ` syzbot
     [not found] <20220514073117.965-1-hdanton@sina.com>
2022-05-14  7:42 ` syzbot
     [not found] <20220514062752.900-1-hdanton@sina.com>
2022-05-14  6:38 ` syzbot
     [not found] <20220514005032.346-1-hdanton@sina.com>
2022-05-14  1:30 ` syzbot
     [not found] <20220513144536.279-1-hdanton@sina.com>
2022-05-13 15:14 ` syzbot
     [not found] <20220513134852.6446-1-hdanton@sina.com>
2022-05-13 14:12 ` syzbot
     [not found] <20220513123641.6379-1-hdanton@sina.com>
2022-05-13 12:48 ` syzbot
     [not found] <20220512133426.6300-1-hdanton@sina.com>
2022-05-12 14:05 ` syzbot
     [not found] <20220512120234.6088-1-hdanton@sina.com>
2022-05-12 12:20 ` syzbot
     [not found] <20220511135117.5993-1-hdanton@sina.com>
2022-05-11 14:03 ` syzbot
2021-11-15 22:27 syzbot
2022-05-11  5:34 ` syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YoR8yrgorv8QssX6@zeniv-ca.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=syzbot+5b1e53987f858500ec00@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.