All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	syzbot <syzbot+c88a7030da47945a3cc3@syzkaller.appspotmail.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	syzkaller-bugs@googlegroups.com, io-uring@vger.kernel.org
Subject: Re: [syzbot] WARNING in mntput_no_expire (2)
Date: Mon, 5 Apr 2021 16:18:58 +0000	[thread overview]
Message-ID: <YGs4clcRhyoXX8D0@zeniv-ca.linux.org.uk> (raw)
In-Reply-To: <20210405114437.hjcojekyp5zt6huu@wittgenstein>

On Mon, Apr 05, 2021 at 01:44:37PM +0200, Christian Brauner wrote:
> On Sun, Apr 04, 2021 at 08:17:21PM +0000, Al Viro wrote:
> > On Sun, Apr 04, 2021 at 06:50:10PM +0000, Al Viro wrote:
> > 
> > > > Yeah, I have at least namei.o
> > > > 
> > > > https://drive.google.com/file/d/1AvO1St0YltIrA86DXjp1Xg3ojtS9owGh/view?usp=sharing
> > > 
> > > *grumble*
> > > 
> > > Is it reproducible without KASAN?  Would be much easier to follow the produced
> > > asm...
> > 
> > 	Looks like inode_permission(_, NULL, _) from may_lookup(nd).  I.e.
> > nd->inode == NULL.
> 
> Yeah, I already saw that.
> 
> > 
> > 	Mind slapping BUG_ON(!nd->inode) right before may_lookup() call in
> > link_path_walk() and trying to reproduce that oops?
> 
> Yep, no problem. If you run the reproducer in a loop for a little while
> you eventually trigger the BUG_ON() and then you get the following splat
> (and then an endless loop) in [1] with nd->inode NULL.
> 
> _But_ I managed to debug this further and was able to trigger the BUG_ON()
> directly in path_init() in the AT_FDCWD branch (after all its AT_FDCWD(./file0)
> with the patch in [3] (it's in LOOKUP_RCU) the corresponding splat is in [2].
> So the crash happens for a PF_IO_WORKER thread with a NULL nd->inode for the
> PF_IO_WORKER's pwd (The PF_IO_WORKER seems to be in async context.).

So we find current->fs->pwd.dentry negative, with current->fs->seq sampled
equal before and after that?  Lovely...  The only places where we assign
anything to ->pwd.dentry are
void set_fs_pwd(struct fs_struct *fs, const struct path *path)
{
        struct path old_pwd; 

        path_get(path);
        spin_lock(&fs->lock);
        write_seqcount_begin(&fs->seq);
        old_pwd = fs->pwd;
        fs->pwd = *path;
        write_seqcount_end(&fs->seq);
        spin_unlock(&fs->lock);

        if (old_pwd.dentry)
                path_put(&old_pwd);
}
where we have ->seq bumped between dget new/assignment/ dput old,
copy_fs_struct() where we have
                spin_lock(&old->lock);
                fs->root = old->root;
                path_get(&fs->root);
                fs->pwd = old->pwd;
                path_get(&fs->pwd);
                spin_unlock(&old->lock);
fs being freshly allocated instance that couldn't have been observed
by anyone and chroot_fs_refs(), where we have
                        spin_lock(&fs->lock);
                        write_seqcount_begin(&fs->seq);
                        hits += replace_path(&fs->root, old_root, new_root);
                        hits += replace_path(&fs->pwd, old_root, new_root);
                        write_seqcount_end(&fs->seq);
                        while (hits--) {
                                count++;
                                path_get(new_root);
                        }
                        spin_unlock(&fs->lock);
...
static inline int replace_path(struct path *p, const struct path *old, const struct path *new)
{
        if (likely(p->dentry != old->dentry || p->mnt != old->mnt))
                return 0;
        *p = *new;
        return 1;
}
Here we have new_root->dentry pinned from the very beginning,
and assignments are wrapped into bumps of ->seq.  Moreover,
we are holding ->lock through that sequence (as all writers
do), so these references can't be dropped before path_get()
bumps new_root->dentry refcount.

chroot_fs_refs() is called only by pivot_root(2):
        chroot_fs_refs(&root, &new);
and there new is set by
        error = user_path_at(AT_FDCWD, new_root,
                             LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &new);
        if (error)
                goto out0;
which pins new.dentry *and* verifies that it's positive and a directory,
at that.  Since pinned positive dentry can't be made negative by anybody
else, we know it will remain in that state until
	path_put(&new);
well downstream of chroot_fs_refs().  In copy_fs_struct() we are
copying someone's ->pwd, so it's also pinned positive.  And it
won't be dropped outside of old->lock, so by the time somebody
manages to drop the reference in old, path_get() effects will be
visible (old->lock serving as a barrier).

That leaves set_fs_pwd() calls:
fs/init.c:54:           set_fs_pwd(current->fs, &path);
	init_chdir(), path set by LOOKUP_DIRECTORY patwalk.  Pinned positive.
fs/namespace.c:4207:    set_fs_pwd(current->fs, &root);
	init_mount_tree(), root.dentry being ->mnt_root of rootfs.  Pinned
positive (and it would've oopsed much earlier had that been it)
fs/namespace.c:4485:    set_fs_pwd(fs, &root);
	mntns_install(), root filled by successful LOOKUP_DOWN for "/"
from mnt_ns->root.  Should be pinned positive.
fs/open.c:501:  set_fs_pwd(current->fs, &path);
	chdir(2), path set by LOOKUP_DIRECTORY pathwalk.  Pinned positive.
fs/open.c:528:          set_fs_pwd(current->fs, &f.file->f_path);
	fchdir(2), file->f_path of any opened file.  Pinned positive.
kernel/usermode_driver.c:130:   set_fs_pwd(current->fs, &umd_info->wd);
	umd_setup(), ->wd.dentry equal to ->wd.mnt->mnt_root, should be pinned positive.
kernel/nsproxy.c:509:           set_fs_pwd(me->fs, &nsset->fs->pwd);
	commit_nsset().  Let's see what's going on there...

        if ((flags & CLONE_NEWNS) && (flags & ~CLONE_NEWNS)) {
                set_fs_root(me->fs, &nsset->fs->root);
                set_fs_pwd(me->fs, &nsset->fs->pwd);
        }
In those conditions nsset.fs has come from copy_fs_struct() done in
prepare_nsset().  And the only thing that might've been done to it
would be those set_fs_pwd() in mntns_install() (I'm not fond of the
entire nsset->fs thing - looks like papering over bad calling
conventions, but anyway)

Now, I might've missed some insanity (direct assignments to ->pwd.dentry,
etc. - wouldn't be the first time io_uring folks went "layering? wassat?
we'll just poke in whatever we can reach"), but I don't see anything
obvious of that sort in the area...

OK, how about this: in path_init(), right after
                        do {
                                seq = read_seqcount_begin(&fs->seq);
                                nd->path = fs->pwd;
                                nd->inode = nd->path.dentry->d_inode;
                                nd->seq = __read_seqcount_begin(&nd->path.dentry->d_seq);
                        } while (read_seqcount_retry(&fs->seq, seq));
slap
			if (!nd->inode) {
				// should never, ever happen
				struct dentry *fucked = nd->path.dentry;
				printk(KERN_ERR "%pd4 %d %x %p %d %d", fucked, d_count(fucked),
							fucked->d_flags, fs, fs->users, seq);
				BUG_ON(1);
				return ERR_PTR(-EINVAL);
			}
and see what it catches?

  reply	other threads:[~2021-04-05 16:19 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-01  9:09 [syzbot] WARNING in mntput_no_expire (2) syzbot
2021-04-01 12:16 ` syzbot
2021-04-01 18:32   ` Christian Brauner
2021-04-01 15:45 ` Christian Brauner
2021-04-01 16:09   ` Jens Axboe
2021-04-01 17:46     ` Christian Brauner
2021-04-01 17:59       ` Christian Brauner
2021-04-01 19:11         ` Al Viro
2021-04-04  2:34           ` Al Viro
2021-04-04  2:38             ` Al Viro
2021-04-04 11:34             ` Christian Brauner
2021-04-04 15:56               ` Al Viro
2021-04-04 16:40                 ` Christian Brauner
2021-04-04 16:44                   ` Al Viro
2021-04-04 17:05                     ` Christian Brauner
2021-04-04 18:50                       ` Al Viro
2021-04-04 20:17                         ` Al Viro
2021-04-05 11:44                           ` Christian Brauner
2021-04-05 16:18                             ` Al Viro [this message]
2021-04-05 17:08                               ` Christian Brauner
2021-04-05 18:23                                 ` Al Viro
2021-04-05 18:28                                   ` Al Viro
2021-04-05 20:07                                     ` Christian Brauner
2021-04-06  1:38                                       ` Al Viro
2021-04-06  2:24                                         ` Al Viro
2021-04-06 12:35                                         ` Christian Brauner
2021-04-06 13:13                                           ` Al Viro
2021-04-06 13:22                                             ` Christian Brauner
2021-04-06 14:15                                               ` Al Viro
2021-04-06 14:23                                                 ` Al Viro
2021-04-06 15:37                                                   ` Jens Axboe
2021-04-06 14:46                                                 ` Christian Brauner
2021-04-04 16:52                   ` Christian Brauner
2021-04-04 16:55                     ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YGs4clcRhyoXX8D0@zeniv-ca.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=axboe@kernel.dk \
    --cc=christian.brauner@ubuntu.com \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=syzbot+c88a7030da47945a3cc3@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.