linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Patrick McLean <chutzpah@gentoo.org>,
	Bruce Fields <bfields@redhat.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	stable <stable@vger.kernel.org>,
	Thorsten Leemhuis <regressions@leemhuis.info>
Subject: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
Date: Thu, 9 Nov 2017 19:37:15 +0000	[thread overview]
Message-ID: <20171109193715.GB21978@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CA+55aFzGDyeJctD5Y3paBnysWXbA0cMF1_7mvvzG3n2OAnNhHw@mail.gmail.com>

On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote:

> > Here is the BUG we are getting:
> >> [   58.962528] BUG: unable to handle kernel NULL pointer dereference at 0000000000000230
> >> [   58.963918] IP: vfs_statfs+0x73/0xb0
> 
> The code disassembles to

>   2a:* 48 8b b7 30 02 00 00 mov    0x230(%rdi),%rsi <-- trapping instruction

> that matters (and that traps) but I'm almost certain that it's the
> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags()
> when it then does
> 
>      flags_by_sb(mnt->mnt_sb->s_flags);
> 
> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is
> NULL, because we wouldn't have gotten this far if it was.
> 
> Now, afaik, mnt->mnt_sb should never be NULL in the first place for a
> proper path. And the vfs_statfs() code itself hasn't changed in a
> while.
> 
> Which does seem to implicate nfsd as having passed in a bad path to
> vfs_statfs(). But I'm not seeing any changes in nfsd either.

It definitely is NULL mnt->mnt_sb and that should never happen.  All
struct mount instances are allocated by alloc_vfsmnt().  Its callers
are
	* vfs_kern_mount().  Assigns ->mnt_sb to root->d_sb before
anyone else sees the address of that object.
	* clone_mnt().  Assigns ->mnt_sb to that of preexisting instance
before anyone else sees the address of that object.

No other callers exist and no other places ever modify the value of that
field.

All instances of struct dentry are created by __d_alloc()[*], which assigns
->d_sb (never to be modified afterwards) *and* dereferences the pointer
it has stored in ->d_sb before the created struct dentry becomes visible
to anyone else.  No struct dentry should ever be observed with NULL ->d_sb;
the only way to get that is memory corruption or looking at freed instance
after its memory has been reused for something else and zeroed.

In other words, we should never observe a struct mount with NULL ->mnt.mnt_sb -
not without memory corruption or looking at freed instance.

The pointer in that case should've come from exp->ex_path.mnt, exp being
the argument of nfsd4_encode_fattr().  Sure, it might have been a dangling
reference.  However, it looks a lot more like a memory corruptor *OR*
miscompiled kernel.

What kind of load do the reproducer boxen have and how fast does that
bug trigger?  Would it be possible to slap something like
	if (unlikely(!exp->exp_path.mnt->mnt_sb)) {
		struct mount *m = real_mount(exp->exp_path.mnt);
		printk(KERN_ERR "mnt: %p\n", exp->exp_path.mnt);
		printk(KERN_ERR "name: [%s]\n", m->mnt_devname);
		printk(KERN_ERR "ns: [%p]\n", m->mnt_ns);
		printk(KERN_ERR "parent: [%p]\n", m->mnt_parent);
		WARN_ON(1);
		err = -EINVAL;
		goto out_nfserr;
	}
in the beginning of nfsd4_encode_fattr() (with include of ../mount.h added
in fs/nfsd/nfs4xdr.c) and see what will it catch?

Both with and without randomized structs, if possible - I might be barking
at the wrong tree, but IMO the very first step in localizing that crap is
to find out whether it's toolchain-related or not.

[*] strictly speaking, there is one exception - lib/test_printf.c has
four static struct dentry instances.  No chance of those being returned
by any ->mount() instance, though.

  parent reply	other threads:[~2017-11-09 19:37 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-09  0:43 [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11 Patrick McLean
2017-11-09  2:40 ` Linus Torvalds
2017-11-09  3:45   ` Al Viro
2017-11-09 19:34   ` Patrick McLean
2017-11-09 19:38     ` Al Viro
2017-11-09 19:42       ` Patrick McLean
2017-11-09 19:37   ` Al Viro [this message]
2017-11-09 19:51     ` Patrick McLean
2017-11-09 20:04       ` Linus Torvalds
2017-11-09 21:16         ` Al Viro
2017-11-10  1:58         ` Patrick McLean
2017-11-10 13:53           ` Arnd Bergmann
2017-11-10 18:42           ` Linus Torvalds
2017-11-10 23:26             ` Patrick McLean
2017-11-11  0:27               ` Patrick McLean
2017-11-11  2:36                 ` Linus Torvalds
2017-11-11 16:13                   ` Kees Cook
2017-11-11 17:31                     ` Linus Torvalds
2017-11-13 22:48                       ` Patrick McLean
2017-11-17  0:54                         ` Kees Cook
2017-11-17 19:03                           ` Patrick McLean
2017-11-17 21:26                             ` Kees Cook
2017-11-18  0:27                               ` Patrick McLean
2017-11-18  0:55                                 ` Linus Torvalds
2017-11-18  1:54                                   ` Patrick McLean
2017-11-18  5:14                                     ` Kees Cook
2017-11-18  5:29                                       ` Linus Torvalds
2017-11-18  8:20                                         ` Kees Cook
2018-02-21 22:19                                       ` RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11) Maciej S. Szmigiero
2018-02-21 22:47                                         ` Linus Torvalds
2018-02-21 23:34                                           ` Kees Cook
2018-03-05  9:27                                           ` Masahiro Yamada
2018-03-05 19:15                                             ` Kees Cook
2018-03-05 19:18                                             ` Linus Torvalds
2018-02-21 22:52                                         ` Kees Cook
2018-02-21 23:24                                           ` Linus Torvalds
2018-02-22  0:12                                             ` Kees Cook
2018-02-22  0:22                                               ` Linus Torvalds
2018-02-22  0:23                                                 ` Kees Cook
2018-02-22  0:27                                                   ` Kees Cook
2017-11-11  1:13               ` [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11 J. Bruce Fields
2017-11-11  2:32                 ` Al Viro
2017-11-10  1:47       ` Patrick McLean
2017-11-09 20:47   ` J. Bruce Fields
2017-11-09 23:07     ` Patrick McLean
2017-11-13 22:59   ` bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11] Rasmus Villemoes
2017-11-13 23:30     ` Linus Torvalds
2017-11-13 23:54       ` Linus Torvalds
2017-11-14 22:24         ` Rasmus Villemoes
2017-11-14 22:43           ` Linus Torvalds
2017-11-14 23:53             ` Rasmus Villemoes
2017-11-15  0:02               ` Linus Torvalds
2017-11-11  2:47 ` [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11 Alan Cox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171109193715.GB21978@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=bfields@redhat.com \
    --cc=chutzpah@gentoo.org \
    --cc=darrick.wong@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=regressions@leemhuis.info \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).