From: Al Viro <viro@ZenIV.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Patrick McLean <chutzpah@gentoo.org>,
Bruce Fields <bfields@redhat.com>,
"Darrick J. Wong" <darrick.wong@oracle.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
stable <stable@vger.kernel.org>,
Thorsten Leemhuis <regressions@leemhuis.info>
Subject: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
Date: Thu, 9 Nov 2017 03:45:07 +0000 [thread overview]
Message-ID: <20171109034507.GZ21978@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CA+55aFzGDyeJctD5Y3paBnysWXbA0cMF1_7mvvzG3n2OAnNhHw@mail.gmail.com>
On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote:
> > Here is the BUG we are getting:
> >> [ 58.962528] BUG: unable to handle kernel NULL pointer dereference at 0000000000000230
> >> [ 58.963918] IP: vfs_statfs+0x73/0xb0
>
> The code disassembles to
>
> 0: 83 c9 08 or $0x8,%ecx
> 3: 40 f6 c6 04 test $0x4,%sil
> 7: 0f 45 d1 cmovne %ecx,%edx
> a: 89 d1 mov %edx,%ecx
> c: 80 cd 04 or $0x4,%ch
> f: 40 f6 c6 08 test $0x8,%sil
> 13: 0f 45 d1 cmovne %ecx,%edx
> 16: 89 d1 mov %edx,%ecx
> 18: 80 cd 08 or $0x8,%ch
> 1b: 40 f6 c6 10 test $0x10,%sil
> 1f: 0f 45 d1 cmovne %ecx,%edx
> 22: 89 d1 mov %edx,%ecx
> 24: 80 cd 10 or $0x10,%ch
> 27: 83 e6 20 and $0x20,%esi
> 2a:* 48 8b b7 30 02 00 00 mov 0x230(%rdi),%rsi <-- trapping instruction
> 31: 0f 45 d1 cmovne %ecx,%edx
> 34: 83 ca 20 or $0x20,%edx
> 37: 89 f1 mov %esi,%ecx
> 39: 83 e1 10 and $0x10,%ecx
> 3c: 89 cf mov %ecx,%edi
>
> and all those odd cmovne and bit-ops are just the bit selection code
> in flags_by_mnt(), which is inlined through calculate_f_flags (which
> is _also_ inlined) into vfs_statfs().
>
> Sadly, gcc makes a mess of it and actually generates code that looks
> like the original C. I would have hoped that gcc could have turned
>
> if (x & BIT)
> y |= OTHER_BIT;
>
> into
>
> y |= (x & BIT) shifted-by-the-bit-difference-between BIT/OTHER_BIT;
>
> but that doesn't happen. We actually do it by hand in some other more
> critical places, but it's painful to do by hand (because the shift
> direction/amount is not trivial to do in C).
>
> Anyway, that cmovne noise makes it a bit hard to see the actual part
> that matters (and that traps) but I'm almost certain that it's the
> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags()
> when it then does
>
> flags_by_sb(mnt->mnt_sb->s_flags);
>
> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is
Interesting...
struct super_block {
struct list_head s_list; /* Keep this first */
dev_t s_dev; /* search index; _not_ kdev_t */
unsigned char s_blocksize_bits;
unsigned long s_blocksize;
loff_t s_maxbytes; /* Max file size */
struct file_system_type *s_type;
const struct super_operations *s_op;
const struct dquot_operations *dq_op;
const struct quotactl_ops *s_qcop;
const struct export_operations *s_export_op;
unsigned long s_flags;
...
s_flags is preceded list_head, u32, unsigned char, 2 u64 and 5 pointers.
IOW, 10 64bit words. And sure enough, amd64 builds here have
mov 0x50(%rdi),%rsi
in the corresponding place. What config and toolchain had produced that?
I would definitely start with turning the randomize crap off, just to
exclude the compiler weirdness. Incidentally, randomizing anything that
contains a hash chain and key... super_block is not the worst here -
struct dentry is clear "winner". Anything in
struct dentry {
/* RCU lookup touched fields */
unsigned int d_flags; /* protected by d_lock */
seqcount_t d_seq; /* per dentry seqlock */
struct hlist_bl_node d_hash; /* lookup hash list */
struct dentry *d_parent; /* parent directory */
struct qstr d_name;
struct inode *d_inode; /* Where the name belongs to - NULL is
* negative */
moving into a separate cache line and we've just doubled cache footprint of
hash chain traversal.
How much reordering does that gcc misfeature do and why do we enable
that in the first place?
next prev parent reply other threads:[~2017-11-09 3:45 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-09 0:43 [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11 Patrick McLean
2017-11-09 2:40 ` Linus Torvalds
2017-11-09 3:45 ` Al Viro [this message]
2017-11-09 19:34 ` Patrick McLean
2017-11-09 19:38 ` Al Viro
2017-11-09 19:42 ` Patrick McLean
2017-11-09 19:37 ` Al Viro
2017-11-09 19:51 ` Patrick McLean
2017-11-09 20:04 ` Linus Torvalds
2017-11-09 21:16 ` Al Viro
2017-11-10 1:58 ` Patrick McLean
2017-11-10 13:53 ` Arnd Bergmann
2017-11-10 18:42 ` Linus Torvalds
2017-11-10 23:26 ` Patrick McLean
2017-11-11 0:27 ` Patrick McLean
2017-11-11 2:36 ` Linus Torvalds
2017-11-11 2:36 ` [kernel-hardening] " Linus Torvalds
2017-11-11 2:36 ` Linus Torvalds
2017-11-11 16:13 ` Kees Cook
2017-11-11 16:13 ` [kernel-hardening] " Kees Cook
2017-11-11 16:13 ` Kees Cook
2017-11-11 17:31 ` Linus Torvalds
2017-11-11 17:31 ` [kernel-hardening] " Linus Torvalds
2017-11-11 17:31 ` Linus Torvalds
2017-11-13 22:48 ` Patrick McLean
2017-11-13 22:48 ` [kernel-hardening] " Patrick McLean
2017-11-13 22:48 ` Patrick McLean
2017-11-17 0:54 ` Kees Cook
2017-11-17 0:54 ` [kernel-hardening] " Kees Cook
2017-11-17 0:54 ` Kees Cook
2017-11-17 19:03 ` Patrick McLean
2017-11-17 19:03 ` [kernel-hardening] " Patrick McLean
2017-11-17 19:03 ` Patrick McLean
2017-11-17 21:26 ` Kees Cook
2017-11-17 21:26 ` [kernel-hardening] " Kees Cook
2017-11-17 21:26 ` Kees Cook
2017-11-18 0:27 ` Patrick McLean
2017-11-18 0:27 ` [kernel-hardening] " Patrick McLean
2017-11-18 0:27 ` Patrick McLean
2017-11-18 0:55 ` Linus Torvalds
2017-11-18 0:55 ` [kernel-hardening] " Linus Torvalds
2017-11-18 0:55 ` Linus Torvalds
2017-11-18 1:54 ` Patrick McLean
2017-11-18 1:54 ` [kernel-hardening] " Patrick McLean
2017-11-18 1:54 ` Patrick McLean
2017-11-18 5:14 ` Kees Cook
2017-11-18 5:14 ` [kernel-hardening] " Kees Cook
2017-11-18 5:14 ` Kees Cook
2017-11-18 5:29 ` Linus Torvalds
2017-11-18 5:29 ` [kernel-hardening] " Linus Torvalds
2017-11-18 5:29 ` Linus Torvalds
2017-11-18 8:20 ` Kees Cook
2017-11-18 8:20 ` [kernel-hardening] " Kees Cook
2017-11-18 8:20 ` Kees Cook
2018-02-21 22:19 ` RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11) Maciej S. Szmigiero
2018-02-21 22:47 ` Linus Torvalds
2018-02-21 22:47 ` Linus Torvalds
2018-02-21 23:34 ` Kees Cook
2018-02-21 23:34 ` Kees Cook
2018-03-05 9:27 ` Masahiro Yamada
2018-03-05 9:27 ` Masahiro Yamada
2018-03-05 19:15 ` Kees Cook
2018-03-05 19:18 ` Linus Torvalds
2018-02-21 22:52 ` Kees Cook
2018-02-21 23:24 ` Linus Torvalds
2018-02-22 0:12 ` Kees Cook
2018-02-22 0:22 ` Linus Torvalds
2018-02-22 0:23 ` Kees Cook
2018-02-22 0:27 ` Kees Cook
2017-11-11 1:13 ` [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11 J. Bruce Fields
2017-11-11 2:32 ` Al Viro
2017-11-10 1:47 ` Patrick McLean
2017-11-09 20:47 ` J. Bruce Fields
2017-11-09 23:07 ` Patrick McLean
2017-11-13 22:59 ` bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11] Rasmus Villemoes
2017-11-13 23:30 ` Linus Torvalds
2017-11-13 23:54 ` Linus Torvalds
2017-11-14 22:24 ` Rasmus Villemoes
2017-11-14 22:43 ` Linus Torvalds
2017-11-14 23:53 ` Rasmus Villemoes
2017-11-15 0:02 ` Linus Torvalds
2017-11-11 2:47 ` [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11 Alan Cox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171109034507.GZ21978@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=bfields@redhat.com \
--cc=chutzpah@gentoo.org \
--cc=darrick.wong@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=regressions@leemhuis.info \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.