All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnd Bergmann <arnd@arndb.de>
To: Patrick McLean <chutzpah@gentoo.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Bruce Fields <bfields@redhat.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	stable <stable@vger.kernel.org>,
	Thorsten Leemhuis <regressions@leemhuis.info>
Subject: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
Date: Fri, 10 Nov 2017 14:53:09 +0100	[thread overview]
Message-ID: <CAK8P3a0GabFsSw5Kq4HKOFPjCuziokkh4mTv-FYS-oZ-Zq7tQQ@mail.gmail.com> (raw)
In-Reply-To: <23f7da04-95f7-24e7-ee70-ce40c5b8fee3@gentoo.org>

On Fri, Nov 10, 2017 at 2:58 AM, Patrick McLean <chutzpah@gentoo.org> wrote:
> On 2017-11-09 12:04 PM, Linus Torvalds wrote:
>> On Thu, Nov 9, 2017 at 11:51 AM, Patrick McLean <chutzpah@gentoo.org> wrote:

>
> We will check our fork against the in-kernel cp201x driver to make sure
> we didn't miss anything, but it seems odd we would be hitting the issue
> so consistently in the NFS code path, rather than somewhere in USB,
> serial, or GPIO paths.
>
>> So since you seem to be able to reproduce this _reasonably_ easily,
>> it's definitely worth checking that it still reproduces even without
>> the gcc plugins.
>
> I haven't been able to reproduce it with RANDSTRUCT disabled (and
> structleak enabled). I will keep trying for a little while more, but
> evidence seems to be pointing to that.
>
> Something must have changed since 4.13.8 to trigger this though. This
> did not crop up at all until we tried 4.13.11, where it we saw it pretty
> quickly. We have a pretty large number of machines running 4.13.6 with
> RANDSTRUCT enabled and running a the same workload with many more
> clients, and have not seen this bug at all.

I couldn't find anything overly suspicious between 4.13.8 and 4.13.11,
see the full list of commits since 3.14.6 at https://pastebin.com/AcxBZR7H

The ones I couldn't immediately rule out (but no smoking gun either) would be:

9970679f497a x86/cpu/AMD: Apply the Erratum 688 fix when the BIOS doesn't
ca6711747c5a assoc_array: Fix a buggy node-splitting case
2fbb8bf749b5 xfs: move two more RT specific functions into CONFIG_XFS_RT
1e1427356d8d xfs: trim writepage mapping to within eof
9df9b634f637 xfs: cancel dirty pages on invalidation
cd3f0bee1b94 xfs: handle error if xfs_btree_get_bufs fails
58cfca25f540 xfs: reinit btree pointer on attr tree inactivation walk
659a9989b68b xfs: don't change inode mode if ACL update fails
88ccd3b6884a xfs: move more RT specific code under CONFIG_XFS_RT
5733ebee586c xfs: Don't log uninitialised fields in inode structures
199a7448c097 xfs: handle racy AIO in xfs_reflink_end_cow
ee5d69c908a1 xfs: always swap the cow forks when swapping extents
2888145444f1 xfs: Capture state of the right inode in xfs_iflush_done
d0fa252b207f xfs: perag initialization should only touch
m_ag_max_usable for AG 0
8da6f7fbe43c xfs: update i_size after unwritten conversion in dio completion
a9eac76e958b xfs: report zeroed or not correctly in xfs_zero_range()
67d51bdcc9f4 fs/xfs: Use %pS printk format for direct addresses
2bf3122f2130 xfs: evict CoW fork extents when performing finsert/fcollapse
a58a0826656d xfs: don't unconditionally clear the reflink flag on
zero-block files
c61e905e0ee2 iomap_dio_rw: Allocate AIO completion queue before submitting dio
7610595830bb pkcs7: Prevent NULL pointer dereference, since sinfo is
not always set.
24a33a0c96f3 KEYS: don't let add_key() update an uninstantiated key
ad4aa448c9b2 FS-Cache: fix dereference of NULL user_key_payload
f45b8fe12221 KEYS: Fix race between updating and finding a negative key
e56be12012c2 ecryptfs: fix dereference of NULL user_key_payload
363ce0b01fe0 fscrypt: fix dereference of NULL user_key_payload
cc757d55c903 lib/digsig: fix dereference of NULL user_key_payload
f5e97214207f x86/microcode/intel: Disable late loading on model 79
7b5e405b7878 Revert "tools/power turbostat: stop migrating, unless '-m'"
8b1e10789c84 KEYS: encrypted: fix dereference of NULL user_key_payload
a258a35a9930 mm: page_vma_mapped: ensure pmd is loaded with READ_ONCE
outside of lock
e47a56cbf519 usb: xhci: Handle error condition in xhci_stop_device()
d53911e63388 usb: xhci: Reset halted endpoint if trb is noop
d1120fe38b3f xhci: Cleanup current_cmd in xhci_cleanup_command_queue()
301d332138d2 xhci: Identify USB 3.1 capable hosts by their port
protocol capability
015e94ead900 usb: hub: Allow reset retry for USB2 devices on connect bounce
1916547b28bd usb: quirks: add quirk for WORLDE MINI MIDI keyboard
e3a038930502 usb: cdc_acm: Add quirk for Elatec TWN3
c2110c8dea7a USB: serial: metro-usb: add MS7820 device id
775462fd5c53 USB: core: fix out-of-bounds access bug in usb_get_bos_descriptor()
a9fdf6354267 USB: devio: Revert "USB: devio: Don't corrupt user memory"

However, you mentioned cp210x, and I noticed related changes in 4.13.8:

e21045a22395 USB: serial: console: fix use-after-free after failed setup
6c7cb458405e USB: serial: console: fix use-after-free on disconnect
4b3e3c7282d6 USB: serial: qcserial: add Dell DW5818, DW5819
c796da1d110f USB: serial: option: add support for TP-Link LTE module
e7e0b4b39663 USB: serial: cp210x: add support for ELV TFD500
1ae2c690f967 USB: serial: cp210x: fix partnum regression
78a02c93648e USB: serial: ftdi_sio: add id for Cypress WICED dev board

You could try reverting those seven, this could point to your forked driver
if it makes a difference.

       Arnd

  reply	other threads:[~2017-11-10 13:53 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-09  0:43 [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11 Patrick McLean
2017-11-09  2:40 ` Linus Torvalds
2017-11-09  3:45   ` Al Viro
2017-11-09 19:34   ` Patrick McLean
2017-11-09 19:38     ` Al Viro
2017-11-09 19:42       ` Patrick McLean
2017-11-09 19:37   ` Al Viro
2017-11-09 19:51     ` Patrick McLean
2017-11-09 20:04       ` Linus Torvalds
2017-11-09 21:16         ` Al Viro
2017-11-10  1:58         ` Patrick McLean
2017-11-10 13:53           ` Arnd Bergmann [this message]
2017-11-10 18:42           ` Linus Torvalds
2017-11-10 23:26             ` Patrick McLean
2017-11-11  0:27               ` Patrick McLean
2017-11-11  2:36                 ` Linus Torvalds
2017-11-11  2:36                   ` [kernel-hardening] " Linus Torvalds
2017-11-11  2:36                   ` Linus Torvalds
2017-11-11 16:13                   ` Kees Cook
2017-11-11 16:13                     ` [kernel-hardening] " Kees Cook
2017-11-11 16:13                     ` Kees Cook
2017-11-11 17:31                     ` Linus Torvalds
2017-11-11 17:31                       ` [kernel-hardening] " Linus Torvalds
2017-11-11 17:31                       ` Linus Torvalds
2017-11-13 22:48                       ` Patrick McLean
2017-11-13 22:48                         ` [kernel-hardening] " Patrick McLean
2017-11-13 22:48                         ` Patrick McLean
2017-11-17  0:54                         ` Kees Cook
2017-11-17  0:54                           ` [kernel-hardening] " Kees Cook
2017-11-17  0:54                           ` Kees Cook
2017-11-17 19:03                           ` Patrick McLean
2017-11-17 19:03                             ` [kernel-hardening] " Patrick McLean
2017-11-17 19:03                             ` Patrick McLean
2017-11-17 21:26                             ` Kees Cook
2017-11-17 21:26                               ` [kernel-hardening] " Kees Cook
2017-11-17 21:26                               ` Kees Cook
2017-11-18  0:27                               ` Patrick McLean
2017-11-18  0:27                                 ` [kernel-hardening] " Patrick McLean
2017-11-18  0:27                                 ` Patrick McLean
2017-11-18  0:55                                 ` Linus Torvalds
2017-11-18  0:55                                   ` [kernel-hardening] " Linus Torvalds
2017-11-18  0:55                                   ` Linus Torvalds
2017-11-18  1:54                                   ` Patrick McLean
2017-11-18  1:54                                     ` [kernel-hardening] " Patrick McLean
2017-11-18  1:54                                     ` Patrick McLean
2017-11-18  5:14                                     ` Kees Cook
2017-11-18  5:14                                       ` [kernel-hardening] " Kees Cook
2017-11-18  5:14                                       ` Kees Cook
2017-11-18  5:29                                       ` Linus Torvalds
2017-11-18  5:29                                         ` [kernel-hardening] " Linus Torvalds
2017-11-18  5:29                                         ` Linus Torvalds
2017-11-18  8:20                                         ` Kees Cook
2017-11-18  8:20                                           ` [kernel-hardening] " Kees Cook
2017-11-18  8:20                                           ` Kees Cook
2018-02-21 22:19                                       ` RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11) Maciej S. Szmigiero
2018-02-21 22:47                                         ` Linus Torvalds
2018-02-21 22:47                                           ` Linus Torvalds
2018-02-21 23:34                                           ` Kees Cook
2018-02-21 23:34                                             ` Kees Cook
2018-03-05  9:27                                           ` Masahiro Yamada
2018-03-05  9:27                                             ` Masahiro Yamada
2018-03-05 19:15                                             ` Kees Cook
2018-03-05 19:18                                             ` Linus Torvalds
2018-02-21 22:52                                         ` Kees Cook
2018-02-21 23:24                                           ` Linus Torvalds
2018-02-22  0:12                                             ` Kees Cook
2018-02-22  0:22                                               ` Linus Torvalds
2018-02-22  0:23                                                 ` Kees Cook
2018-02-22  0:27                                                   ` Kees Cook
2017-11-11  1:13               ` [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11 J. Bruce Fields
2017-11-11  2:32                 ` Al Viro
2017-11-10  1:47       ` Patrick McLean
2017-11-09 20:47   ` J. Bruce Fields
2017-11-09 23:07     ` Patrick McLean
2017-11-13 22:59   ` bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11] Rasmus Villemoes
2017-11-13 23:30     ` Linus Torvalds
2017-11-13 23:54       ` Linus Torvalds
2017-11-14 22:24         ` Rasmus Villemoes
2017-11-14 22:43           ` Linus Torvalds
2017-11-14 23:53             ` Rasmus Villemoes
2017-11-15  0:02               ` Linus Torvalds
2017-11-11  2:47 ` [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11 Alan Cox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAK8P3a0GabFsSw5Kq4HKOFPjCuziokkh4mTv-FYS-oZ-Zq7tQQ@mail.gmail.com \
    --to=arnd@arndb.de \
    --cc=bfields@redhat.com \
    --cc=chutzpah@gentoo.org \
    --cc=darrick.wong@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=regressions@leemhuis.info \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.