linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 4.14 000/146] 4.14.86-stable review
@ 2018-12-04 10:48 Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 001/146] mm/huge_memory: rename freeze_page() to unmap_page() Greg Kroah-Hartman
                   ` (150 more replies)
  0 siblings, 151 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuah, patches,
	ben.hutchings, lkft-triage, stable

This is the start of the stable review cycle for the 4.14.86 release.
There are 146 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Dec  6 10:36:52 UTC 2018.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.86-rc1.gz
or in the git tree and branch at:
	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 4.14.86-rc1

Todd Kjos <tkjos@android.com>
    binder: fix race that allows malicious free of live buffer

YueHaibing <yuehaibing@huawei.com>
    misc: mic/scif: fix copy-paste error in scif_create_remote_lookup

Dexuan Cui <decui@microsoft.com>
    Drivers: hv: vmbus: check the creation_status in vmbus_establish_gpadl()

Yu Zhao <yuzhao@google.com>
    mm: use swp_offset as key in shmem_replace_page()

Luis Chamberlain <mcgrof@kernel.org>
    lib/test_kmod.c: fix rmmod double free

Martin Kelly <martin@martingkelly.com>
    iio:st_magn: Fix enable device after trigger

Felipe Balbi <felipe.balbi@linux.intel.com>
    Revert "usb: dwc3: gadget: skip Set/Clear Halt when invalid"

Michael Niewöhner <linux@mniewoehner.de>
    usb: core: quirks: add RESET_RESUME quirk for Cherry G230 Stream series

Kai-Heng Feng <kai.heng.feng@canonical.com>
    USB: usb-storage: Add new IDs to ums-realtek

Larry Finger <Larry.Finger@lwfinger.net>
    staging: rtl8723bs: Add missing return for cfg80211_rtw_get_station

Ben Wolsieffer <benwolsieffer@gmail.com>
    staging: vchiq_arm: fix compat VCHIQ_IOC_AWAIT_COMPLETION

Josef Bacik <josef@toxicpanda.com>
    btrfs: release metadata before running delayed refs

Richard Genoud <richard.genoud@gmail.com>
    dmaengine: at_hdmac: fix module unloading

Richard Genoud <richard.genoud@gmail.com>
    dmaengine: at_hdmac: fix memory leak in at_dma_xlate()

Heiko Stuebner <heiko@sntech.de>
    ARM: dts: rockchip: Remove @0 from the veyron memory node

Pan Bian <bianpan2016@163.com>
    ext2: fix potential use after free

Anisse Astier <anisse@astier.eu>
    ALSA: hda/realtek - fix headset mic detection for MSI MS-B171

Kailang Yang <kailang@realtek.com>
    ALSA: hda/realtek - Support ALC300

Takashi Iwai <tiwai@suse.de>
    ALSA: sparc: Fix invalid snd_free_pages() at error path

Takashi Iwai <tiwai@suse.de>
    ALSA: control: Fix race between adding and removing a user element

Takashi Iwai <tiwai@suse.de>
    ALSA: ac97: Fix incorrect bit shift at AC97-SPSA control write

Takashi Iwai <tiwai@suse.de>
    ALSA: wss: Fix invalid snd_free_pages() at error path

Maximilian Heyne <mheyne@amazon.de>
    fs: fix lost error code in dio_complete

Jiri Olsa <jolsa@kernel.org>
    perf/x86/intel: Add generic branch tracing check to intel_pmu_has_bts()

Jiri Olsa <jolsa@kernel.org>
    perf/x86/intel: Move branch tracing setup to the Intel-specific source file

Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    x86/fpu: Disable bottom halves while loading FPU registers

Borislav Petkov <bp@suse.de>
    x86/MCE/AMD: Fix the thresholding machinery initialization order

Christoph Muellner <christoph.muellner@theobroma-systems.com>
    arm64: dts: rockchip: Fix PCIe reset polarity for rk3399-puma-haikou.

Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
    PCI: layerscape: Fix wrong invocation of outbound window disable accessor

Pan Bian <bianpan2016@163.com>
    btrfs: relocation: set trans to be NULL after ending transaction

Filipe Manana <fdmanana@suse.com>
    Btrfs: ensure path name is null terminated at btrfs_control_ioctl

Max Filippov <jcmvbkbc@gmail.com>
    xtensa: fix coprocessor part of ptrace_{get,set}xregs

Max Filippov <jcmvbkbc@gmail.com>
    xtensa: fix coprocessor context offset definitions

Max Filippov <jcmvbkbc@gmail.com>
    xtensa: enable coprocessors that are being flushed

Wanpeng Li <wanpengli@tencent.com>
    KVM: X86: Fix scan ioapic use-before-initialization

Liran Alon <liran.alon@oracle.com>
    KVM: x86: Fix kernel info-leak in KVM_HC_CLOCK_PAIRING hypercall

Jim Mattson <jmattson@google.com>
    kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb

Junaid Shahid <junaids@google.com>
    kvm: mmu: Fix race in emulated page table writes

Thomas Gleixner <tglx@linutronix.de>
    x86/speculation: Provide IBPB always command line options

Thomas Gleixner <tglx@linutronix.de>
    x86/speculation: Add seccomp Spectre v2 user space protection mode

Thomas Gleixner <tglx@linutronix.de>
    x86/speculation: Enable prctl mode for spectre_v2_user

Thomas Gleixner <tglx@linutronix.de>
    x86/speculation: Add prctl() control for indirect branch speculation

Thomas Gleixner <tglx@linutronix.de>
    x86/speculation: Prepare arch_smt_update() for PRCTL mode

Thomas Gleixner <tglx@linutronix.de>
    x86/speculation: Prevent stale SPEC_CTRL msr content

Thomas Gleixner <tglx@linutronix.de>
    x86/speculation: Split out TIF update

Thomas Gleixner <tglx@linutronix.de>
    ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRS

Thomas Gleixner <tglx@linutronix.de>
    x86/speculation: Prepare for conditional IBPB in switch_mm()

Thomas Gleixner <tglx@linutronix.de>
    x86/speculation: Avoid __switch_to_xtra() calls

Thomas Gleixner <tglx@linutronix.de>
    x86/process: Consolidate and simplify switch_to_xtra() code

Tim Chen <tim.c.chen@linux.intel.com>
    x86/speculation: Prepare for per task indirect branch speculation control

Thomas Gleixner <tglx@linutronix.de>
    x86/speculation: Add command line control for indirect branch speculation

Thomas Gleixner <tglx@linutronix.de>
    x86/speculation: Unify conditional spectre v2 print functions

Thomas Gleixner <tglx@linutronix.de>
    x86/speculataion: Mark command line parser data __initdata

Thomas Gleixner <tglx@linutronix.de>
    x86/speculation: Mark string arrays const correctly

Thomas Gleixner <tglx@linutronix.de>
    x86/speculation: Reorder the spec_v2 code

Thomas Gleixner <tglx@linutronix.de>
    x86/l1tf: Show actual SMT state

Thomas Gleixner <tglx@linutronix.de>
    x86/speculation: Rework SMT state change

Thomas Gleixner <tglx@linutronix.de>
    sched/smt: Expose sched_smt_present static key

Thomas Gleixner <tglx@linutronix.de>
    x86/Kconfig: Select SCHED_SMT if SMP enabled

Peter Zijlstra (Intel) <peterz@infradead.org>
    sched/smt: Make sched_smt_present track topology

Tim Chen <tim.c.chen@linux.intel.com>
    x86/speculation: Reorganize speculation control MSRs update

Thomas Gleixner <tglx@linutronix.de>
    x86/speculation: Rename SSBD update functions

Tim Chen <tim.c.chen@linux.intel.com>
    x86/speculation: Disable STIBP when enhanced IBRS is in use

Tim Chen <tim.c.chen@linux.intel.com>
    x86/speculation: Move STIPB/IBPB string conditionals out of cpu_show_common()

Tim Chen <tim.c.chen@linux.intel.com>
    x86/speculation: Remove unnecessary ret variable in cpu_show_common()

Tim Chen <tim.c.chen@linux.intel.com>
    x86/speculation: Clean up spectre_v2_parse_cmdline()

Tim Chen <tim.c.chen@linux.intel.com>
    x86/speculation: Update the TIF_SSBD comment

Zhenzhong Duan <zhenzhong.duan@oracle.com>
    x86/retpoline: Remove minimal retpoline support

Zhenzhong Duan <zhenzhong.duan@oracle.com>
    x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support

Zhenzhong Duan <zhenzhong.duan@oracle.com>
    x86/speculation: Add RETPOLINE_AMD support to the inline asm CALL_NOSPEC variant

Jiri Kosina <jkosina@suse.cz>
    x86/speculation: Propagate information about RSB filling mitigation to sysfs

Jiri Kosina <jkosina@suse.cz>
    x86/speculation: Apply IBPB more strictly to avoid cross-process data leak

Jiri Kosina <jkosina@suse.cz>
    x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation

Tom Lendacky <thomas.lendacky@amd.com>
    x86/bugs: Fix the AMD SSBD usage of the SPEC_CTRL MSR

Tom Lendacky <thomas.lendacky@amd.com>
    x86/bugs: Update when to check for the LS_CFG SSBD mitigation

Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    x86/bugs: Switch the selection of mitigation from CPU vendor to CPU features

Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    x86/bugs: Add AMD's SPEC_CTRL MSR usage

Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    x86/bugs: Add AMD's variant of SSB_NO

Peter Zijlstra <peterz@infradead.org>
    sched/core: Fix cpu.max vs. cpuhotplug deadlock

Bernd Eckstein <3erndeckstein@gmail.com>
    usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2

Julian Wiedmann <jwi@linux.ibm.com>
    s390/qeth: fix length check in SNMP processing

Pan Bian <bianpan2016@163.com>
    rapidio/rionet: do not free skb before reading its length

Willem de Bruijn <willemb@google.com>
    packet: copy user buffers before orphan or clone

Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
    net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue

Jason Wang <jasowang@redhat.com>
    virtio-net: fail XDP set if guest csum is negotiated

Jason Wang <jasowang@redhat.com>
    virtio-net: disable guest csum during XDP set

Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
    net: thunderx: set xdp_prog to NULL if bpf_prog_add fails

Petr Machata <petrm@mellanox.com>
    net: skb_scrub_packet(): Scrub offload_fwd_mark

Sasha Levin <sashal@kernel.org>
    Revert "wlcore: Add missing PM call for wlcore_cmd_wait_for_event_or_timeout()"

Darrick J. Wong <darrick.wong@oracle.com>
    xfs: don't fail when converting shortform attr to long form during ATTR_REPLACE

Chao Yu <yuchao0@huawei.com>
    f2fs: fix to do sanity check with cp_pack_start_sum

Chao Yu <yuchao0@huawei.com>
    f2fs: fix to do sanity check with i_extra_isize

Chao Yu <yuchao0@huawei.com>
    f2fs: fix to do sanity check with block address in main area

Chao Yu <yuchao0@huawei.com>
    f2fs: fix to do sanity check with node footer and iblocks

Chao Yu <yuchao0@huawei.com>
    f2fs: fix to do sanity check with user_block_count

Chao Yu <yuchao0@huawei.com>
    f2fs: fix to do sanity check with extra_attr feature

Ben Hutchings <ben.hutchings@codethink.co.uk>
    f2fs: Add sanity_check_inode() function

Chao Yu <yuchao0@huawei.com>
    f2fs: fix to do sanity check with secs_per_zone

Chao Yu <yuchao0@huawei.com>
    f2fs: introduce and spread verify_blkaddr

Chao Yu <yuchao0@huawei.com>
    f2fs: clean up with is_valid_blkaddr()

Jaegeuk Kim <jaegeuk@kernel.org>
    f2fs: enhance sanity_check_raw_super() to avoid potential overflow

Jaegeuk Kim <jaegeuk@kernel.org>
    f2fs: sanity check on sit entry

Yunlei He <heyunlei@huawei.com>
    f2fs: check blkaddr more accuratly before issue a bio

Shaokun Zhang <zhangshaokun@hisilicon.com>
    btrfs: tree-checker: Fix misleading group system information

Qu Wenruo <wqu@suse.com>
    btrfs: tree-checker: Check level for leaves and nodes

Qu Wenruo <wqu@suse.com>
    btrfs: Check that each block group has corresponding chunk at mount time

Qu Wenruo <wqu@suse.com>
    btrfs: tree-checker: Detect invalid and empty essential trees

Qu Wenruo <wqu@suse.com>
    btrfs: tree-checker: Verify block_group_item

David Sterba <dsterba@suse.com>
    btrfs: tree-check: reduce stack consumption in check_dir_item

Arnd Bergmann <arnd@arndb.de>
    btrfs: tree-checker: use %zu format string for size_t

Qu Wenruo <wqu@suse.com>
    btrfs: tree-checker: Add checker for dir item

Qu Wenruo <wqu@suse.com>
    btrfs: tree-checker: Fix false panic for sanity test

Qu Wenruo <quwenruo.btrfs@gmx.com>
    btrfs: tree-checker: Enhance btrfs_check_node output

Qu Wenruo <quwenruo.btrfs@gmx.com>
    btrfs: Move leaf and node validation checker to tree-checker.c

Qu Wenruo <quwenruo.btrfs@gmx.com>
    btrfs: Add checker for EXTENT_CSUM

Qu Wenruo <quwenruo.btrfs@gmx.com>
    btrfs: Add sanity check for EXTENT_DATA when reading out leaf

Qu Wenruo <quwenruo.btrfs@gmx.com>
    btrfs: Check if item pointer overlaps with the item itself

Qu Wenruo <quwenruo.btrfs@gmx.com>
    btrfs: Refactor check_leaf function for later expansion

Qu Wenruo <wqu@suse.com>
    btrfs: Verify that every chunk has corresponding block group at mount time

Gu Jinxiang <gujx@cn.fujitsu.com>
    btrfs: validate type when reading a chunk

Lior David <qca_liord@qca.qualcomm.com>
    wil6210: missing length check in wmi_set_ie

Vakul Garg <vakul.garg@nxp.com>
    net/tls: Fixed return value when tls_complete_pending_work() fails

Boris Pismenny <borisp@mellanox.com>
    tls: Use correct sk->sk_prot for IPV6

Ilya Lesokhin <ilyal@mellanox.com>
    tls: don't override sk_write_space if tls_set_sw_offload fails.

Ilya Lesokhin <ilyal@mellanox.com>
    tls: Avoid copying crypto_info again after cipher_type check.

Ilya Lesokhin <ilyal@mellanox.com>
    tls: Fix TLS ulp context leak, when TLS_TX setsockopt is not used.

Ilya Lesokhin <ilyal@mellanox.com>
    tls: Add function to update the TLS socket configuration

Alexei Starovoitov <ast@kernel.org>
    bpf: Prevent memory disambiguation attack

Ilya Dryomov <idryomov@gmail.com>
    libceph: implement CEPHX_V2 calculation mode

Ilya Dryomov <idryomov@gmail.com>
    libceph: add authorizer challenge

Ilya Dryomov <idryomov@gmail.com>
    libceph: factor out encrypt_authorizer()

Ilya Dryomov <idryomov@gmail.com>
    libceph: factor out __ceph_x_decrypt()

Ilya Dryomov <idryomov@gmail.com>
    libceph: factor out __prepare_write_connect()

Ilya Dryomov <idryomov@gmail.com>
    libceph: store ceph_auth_handshake pointer in ceph_connection

Richard Weinberger <richard@nod.at>
    ubi: Initialize Fastmap checkmapping correctly

Matthias Schwarzott <zzam@gentoo.org>
    media: em28xx: Fix use-after-free when disconnecting

Hugh Dickins <hughd@google.com>
    mm/khugepaged: collapse_shmem() do not crash on Compound

Hugh Dickins <hughd@google.com>
    mm/khugepaged: collapse_shmem() without freezing new_page

Hugh Dickins <hughd@google.com>
    mm/khugepaged: minor reorderings in collapse_shmem()

Hugh Dickins <hughd@google.com>
    mm/khugepaged: collapse_shmem() remember to clear holes

Hugh Dickins <hughd@google.com>
    mm/khugepaged: fix crashes due to misaccounted holes

Hugh Dickins <hughd@google.com>
    mm/khugepaged: collapse_shmem() stop if punched or truncated

Hugh Dickins <hughd@google.com>
    mm/huge_memory: fix lockdep complaint on 32-bit i_size_read()

Hugh Dickins <hughd@google.com>
    mm/huge_memory: splitting set mapping+index before unfreeze

Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
    mm/huge_memory.c: reorder operations in __split_huge_page_tail()

Hugh Dickins <hughd@google.com>
    mm/huge_memory: rename freeze_page() to unmap_page()


-------------

Diffstat:

 Documentation/admin-guide/kernel-parameters.txt    |  56 +-
 Documentation/userspace-api/spec_ctrl.rst          |   9 +
 Makefile                                           |   4 +-
 arch/arm/boot/dts/rk3288-veyron.dtsi               |   6 +-
 .../arm64/boot/dts/rockchip/rk3399-puma-haikou.dts |   2 +-
 arch/x86/Kconfig                                   |  12 +-
 arch/x86/Makefile                                  |   5 +-
 arch/x86/events/core.c                             |  20 -
 arch/x86/events/intel/core.c                       |  52 +-
 arch/x86/events/perf_event.h                       |  13 +-
 arch/x86/include/asm/cpufeatures.h                 |   2 +
 arch/x86/include/asm/msr-index.h                   |   5 +-
 arch/x86/include/asm/nospec-branch.h               |  44 +-
 arch/x86/include/asm/spec-ctrl.h                   |  20 +-
 arch/x86/include/asm/switch_to.h                   |   3 -
 arch/x86/include/asm/thread_info.h                 |  20 +-
 arch/x86/include/asm/tlbflush.h                    |   8 +-
 arch/x86/kernel/cpu/amd.c                          |   4 +-
 arch/x86/kernel/cpu/bugs.c                         | 510 ++++++++++++----
 arch/x86/kernel/cpu/common.c                       |   9 +-
 arch/x86/kernel/cpu/mcheck/mce_amd.c               |  19 +-
 arch/x86/kernel/fpu/signal.c                       |   4 +-
 arch/x86/kernel/process.c                          | 101 +++-
 arch/x86/kernel/process.h                          |  39 ++
 arch/x86/kernel/process_32.c                       |   8 +-
 arch/x86/kernel/process_64.c                       |  10 +-
 arch/x86/kvm/cpuid.c                               |  10 +-
 arch/x86/kvm/mmu.c                                 |  27 +-
 arch/x86/kvm/svm.c                                 |  28 +-
 arch/x86/kvm/x86.c                                 |   4 +-
 arch/x86/mm/tlb.c                                  | 115 +++-
 arch/xtensa/kernel/asm-offsets.c                   |  16 +-
 arch/xtensa/kernel/process.c                       |   5 +-
 arch/xtensa/kernel/ptrace.c                        |  42 +-
 drivers/android/binder.c                           |  21 +-
 drivers/android/binder_alloc.c                     |  14 +-
 drivers/android/binder_alloc.h                     |   3 +-
 drivers/dma/at_hdmac.c                             |  10 +-
 drivers/hv/channel.c                               |   8 +
 drivers/iio/magnetometer/st_magn_buffer.c          |  12 +-
 drivers/media/usb/em28xx/em28xx-dvb.c              |   3 +-
 drivers/misc/mic/scif/scif_rma.c                   |   2 +-
 drivers/mtd/ubi/vtbl.c                             |  20 +-
 drivers/net/ethernet/cavium/thunder/nicvf_main.c   |   9 +-
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c |   4 +-
 drivers/net/rionet.c                               |   2 +-
 drivers/net/usb/ipheth.c                           |  10 +-
 drivers/net/virtio_net.c                           |  13 +-
 drivers/net/wireless/ath/wil6210/wmi.c             |   8 +-
 drivers/net/wireless/ti/wlcore/cmd.c               |   6 -
 drivers/pci/dwc/pci-layerscape.c                   |   2 +-
 drivers/s390/net/qeth_core_main.c                  |  27 +-
 drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c  |   2 +-
 .../vc04_services/interface/vchiq_arm/vchiq_arm.c  |   7 +-
 drivers/usb/core/quirks.c                          |   3 +
 drivers/usb/dwc3/gadget.c                          |   5 -
 drivers/usb/storage/unusual_realtek.h              |  10 +
 fs/btrfs/Makefile                                  |   2 +-
 fs/btrfs/disk-io.c                                 | 153 +----
 fs/btrfs/extent-tree.c                             |  86 ++-
 fs/btrfs/relocation.c                              |   1 +
 fs/btrfs/super.c                                   |   1 +
 fs/btrfs/transaction.c                             |   6 +-
 fs/btrfs/tree-checker.c                            | 649 +++++++++++++++++++++
 fs/btrfs/tree-checker.h                            |  38 ++
 fs/btrfs/volumes.c                                 |  30 +-
 fs/btrfs/volumes.h                                 |   2 +
 fs/ceph/mds_client.c                               |  11 +
 fs/direct-io.c                                     |   4 +-
 fs/ext2/xattr.c                                    |   2 +-
 fs/f2fs/checkpoint.c                               |  43 +-
 fs/f2fs/data.c                                     |  52 +-
 fs/f2fs/f2fs.h                                     |  41 +-
 fs/f2fs/file.c                                     |  21 +-
 fs/f2fs/inode.c                                    |  78 ++-
 fs/f2fs/node.c                                     |   9 +-
 fs/f2fs/recovery.c                                 |   6 +-
 fs/f2fs/segment.c                                  |  13 +-
 fs/f2fs/segment.h                                  |  24 +-
 fs/f2fs/super.c                                    |  96 ++-
 fs/xfs/libxfs/xfs_attr.c                           |   9 +-
 include/linux/bpf_verifier.h                       |   1 +
 include/linux/ceph/auth.h                          |   8 +
 include/linux/ceph/ceph_features.h                 |   7 +-
 include/linux/ceph/messenger.h                     |   6 +-
 include/linux/ceph/msgr.h                          |   2 +-
 include/linux/jump_label.h                         |   7 +
 include/linux/ptrace.h                             |   4 +-
 include/linux/sched.h                              |   9 +
 include/linux/sched/smt.h                          |  20 +
 include/linux/skbuff.h                             |  18 +-
 include/net/tls.h                                  |   4 +-
 include/uapi/linux/btrfs_tree.h                    |   1 +
 include/uapi/linux/prctl.h                         |   1 +
 kernel/bpf/verifier.c                              |  62 +-
 kernel/cpu.c                                       |  14 +-
 kernel/jump_label.c                                |  12 +-
 kernel/sched/core.c                                |  19 +-
 kernel/sched/fair.c                                |   4 +-
 kernel/sched/sched.h                               |   4 +-
 lib/test_kmod.c                                    |   1 -
 mm/huge_memory.c                                   |  79 +--
 mm/khugepaged.c                                    | 129 ++--
 mm/shmem.c                                         |  12 +-
 net/ceph/auth.c                                    |  16 +
 net/ceph/auth_x.c                                  | 217 +++++--
 net/ceph/auth_x_protocol.h                         |   7 +
 net/ceph/messenger.c                               |  86 +--
 net/ceph/osd_client.c                              |  11 +
 net/core/skbuff.c                                  |   4 +
 net/packet/af_packet.c                             |   4 +-
 net/tls/tls_main.c                                 | 124 ++--
 net/tls/tls_sw.c                                   |  13 +-
 scripts/Makefile.build                             |   2 -
 sound/core/control.c                               |  80 +--
 sound/isa/wss/wss_lib.c                            |   2 -
 sound/pci/ac97/ac97_codec.c                        |   2 +-
 sound/pci/hda/patch_realtek.c                      |   9 +
 sound/sparc/cs4231.c                               |   8 +-
 119 files changed, 2912 insertions(+), 907 deletions(-)



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 001/146] mm/huge_memory: rename freeze_page() to unmap_page()
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 002/146] mm/huge_memory.c: reorder operations in __split_huge_page_tail() Greg Kroah-Hartman
                   ` (149 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hugh Dickins, Kirill A. Shutemov,
	Jerome Glisse, Konstantin Khlebnikov, Matthew Wilcox,
	Andrew Morton, Linus Torvalds, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 906f9cdfc2a0800f13683f9e4ebdfd08c12ee81b upstream.

The term "freeze" is used in several ways in the kernel, and in mm it
has the particular meaning of forcing page refcount temporarily to 0.
freeze_page() is just too confusing a name for a function that unmaps a
page: rename it unmap_page(), and rename unfreeze_page() remap_page().

Went to change the mention of freeze_page() added later in mm/rmap.c,
but found it to be incorrect: ordinary page reclaim reaches there too;
but the substance of the comment still seems correct, so edit it down.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261514080.2275@eggly.anvils
Fixes: e9b61f19858a5 ("thp: reintroduce split_huge_page()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 mm/huge_memory.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index adacfe66cf3d..09d7e9c4de89 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2280,7 +2280,7 @@ void vma_adjust_trans_huge(struct vm_area_struct *vma,
 	}
 }
 
-static void freeze_page(struct page *page)
+static void unmap_page(struct page *page)
 {
 	enum ttu_flags ttu_flags = TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS |
 		TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD;
@@ -2295,7 +2295,7 @@ static void freeze_page(struct page *page)
 	VM_BUG_ON_PAGE(!unmap_success, page);
 }
 
-static void unfreeze_page(struct page *page)
+static void remap_page(struct page *page)
 {
 	int i;
 	if (PageTransHuge(page)) {
@@ -2412,7 +2412,7 @@ static void __split_huge_page(struct page *page, struct list_head *list,
 
 	spin_unlock_irqrestore(zone_lru_lock(page_zone(head)), flags);
 
-	unfreeze_page(head);
+	remap_page(head);
 
 	for (i = 0; i < HPAGE_PMD_NR; i++) {
 		struct page *subpage = head + i;
@@ -2593,7 +2593,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 	}
 
 	/*
-	 * Racy check if we can split the page, before freeze_page() will
+	 * Racy check if we can split the page, before unmap_page() will
 	 * split PMDs
 	 */
 	if (!can_split_huge_page(head, &extra_pins)) {
@@ -2602,7 +2602,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 	}
 
 	mlocked = PageMlocked(page);
-	freeze_page(head);
+	unmap_page(head);
 	VM_BUG_ON_PAGE(compound_mapcount(head), head);
 
 	/* Make sure the page is not on per-CPU pagevec as it takes pin */
@@ -2659,7 +2659,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 fail:		if (mapping)
 			spin_unlock(&mapping->tree_lock);
 		spin_unlock_irqrestore(zone_lru_lock(page_zone(head)), flags);
-		unfreeze_page(head);
+		remap_page(head);
 		ret = -EBUSY;
 	}
 
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 002/146] mm/huge_memory.c: reorder operations in __split_huge_page_tail()
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 001/146] mm/huge_memory: rename freeze_page() to unmap_page() Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 003/146] mm/huge_memory: splitting set mapping+index before unfreeze Greg Kroah-Hartman
                   ` (148 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Konstantin Khlebnikov,
	Kirill A. Shutemov, Michal Hocko, Nicholas Piggin, Andrew Morton,
	Linus Torvalds, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 605ca5ede7643a01f4c4a15913f9714ac297f8a6 upstream.

THP split makes non-atomic change of tail page flags.  This is almost ok
because tail pages are locked and isolated but this breaks recent
changes in page locking: non-atomic operation could clear bit
PG_waiters.

As a result concurrent sequence get_page_unless_zero() -> lock_page()
might block forever.  Especially if this page was truncated later.

Fix is trivial: clone flags before unfreezing page reference counter.

This race exists since commit 62906027091f ("mm: add PageWaiters
indicating tasks are waiting for a page bit") while unsave unfreeze
itself was added in commit 8df651c7059e ("thp: cleanup
split_huge_page()").

clear_compound_head() also must be called before unfreezing page
reference because after successful get_page_unless_zero() might follow
put_page() which needs correct compound_head().

And replace page_ref_inc()/page_ref_add() with page_ref_unfreeze() which
is made especially for that and has semantic of smp_store_release().

Link: http://lkml.kernel.org/r/151844393341.210639.13162088407980624477.stgit@buzz
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 mm/huge_memory.c | 36 +++++++++++++++---------------------
 1 file changed, 15 insertions(+), 21 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 09d7e9c4de89..8969f5cf3174 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2312,26 +2312,13 @@ static void __split_huge_page_tail(struct page *head, int tail,
 	struct page *page_tail = head + tail;
 
 	VM_BUG_ON_PAGE(atomic_read(&page_tail->_mapcount) != -1, page_tail);
-	VM_BUG_ON_PAGE(page_ref_count(page_tail) != 0, page_tail);
 
 	/*
-	 * tail_page->_refcount is zero and not changing from under us. But
-	 * get_page_unless_zero() may be running from under us on the
-	 * tail_page. If we used atomic_set() below instead of atomic_inc() or
-	 * atomic_add(), we would then run atomic_set() concurrently with
-	 * get_page_unless_zero(), and atomic_set() is implemented in C not
-	 * using locked ops. spin_unlock on x86 sometime uses locked ops
-	 * because of PPro errata 66, 92, so unless somebody can guarantee
-	 * atomic_set() here would be safe on all archs (and not only on x86),
-	 * it's safer to use atomic_inc()/atomic_add().
+	 * Clone page flags before unfreezing refcount.
+	 *
+	 * After successful get_page_unless_zero() might follow flags change,
+	 * for exmaple lock_page() which set PG_waiters.
 	 */
-	if (PageAnon(head) && !PageSwapCache(head)) {
-		page_ref_inc(page_tail);
-	} else {
-		/* Additional pin to radix tree */
-		page_ref_add(page_tail, 2);
-	}
-
 	page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
 	page_tail->flags |= (head->flags &
 			((1L << PG_referenced) |
@@ -2344,14 +2331,21 @@ static void __split_huge_page_tail(struct page *head, int tail,
 			 (1L << PG_unevictable) |
 			 (1L << PG_dirty)));
 
-	/*
-	 * After clearing PageTail the gup refcount can be released.
-	 * Page flags also must be visible before we make the page non-compound.
-	 */
+	/* Page flags must be visible before we make the page non-compound. */
 	smp_wmb();
 
+	/*
+	 * Clear PageTail before unfreezing page refcount.
+	 *
+	 * After successful get_page_unless_zero() might follow put_page()
+	 * which needs correct compound_head().
+	 */
 	clear_compound_head(page_tail);
 
+	/* Finally unfreeze refcount. Additional reference from page cache. */
+	page_ref_unfreeze(page_tail, 1 + (!PageAnon(head) ||
+					  PageSwapCache(head)));
+
 	if (page_is_young(head))
 		set_page_young(page_tail);
 	if (page_is_idle(head))
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 003/146] mm/huge_memory: splitting set mapping+index before unfreeze
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 001/146] mm/huge_memory: rename freeze_page() to unmap_page() Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 002/146] mm/huge_memory.c: reorder operations in __split_huge_page_tail() Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 004/146] mm/huge_memory: fix lockdep complaint on 32-bit i_size_read() Greg Kroah-Hartman
                   ` (147 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hugh Dickins, Kirill A. Shutemov,
	Konstantin Khlebnikov, Jerome Glisse, Matthew Wilcox,
	Andrew Morton, Linus Torvalds, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 173d9d9fd3ddae84c110fea8aedf1f26af6be9ec upstream.

Huge tmpfs stress testing has occasionally hit shmem_undo_range()'s
VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page).

Move the setting of mapping and index up before the page_ref_unfreeze()
in __split_huge_page_tail() to fix this: so that a page cache lookup
cannot get a reference while the tail's mapping and index are unstable.

In fact, might as well move them up before the smp_wmb(): I don't see an
actual need for that, but if I'm missing something, this way round is
safer than the other, and no less efficient.

You might argue that VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page) is
misplaced, and should be left until after the trylock_page(); but left as
is has not crashed since, and gives more stringent assurance.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261516380.2275@eggly.anvils
Fixes: e9b61f19858a5 ("thp: reintroduce split_huge_page()")
Requires: 605ca5ede764 ("mm/huge_memory.c: reorder operations in __split_huge_page_tail()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 mm/huge_memory.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 8969f5cf3174..69ffeb439b0f 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2331,6 +2331,12 @@ static void __split_huge_page_tail(struct page *head, int tail,
 			 (1L << PG_unevictable) |
 			 (1L << PG_dirty)));
 
+	/* ->mapping in first tail page is compound_mapcount */
+	VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING,
+			page_tail);
+	page_tail->mapping = head->mapping;
+	page_tail->index = head->index + tail;
+
 	/* Page flags must be visible before we make the page non-compound. */
 	smp_wmb();
 
@@ -2351,12 +2357,6 @@ static void __split_huge_page_tail(struct page *head, int tail,
 	if (page_is_idle(head))
 		set_page_idle(page_tail);
 
-	/* ->mapping in first tail page is compound_mapcount */
-	VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING,
-			page_tail);
-	page_tail->mapping = head->mapping;
-
-	page_tail->index = head->index + tail;
 	page_cpupid_xchg_last(page_tail, page_cpupid_last(head));
 	lru_add_page_tail(head, page_tail, lruvec, list);
 }
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 004/146] mm/huge_memory: fix lockdep complaint on 32-bit i_size_read()
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 003/146] mm/huge_memory: splitting set mapping+index before unfreeze Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 005/146] mm/khugepaged: collapse_shmem() stop if punched or truncated Greg Kroah-Hartman
                   ` (146 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hugh Dickins, Kirill A. Shutemov,
	Jerome Glisse, Konstantin Khlebnikov, Matthew Wilcox,
	Andrew Morton, Linus Torvalds, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 006d3ff27e884f80bd7d306b041afc415f63598f upstream.

Huge tmpfs testing, on 32-bit kernel with lockdep enabled, showed that
__split_huge_page() was using i_size_read() while holding the irq-safe
lru_lock and page tree lock, but the 32-bit i_size_read() uses an
irq-unsafe seqlock which should not be nested inside them.

Instead, read the i_size earlier in split_huge_page_to_list(), and pass
the end offset down to __split_huge_page(): all while holding head page
lock, which is enough to prevent truncation of that extent before the
page tree lock has been taken.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261520070.2275@eggly.anvils
Fixes: baa355fd33142 ("thp: file pages support for split_huge_page()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 mm/huge_memory.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 69ffeb439b0f..930f2aa3bb4d 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2362,12 +2362,11 @@ static void __split_huge_page_tail(struct page *head, int tail,
 }
 
 static void __split_huge_page(struct page *page, struct list_head *list,
-		unsigned long flags)
+		pgoff_t end, unsigned long flags)
 {
 	struct page *head = compound_head(page);
 	struct zone *zone = page_zone(head);
 	struct lruvec *lruvec;
-	pgoff_t end = -1;
 	int i;
 
 	lruvec = mem_cgroup_page_lruvec(head, zone->zone_pgdat);
@@ -2375,9 +2374,6 @@ static void __split_huge_page(struct page *page, struct list_head *list,
 	/* complete memcg works before add pages to LRU */
 	mem_cgroup_split_huge_fixup(head);
 
-	if (!PageAnon(page))
-		end = DIV_ROUND_UP(i_size_read(head->mapping->host), PAGE_SIZE);
-
 	for (i = HPAGE_PMD_NR - 1; i >= 1; i--) {
 		__split_huge_page_tail(head, i, lruvec, list);
 		/* Some pages can be beyond i_size: drop them from page cache */
@@ -2549,6 +2545,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 	int count, mapcount, extra_pins, ret;
 	bool mlocked;
 	unsigned long flags;
+	pgoff_t end;
 
 	VM_BUG_ON_PAGE(is_huge_zero_page(page), page);
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
@@ -2571,6 +2568,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 			ret = -EBUSY;
 			goto out;
 		}
+		end = -1;
 		mapping = NULL;
 		anon_vma_lock_write(anon_vma);
 	} else {
@@ -2584,6 +2582,15 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 
 		anon_vma = NULL;
 		i_mmap_lock_read(mapping);
+
+		/*
+		 *__split_huge_page() may need to trim off pages beyond EOF:
+		 * but on 32-bit, i_size_read() takes an irq-unsafe seqlock,
+		 * which cannot be nested inside the page tree lock. So note
+		 * end now: i_size itself may be changed at any moment, but
+		 * head page lock is good enough to serialize the trimming.
+		 */
+		end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE);
 	}
 
 	/*
@@ -2633,7 +2640,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 		if (mapping)
 			__dec_node_page_state(page, NR_SHMEM_THPS);
 		spin_unlock(&pgdata->split_queue_lock);
-		__split_huge_page(page, list, flags);
+		__split_huge_page(page, list, end, flags);
 		if (PageSwapCache(head)) {
 			swp_entry_t entry = { .val = page_private(head) };
 
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 005/146] mm/khugepaged: collapse_shmem() stop if punched or truncated
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (3 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 004/146] mm/huge_memory: fix lockdep complaint on 32-bit i_size_read() Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 006/146] mm/khugepaged: fix crashes due to misaccounted holes Greg Kroah-Hartman
                   ` (145 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hugh Dickins, Matthew Wilcox,
	Kirill A. Shutemov, Jerome Glisse, Konstantin Khlebnikov,
	Andrew Morton, Linus Torvalds, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 701270fa193aadf00bdcf607738f64997275d4c7 upstream.

Huge tmpfs testing showed that although collapse_shmem() recognizes a
concurrently truncated or hole-punched page correctly, its handling of
holes was liable to refill an emptied extent.  Add check to stop that.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261522040.2275@eggly.anvils
Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Matthew Wilcox <willy@infradead.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 mm/khugepaged.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 0a5bb3e8a8a3..d4a06afbeda4 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1352,6 +1352,16 @@ static void collapse_shmem(struct mm_struct *mm,
 	radix_tree_for_each_slot(slot, &mapping->page_tree, &iter, start) {
 		int n = min(iter.index, end) - index;
 
+		/*
+		 * Stop if extent has been hole-punched, and is now completely
+		 * empty (the more obvious i_size_read() check would take an
+		 * irq-unsafe seqlock on 32-bit).
+		 */
+		if (n >= HPAGE_PMD_NR) {
+			result = SCAN_TRUNCATED;
+			goto tree_locked;
+		}
+
 		/*
 		 * Handle holes in the radix tree: charge it from shmem and
 		 * insert relevant subpage of new_page into the radix-tree.
@@ -1463,6 +1473,11 @@ static void collapse_shmem(struct mm_struct *mm,
 	if (result == SCAN_SUCCEED && index < end) {
 		int n = end - index;
 
+		/* Stop if extent has been truncated, and is now empty */
+		if (n >= HPAGE_PMD_NR) {
+			result = SCAN_TRUNCATED;
+			goto tree_locked;
+		}
 		if (!shmem_charge(mapping->host, n)) {
 			result = SCAN_FAIL;
 			goto tree_locked;
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 006/146] mm/khugepaged: fix crashes due to misaccounted holes
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 005/146] mm/khugepaged: collapse_shmem() stop if punched or truncated Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 007/146] mm/khugepaged: collapse_shmem() remember to clear holes Greg Kroah-Hartman
                   ` (144 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hugh Dickins, Kirill A. Shutemov,
	Jerome Glisse, Konstantin Khlebnikov, Matthew Wilcox,
	Andrew Morton, Linus Torvalds, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit aaa52e340073b7f4593b3c4ddafcafa70cf838b5 upstream.

Huge tmpfs testing on a shortish file mapped into a pmd-rounded extent
hit shmem_evict_inode()'s WARN_ON(inode->i_blocks) followed by
clear_inode()'s BUG_ON(inode->i_data.nrpages) when the file was later
closed and unlinked.

khugepaged's collapse_shmem() was forgetting to update mapping->nrpages
on the rollback path, after it had added but then needs to undo some
holes.

There is indeed an irritating asymmetry between shmem_charge(), whose
callers want it to increment nrpages after successfully accounting
blocks, and shmem_uncharge(), when __delete_from_page_cache() already
decremented nrpages itself: oh well, just add a comment on that to them
both.

And shmem_recalc_inode() is supposed to be called when the accounting is
expected to be in balance (so it can deduce from imbalance that reclaim
discarded some pages): so change shmem_charge() to update nrpages
earlier (though it's rare for the difference to matter at all).

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261523450.2275@eggly.anvils
Fixes: 800d8c63b2e98 ("shmem: add huge pages support")
Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 mm/khugepaged.c | 4 +++-
 mm/shmem.c      | 6 +++++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index d4a06afbeda4..be7b0863e33c 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1539,8 +1539,10 @@ static void collapse_shmem(struct mm_struct *mm,
 		*hpage = NULL;
 	} else {
 		/* Something went wrong: rollback changes to the radix-tree */
-		shmem_uncharge(mapping->host, nr_none);
 		spin_lock_irq(&mapping->tree_lock);
+		mapping->nrpages -= nr_none;
+		shmem_uncharge(mapping->host, nr_none);
+
 		radix_tree_for_each_slot(slot, &mapping->page_tree, &iter,
 				start) {
 			if (iter.index >= end)
diff --git a/mm/shmem.c b/mm/shmem.c
index fa08f56fd5e5..177fef62cbbd 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -296,12 +296,14 @@ bool shmem_charge(struct inode *inode, long pages)
 	if (!shmem_inode_acct_block(inode, pages))
 		return false;
 
+	/* nrpages adjustment first, then shmem_recalc_inode() when balanced */
+	inode->i_mapping->nrpages += pages;
+
 	spin_lock_irqsave(&info->lock, flags);
 	info->alloced += pages;
 	inode->i_blocks += pages * BLOCKS_PER_PAGE;
 	shmem_recalc_inode(inode);
 	spin_unlock_irqrestore(&info->lock, flags);
-	inode->i_mapping->nrpages += pages;
 
 	return true;
 }
@@ -311,6 +313,8 @@ void shmem_uncharge(struct inode *inode, long pages)
 	struct shmem_inode_info *info = SHMEM_I(inode);
 	unsigned long flags;
 
+	/* nrpages adjustment done by __delete_from_page_cache() or caller */
+
 	spin_lock_irqsave(&info->lock, flags);
 	info->alloced -= pages;
 	inode->i_blocks -= pages * BLOCKS_PER_PAGE;
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 007/146] mm/khugepaged: collapse_shmem() remember to clear holes
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (5 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 006/146] mm/khugepaged: fix crashes due to misaccounted holes Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 008/146] mm/khugepaged: minor reorderings in collapse_shmem() Greg Kroah-Hartman
                   ` (143 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hugh Dickins, Kirill A. Shutemov,
	Jerome Glisse, Konstantin Khlebnikov, Matthew Wilcox,
	Andrew Morton, Linus Torvalds, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 2af8ff291848cc4b1cce24b6c943394eb2c761e8 upstream.

Huge tmpfs testing reminds us that there is no __GFP_ZERO in the gfp
flags khugepaged uses to allocate a huge page - in all common cases it
would just be a waste of effort - so collapse_shmem() must remember to
clear out any holes that it instantiates.

The obvious place to do so, where they are put into the page cache tree,
is not a good choice: because interrupts are disabled there.  Leave it
until further down, once success is assured, where the other pages are
copied (before setting PageUptodate).

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261525080.2275@eggly.anvils
Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 mm/khugepaged.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index be7b0863e33c..f535dbe3f9f5 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1502,7 +1502,12 @@ static void collapse_shmem(struct mm_struct *mm,
 		 * Replacing old pages with new one has succeed, now we need to
 		 * copy the content and free old pages.
 		 */
+		index = start;
 		list_for_each_entry_safe(page, tmp, &pagelist, lru) {
+			while (index < page->index) {
+				clear_highpage(new_page + (index % HPAGE_PMD_NR));
+				index++;
+			}
 			copy_highpage(new_page + (page->index % HPAGE_PMD_NR),
 					page);
 			list_del(&page->lru);
@@ -1512,6 +1517,11 @@ static void collapse_shmem(struct mm_struct *mm,
 			ClearPageActive(page);
 			ClearPageUnevictable(page);
 			put_page(page);
+			index++;
+		}
+		while (index < end) {
+			clear_highpage(new_page + (index % HPAGE_PMD_NR));
+			index++;
 		}
 
 		local_irq_save(flags);
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 008/146] mm/khugepaged: minor reorderings in collapse_shmem()
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (6 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 007/146] mm/khugepaged: collapse_shmem() remember to clear holes Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 009/146] mm/khugepaged: collapse_shmem() without freezing new_page Greg Kroah-Hartman
                   ` (142 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hugh Dickins, Kirill A. Shutemov,
	Jerome Glisse, Konstantin Khlebnikov, Matthew Wilcox,
	Andrew Morton, Linus Torvalds, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 042a30824871fa3149b0127009074b75cc25863c upstream.

Several cleanups in collapse_shmem(): most of which probably do not
really matter, beyond doing things in a more familiar and reassuring
order.  Simplify the failure gotos in the main loop, and on success
update stats while interrupts still disabled from the last iteration.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261526400.2275@eggly.anvils
Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 mm/khugepaged.c | 73 ++++++++++++++++++++-----------------------------
 1 file changed, 30 insertions(+), 43 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index f535dbe3f9f5..0eac4344477a 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1333,13 +1333,12 @@ static void collapse_shmem(struct mm_struct *mm,
 		goto out;
 	}
 
+	__SetPageLocked(new_page);
+	__SetPageSwapBacked(new_page);
 	new_page->index = start;
 	new_page->mapping = mapping;
-	__SetPageSwapBacked(new_page);
-	__SetPageLocked(new_page);
 	BUG_ON(!page_ref_freeze(new_page, 1));
 
-
 	/*
 	 * At this point the new_page is 'frozen' (page_count() is zero), locked
 	 * and not up-to-date. It's safe to insert it into radix tree, because
@@ -1368,13 +1367,13 @@ static void collapse_shmem(struct mm_struct *mm,
 		 */
 		if (n && !shmem_charge(mapping->host, n)) {
 			result = SCAN_FAIL;
-			break;
+			goto tree_locked;
 		}
-		nr_none += n;
 		for (; index < min(iter.index, end); index++) {
 			radix_tree_insert(&mapping->page_tree, index,
 					new_page + (index % HPAGE_PMD_NR));
 		}
+		nr_none += n;
 
 		/* We are done. */
 		if (index >= end)
@@ -1390,12 +1389,12 @@ static void collapse_shmem(struct mm_struct *mm,
 				result = SCAN_FAIL;
 				goto tree_unlocked;
 			}
-			spin_lock_irq(&mapping->tree_lock);
 		} else if (trylock_page(page)) {
 			get_page(page);
+			spin_unlock_irq(&mapping->tree_lock);
 		} else {
 			result = SCAN_PAGE_LOCK;
-			break;
+			goto tree_locked;
 		}
 
 		/*
@@ -1410,11 +1409,10 @@ static void collapse_shmem(struct mm_struct *mm,
 			result = SCAN_TRUNCATED;
 			goto out_unlock;
 		}
-		spin_unlock_irq(&mapping->tree_lock);
 
 		if (isolate_lru_page(page)) {
 			result = SCAN_DEL_PAGE_LRU;
-			goto out_isolate_failed;
+			goto out_unlock;
 		}
 
 		if (page_mapped(page))
@@ -1436,7 +1434,9 @@ static void collapse_shmem(struct mm_struct *mm,
 		 */
 		if (!page_ref_freeze(page, 3)) {
 			result = SCAN_PAGE_COUNT;
-			goto out_lru;
+			spin_unlock_irq(&mapping->tree_lock);
+			putback_lru_page(page);
+			goto out_unlock;
 		}
 
 		/*
@@ -1452,17 +1452,10 @@ static void collapse_shmem(struct mm_struct *mm,
 		slot = radix_tree_iter_resume(slot, &iter);
 		index++;
 		continue;
-out_lru:
-		spin_unlock_irq(&mapping->tree_lock);
-		putback_lru_page(page);
-out_isolate_failed:
-		unlock_page(page);
-		put_page(page);
-		goto tree_unlocked;
 out_unlock:
 		unlock_page(page);
 		put_page(page);
-		break;
+		goto tree_unlocked;
 	}
 
 	/*
@@ -1470,7 +1463,7 @@ static void collapse_shmem(struct mm_struct *mm,
 	 * This code only triggers if there's nothing in radix tree
 	 * beyond 'end'.
 	 */
-	if (result == SCAN_SUCCEED && index < end) {
+	if (index < end) {
 		int n = end - index;
 
 		/* Stop if extent has been truncated, and is now empty */
@@ -1482,7 +1475,6 @@ static void collapse_shmem(struct mm_struct *mm,
 			result = SCAN_FAIL;
 			goto tree_locked;
 		}
-
 		for (; index < end; index++) {
 			radix_tree_insert(&mapping->page_tree, index,
 					new_page + (index % HPAGE_PMD_NR));
@@ -1490,14 +1482,19 @@ static void collapse_shmem(struct mm_struct *mm,
 		nr_none += n;
 	}
 
+	__inc_node_page_state(new_page, NR_SHMEM_THPS);
+	if (nr_none) {
+		struct zone *zone = page_zone(new_page);
+
+		__mod_node_page_state(zone->zone_pgdat, NR_FILE_PAGES, nr_none);
+		__mod_node_page_state(zone->zone_pgdat, NR_SHMEM, nr_none);
+	}
+
 tree_locked:
 	spin_unlock_irq(&mapping->tree_lock);
 tree_unlocked:
 
 	if (result == SCAN_SUCCEED) {
-		unsigned long flags;
-		struct zone *zone = page_zone(new_page);
-
 		/*
 		 * Replacing old pages with new one has succeed, now we need to
 		 * copy the content and free old pages.
@@ -1511,11 +1508,11 @@ static void collapse_shmem(struct mm_struct *mm,
 			copy_highpage(new_page + (page->index % HPAGE_PMD_NR),
 					page);
 			list_del(&page->lru);
-			unlock_page(page);
-			page_ref_unfreeze(page, 1);
 			page->mapping = NULL;
+			page_ref_unfreeze(page, 1);
 			ClearPageActive(page);
 			ClearPageUnevictable(page);
+			unlock_page(page);
 			put_page(page);
 			index++;
 		}
@@ -1524,28 +1521,17 @@ static void collapse_shmem(struct mm_struct *mm,
 			index++;
 		}
 
-		local_irq_save(flags);
-		__inc_node_page_state(new_page, NR_SHMEM_THPS);
-		if (nr_none) {
-			__mod_node_page_state(zone->zone_pgdat, NR_FILE_PAGES, nr_none);
-			__mod_node_page_state(zone->zone_pgdat, NR_SHMEM, nr_none);
-		}
-		local_irq_restore(flags);
-
-		/*
-		 * Remove pte page tables, so we can re-faulti
-		 * the page as huge.
-		 */
-		retract_page_tables(mapping, start);
-
 		/* Everything is ready, let's unfreeze the new_page */
-		set_page_dirty(new_page);
 		SetPageUptodate(new_page);
 		page_ref_unfreeze(new_page, HPAGE_PMD_NR);
+		set_page_dirty(new_page);
 		mem_cgroup_commit_charge(new_page, memcg, false, true);
 		lru_cache_add_anon(new_page);
-		unlock_page(new_page);
 
+		/*
+		 * Remove pte page tables, so we can re-fault the page as huge.
+		 */
+		retract_page_tables(mapping, start);
 		*hpage = NULL;
 	} else {
 		/* Something went wrong: rollback changes to the radix-tree */
@@ -1578,8 +1564,8 @@ static void collapse_shmem(struct mm_struct *mm,
 						slot, page);
 			slot = radix_tree_iter_resume(slot, &iter);
 			spin_unlock_irq(&mapping->tree_lock);
-			putback_lru_page(page);
 			unlock_page(page);
+			putback_lru_page(page);
 			spin_lock_irq(&mapping->tree_lock);
 		}
 		VM_BUG_ON(nr_none);
@@ -1588,9 +1574,10 @@ static void collapse_shmem(struct mm_struct *mm,
 		/* Unfreeze new_page, caller would take care about freeing it */
 		page_ref_unfreeze(new_page, 1);
 		mem_cgroup_cancel_charge(new_page, memcg, true);
-		unlock_page(new_page);
 		new_page->mapping = NULL;
 	}
+
+	unlock_page(new_page);
 out:
 	VM_BUG_ON(!list_empty(&pagelist));
 	/* TODO: tracepoints */
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 009/146] mm/khugepaged: collapse_shmem() without freezing new_page
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (7 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 008/146] mm/khugepaged: minor reorderings in collapse_shmem() Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 010/146] mm/khugepaged: collapse_shmem() do not crash on Compound Greg Kroah-Hartman
                   ` (141 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hugh Dickins, Kirill A. Shutemov,
	Jerome Glisse, Konstantin Khlebnikov, Matthew Wilcox,
	Andrew Morton, Linus Torvalds, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 87c460a0bded56195b5eb497d44709777ef7b415 upstream.

khugepaged's collapse_shmem() does almost all of its work, to assemble
the huge new_page from 512 scattered old pages, with the new_page's
refcount frozen to 0 (and refcounts of all old pages so far also frozen
to 0).  Including shmem_getpage() to read in any which were out on swap,
memory reclaim if necessary to allocate their intermediate pages, and
copying over all the data from old to new.

Imagine the frozen refcount as a spinlock held, but without any lock
debugging to highlight the abuse: it's not good, and under serious load
heads into lockups - speculative getters of the page are not expecting
to spin while khugepaged is rescheduled.

One can get a little further under load by hacking around elsewhere; but
fortunately, freezing the new_page turns out to have been entirely
unnecessary, with no hacks needed elsewhere.

The huge new_page lock is already held throughout, and guards all its
subpages as they are brought one by one into the page cache tree; and
anything reading the data in that page, without the lock, before it has
been marked PageUptodate, would already be in the wrong.  So simply
eliminate the freezing of the new_page.

Each of the old pages remains frozen with refcount 0 after it has been
replaced by a new_page subpage in the page cache tree, until they are
all unfrozen on success or failure: just as before.  They could be
unfrozen sooner, but cause no problem once no longer visible to
find_get_entry(), filemap_map_pages() and other speculative lookups.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261527570.2275@eggly.anvils
Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 mm/khugepaged.c | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 0eac4344477a..45bf0e5775f7 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1288,7 +1288,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
  * collapse_shmem - collapse small tmpfs/shmem pages into huge one.
  *
  * Basic scheme is simple, details are more complex:
- *  - allocate and freeze a new huge page;
+ *  - allocate and lock a new huge page;
  *  - scan over radix tree replacing old pages the new one
  *    + swap in pages if necessary;
  *    + fill in gaps;
@@ -1296,11 +1296,11 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
  *  - if replacing succeed:
  *    + copy data over;
  *    + free old pages;
- *    + unfreeze huge page;
+ *    + unlock huge page;
  *  - if replacing failed;
  *    + put all pages back and unfreeze them;
  *    + restore gaps in the radix-tree;
- *    + free huge page;
+ *    + unlock and free huge page;
  */
 static void collapse_shmem(struct mm_struct *mm,
 		struct address_space *mapping, pgoff_t start,
@@ -1337,13 +1337,11 @@ static void collapse_shmem(struct mm_struct *mm,
 	__SetPageSwapBacked(new_page);
 	new_page->index = start;
 	new_page->mapping = mapping;
-	BUG_ON(!page_ref_freeze(new_page, 1));
 
 	/*
-	 * At this point the new_page is 'frozen' (page_count() is zero), locked
-	 * and not up-to-date. It's safe to insert it into radix tree, because
-	 * nobody would be able to map it or use it in other way until we
-	 * unfreeze it.
+	 * At this point the new_page is locked and not up-to-date.
+	 * It's safe to insert it into the page cache, because nobody would
+	 * be able to map it or use it in another way until we unlock it.
 	 */
 
 	index = start;
@@ -1521,9 +1519,8 @@ static void collapse_shmem(struct mm_struct *mm,
 			index++;
 		}
 
-		/* Everything is ready, let's unfreeze the new_page */
 		SetPageUptodate(new_page);
-		page_ref_unfreeze(new_page, HPAGE_PMD_NR);
+		page_ref_add(new_page, HPAGE_PMD_NR - 1);
 		set_page_dirty(new_page);
 		mem_cgroup_commit_charge(new_page, memcg, false, true);
 		lru_cache_add_anon(new_page);
@@ -1571,8 +1568,6 @@ static void collapse_shmem(struct mm_struct *mm,
 		VM_BUG_ON(nr_none);
 		spin_unlock_irq(&mapping->tree_lock);
 
-		/* Unfreeze new_page, caller would take care about freeing it */
-		page_ref_unfreeze(new_page, 1);
 		mem_cgroup_cancel_charge(new_page, memcg, true);
 		new_page->mapping = NULL;
 	}
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 010/146] mm/khugepaged: collapse_shmem() do not crash on Compound
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (8 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 009/146] mm/khugepaged: collapse_shmem() without freezing new_page Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 011/146] media: em28xx: Fix use-after-free when disconnecting Greg Kroah-Hartman
                   ` (140 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hugh Dickins, Kirill A. Shutemov,
	Jerome Glisse, Konstantin Khlebnikov, Matthew Wilcox,
	Andrew Morton, Linus Torvalds, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 06a5e1268a5fb9c2b346a3da6b97e85f2eba0f07 upstream.

collapse_shmem()'s VM_BUG_ON_PAGE(PageTransCompound) was unsafe: before
it holds page lock of the first page, racing truncation then extension
might conceivably have inserted a hugepage there already.  Fail with the
SCAN_PAGE_COMPOUND result, instead of crashing (CONFIG_DEBUG_VM=y) or
otherwise mishandling the unexpected hugepage - though later we might
code up a more constructive way of handling it, with SCAN_SUCCESS.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261529310.2275@eggly.anvils
Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 mm/khugepaged.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 45bf0e5775f7..d27a73737f1a 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1401,7 +1401,15 @@ static void collapse_shmem(struct mm_struct *mm,
 		 */
 		VM_BUG_ON_PAGE(!PageLocked(page), page);
 		VM_BUG_ON_PAGE(!PageUptodate(page), page);
-		VM_BUG_ON_PAGE(PageTransCompound(page), page);
+
+		/*
+		 * If file was truncated then extended, or hole-punched, before
+		 * we locked the first page, then a THP might be there already.
+		 */
+		if (PageTransCompound(page)) {
+			result = SCAN_PAGE_COMPOUND;
+			goto out_unlock;
+		}
 
 		if (page_mapping(page) != mapping) {
 			result = SCAN_TRUNCATED;
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 011/146] media: em28xx: Fix use-after-free when disconnecting
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (9 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 010/146] mm/khugepaged: collapse_shmem() do not crash on Compound Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 012/146] ubi: Initialize Fastmap checkmapping correctly Greg Kroah-Hartman
                   ` (139 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Matthias Schwarzott,
	Mauro Carvalho Chehab, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

[ Upstream commit 910b0797fa9e8af09c44a3fa36cb310ba7a7218d ]

Fix bug by moving the i2c_unregister_device calls after deregistration
of dvb frontend.

The new style i2c drivers already destroys the frontend object at
i2c_unregister_device time.
When the dvb frontend is unregistered afterwards it leads to this oops:

  [ 6058.866459] BUG: unable to handle kernel NULL pointer dereference at 00000000000001f8
  [ 6058.866578] IP: dvb_frontend_stop+0x30/0xd0 [dvb_core]
  [ 6058.866644] PGD 0
  [ 6058.866646] P4D 0

  [ 6058.866726] Oops: 0000 [#1] SMP
  [ 6058.866768] Modules linked in: rc_pinnacle_pctv_hd(O) em28xx_rc(O) si2157(O) si2168(O) em28xx_dvb(O) em28xx(O) si2165(O) a8293(O) tda10071(O) tea5767(O) tuner(O) cx23885(O) tda18271(O) videobuf2_dvb(O) videobuf2_dma_sg(O) m88ds3103(O) tveeprom(O) cx2341x(O) v4l2_common(O) dvb_core(O) rc_core(O) videobuf2_memops(O) videobuf2_v4l2(O) videobuf2_core(O) videodev(O) media(O) bluetooth ecdh_generic ums_realtek uas rtl8192cu rtl_usb rtl8192c_common rtlwifi usb_storage snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic i2c_mux snd_hda_intel snd_hda_codec snd_hwdep x86_pkg_temp_thermal snd_hda_core kvm_intel kvm irqbypass [last unloaded: videobuf2_memops]
  [ 6058.867497] CPU: 2 PID: 7349 Comm: kworker/2:0 Tainted: G        W  O    4.13.9-gentoo #1
  [ 6058.867595] Hardware name: MEDION E2050 2391/H81H3-EM2, BIOS H81EM2W08.308 08/25/2014
  [ 6058.867692] Workqueue: usb_hub_wq hub_event
  [ 6058.867746] task: ffff88011a15e040 task.stack: ffffc90003074000
  [ 6058.867825] RIP: 0010:dvb_frontend_stop+0x30/0xd0 [dvb_core]
  [ 6058.867896] RSP: 0018:ffffc90003077b58 EFLAGS: 00010293
  [ 6058.867964] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000010040001f
  [ 6058.868056] RDX: ffff88011a15e040 RSI: ffffea000464e400 RDI: ffff88001cbe3028
  [ 6058.868150] RBP: ffffc90003077b68 R08: ffff880119390380 R09: 000000010040001f
  [ 6058.868241] R10: ffffc90003077b18 R11: 000000000001e200 R12: ffff88001cbe3028
  [ 6058.868330] R13: ffff88001cbe68d0 R14: ffff8800cf734000 R15: ffff8800cf734098
  [ 6058.868419] FS:  0000000000000000(0000) GS:ffff88011fb00000(0000) knlGS:0000000000000000
  [ 6058.868511] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 6058.868578] CR2: 00000000000001f8 CR3: 00000001113c5000 CR4: 00000000001406e0
  [ 6058.868662] Call Trace:
  [ 6058.868705]  dvb_unregister_frontend+0x2a/0x80 [dvb_core]
  [ 6058.868774]  em28xx_dvb_fini+0x132/0x220 [em28xx_dvb]
  [ 6058.868840]  em28xx_close_extension+0x34/0x90 [em28xx]
  [ 6058.868902]  em28xx_usb_disconnect+0x4e/0x70 [em28xx]
  [ 6058.868968]  usb_unbind_interface+0x6d/0x260
  [ 6058.869025]  device_release_driver_internal+0x150/0x210
  [ 6058.869094]  device_release_driver+0xd/0x10
  [ 6058.869150]  bus_remove_device+0xe4/0x160
  [ 6058.869204]  device_del+0x1ce/0x2f0
  [ 6058.869253]  usb_disable_device+0x99/0x270
  [ 6058.869306]  usb_disconnect+0x8d/0x260
  [ 6058.869359]  hub_event+0x93d/0x1520
  [ 6058.869408]  ? dequeue_task_fair+0xae5/0xd20
  [ 6058.869467]  process_one_work+0x1d9/0x3e0
  [ 6058.869522]  worker_thread+0x43/0x3e0
  [ 6058.869576]  kthread+0x104/0x140
  [ 6058.869602]  ? trace_event_raw_event_workqueue_work+0x80/0x80
  [ 6058.869640]  ? kthread_create_on_node+0x40/0x40
  [ 6058.869673]  ret_from_fork+0x22/0x30
  [ 6058.869698] Code: 54 49 89 fc 53 48 8b 9f 18 03 00 00 0f 1f 44 00 00 41 83 bc 24 04 05 00 00 02 74 0c 41 c7 84 24 04 05 00 00 01 00 00 00 0f ae f0 <48> 8b bb f8 01 00 00 48 85 ff 74 5c e8 df 40 f0 e0 48 8b 93 f8
  [ 6058.869850] RIP: dvb_frontend_stop+0x30/0xd0 [dvb_core] RSP: ffffc90003077b58
  [ 6058.869894] CR2: 00000000000001f8
  [ 6058.875880] ---[ end trace 717eecf7193b3fc6 ]---

Signed-off-by: Matthias Schwarzott <zzam@gentoo.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/media/usb/em28xx/em28xx-dvb.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/media/usb/em28xx/em28xx-dvb.c b/drivers/media/usb/em28xx/em28xx-dvb.c
index 4a7db623fe29..29cdaaf1ed90 100644
--- a/drivers/media/usb/em28xx/em28xx-dvb.c
+++ b/drivers/media/usb/em28xx/em28xx-dvb.c
@@ -2105,6 +2105,8 @@ static int em28xx_dvb_fini(struct em28xx *dev)
 		}
 	}
 
+	em28xx_unregister_dvb(dvb);
+
 	/* remove I2C SEC */
 	client = dvb->i2c_client_sec;
 	if (client) {
@@ -2126,7 +2128,6 @@ static int em28xx_dvb_fini(struct em28xx *dev)
 		i2c_unregister_device(client);
 	}
 
-	em28xx_unregister_dvb(dvb);
 	kfree(dvb);
 	dev->dvb = NULL;
 	kref_put(&dev->ref, em28xx_free_device);
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 012/146] ubi: Initialize Fastmap checkmapping correctly
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (10 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 011/146] media: em28xx: Fix use-after-free when disconnecting Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 013/146] libceph: store ceph_auth_handshake pointer in ceph_connection Greg Kroah-Hartman
                   ` (138 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Lachmann, Juergen,
	Richard Weinberger, Sudip Mukherjee, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 25677478474a91fa1b46f19a4a591a9848bca6fb upstream

We cannot do it last, otherwithse it will be skipped for dynamic
volumes.

Reported-by: Lachmann, Juergen <juergen.lachmann@harman.com>
Fixes: 34653fd8c46e ("ubi: fastmap: Check each mapping only once")
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/mtd/ubi/vtbl.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/mtd/ubi/vtbl.c b/drivers/mtd/ubi/vtbl.c
index 94d7a865b135..7504f430c011 100644
--- a/drivers/mtd/ubi/vtbl.c
+++ b/drivers/mtd/ubi/vtbl.c
@@ -578,6 +578,16 @@ static int init_volumes(struct ubi_device *ubi,
 		vol->ubi = ubi;
 		reserved_pebs += vol->reserved_pebs;
 
+		/*
+		 * We use ubi->peb_count and not vol->reserved_pebs because
+		 * we want to keep the code simple. Otherwise we'd have to
+		 * resize/check the bitmap upon volume resize too.
+		 * Allocating a few bytes more does not hurt.
+		 */
+		err = ubi_fastmap_init_checkmap(vol, ubi->peb_count);
+		if (err)
+			return err;
+
 		/*
 		 * In case of dynamic volume UBI knows nothing about how many
 		 * data is stored there. So assume the whole volume is used.
@@ -620,16 +630,6 @@ static int init_volumes(struct ubi_device *ubi,
 			(long long)(vol->used_ebs - 1) * vol->usable_leb_size;
 		vol->used_bytes += av->last_data_size;
 		vol->last_eb_bytes = av->last_data_size;
-
-		/*
-		 * We use ubi->peb_count and not vol->reserved_pebs because
-		 * we want to keep the code simple. Otherwise we'd have to
-		 * resize/check the bitmap upon volume resize too.
-		 * Allocating a few bytes more does not hurt.
-		 */
-		err = ubi_fastmap_init_checkmap(vol, ubi->peb_count);
-		if (err)
-			return err;
 	}
 
 	/* And add the layout volume */
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 013/146] libceph: store ceph_auth_handshake pointer in ceph_connection
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (11 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 012/146] ubi: Initialize Fastmap checkmapping correctly Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 014/146] libceph: factor out __prepare_write_connect() Greg Kroah-Hartman
                   ` (137 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ilya Dryomov, Sage Weil,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 262614c4294d33b1f19e0d18c0091d9c329b544a upstream.

We already copy authorizer_reply_buf and authorizer_reply_buf_len into
ceph_connection.  Factoring out __prepare_write_connect() requires two
more: authorizer_buf and authorizer_buf_len.  Store the pointer to the
handshake in con->auth rather than piling on.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/linux/ceph/messenger.h |  3 +-
 net/ceph/messenger.c           | 54 ++++++++++++++++------------------
 2 files changed, 27 insertions(+), 30 deletions(-)

diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index ead9d85f1c11..9056077c023f 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -203,9 +203,8 @@ struct ceph_connection {
 				 attempt for this connection, client */
 	u32 peer_global_seq;  /* peer's global seq for this connection */
 
+	struct ceph_auth_handshake *auth;
 	int auth_retry;       /* true if we need a newer authorizer */
-	void *auth_reply_buf;   /* where to put the authorizer reply */
-	int auth_reply_buf_len;
 
 	struct mutex mutex;
 
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 5281da82371a..3a82e6d2864b 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -1411,24 +1411,26 @@ static void prepare_write_keepalive(struct ceph_connection *con)
  * Connection negotiation.
  */
 
-static struct ceph_auth_handshake *get_connect_authorizer(struct ceph_connection *con,
-						int *auth_proto)
+static int get_connect_authorizer(struct ceph_connection *con)
 {
 	struct ceph_auth_handshake *auth;
+	int auth_proto;
 
 	if (!con->ops->get_authorizer) {
+		con->auth = NULL;
 		con->out_connect.authorizer_protocol = CEPH_AUTH_UNKNOWN;
 		con->out_connect.authorizer_len = 0;
-		return NULL;
+		return 0;
 	}
 
-	auth = con->ops->get_authorizer(con, auth_proto, con->auth_retry);
+	auth = con->ops->get_authorizer(con, &auth_proto, con->auth_retry);
 	if (IS_ERR(auth))
-		return auth;
+		return PTR_ERR(auth);
 
-	con->auth_reply_buf = auth->authorizer_reply_buf;
-	con->auth_reply_buf_len = auth->authorizer_reply_buf_len;
-	return auth;
+	con->auth = auth;
+	con->out_connect.authorizer_protocol = cpu_to_le32(auth_proto);
+	con->out_connect.authorizer_len = cpu_to_le32(auth->authorizer_buf_len);
+	return 0;
 }
 
 /*
@@ -1448,8 +1450,7 @@ static int prepare_write_connect(struct ceph_connection *con)
 {
 	unsigned int global_seq = get_global_seq(con->msgr, 0);
 	int proto;
-	int auth_proto;
-	struct ceph_auth_handshake *auth;
+	int ret;
 
 	switch (con->peer_name.type) {
 	case CEPH_ENTITY_TYPE_MON:
@@ -1476,20 +1477,15 @@ static int prepare_write_connect(struct ceph_connection *con)
 	con->out_connect.protocol_version = cpu_to_le32(proto);
 	con->out_connect.flags = 0;
 
-	auth_proto = CEPH_AUTH_UNKNOWN;
-	auth = get_connect_authorizer(con, &auth_proto);
-	if (IS_ERR(auth))
-		return PTR_ERR(auth);
-
-	con->out_connect.authorizer_protocol = cpu_to_le32(auth_proto);
-	con->out_connect.authorizer_len = auth ?
-		cpu_to_le32(auth->authorizer_buf_len) : 0;
+	ret = get_connect_authorizer(con);
+	if (ret)
+		return ret;
 
 	con_out_kvec_add(con, sizeof (con->out_connect),
 					&con->out_connect);
-	if (auth && auth->authorizer_buf_len)
-		con_out_kvec_add(con, auth->authorizer_buf_len,
-					auth->authorizer_buf);
+	if (con->auth)
+		con_out_kvec_add(con, con->auth->authorizer_buf_len,
+				 con->auth->authorizer_buf);
 
 	con->out_more = 0;
 	con_flag_set(con, CON_FLAG_WRITE_PENDING);
@@ -1753,11 +1749,14 @@ static int read_partial_connect(struct ceph_connection *con)
 	if (ret <= 0)
 		goto out;
 
-	size = le32_to_cpu(con->in_reply.authorizer_len);
-	end += size;
-	ret = read_partial(con, end, size, con->auth_reply_buf);
-	if (ret <= 0)
-		goto out;
+	if (con->auth) {
+		size = le32_to_cpu(con->in_reply.authorizer_len);
+		end += size;
+		ret = read_partial(con, end, size,
+				   con->auth->authorizer_reply_buf);
+		if (ret <= 0)
+			goto out;
+	}
 
 	dout("read_partial_connect %p tag %d, con_seq = %u, g_seq = %u\n",
 	     con, (int)con->in_reply.tag,
@@ -1765,7 +1764,6 @@ static int read_partial_connect(struct ceph_connection *con)
 	     le32_to_cpu(con->in_reply.global_seq));
 out:
 	return ret;
-
 }
 
 /*
@@ -2048,7 +2046,7 @@ static int process_connect(struct ceph_connection *con)
 
 	dout("process_connect on %p tag %d\n", con, (int)con->in_tag);
 
-	if (con->auth_reply_buf) {
+	if (con->auth) {
 		/*
 		 * Any connection that defines ->get_authorizer()
 		 * should also define ->verify_authorizer_reply().
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 014/146] libceph: factor out __prepare_write_connect()
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (12 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 013/146] libceph: store ceph_auth_handshake pointer in ceph_connection Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 015/146] libceph: factor out __ceph_x_decrypt() Greg Kroah-Hartman
                   ` (136 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ilya Dryomov, Sage Weil,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit c0f56b483aa09c99bfe97409a43ad786f33b8a5a upstream.

Will be used for sending ceph_msg_connect with an updated authorizer,
after the server challenges the initial authorizer.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/ceph/messenger.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 3a82e6d2864b..0b121327d32f 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -1446,6 +1446,17 @@ static void prepare_write_banner(struct ceph_connection *con)
 	con_flag_set(con, CON_FLAG_WRITE_PENDING);
 }
 
+static void __prepare_write_connect(struct ceph_connection *con)
+{
+	con_out_kvec_add(con, sizeof(con->out_connect), &con->out_connect);
+	if (con->auth)
+		con_out_kvec_add(con, con->auth->authorizer_buf_len,
+				 con->auth->authorizer_buf);
+
+	con->out_more = 0;
+	con_flag_set(con, CON_FLAG_WRITE_PENDING);
+}
+
 static int prepare_write_connect(struct ceph_connection *con)
 {
 	unsigned int global_seq = get_global_seq(con->msgr, 0);
@@ -1481,15 +1492,7 @@ static int prepare_write_connect(struct ceph_connection *con)
 	if (ret)
 		return ret;
 
-	con_out_kvec_add(con, sizeof (con->out_connect),
-					&con->out_connect);
-	if (con->auth)
-		con_out_kvec_add(con, con->auth->authorizer_buf_len,
-				 con->auth->authorizer_buf);
-
-	con->out_more = 0;
-	con_flag_set(con, CON_FLAG_WRITE_PENDING);
-
+	__prepare_write_connect(con);
 	return 0;
 }
 
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 015/146] libceph: factor out __ceph_x_decrypt()
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (13 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 014/146] libceph: factor out __prepare_write_connect() Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 016/146] libceph: factor out encrypt_authorizer() Greg Kroah-Hartman
                   ` (135 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ilya Dryomov, Sage Weil,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit c571fe24d243bfe7017f0e67fe800b3cc2a1d1f7 upstream.

Will be used for decrypting the server challenge which is only preceded
by ceph_x_encrypt_header.

Drop struct_v check to allow for extending ceph_x_encrypt_header in the
future.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/ceph/auth_x.c | 33 ++++++++++++++++++++++++---------
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/net/ceph/auth_x.c b/net/ceph/auth_x.c
index 2f4a1baf5f52..9cac05239346 100644
--- a/net/ceph/auth_x.c
+++ b/net/ceph/auth_x.c
@@ -70,25 +70,40 @@ static int ceph_x_encrypt(struct ceph_crypto_key *secret, void *buf,
 	return sizeof(u32) + ciphertext_len;
 }
 
+static int __ceph_x_decrypt(struct ceph_crypto_key *secret, void *p,
+			    int ciphertext_len)
+{
+	struct ceph_x_encrypt_header *hdr = p;
+	int plaintext_len;
+	int ret;
+
+	ret = ceph_crypt(secret, false, p, ciphertext_len, ciphertext_len,
+			 &plaintext_len);
+	if (ret)
+		return ret;
+
+	if (le64_to_cpu(hdr->magic) != CEPHX_ENC_MAGIC) {
+		pr_err("%s bad magic\n", __func__);
+		return -EINVAL;
+	}
+
+	return plaintext_len - sizeof(*hdr);
+}
+
 static int ceph_x_decrypt(struct ceph_crypto_key *secret, void **p, void *end)
 {
-	struct ceph_x_encrypt_header *hdr = *p + sizeof(u32);
-	int ciphertext_len, plaintext_len;
+	int ciphertext_len;
 	int ret;
 
 	ceph_decode_32_safe(p, end, ciphertext_len, e_inval);
 	ceph_decode_need(p, end, ciphertext_len, e_inval);
 
-	ret = ceph_crypt(secret, false, *p, end - *p, ciphertext_len,
-			 &plaintext_len);
-	if (ret)
+	ret = __ceph_x_decrypt(secret, *p, ciphertext_len);
+	if (ret < 0)
 		return ret;
 
-	if (hdr->struct_v != 1 || le64_to_cpu(hdr->magic) != CEPHX_ENC_MAGIC)
-		return -EPERM;
-
 	*p += ciphertext_len;
-	return plaintext_len - sizeof(struct ceph_x_encrypt_header);
+	return ret;
 
 e_inval:
 	return -EINVAL;
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 016/146] libceph: factor out encrypt_authorizer()
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (14 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 015/146] libceph: factor out __ceph_x_decrypt() Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 017/146] libceph: add authorizer challenge Greg Kroah-Hartman
                   ` (134 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ilya Dryomov, Sage Weil,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 149cac4a50b0b4081b38b2f38de6ef71c27eaa85 upstream.

Will be used for encrypting both the initial and updated authorizers.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/ceph/auth_x.c | 49 ++++++++++++++++++++++++++++++++++-------------
 1 file changed, 36 insertions(+), 13 deletions(-)

diff --git a/net/ceph/auth_x.c b/net/ceph/auth_x.c
index 9cac05239346..722791f45b2a 100644
--- a/net/ceph/auth_x.c
+++ b/net/ceph/auth_x.c
@@ -290,6 +290,38 @@ bad:
 	return -EINVAL;
 }
 
+/*
+ * Encode and encrypt the second part (ceph_x_authorize_b) of the
+ * authorizer.  The first part (ceph_x_authorize_a) should already be
+ * encoded.
+ */
+static int encrypt_authorizer(struct ceph_x_authorizer *au)
+{
+	struct ceph_x_authorize_a *msg_a;
+	struct ceph_x_authorize_b *msg_b;
+	void *p, *end;
+	int ret;
+
+	msg_a = au->buf->vec.iov_base;
+	WARN_ON(msg_a->ticket_blob.secret_id != cpu_to_le64(au->secret_id));
+	p = (void *)(msg_a + 1) + le32_to_cpu(msg_a->ticket_blob.blob_len);
+	end = au->buf->vec.iov_base + au->buf->vec.iov_len;
+
+	msg_b = p + ceph_x_encrypt_offset();
+	msg_b->struct_v = 1;
+	msg_b->nonce = cpu_to_le64(au->nonce);
+
+	ret = ceph_x_encrypt(&au->session_key, p, end - p, sizeof(*msg_b));
+	if (ret < 0)
+		return ret;
+
+	p += ret;
+	WARN_ON(p > end);
+	au->buf->vec.iov_len = p - au->buf->vec.iov_base;
+
+	return 0;
+}
+
 static void ceph_x_authorizer_cleanup(struct ceph_x_authorizer *au)
 {
 	ceph_crypto_key_destroy(&au->session_key);
@@ -306,7 +338,6 @@ static int ceph_x_build_authorizer(struct ceph_auth_client *ac,
 	int maxlen;
 	struct ceph_x_authorize_a *msg_a;
 	struct ceph_x_authorize_b *msg_b;
-	void *p, *end;
 	int ret;
 	int ticket_blob_len =
 		(th->ticket_blob ? th->ticket_blob->vec.iov_len : 0);
@@ -350,21 +381,13 @@ static int ceph_x_build_authorizer(struct ceph_auth_client *ac,
 	dout(" th %p secret_id %lld %lld\n", th, th->secret_id,
 	     le64_to_cpu(msg_a->ticket_blob.secret_id));
 
-	p = msg_a + 1;
-	p += ticket_blob_len;
-	end = au->buf->vec.iov_base + au->buf->vec.iov_len;
-
-	msg_b = p + ceph_x_encrypt_offset();
-	msg_b->struct_v = 1;
 	get_random_bytes(&au->nonce, sizeof(au->nonce));
-	msg_b->nonce = cpu_to_le64(au->nonce);
-	ret = ceph_x_encrypt(&au->session_key, p, end - p, sizeof(*msg_b));
-	if (ret < 0)
+	ret = encrypt_authorizer(au);
+	if (ret) {
+		pr_err("failed to encrypt authorizer: %d", ret);
 		goto out_au;
+	}
 
-	p += ret;
-	WARN_ON(p > end);
-	au->buf->vec.iov_len = p - au->buf->vec.iov_base;
 	dout(" built authorizer nonce %llx len %d\n", au->nonce,
 	     (int)au->buf->vec.iov_len);
 	return 0;
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 017/146] libceph: add authorizer challenge
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (15 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 016/146] libceph: factor out encrypt_authorizer() Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 018/146] libceph: implement CEPHX_V2 calculation mode Greg Kroah-Hartman
                   ` (133 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ilya Dryomov, Sage Weil,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 6daca13d2e72bedaaacfc08f873114c9307d5aea upstream.

When a client authenticates with a service, an authorizer is sent with
a nonce to the service (ceph_x_authorize_[ab]) and the service responds
with a mutation of that nonce (ceph_x_authorize_reply).  This lets the
client verify the service is who it says it is but it doesn't protect
against a replay: someone can trivially capture the exchange and reuse
the same authorizer to authenticate themselves.

Allow the service to reject an initial authorizer with a random
challenge (ceph_x_authorize_challenge).  The client then has to respond
with an updated authorizer proving they are able to decrypt the
service's challenge and that the new authorizer was produced for this
specific connection instance.

The accepting side requires this challenge and response unconditionally
if the client side advertises they have CEPHX_V2 feature bit.

This addresses CVE-2018-1128.

Link: http://tracker.ceph.com/issues/24836
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/ceph/mds_client.c           | 11 ++++++
 include/linux/ceph/auth.h      |  8 ++++
 include/linux/ceph/messenger.h |  3 ++
 include/linux/ceph/msgr.h      |  2 +-
 net/ceph/auth.c                | 16 ++++++++
 net/ceph/auth_x.c              | 72 +++++++++++++++++++++++++++++++---
 net/ceph/auth_x_protocol.h     |  7 ++++
 net/ceph/messenger.c           | 17 +++++++-
 net/ceph/osd_client.c          | 11 ++++++
 9 files changed, 140 insertions(+), 7 deletions(-)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index bf378ddca4db..a48984dd6426 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -4079,6 +4079,16 @@ static struct ceph_auth_handshake *get_authorizer(struct ceph_connection *con,
 	return auth;
 }
 
+static int add_authorizer_challenge(struct ceph_connection *con,
+				    void *challenge_buf, int challenge_buf_len)
+{
+	struct ceph_mds_session *s = con->private;
+	struct ceph_mds_client *mdsc = s->s_mdsc;
+	struct ceph_auth_client *ac = mdsc->fsc->client->monc.auth;
+
+	return ceph_auth_add_authorizer_challenge(ac, s->s_auth.authorizer,
+					    challenge_buf, challenge_buf_len);
+}
 
 static int verify_authorizer_reply(struct ceph_connection *con)
 {
@@ -4142,6 +4152,7 @@ static const struct ceph_connection_operations mds_con_ops = {
 	.put = con_put,
 	.dispatch = dispatch,
 	.get_authorizer = get_authorizer,
+	.add_authorizer_challenge = add_authorizer_challenge,
 	.verify_authorizer_reply = verify_authorizer_reply,
 	.invalidate_authorizer = invalidate_authorizer,
 	.peer_reset = peer_reset,
diff --git a/include/linux/ceph/auth.h b/include/linux/ceph/auth.h
index e931da8424a4..6728c2ee0205 100644
--- a/include/linux/ceph/auth.h
+++ b/include/linux/ceph/auth.h
@@ -64,6 +64,10 @@ struct ceph_auth_client_ops {
 	/* ensure that an existing authorizer is up to date */
 	int (*update_authorizer)(struct ceph_auth_client *ac, int peer_type,
 				 struct ceph_auth_handshake *auth);
+	int (*add_authorizer_challenge)(struct ceph_auth_client *ac,
+					struct ceph_authorizer *a,
+					void *challenge_buf,
+					int challenge_buf_len);
 	int (*verify_authorizer_reply)(struct ceph_auth_client *ac,
 				       struct ceph_authorizer *a);
 	void (*invalidate_authorizer)(struct ceph_auth_client *ac,
@@ -118,6 +122,10 @@ void ceph_auth_destroy_authorizer(struct ceph_authorizer *a);
 extern int ceph_auth_update_authorizer(struct ceph_auth_client *ac,
 				       int peer_type,
 				       struct ceph_auth_handshake *a);
+int ceph_auth_add_authorizer_challenge(struct ceph_auth_client *ac,
+				       struct ceph_authorizer *a,
+				       void *challenge_buf,
+				       int challenge_buf_len);
 extern int ceph_auth_verify_authorizer_reply(struct ceph_auth_client *ac,
 					     struct ceph_authorizer *a);
 extern void ceph_auth_invalidate_authorizer(struct ceph_auth_client *ac,
diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index 9056077c023f..18fbe910ed55 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -31,6 +31,9 @@ struct ceph_connection_operations {
 	struct ceph_auth_handshake *(*get_authorizer) (
 				struct ceph_connection *con,
 			       int *proto, int force_new);
+	int (*add_authorizer_challenge)(struct ceph_connection *con,
+					void *challenge_buf,
+					int challenge_buf_len);
 	int (*verify_authorizer_reply) (struct ceph_connection *con);
 	int (*invalidate_authorizer)(struct ceph_connection *con);
 
diff --git a/include/linux/ceph/msgr.h b/include/linux/ceph/msgr.h
index 73ae2a926548..9e50aede46c8 100644
--- a/include/linux/ceph/msgr.h
+++ b/include/linux/ceph/msgr.h
@@ -91,7 +91,7 @@ struct ceph_entity_inst {
 #define CEPH_MSGR_TAG_SEQ           13 /* 64-bit int follows with seen seq number */
 #define CEPH_MSGR_TAG_KEEPALIVE2    14 /* keepalive2 byte + ceph_timespec */
 #define CEPH_MSGR_TAG_KEEPALIVE2_ACK 15 /* keepalive2 reply */
-
+#define CEPH_MSGR_TAG_CHALLENGE_AUTHORIZER 16  /* cephx v2 doing server challenge */
 
 /*
  * connection negotiation
diff --git a/net/ceph/auth.c b/net/ceph/auth.c
index dbde2b3c3c15..fbeee068ea14 100644
--- a/net/ceph/auth.c
+++ b/net/ceph/auth.c
@@ -315,6 +315,22 @@ int ceph_auth_update_authorizer(struct ceph_auth_client *ac,
 }
 EXPORT_SYMBOL(ceph_auth_update_authorizer);
 
+int ceph_auth_add_authorizer_challenge(struct ceph_auth_client *ac,
+				       struct ceph_authorizer *a,
+				       void *challenge_buf,
+				       int challenge_buf_len)
+{
+	int ret = 0;
+
+	mutex_lock(&ac->mutex);
+	if (ac->ops && ac->ops->add_authorizer_challenge)
+		ret = ac->ops->add_authorizer_challenge(ac, a, challenge_buf,
+							challenge_buf_len);
+	mutex_unlock(&ac->mutex);
+	return ret;
+}
+EXPORT_SYMBOL(ceph_auth_add_authorizer_challenge);
+
 int ceph_auth_verify_authorizer_reply(struct ceph_auth_client *ac,
 				      struct ceph_authorizer *a)
 {
diff --git a/net/ceph/auth_x.c b/net/ceph/auth_x.c
index 722791f45b2a..ce28bb07d8fd 100644
--- a/net/ceph/auth_x.c
+++ b/net/ceph/auth_x.c
@@ -295,7 +295,8 @@ bad:
  * authorizer.  The first part (ceph_x_authorize_a) should already be
  * encoded.
  */
-static int encrypt_authorizer(struct ceph_x_authorizer *au)
+static int encrypt_authorizer(struct ceph_x_authorizer *au,
+			      u64 *server_challenge)
 {
 	struct ceph_x_authorize_a *msg_a;
 	struct ceph_x_authorize_b *msg_b;
@@ -308,16 +309,28 @@ static int encrypt_authorizer(struct ceph_x_authorizer *au)
 	end = au->buf->vec.iov_base + au->buf->vec.iov_len;
 
 	msg_b = p + ceph_x_encrypt_offset();
-	msg_b->struct_v = 1;
+	msg_b->struct_v = 2;
 	msg_b->nonce = cpu_to_le64(au->nonce);
+	if (server_challenge) {
+		msg_b->have_challenge = 1;
+		msg_b->server_challenge_plus_one =
+		    cpu_to_le64(*server_challenge + 1);
+	} else {
+		msg_b->have_challenge = 0;
+		msg_b->server_challenge_plus_one = 0;
+	}
 
 	ret = ceph_x_encrypt(&au->session_key, p, end - p, sizeof(*msg_b));
 	if (ret < 0)
 		return ret;
 
 	p += ret;
-	WARN_ON(p > end);
-	au->buf->vec.iov_len = p - au->buf->vec.iov_base;
+	if (server_challenge) {
+		WARN_ON(p != end);
+	} else {
+		WARN_ON(p > end);
+		au->buf->vec.iov_len = p - au->buf->vec.iov_base;
+	}
 
 	return 0;
 }
@@ -382,7 +395,7 @@ static int ceph_x_build_authorizer(struct ceph_auth_client *ac,
 	     le64_to_cpu(msg_a->ticket_blob.secret_id));
 
 	get_random_bytes(&au->nonce, sizeof(au->nonce));
-	ret = encrypt_authorizer(au);
+	ret = encrypt_authorizer(au, NULL);
 	if (ret) {
 		pr_err("failed to encrypt authorizer: %d", ret);
 		goto out_au;
@@ -664,6 +677,54 @@ static int ceph_x_update_authorizer(
 	return 0;
 }
 
+static int decrypt_authorize_challenge(struct ceph_x_authorizer *au,
+				       void *challenge_buf,
+				       int challenge_buf_len,
+				       u64 *server_challenge)
+{
+	struct ceph_x_authorize_challenge *ch =
+	    challenge_buf + sizeof(struct ceph_x_encrypt_header);
+	int ret;
+
+	/* no leading len */
+	ret = __ceph_x_decrypt(&au->session_key, challenge_buf,
+			       challenge_buf_len);
+	if (ret < 0)
+		return ret;
+	if (ret < sizeof(*ch)) {
+		pr_err("bad size %d for ceph_x_authorize_challenge\n", ret);
+		return -EINVAL;
+	}
+
+	*server_challenge = le64_to_cpu(ch->server_challenge);
+	return 0;
+}
+
+static int ceph_x_add_authorizer_challenge(struct ceph_auth_client *ac,
+					   struct ceph_authorizer *a,
+					   void *challenge_buf,
+					   int challenge_buf_len)
+{
+	struct ceph_x_authorizer *au = (void *)a;
+	u64 server_challenge;
+	int ret;
+
+	ret = decrypt_authorize_challenge(au, challenge_buf, challenge_buf_len,
+					  &server_challenge);
+	if (ret) {
+		pr_err("failed to decrypt authorize challenge: %d", ret);
+		return ret;
+	}
+
+	ret = encrypt_authorizer(au, &server_challenge);
+	if (ret) {
+		pr_err("failed to encrypt authorizer w/ challenge: %d", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
 static int ceph_x_verify_authorizer_reply(struct ceph_auth_client *ac,
 					  struct ceph_authorizer *a)
 {
@@ -816,6 +877,7 @@ static const struct ceph_auth_client_ops ceph_x_ops = {
 	.handle_reply = ceph_x_handle_reply,
 	.create_authorizer = ceph_x_create_authorizer,
 	.update_authorizer = ceph_x_update_authorizer,
+	.add_authorizer_challenge = ceph_x_add_authorizer_challenge,
 	.verify_authorizer_reply = ceph_x_verify_authorizer_reply,
 	.invalidate_authorizer = ceph_x_invalidate_authorizer,
 	.reset =  ceph_x_reset,
diff --git a/net/ceph/auth_x_protocol.h b/net/ceph/auth_x_protocol.h
index 32c13d763b9a..24b0b74564d0 100644
--- a/net/ceph/auth_x_protocol.h
+++ b/net/ceph/auth_x_protocol.h
@@ -70,6 +70,13 @@ struct ceph_x_authorize_a {
 struct ceph_x_authorize_b {
 	__u8 struct_v;
 	__le64 nonce;
+	__u8 have_challenge;
+	__le64 server_challenge_plus_one;
+} __attribute__ ((packed));
+
+struct ceph_x_authorize_challenge {
+	__u8 struct_v;
+	__le64 server_challenge;
 } __attribute__ ((packed));
 
 struct ceph_x_authorize_reply {
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 0b121327d32f..ad33baa2008d 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2052,9 +2052,24 @@ static int process_connect(struct ceph_connection *con)
 	if (con->auth) {
 		/*
 		 * Any connection that defines ->get_authorizer()
-		 * should also define ->verify_authorizer_reply().
+		 * should also define ->add_authorizer_challenge() and
+		 * ->verify_authorizer_reply().
+		 *
 		 * See get_connect_authorizer().
 		 */
+		if (con->in_reply.tag == CEPH_MSGR_TAG_CHALLENGE_AUTHORIZER) {
+			ret = con->ops->add_authorizer_challenge(
+				    con, con->auth->authorizer_reply_buf,
+				    le32_to_cpu(con->in_reply.authorizer_len));
+			if (ret < 0)
+				return ret;
+
+			con_out_kvec_reset(con);
+			__prepare_write_connect(con);
+			prepare_read_connect(con);
+			return 0;
+		}
+
 		ret = con->ops->verify_authorizer_reply(con);
 		if (ret < 0) {
 			con->error_msg = "bad authorize reply";
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 2814dba5902d..53ea2d48896c 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -5292,6 +5292,16 @@ static struct ceph_auth_handshake *get_authorizer(struct ceph_connection *con,
 	return auth;
 }
 
+static int add_authorizer_challenge(struct ceph_connection *con,
+				    void *challenge_buf, int challenge_buf_len)
+{
+	struct ceph_osd *o = con->private;
+	struct ceph_osd_client *osdc = o->o_osdc;
+	struct ceph_auth_client *ac = osdc->client->monc.auth;
+
+	return ceph_auth_add_authorizer_challenge(ac, o->o_auth.authorizer,
+					    challenge_buf, challenge_buf_len);
+}
 
 static int verify_authorizer_reply(struct ceph_connection *con)
 {
@@ -5341,6 +5351,7 @@ static const struct ceph_connection_operations osd_con_ops = {
 	.put = put_osd_con,
 	.dispatch = dispatch,
 	.get_authorizer = get_authorizer,
+	.add_authorizer_challenge = add_authorizer_challenge,
 	.verify_authorizer_reply = verify_authorizer_reply,
 	.invalidate_authorizer = invalidate_authorizer,
 	.alloc_msg = alloc_msg,
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 018/146] libceph: implement CEPHX_V2 calculation mode
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (16 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 017/146] libceph: add authorizer challenge Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 12:06   ` Ilya Dryomov
  2018-12-04 10:48 ` [PATCH 4.14 019/146] bpf: Prevent memory disambiguation attack Greg Kroah-Hartman
                   ` (132 subsequent siblings)
  150 siblings, 1 reply; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ilya Dryomov, Sage Weil,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit cc255c76c70f7a87d97939621eae04b600d9f4a1 upstream.

Derive the signature from the entire buffer (both AES cipher blocks)
instead of using just the first half of the first block, leaving out
data_crc entirely.

This addresses CVE-2018-1129.

Link: http://tracker.ceph.com/issues/24837
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/linux/ceph/ceph_features.h |  7 +--
 net/ceph/auth_x.c                  | 73 +++++++++++++++++++++++-------
 2 files changed, 60 insertions(+), 20 deletions(-)

diff --git a/include/linux/ceph/ceph_features.h b/include/linux/ceph/ceph_features.h
index 59042d5ac520..70f42eef813b 100644
--- a/include/linux/ceph/ceph_features.h
+++ b/include/linux/ceph/ceph_features.h
@@ -165,9 +165,9 @@ DEFINE_CEPH_FEATURE(58, 1, FS_FILE_LAYOUT_V2) // overlap
 DEFINE_CEPH_FEATURE(59, 1, FS_BTIME)
 DEFINE_CEPH_FEATURE(59, 1, FS_CHANGE_ATTR) // overlap
 DEFINE_CEPH_FEATURE(59, 1, MSG_ADDR2) // overlap
-DEFINE_CEPH_FEATURE(60, 1, BLKIN_TRACING)  // *do not share this bit*
+DEFINE_CEPH_FEATURE(60, 1, OSD_RECOVERY_DELETES) // *do not share this bit*
+DEFINE_CEPH_FEATURE(61, 1, CEPHX_V2)             // *do not share this bit*
 
-DEFINE_CEPH_FEATURE(61, 1, RESERVED2)          // unused, but slow down!
 DEFINE_CEPH_FEATURE(62, 1, RESERVED)           // do not use; used as a sentinal
 DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, LUMINOUS) // client-facing
 
@@ -209,7 +209,8 @@ DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, LUMINOUS) // client-facin
 	 CEPH_FEATURE_SERVER_JEWEL |		\
 	 CEPH_FEATURE_MON_STATEFUL_SUB |	\
 	 CEPH_FEATURE_CRUSH_TUNABLES5 |		\
-	 CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING)
+	 CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING |	\
+	 CEPH_FEATURE_CEPHX_V2)
 
 #define CEPH_FEATURES_REQUIRED_DEFAULT   \
 	(CEPH_FEATURE_NOSRCADDR |	 \
diff --git a/net/ceph/auth_x.c b/net/ceph/auth_x.c
index ce28bb07d8fd..10eb759bbcb4 100644
--- a/net/ceph/auth_x.c
+++ b/net/ceph/auth_x.c
@@ -9,6 +9,7 @@
 
 #include <linux/ceph/decode.h>
 #include <linux/ceph/auth.h>
+#include <linux/ceph/ceph_features.h>
 #include <linux/ceph/libceph.h>
 #include <linux/ceph/messenger.h>
 
@@ -803,26 +804,64 @@ static int calc_signature(struct ceph_x_authorizer *au, struct ceph_msg *msg,
 			  __le64 *psig)
 {
 	void *enc_buf = au->enc_buf;
-	struct {
-		__le32 len;
-		__le32 header_crc;
-		__le32 front_crc;
-		__le32 middle_crc;
-		__le32 data_crc;
-	} __packed *sigblock = enc_buf + ceph_x_encrypt_offset();
 	int ret;
 
-	sigblock->len = cpu_to_le32(4*sizeof(u32));
-	sigblock->header_crc = msg->hdr.crc;
-	sigblock->front_crc = msg->footer.front_crc;
-	sigblock->middle_crc = msg->footer.middle_crc;
-	sigblock->data_crc =  msg->footer.data_crc;
-	ret = ceph_x_encrypt(&au->session_key, enc_buf, CEPHX_AU_ENC_BUF_LEN,
-			     sizeof(*sigblock));
-	if (ret < 0)
-		return ret;
+	if (!CEPH_HAVE_FEATURE(msg->con->peer_features, CEPHX_V2)) {
+		struct {
+			__le32 len;
+			__le32 header_crc;
+			__le32 front_crc;
+			__le32 middle_crc;
+			__le32 data_crc;
+		} __packed *sigblock = enc_buf + ceph_x_encrypt_offset();
+
+		sigblock->len = cpu_to_le32(4*sizeof(u32));
+		sigblock->header_crc = msg->hdr.crc;
+		sigblock->front_crc = msg->footer.front_crc;
+		sigblock->middle_crc = msg->footer.middle_crc;
+		sigblock->data_crc =  msg->footer.data_crc;
+
+		ret = ceph_x_encrypt(&au->session_key, enc_buf,
+				     CEPHX_AU_ENC_BUF_LEN, sizeof(*sigblock));
+		if (ret < 0)
+			return ret;
+
+		*psig = *(__le64 *)(enc_buf + sizeof(u32));
+	} else {
+		struct {
+			__le32 header_crc;
+			__le32 front_crc;
+			__le32 front_len;
+			__le32 middle_crc;
+			__le32 middle_len;
+			__le32 data_crc;
+			__le32 data_len;
+			__le32 seq_lower_word;
+		} __packed *sigblock = enc_buf;
+		struct {
+			__le64 a, b, c, d;
+		} __packed *penc = enc_buf;
+		int ciphertext_len;
+
+		sigblock->header_crc = msg->hdr.crc;
+		sigblock->front_crc = msg->footer.front_crc;
+		sigblock->front_len = msg->hdr.front_len;
+		sigblock->middle_crc = msg->footer.middle_crc;
+		sigblock->middle_len = msg->hdr.middle_len;
+		sigblock->data_crc =  msg->footer.data_crc;
+		sigblock->data_len = msg->hdr.data_len;
+		sigblock->seq_lower_word = *(__le32 *)&msg->hdr.seq;
+
+		/* no leading len, no ceph_x_encrypt_header */
+		ret = ceph_crypt(&au->session_key, true, enc_buf,
+				 CEPHX_AU_ENC_BUF_LEN, sizeof(*sigblock),
+				 &ciphertext_len);
+		if (ret)
+			return ret;
+
+		*psig = penc->a ^ penc->b ^ penc->c ^ penc->d;
+	}
 
-	*psig = *(__le64 *)(enc_buf + sizeof(u32));
 	return 0;
 }
 
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 019/146] bpf: Prevent memory disambiguation attack
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (17 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 018/146] libceph: implement CEPHX_V2 calculation mode Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 020/146] tls: Add function to update the TLS socket configuration Greg Kroah-Hartman
                   ` (131 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexei Starovoitov, Thomas Gleixner,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit af86ca4e3088fe5eacf2f7e58c01fa68ca067672 upstream.

Detect code patterns where malicious 'speculative store bypass' can be used
and sanitize such patterns.

 39: (bf) r3 = r10
 40: (07) r3 += -216
 41: (79) r8 = *(u64 *)(r7 +0)   // slow read
 42: (7a) *(u64 *)(r10 -72) = 0  // verifier inserts this instruction
 43: (7b) *(u64 *)(r8 +0) = r3   // this store becomes slow due to r8
 44: (79) r1 = *(u64 *)(r6 +0)   // cpu speculatively executes this load
 45: (71) r2 = *(u8 *)(r1 +0)    // speculatively arbitrary 'load byte'
                                 // is now sanitized

Above code after x86 JIT becomes:
 e5: mov    %rbp,%rdx
 e8: add    $0xffffffffffffff28,%rdx
 ef: mov    0x0(%r13),%r14
 f3: movq   $0x0,-0x48(%rbp)
 fb: mov    %rdx,0x0(%r14)
 ff: mov    0x0(%rbx),%rdi
103: movzbq 0x0(%rdi),%rsi

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[bwh: Backported to 4.14:
 - Add bpf_verifier_env parameter to check_stack_write()
 - Look up stack slot_types with state->stack_slot_type[] rather than
   state->stack[].slot_type[]
 - Drop bpf_verifier_env argument to verbose()
 - Adjust context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/linux/bpf_verifier.h |  1 +
 kernel/bpf/verifier.c        | 62 +++++++++++++++++++++++++++++++++---
 2 files changed, 59 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index a3333004fd2b..8458cc5fbce5 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -113,6 +113,7 @@ struct bpf_insn_aux_data {
 		struct bpf_map *map_ptr;	/* pointer for call insn into lookup_elem */
 	};
 	int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
+	int sanitize_stack_off; /* stack slot to be cleared */
 	bool seen; /* this insn was processed by the verifier */
 };
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 013b0cd1958e..f6755fd5bae2 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -717,8 +717,9 @@ static bool is_spillable_regtype(enum bpf_reg_type type)
 /* check_stack_read/write functions track spill/fill of registers,
  * stack boundary and alignment are checked in check_mem_access()
  */
-static int check_stack_write(struct bpf_verifier_state *state, int off,
-			     int size, int value_regno)
+static int check_stack_write(struct bpf_verifier_env *env,
+			     struct bpf_verifier_state *state, int off,
+			     int size, int value_regno, int insn_idx)
 {
 	int i, spi = (MAX_BPF_STACK + off) / BPF_REG_SIZE;
 	/* caller checked that off % size == 0 and -MAX_BPF_STACK <= off < 0,
@@ -738,8 +739,32 @@ static int check_stack_write(struct bpf_verifier_state *state, int off,
 		state->spilled_regs[spi] = state->regs[value_regno];
 		state->spilled_regs[spi].live |= REG_LIVE_WRITTEN;
 
-		for (i = 0; i < BPF_REG_SIZE; i++)
+		for (i = 0; i < BPF_REG_SIZE; i++) {
+			if (state->stack_slot_type[MAX_BPF_STACK + off + i] == STACK_MISC &&
+			    !env->allow_ptr_leaks) {
+				int *poff = &env->insn_aux_data[insn_idx].sanitize_stack_off;
+				int soff = (-spi - 1) * BPF_REG_SIZE;
+
+				/* detected reuse of integer stack slot with a pointer
+				 * which means either llvm is reusing stack slot or
+				 * an attacker is trying to exploit CVE-2018-3639
+				 * (speculative store bypass)
+				 * Have to sanitize that slot with preemptive
+				 * store of zero.
+				 */
+				if (*poff && *poff != soff) {
+					/* disallow programs where single insn stores
+					 * into two different stack slots, since verifier
+					 * cannot sanitize them
+					 */
+					verbose("insn %d cannot access two stack slots fp%d and fp%d",
+						insn_idx, *poff, soff);
+					return -EINVAL;
+				}
+				*poff = soff;
+			}
 			state->stack_slot_type[MAX_BPF_STACK + off + i] = STACK_SPILL;
+		}
 	} else {
 		/* regular write of data into stack */
 		state->spilled_regs[spi] = (struct bpf_reg_state) {};
@@ -1216,7 +1241,8 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
 				verbose("attempt to corrupt spilled pointer on stack\n");
 				return -EACCES;
 			}
-			err = check_stack_write(state, off, size, value_regno);
+			err = check_stack_write(env, state, off, size,
+						value_regno, insn_idx);
 		} else {
 			err = check_stack_read(state, off, size, value_regno);
 		}
@@ -4270,6 +4296,34 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
 		else
 			continue;
 
+		if (type == BPF_WRITE &&
+		    env->insn_aux_data[i + delta].sanitize_stack_off) {
+			struct bpf_insn patch[] = {
+				/* Sanitize suspicious stack slot with zero.
+				 * There are no memory dependencies for this store,
+				 * since it's only using frame pointer and immediate
+				 * constant of zero
+				 */
+				BPF_ST_MEM(BPF_DW, BPF_REG_FP,
+					   env->insn_aux_data[i + delta].sanitize_stack_off,
+					   0),
+				/* the original STX instruction will immediately
+				 * overwrite the same stack slot with appropriate value
+				 */
+				*insn,
+			};
+
+			cnt = ARRAY_SIZE(patch);
+			new_prog = bpf_patch_insn_data(env, i + delta, patch, cnt);
+			if (!new_prog)
+				return -ENOMEM;
+
+			delta    += cnt - 1;
+			env->prog = new_prog;
+			insn      = new_prog->insnsi + i + delta;
+			continue;
+		}
+
 		if (env->insn_aux_data[i + delta].ptr_type != PTR_TO_CTX)
 			continue;
 
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 020/146] tls: Add function to update the TLS socket configuration
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (18 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 019/146] bpf: Prevent memory disambiguation attack Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 021/146] tls: Fix TLS ulp context leak, when TLS_TX setsockopt is not used Greg Kroah-Hartman
                   ` (130 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ilya Lesokhin, David S. Miller,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 6d88207fcfddc002afe3e2e4a455e5201089d5d9 upstream.

The tx configuration is now stored in ctx->tx_conf.
And sk->sk_prot is updated trough a function
This will simplify things when we add rx
and support for different possible
tx and rx cross configurations.

Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/tls.h  |  2 ++
 net/tls/tls_main.c | 46 ++++++++++++++++++++++++++++++++--------------
 2 files changed, 34 insertions(+), 14 deletions(-)

diff --git a/include/net/tls.h b/include/net/tls.h
index 86ed3dd80fe7..0c3ab2af74d3 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -89,6 +89,8 @@ struct tls_context {
 
 	void *priv_ctx;
 
+	u8 tx_conf:2;
+
 	u16 prepend_size;
 	u16 tag_size;
 	u16 overhead_size;
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 4f2971f528db..191a8adee3ea 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -46,8 +46,18 @@ MODULE_DESCRIPTION("Transport Layer Security Support");
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_ALIAS_TCP_ULP("tls");
 
-static struct proto tls_base_prot;
-static struct proto tls_sw_prot;
+enum {
+	TLS_BASE_TX,
+	TLS_SW_TX,
+	TLS_NUM_CONFIG,
+};
+
+static struct proto tls_prots[TLS_NUM_CONFIG];
+
+static inline void update_sk_prot(struct sock *sk, struct tls_context *ctx)
+{
+	sk->sk_prot = &tls_prots[ctx->tx_conf];
+}
 
 int wait_on_pending_writer(struct sock *sk, long *timeo)
 {
@@ -364,8 +374,8 @@ static int do_tls_setsockopt_tx(struct sock *sk, char __user *optval,
 {
 	struct tls_crypto_info *crypto_info, tmp_crypto_info;
 	struct tls_context *ctx = tls_get_ctx(sk);
-	struct proto *prot = NULL;
 	int rc = 0;
+	int tx_conf;
 
 	if (!optval || (optlen < sizeof(*crypto_info))) {
 		rc = -EINVAL;
@@ -422,11 +432,12 @@ static int do_tls_setsockopt_tx(struct sock *sk, char __user *optval,
 
 	/* currently SW is default, we will have ethtool in future */
 	rc = tls_set_sw_offload(sk, ctx);
-	prot = &tls_sw_prot;
+	tx_conf = TLS_SW_TX;
 	if (rc)
 		goto err_crypto_info;
 
-	sk->sk_prot = prot;
+	ctx->tx_conf = tx_conf;
+	update_sk_prot(sk, ctx);
 	goto out;
 
 err_crypto_info:
@@ -488,7 +499,9 @@ static int tls_init(struct sock *sk)
 	icsk->icsk_ulp_data = ctx;
 	ctx->setsockopt = sk->sk_prot->setsockopt;
 	ctx->getsockopt = sk->sk_prot->getsockopt;
-	sk->sk_prot = &tls_base_prot;
+
+	ctx->tx_conf = TLS_BASE_TX;
+	update_sk_prot(sk, ctx);
 out:
 	return rc;
 }
@@ -499,16 +512,21 @@ static struct tcp_ulp_ops tcp_tls_ulp_ops __read_mostly = {
 	.init			= tls_init,
 };
 
+static void build_protos(struct proto *prot, struct proto *base)
+{
+	prot[TLS_BASE_TX] = *base;
+	prot[TLS_BASE_TX].setsockopt = tls_setsockopt;
+	prot[TLS_BASE_TX].getsockopt = tls_getsockopt;
+
+	prot[TLS_SW_TX] = prot[TLS_BASE_TX];
+	prot[TLS_SW_TX].close		= tls_sk_proto_close;
+	prot[TLS_SW_TX].sendmsg		= tls_sw_sendmsg;
+	prot[TLS_SW_TX].sendpage	= tls_sw_sendpage;
+}
+
 static int __init tls_register(void)
 {
-	tls_base_prot			= tcp_prot;
-	tls_base_prot.setsockopt	= tls_setsockopt;
-	tls_base_prot.getsockopt	= tls_getsockopt;
-
-	tls_sw_prot			= tls_base_prot;
-	tls_sw_prot.sendmsg		= tls_sw_sendmsg;
-	tls_sw_prot.sendpage            = tls_sw_sendpage;
-	tls_sw_prot.close               = tls_sk_proto_close;
+	build_protos(tls_prots, &tcp_prot);
 
 	tcp_register_ulp(&tcp_tls_ulp_ops);
 
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 021/146] tls: Fix TLS ulp context leak, when TLS_TX setsockopt is not used.
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (19 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 020/146] tls: Add function to update the TLS socket configuration Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 022/146] tls: Avoid copying crypto_info again after cipher_type check Greg Kroah-Hartman
                   ` (129 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ilya Lesokhin, David S. Miller,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit ff45d820a2df163957ad8ab459b6eb6976144c18 upstream.

Previously the TLS ulp context would leak if we attached a TLS ulp
to a socket but did not use the TLS_TX setsockopt,
or did use it but it failed.
This patch solves the issue by overriding prot[TLS_BASE_TX].close
and fixing tls_sk_proto_close to work properly
when its called with ctx->tx_conf == TLS_BASE_TX.
This patch also removes ctx->free_resources as we can use ctx->tx_conf
to obtain the relevant information.

Fixes: 3c4d7559159b ('tls: kernel TLS support')
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 4.14: Keep using tls_ctx_free() as introduced by
 the earlier backport of "tls: zero the crypto information from
 tls_context before freeing"]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/tls.h  |  2 +-
 net/tls/tls_main.c | 24 ++++++++++++++++--------
 net/tls/tls_sw.c   |  3 +--
 3 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/include/net/tls.h b/include/net/tls.h
index 0c3ab2af74d3..604fd982da19 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -106,7 +106,6 @@ struct tls_context {
 
 	u16 pending_open_record_frags;
 	int (*push_pending_record)(struct sock *sk, int flags);
-	void (*free_resources)(struct sock *sk);
 
 	void (*sk_write_space)(struct sock *sk);
 	void (*sk_proto_close)(struct sock *sk, long timeout);
@@ -131,6 +130,7 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
 int tls_sw_sendpage(struct sock *sk, struct page *page,
 		    int offset, size_t size, int flags);
 void tls_sw_close(struct sock *sk, long timeout);
+void tls_sw_free_tx_resources(struct sock *sk);
 
 void tls_sk_destruct(struct sock *sk, struct tls_context *ctx);
 void tls_icsk_clean_acked(struct sock *sk);
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 191a8adee3ea..b5f9c578bcf0 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -249,6 +249,12 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
 	void (*sk_proto_close)(struct sock *sk, long timeout);
 
 	lock_sock(sk);
+	sk_proto_close = ctx->sk_proto_close;
+
+	if (ctx->tx_conf == TLS_BASE_TX) {
+		tls_ctx_free(ctx);
+		goto skip_tx_cleanup;
+	}
 
 	if (!tls_complete_pending_work(sk, ctx, 0, &timeo))
 		tls_handle_open_record(sk, 0);
@@ -265,13 +271,16 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
 			sg++;
 		}
 	}
-	ctx->free_resources(sk);
+
 	kfree(ctx->rec_seq);
 	kfree(ctx->iv);
 
-	sk_proto_close = ctx->sk_proto_close;
-	tls_ctx_free(ctx);
+	if (ctx->tx_conf == TLS_SW_TX) {
+		tls_sw_free_tx_resources(sk);
+		tls_ctx_free(ctx);
+	}
 
+skip_tx_cleanup:
 	release_sock(sk);
 	sk_proto_close(sk, timeout);
 }
@@ -428,8 +437,6 @@ static int do_tls_setsockopt_tx(struct sock *sk, char __user *optval,
 	ctx->sk_write_space = sk->sk_write_space;
 	sk->sk_write_space = tls_write_space;
 
-	ctx->sk_proto_close = sk->sk_prot->close;
-
 	/* currently SW is default, we will have ethtool in future */
 	rc = tls_set_sw_offload(sk, ctx);
 	tx_conf = TLS_SW_TX;
@@ -499,6 +506,7 @@ static int tls_init(struct sock *sk)
 	icsk->icsk_ulp_data = ctx;
 	ctx->setsockopt = sk->sk_prot->setsockopt;
 	ctx->getsockopt = sk->sk_prot->getsockopt;
+	ctx->sk_proto_close = sk->sk_prot->close;
 
 	ctx->tx_conf = TLS_BASE_TX;
 	update_sk_prot(sk, ctx);
@@ -515,11 +523,11 @@ static struct tcp_ulp_ops tcp_tls_ulp_ops __read_mostly = {
 static void build_protos(struct proto *prot, struct proto *base)
 {
 	prot[TLS_BASE_TX] = *base;
-	prot[TLS_BASE_TX].setsockopt = tls_setsockopt;
-	prot[TLS_BASE_TX].getsockopt = tls_getsockopt;
+	prot[TLS_BASE_TX].setsockopt	= tls_setsockopt;
+	prot[TLS_BASE_TX].getsockopt	= tls_getsockopt;
+	prot[TLS_BASE_TX].close		= tls_sk_proto_close;
 
 	prot[TLS_SW_TX] = prot[TLS_BASE_TX];
-	prot[TLS_SW_TX].close		= tls_sk_proto_close;
 	prot[TLS_SW_TX].sendmsg		= tls_sw_sendmsg;
 	prot[TLS_SW_TX].sendpage	= tls_sw_sendpage;
 }
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 6ae9ca567d6c..5996fb5756e1 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -646,7 +646,7 @@ sendpage_end:
 	return ret;
 }
 
-static void tls_sw_free_resources(struct sock *sk)
+void tls_sw_free_tx_resources(struct sock *sk)
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
 	struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
@@ -685,7 +685,6 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx)
 	}
 
 	ctx->priv_ctx = (struct tls_offload_context *)sw_ctx;
-	ctx->free_resources = tls_sw_free_resources;
 
 	crypto_info = &ctx->crypto_send.info;
 	switch (crypto_info->cipher_type) {
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 022/146] tls: Avoid copying crypto_info again after cipher_type check.
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (20 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 021/146] tls: Fix TLS ulp context leak, when TLS_TX setsockopt is not used Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 023/146] tls: dont override sk_write_space if tls_set_sw_offload fails Greg Kroah-Hartman
                   ` (128 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ilya Lesokhin, David S. Miller,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 196c31b4b54474b31dee3c30352c45c2a93e9226 upstream.

Avoid copying crypto_info again after cipher_type check
to avoid a TOCTOU exploits.
The temporary array on the stack is removed as we don't really need it

Fixes: 3c4d7559159b ('tls: kernel TLS support')
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 4.14: Preserve changes made by earlier backports of
 "tls: return -EBUSY if crypto_info is already set" and "tls: zero the
 crypto information from tls_context before freeing"]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/tls/tls_main.c | 33 ++++++++++++++-------------------
 1 file changed, 14 insertions(+), 19 deletions(-)

diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index b5f9c578bcf0..f88df514ad5f 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -381,7 +381,7 @@ static int tls_getsockopt(struct sock *sk, int level, int optname,
 static int do_tls_setsockopt_tx(struct sock *sk, char __user *optval,
 				unsigned int optlen)
 {
-	struct tls_crypto_info *crypto_info, tmp_crypto_info;
+	struct tls_crypto_info *crypto_info;
 	struct tls_context *ctx = tls_get_ctx(sk);
 	int rc = 0;
 	int tx_conf;
@@ -391,38 +391,33 @@ static int do_tls_setsockopt_tx(struct sock *sk, char __user *optval,
 		goto out;
 	}
 
-	rc = copy_from_user(&tmp_crypto_info, optval, sizeof(*crypto_info));
+	crypto_info = &ctx->crypto_send.info;
+	/* Currently we don't support set crypto info more than one time */
+	if (TLS_CRYPTO_INFO_READY(crypto_info)) {
+		rc = -EBUSY;
+		goto out;
+	}
+
+	rc = copy_from_user(crypto_info, optval, sizeof(*crypto_info));
 	if (rc) {
 		rc = -EFAULT;
 		goto out;
 	}
 
 	/* check version */
-	if (tmp_crypto_info.version != TLS_1_2_VERSION) {
+	if (crypto_info->version != TLS_1_2_VERSION) {
 		rc = -ENOTSUPP;
-		goto out;
-	}
-
-	/* get user crypto info */
-	crypto_info = &ctx->crypto_send.info;
-
-	/* Currently we don't support set crypto info more than one time */
-	if (TLS_CRYPTO_INFO_READY(crypto_info)) {
-		rc = -EBUSY;
-		goto out;
+		goto err_crypto_info;
 	}
 
-	switch (tmp_crypto_info.cipher_type) {
+	switch (crypto_info->cipher_type) {
 	case TLS_CIPHER_AES_GCM_128: {
 		if (optlen != sizeof(struct tls12_crypto_info_aes_gcm_128)) {
 			rc = -EINVAL;
 			goto err_crypto_info;
 		}
-		rc = copy_from_user(
-		  crypto_info,
-		  optval,
-		  sizeof(struct tls12_crypto_info_aes_gcm_128));
-
+		rc = copy_from_user(crypto_info + 1, optval + sizeof(*crypto_info),
+				    optlen - sizeof(*crypto_info));
 		if (rc) {
 			rc = -EFAULT;
 			goto err_crypto_info;
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 023/146] tls: dont override sk_write_space if tls_set_sw_offload fails.
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (21 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 022/146] tls: Avoid copying crypto_info again after cipher_type check Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 024/146] tls: Use correct sk->sk_prot for IPV6 Greg Kroah-Hartman
                   ` (127 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ilya Lesokhin, David S. Miller,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit ee181e5201e640a4b92b217e9eab2531dab57d2c upstream.

If we fail to enable tls in the kernel we shouldn't override
the sk_write_space callback

Fixes: 3c4d7559159b ('tls: kernel TLS support')
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/tls/tls_main.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index f88df514ad5f..33187e34599b 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -429,9 +429,6 @@ static int do_tls_setsockopt_tx(struct sock *sk, char __user *optval,
 		goto err_crypto_info;
 	}
 
-	ctx->sk_write_space = sk->sk_write_space;
-	sk->sk_write_space = tls_write_space;
-
 	/* currently SW is default, we will have ethtool in future */
 	rc = tls_set_sw_offload(sk, ctx);
 	tx_conf = TLS_SW_TX;
@@ -440,6 +437,8 @@ static int do_tls_setsockopt_tx(struct sock *sk, char __user *optval,
 
 	ctx->tx_conf = tx_conf;
 	update_sk_prot(sk, ctx);
+	ctx->sk_write_space = sk->sk_write_space;
+	sk->sk_write_space = tls_write_space;
 	goto out;
 
 err_crypto_info:
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 024/146] tls: Use correct sk->sk_prot for IPV6
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (22 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 023/146] tls: dont override sk_write_space if tls_set_sw_offload fails Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 025/146] net/tls: Fixed return value when tls_complete_pending_work() fails Greg Kroah-Hartman
                   ` (126 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Boris Pismenny, Ilya Lesokhin,
	David S. Miller, Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit c113187d38ff85dc302a1bb55864b203ebb2ba10 upstream.

The tls ulp overrides sk->prot with a new tls specific proto structs.
The tls specific structs were previously based on the ipv4 specific
tcp_prot sturct.
As a result, attaching the tls ulp to an ipv6 tcp socket replaced
some ipv6 callback with the ipv4 equivalents.

This patch adds ipv6 tls proto structs and uses them when
attached to ipv6 sockets.

Fixes: 3c4d7559159b ('tls: kernel TLS support')
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/tls/tls_main.c | 52 +++++++++++++++++++++++++++++++++-------------
 1 file changed, 37 insertions(+), 15 deletions(-)

diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 33187e34599b..e903bdd39b9f 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -46,17 +46,27 @@ MODULE_DESCRIPTION("Transport Layer Security Support");
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_ALIAS_TCP_ULP("tls");
 
+enum {
+	TLSV4,
+	TLSV6,
+	TLS_NUM_PROTS,
+};
+
 enum {
 	TLS_BASE_TX,
 	TLS_SW_TX,
 	TLS_NUM_CONFIG,
 };
 
-static struct proto tls_prots[TLS_NUM_CONFIG];
+static struct proto *saved_tcpv6_prot;
+static DEFINE_MUTEX(tcpv6_prot_mutex);
+static struct proto tls_prots[TLS_NUM_PROTS][TLS_NUM_CONFIG];
 
 static inline void update_sk_prot(struct sock *sk, struct tls_context *ctx)
 {
-	sk->sk_prot = &tls_prots[ctx->tx_conf];
+	int ip_ver = sk->sk_family == AF_INET6 ? TLSV6 : TLSV4;
+
+	sk->sk_prot = &tls_prots[ip_ver][ctx->tx_conf];
 }
 
 int wait_on_pending_writer(struct sock *sk, long *timeo)
@@ -476,8 +486,21 @@ static int tls_setsockopt(struct sock *sk, int level, int optname,
 	return do_tls_setsockopt(sk, optname, optval, optlen);
 }
 
+static void build_protos(struct proto *prot, struct proto *base)
+{
+	prot[TLS_BASE_TX] = *base;
+	prot[TLS_BASE_TX].setsockopt	= tls_setsockopt;
+	prot[TLS_BASE_TX].getsockopt	= tls_getsockopt;
+	prot[TLS_BASE_TX].close		= tls_sk_proto_close;
+
+	prot[TLS_SW_TX] = prot[TLS_BASE_TX];
+	prot[TLS_SW_TX].sendmsg		= tls_sw_sendmsg;
+	prot[TLS_SW_TX].sendpage	= tls_sw_sendpage;
+}
+
 static int tls_init(struct sock *sk)
 {
+	int ip_ver = sk->sk_family == AF_INET6 ? TLSV6 : TLSV4;
 	struct inet_connection_sock *icsk = inet_csk(sk);
 	struct tls_context *ctx;
 	int rc = 0;
@@ -502,6 +525,17 @@ static int tls_init(struct sock *sk)
 	ctx->getsockopt = sk->sk_prot->getsockopt;
 	ctx->sk_proto_close = sk->sk_prot->close;
 
+	/* Build IPv6 TLS whenever the address of tcpv6_prot changes */
+	if (ip_ver == TLSV6 &&
+	    unlikely(sk->sk_prot != smp_load_acquire(&saved_tcpv6_prot))) {
+		mutex_lock(&tcpv6_prot_mutex);
+		if (likely(sk->sk_prot != saved_tcpv6_prot)) {
+			build_protos(tls_prots[TLSV6], sk->sk_prot);
+			smp_store_release(&saved_tcpv6_prot, sk->sk_prot);
+		}
+		mutex_unlock(&tcpv6_prot_mutex);
+	}
+
 	ctx->tx_conf = TLS_BASE_TX;
 	update_sk_prot(sk, ctx);
 out:
@@ -514,21 +548,9 @@ static struct tcp_ulp_ops tcp_tls_ulp_ops __read_mostly = {
 	.init			= tls_init,
 };
 
-static void build_protos(struct proto *prot, struct proto *base)
-{
-	prot[TLS_BASE_TX] = *base;
-	prot[TLS_BASE_TX].setsockopt	= tls_setsockopt;
-	prot[TLS_BASE_TX].getsockopt	= tls_getsockopt;
-	prot[TLS_BASE_TX].close		= tls_sk_proto_close;
-
-	prot[TLS_SW_TX] = prot[TLS_BASE_TX];
-	prot[TLS_SW_TX].sendmsg		= tls_sw_sendmsg;
-	prot[TLS_SW_TX].sendpage	= tls_sw_sendpage;
-}
-
 static int __init tls_register(void)
 {
-	build_protos(tls_prots, &tcp_prot);
+	build_protos(tls_prots[TLSV4], &tcp_prot);
 
 	tcp_register_ulp(&tcp_tls_ulp_ops);
 
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 025/146] net/tls: Fixed return value when tls_complete_pending_work() fails
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (23 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 024/146] tls: Use correct sk->sk_prot for IPV6 Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 026/146] wil6210: missing length check in wmi_set_ie Greg Kroah-Hartman
                   ` (125 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Vakul Garg, David S. Miller,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 150085791afb8054e11d2e080d4b9cd755dd7f69 upstream.

In tls_sw_sendmsg() and tls_sw_sendpage(), the variable 'ret' has
been set to return value of tls_complete_pending_work(). This allows
return of proper error code if tls_complete_pending_work() fails.

Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 4.14: adjust context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/tls/tls_sw.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 5996fb5756e1..d18d4a478e4f 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -388,7 +388,7 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
 	struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
-	int ret = 0;
+	int ret;
 	int required_size;
 	long timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
 	bool eor = !(msg->msg_flags & MSG_MORE);
@@ -403,7 +403,8 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 
 	lock_sock(sk);
 
-	if (tls_complete_pending_work(sk, tls_ctx, msg->msg_flags, &timeo))
+	ret = tls_complete_pending_work(sk, tls_ctx, msg->msg_flags, &timeo);
+	if (ret)
 		goto send_end;
 
 	if (unlikely(msg->msg_controllen)) {
@@ -539,7 +540,7 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
 	struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
-	int ret = 0;
+	int ret;
 	long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
 	bool eor;
 	size_t orig_size = size;
@@ -559,7 +560,8 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
 
 	sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
 
-	if (tls_complete_pending_work(sk, tls_ctx, flags, &timeo))
+	ret = tls_complete_pending_work(sk, tls_ctx, flags, &timeo);
+	if (ret)
 		goto sendpage_end;
 
 	/* Call the sk_stream functions to manage the sndbuf mem. */
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 026/146] wil6210: missing length check in wmi_set_ie
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (24 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 025/146] net/tls: Fixed return value when tls_complete_pending_work() fails Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 027/146] btrfs: validate type when reading a chunk Greg Kroah-Hartman
                   ` (124 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Lior David, Maya Erez, Kalle Valo,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit b5a8ffcae4103a9d823ea3aa3a761f65779fbe2a upstream.

Add a length check in wmi_set_ie to detect unsigned integer
overflow.

Signed-off-by: Lior David <qca_liord@qca.qualcomm.com>
Signed-off-by: Maya Erez <qca_merez@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/wireless/ath/wil6210/wmi.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/wil6210/wmi.c b/drivers/net/wireless/ath/wil6210/wmi.c
index ffdd2fa401b1..d63d7c326801 100644
--- a/drivers/net/wireless/ath/wil6210/wmi.c
+++ b/drivers/net/wireless/ath/wil6210/wmi.c
@@ -1380,8 +1380,14 @@ int wmi_set_ie(struct wil6210_priv *wil, u8 type, u16 ie_len, const void *ie)
 	};
 	int rc;
 	u16 len = sizeof(struct wmi_set_appie_cmd) + ie_len;
-	struct wmi_set_appie_cmd *cmd = kzalloc(len, GFP_KERNEL);
+	struct wmi_set_appie_cmd *cmd;
 
+	if (len < ie_len) {
+		rc = -EINVAL;
+		goto out;
+	}
+
+	cmd = kzalloc(len, GFP_KERNEL);
 	if (!cmd) {
 		rc = -ENOMEM;
 		goto out;
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 027/146] btrfs: validate type when reading a chunk
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (25 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 026/146] wil6210: missing length check in wmi_set_ie Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 028/146] btrfs: Verify that every chunk has corresponding block group at mount time Greg Kroah-Hartman
                   ` (123 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Xu Wen, Qu Wenruo, Gu Jinxiang,
	David Sterba, Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 315409b0098fb2651d86553f0436b70502b29bb2 upstream.

Reported in https://bugzilla.kernel.org/show_bug.cgi?id=199839, with an
image that has an invalid chunk type but does not return an error.

Add chunk type check in btrfs_check_chunk_valid, to detect the wrong
type combinations.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=199839
Reported-by: Xu Wen <wen.xu@gatech.edu>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/volumes.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index a0947f4a3e87..cfd5728e7519 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6353,6 +6353,8 @@ static int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
 	u16 num_stripes;
 	u16 sub_stripes;
 	u64 type;
+	u64 features;
+	bool mixed = false;
 
 	length = btrfs_chunk_length(leaf, chunk);
 	stripe_len = btrfs_chunk_stripe_len(leaf, chunk);
@@ -6391,6 +6393,32 @@ static int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
 			  btrfs_chunk_type(leaf, chunk));
 		return -EIO;
 	}
+
+	if ((type & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0) {
+		btrfs_err(fs_info, "missing chunk type flag: 0x%llx", type);
+		return -EIO;
+	}
+
+	if ((type & BTRFS_BLOCK_GROUP_SYSTEM) &&
+	    (type & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_DATA))) {
+		btrfs_err(fs_info,
+			"system chunk with data or metadata type: 0x%llx", type);
+		return -EIO;
+	}
+
+	features = btrfs_super_incompat_flags(fs_info->super_copy);
+	if (features & BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS)
+		mixed = true;
+
+	if (!mixed) {
+		if ((type & BTRFS_BLOCK_GROUP_METADATA) &&
+		    (type & BTRFS_BLOCK_GROUP_DATA)) {
+			btrfs_err(fs_info,
+			"mixed chunk type in non-mixed mode: 0x%llx", type);
+			return -EIO;
+		}
+	}
+
 	if ((type & BTRFS_BLOCK_GROUP_RAID10 && sub_stripes != 2) ||
 	    (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) ||
 	    (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) ||
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 028/146] btrfs: Verify that every chunk has corresponding block group at mount time
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (26 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 027/146] btrfs: validate type when reading a chunk Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 029/146] btrfs: Refactor check_leaf function for later expansion Greg Kroah-Hartman
                   ` (122 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Xu Wen, Qu Wenruo, Gu Jinxiang,
	David Sterba, Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 7ef49515fa6727cb4b6f2f5b0ffbc5fc20a9f8c6 upstream.

If a crafted image has missing block group items, it could cause
unexpected behavior and breaks the assumption of 1:1 chunk<->block group
mapping.

Although we have the block group -> chunk mapping check, we still need
chunk -> block group mapping check.

This patch will do extra check to ensure each chunk has its
corresponding block group.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=199847
Reported-by: Xu Wen <wen.xu@gatech.edu>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/extent-tree.c | 58 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 57 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 2cb3569ac548..fdc42eddccc2 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -10092,6 +10092,62 @@ btrfs_create_block_group_cache(struct btrfs_fs_info *fs_info,
 	return cache;
 }
 
+
+/*
+ * Iterate all chunks and verify that each of them has the corresponding block
+ * group
+ */
+static int check_chunk_block_group_mappings(struct btrfs_fs_info *fs_info)
+{
+	struct btrfs_mapping_tree *map_tree = &fs_info->mapping_tree;
+	struct extent_map *em;
+	struct btrfs_block_group_cache *bg;
+	u64 start = 0;
+	int ret = 0;
+
+	while (1) {
+		read_lock(&map_tree->map_tree.lock);
+		/*
+		 * lookup_extent_mapping will return the first extent map
+		 * intersecting the range, so setting @len to 1 is enough to
+		 * get the first chunk.
+		 */
+		em = lookup_extent_mapping(&map_tree->map_tree, start, 1);
+		read_unlock(&map_tree->map_tree.lock);
+		if (!em)
+			break;
+
+		bg = btrfs_lookup_block_group(fs_info, em->start);
+		if (!bg) {
+			btrfs_err(fs_info,
+	"chunk start=%llu len=%llu doesn't have corresponding block group",
+				     em->start, em->len);
+			ret = -EUCLEAN;
+			free_extent_map(em);
+			break;
+		}
+		if (bg->key.objectid != em->start ||
+		    bg->key.offset != em->len ||
+		    (bg->flags & BTRFS_BLOCK_GROUP_TYPE_MASK) !=
+		    (em->map_lookup->type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
+			btrfs_err(fs_info,
+"chunk start=%llu len=%llu flags=0x%llx doesn't match block group start=%llu len=%llu flags=0x%llx",
+				em->start, em->len,
+				em->map_lookup->type & BTRFS_BLOCK_GROUP_TYPE_MASK,
+				bg->key.objectid, bg->key.offset,
+				bg->flags & BTRFS_BLOCK_GROUP_TYPE_MASK);
+			ret = -EUCLEAN;
+			free_extent_map(em);
+			btrfs_put_block_group(bg);
+			break;
+		}
+		start = em->start + em->len;
+		free_extent_map(em);
+		btrfs_put_block_group(bg);
+	}
+	return ret;
+}
+
 int btrfs_read_block_groups(struct btrfs_fs_info *info)
 {
 	struct btrfs_path *path;
@@ -10264,7 +10320,7 @@ int btrfs_read_block_groups(struct btrfs_fs_info *info)
 	}
 
 	init_global_block_rsv(info);
-	ret = 0;
+	ret = check_chunk_block_group_mappings(info);
 error:
 	btrfs_free_path(path);
 	return ret;
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 029/146] btrfs: Refactor check_leaf function for later expansion
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (27 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 028/146] btrfs: Verify that every chunk has corresponding block group at mount time Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 030/146] btrfs: Check if item pointer overlaps with the item itself Greg Kroah-Hartman
                   ` (121 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Qu Wenruo, Nikolay Borisov,
	David Sterba, Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit c3267bbaa9cae09b62960eafe33ad19196803285 upstream.

Current check_leaf() function does a good job checking key order and
item offset/size.

However it only checks from slot 0 to the last but one slot, this is
good but makes later expansion hard.

So this refactoring iterates from slot 0 to the last slot.
For key comparison, it uses a key with all 0 as initial key, so all
valid keys should be larger than that.

And for item size/offset checks, it compares current item end with
previous item offset.
For slot 0, use leaf end as a special case.

This makes later item/key offset checks and item size checks easier to
be implemented.

Also, makes check_leaf() to return -EUCLEAN other than -EIO to indicate
error.

Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/disk-io.c | 50 +++++++++++++++++++++++++---------------------
 1 file changed, 27 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 0e67cee73c53..4a1e63df1183 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -554,8 +554,9 @@ static noinline int check_leaf(struct btrfs_root *root,
 			       struct extent_buffer *leaf)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
+	/* No valid key type is 0, so all key should be larger than this key */
+	struct btrfs_key prev_key = {0, 0, 0};
 	struct btrfs_key key;
-	struct btrfs_key leaf_key;
 	u32 nritems = btrfs_header_nritems(leaf);
 	int slot;
 
@@ -588,7 +589,7 @@ static noinline int check_leaf(struct btrfs_root *root,
 				CORRUPT("non-root leaf's nritems is 0",
 					leaf, check_root, 0);
 				free_extent_buffer(eb);
-				return -EIO;
+				return -EUCLEAN;
 			}
 			free_extent_buffer(eb);
 		}
@@ -598,28 +599,23 @@ static noinline int check_leaf(struct btrfs_root *root,
 	if (nritems == 0)
 		return 0;
 
-	/* Check the 0 item */
-	if (btrfs_item_offset_nr(leaf, 0) + btrfs_item_size_nr(leaf, 0) !=
-	    BTRFS_LEAF_DATA_SIZE(fs_info)) {
-		CORRUPT("invalid item offset size pair", leaf, root, 0);
-		return -EIO;
-	}
-
 	/*
-	 * Check to make sure each items keys are in the correct order and their
-	 * offsets make sense.  We only have to loop through nritems-1 because
-	 * we check the current slot against the next slot, which verifies the
-	 * next slot's offset+size makes sense and that the current's slot
-	 * offset is correct.
+	 * Check the following things to make sure this is a good leaf, and
+	 * leaf users won't need to bother with similar sanity checks:
+	 *
+	 * 1) key order
+	 * 2) item offset and size
+	 *    No overlap, no hole, all inside the leaf.
 	 */
-	for (slot = 0; slot < nritems - 1; slot++) {
-		btrfs_item_key_to_cpu(leaf, &leaf_key, slot);
-		btrfs_item_key_to_cpu(leaf, &key, slot + 1);
+	for (slot = 0; slot < nritems; slot++) {
+		u32 item_end_expected;
+
+		btrfs_item_key_to_cpu(leaf, &key, slot);
 
 		/* Make sure the keys are in the right order */
-		if (btrfs_comp_cpu_keys(&leaf_key, &key) >= 0) {
+		if (btrfs_comp_cpu_keys(&prev_key, &key) >= 0) {
 			CORRUPT("bad key order", leaf, root, slot);
-			return -EIO;
+			return -EUCLEAN;
 		}
 
 		/*
@@ -627,10 +623,14 @@ static noinline int check_leaf(struct btrfs_root *root,
 		 * item data starts at the end of the leaf and grows towards the
 		 * front.
 		 */
-		if (btrfs_item_offset_nr(leaf, slot) !=
-			btrfs_item_end_nr(leaf, slot + 1)) {
+		if (slot == 0)
+			item_end_expected = BTRFS_LEAF_DATA_SIZE(fs_info);
+		else
+			item_end_expected = btrfs_item_offset_nr(leaf,
+								 slot - 1);
+		if (btrfs_item_end_nr(leaf, slot) != item_end_expected) {
 			CORRUPT("slot offset bad", leaf, root, slot);
-			return -EIO;
+			return -EUCLEAN;
 		}
 
 		/*
@@ -641,8 +641,12 @@ static noinline int check_leaf(struct btrfs_root *root,
 		if (btrfs_item_end_nr(leaf, slot) >
 		    BTRFS_LEAF_DATA_SIZE(fs_info)) {
 			CORRUPT("slot end outside of leaf", leaf, root, slot);
-			return -EIO;
+			return -EUCLEAN;
 		}
+
+		prev_key.objectid = key.objectid;
+		prev_key.type = key.type;
+		prev_key.offset = key.offset;
 	}
 
 	return 0;
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 030/146] btrfs: Check if item pointer overlaps with the item itself
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (28 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 029/146] btrfs: Refactor check_leaf function for later expansion Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 031/146] btrfs: Add sanity check for EXTENT_DATA when reading out leaf Greg Kroah-Hartman
                   ` (120 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Qu Wenruo, Nikolay Borisov,
	David Sterba, Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 7f43d4affb2a254d421ab20b0cf65ac2569909fb upstream.

Function check_leaf() checks if any item pointer points outside of the
leaf, but it doesn't check if the pointer overlaps with the item itself.

Normally only the last item may be the victim, but adding such check is
never a bad idea anyway.

Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/disk-io.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 4a1e63df1183..da7b2039e4cb 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -644,6 +644,13 @@ static noinline int check_leaf(struct btrfs_root *root,
 			return -EUCLEAN;
 		}
 
+		/* Also check if the item pointer overlaps with btrfs item. */
+		if (btrfs_item_nr_offset(slot) + sizeof(struct btrfs_item) >
+		    btrfs_item_ptr_offset(leaf, slot)) {
+			CORRUPT("slot overlap with its data", leaf, root, slot);
+			return -EUCLEAN;
+		}
+
 		prev_key.objectid = key.objectid;
 		prev_key.type = key.type;
 		prev_key.offset = key.offset;
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 031/146] btrfs: Add sanity check for EXTENT_DATA when reading out leaf
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (29 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 030/146] btrfs: Check if item pointer overlaps with the item itself Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 032/146] btrfs: Add checker for EXTENT_CSUM Greg Kroah-Hartman
                   ` (119 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Qu Wenruo, Nikolay Borisov,
	David Sterba, Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 40c3c40947324d9f40bf47830c92c59a9bbadf4a upstream.

Add extra checks for item with EXTENT_DATA type.  This checks the
following thing:

0) Key offset
   All key offsets must be aligned to sectorsize.
   Inline extent must have 0 for key offset.

1) Item size
   Uncompressed inline file extent size must match item size.
   (Compressed inline file extent has no information about its on-disk size.)
   Regular/preallocated file extent size must be a fixed value.

2) Every member of regular file extent item
   Including alignment for bytenr and offset, possible value for
   compression/encryption/type.

3) Type/compression/encode must be one of the valid values.

This should be the most comprehensive and strict check in the context
of btrfs_item for EXTENT_DATA.

Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ switch to BTRFS_FILE_EXTENT_TYPES, similar to what
  BTRFS_COMPRESS_TYPES does ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/disk-io.c              | 103 ++++++++++++++++++++++++++++++++
 include/uapi/linux/btrfs_tree.h |   1 +
 2 files changed, 104 insertions(+)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index da7b2039e4cb..ab8925b2efd1 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -550,6 +550,100 @@ static int check_tree_block_fsid(struct btrfs_fs_info *fs_info,
 		   btrfs_header_level(eb) == 0 ? "leaf" : "node",	\
 		   reason, btrfs_header_bytenr(eb), root->objectid, slot)
 
+static int check_extent_data_item(struct btrfs_root *root,
+				  struct extent_buffer *leaf,
+				  struct btrfs_key *key, int slot)
+{
+	struct btrfs_file_extent_item *fi;
+	u32 sectorsize = root->fs_info->sectorsize;
+	u32 item_size = btrfs_item_size_nr(leaf, slot);
+
+	if (!IS_ALIGNED(key->offset, sectorsize)) {
+		CORRUPT("unaligned key offset for file extent",
+			leaf, root, slot);
+		return -EUCLEAN;
+	}
+
+	fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item);
+
+	if (btrfs_file_extent_type(leaf, fi) > BTRFS_FILE_EXTENT_TYPES) {
+		CORRUPT("invalid file extent type", leaf, root, slot);
+		return -EUCLEAN;
+	}
+
+	/*
+	 * Support for new compression/encrption must introduce incompat flag,
+	 * and must be caught in open_ctree().
+	 */
+	if (btrfs_file_extent_compression(leaf, fi) > BTRFS_COMPRESS_TYPES) {
+		CORRUPT("invalid file extent compression", leaf, root, slot);
+		return -EUCLEAN;
+	}
+	if (btrfs_file_extent_encryption(leaf, fi)) {
+		CORRUPT("invalid file extent encryption", leaf, root, slot);
+		return -EUCLEAN;
+	}
+	if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) {
+		/* Inline extent must have 0 as key offset */
+		if (key->offset) {
+			CORRUPT("inline extent has non-zero key offset",
+				leaf, root, slot);
+			return -EUCLEAN;
+		}
+
+		/* Compressed inline extent has no on-disk size, skip it */
+		if (btrfs_file_extent_compression(leaf, fi) !=
+		    BTRFS_COMPRESS_NONE)
+			return 0;
+
+		/* Uncompressed inline extent size must match item size */
+		if (item_size != BTRFS_FILE_EXTENT_INLINE_DATA_START +
+		    btrfs_file_extent_ram_bytes(leaf, fi)) {
+			CORRUPT("plaintext inline extent has invalid size",
+				leaf, root, slot);
+			return -EUCLEAN;
+		}
+		return 0;
+	}
+
+	/* Regular or preallocated extent has fixed item size */
+	if (item_size != sizeof(*fi)) {
+		CORRUPT(
+		"regluar or preallocated extent data item size is invalid",
+			leaf, root, slot);
+		return -EUCLEAN;
+	}
+	if (!IS_ALIGNED(btrfs_file_extent_ram_bytes(leaf, fi), sectorsize) ||
+	    !IS_ALIGNED(btrfs_file_extent_disk_bytenr(leaf, fi), sectorsize) ||
+	    !IS_ALIGNED(btrfs_file_extent_disk_num_bytes(leaf, fi), sectorsize) ||
+	    !IS_ALIGNED(btrfs_file_extent_offset(leaf, fi), sectorsize) ||
+	    !IS_ALIGNED(btrfs_file_extent_num_bytes(leaf, fi), sectorsize)) {
+		CORRUPT(
+		"regular or preallocated extent data item has unaligned value",
+			leaf, root, slot);
+		return -EUCLEAN;
+	}
+
+	return 0;
+}
+
+/*
+ * Common point to switch the item-specific validation.
+ */
+static int check_leaf_item(struct btrfs_root *root,
+			   struct extent_buffer *leaf,
+			   struct btrfs_key *key, int slot)
+{
+	int ret = 0;
+
+	switch (key->type) {
+	case BTRFS_EXTENT_DATA_KEY:
+		ret = check_extent_data_item(root, leaf, key, slot);
+		break;
+	}
+	return ret;
+}
+
 static noinline int check_leaf(struct btrfs_root *root,
 			       struct extent_buffer *leaf)
 {
@@ -606,9 +700,13 @@ static noinline int check_leaf(struct btrfs_root *root,
 	 * 1) key order
 	 * 2) item offset and size
 	 *    No overlap, no hole, all inside the leaf.
+	 * 3) item content
+	 *    If possible, do comprehensive sanity check.
+	 *    NOTE: All checks must only rely on the item data itself.
 	 */
 	for (slot = 0; slot < nritems; slot++) {
 		u32 item_end_expected;
+		int ret;
 
 		btrfs_item_key_to_cpu(leaf, &key, slot);
 
@@ -651,6 +749,11 @@ static noinline int check_leaf(struct btrfs_root *root,
 			return -EUCLEAN;
 		}
 
+		/* Check if the item size and content meet other criteria */
+		ret = check_leaf_item(root, leaf, &key, slot);
+		if (ret < 0)
+			return ret;
+
 		prev_key.objectid = key.objectid;
 		prev_key.type = key.type;
 		prev_key.offset = key.offset;
diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h
index 7115838fbf2a..38ab0e06259a 100644
--- a/include/uapi/linux/btrfs_tree.h
+++ b/include/uapi/linux/btrfs_tree.h
@@ -734,6 +734,7 @@ struct btrfs_balance_item {
 #define BTRFS_FILE_EXTENT_INLINE 0
 #define BTRFS_FILE_EXTENT_REG 1
 #define BTRFS_FILE_EXTENT_PREALLOC 2
+#define BTRFS_FILE_EXTENT_TYPES	2
 
 struct btrfs_file_extent_item {
 	/*
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 032/146] btrfs: Add checker for EXTENT_CSUM
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (30 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 031/146] btrfs: Add sanity check for EXTENT_DATA when reading out leaf Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 033/146] btrfs: Move leaf and node validation checker to tree-checker.c Greg Kroah-Hartman
                   ` (118 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Qu Wenruo, Nikolay Borisov,
	David Sterba, Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 4b865cab96fe2a30ed512cf667b354bd291b3b0a upstream.

EXTENT_CSUM checker is a relatively easy one, only needs to check:

1) Objectid
   Fixed to BTRFS_EXTENT_CSUM_OBJECTID

2) Key offset alignment
   Must be aligned to sectorsize

3) Item size alignedment
   Must be aligned to csum size

Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/disk-io.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index ab8925b2efd1..53841d773a40 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -627,6 +627,27 @@ static int check_extent_data_item(struct btrfs_root *root,
 	return 0;
 }
 
+static int check_csum_item(struct btrfs_root *root, struct extent_buffer *leaf,
+			   struct btrfs_key *key, int slot)
+{
+	u32 sectorsize = root->fs_info->sectorsize;
+	u32 csumsize = btrfs_super_csum_size(root->fs_info->super_copy);
+
+	if (key->objectid != BTRFS_EXTENT_CSUM_OBJECTID) {
+		CORRUPT("invalid objectid for csum item", leaf, root, slot);
+		return -EUCLEAN;
+	}
+	if (!IS_ALIGNED(key->offset, sectorsize)) {
+		CORRUPT("unaligned key offset for csum item", leaf, root, slot);
+		return -EUCLEAN;
+	}
+	if (!IS_ALIGNED(btrfs_item_size_nr(leaf, slot), csumsize)) {
+		CORRUPT("unaligned csum item size", leaf, root, slot);
+		return -EUCLEAN;
+	}
+	return 0;
+}
+
 /*
  * Common point to switch the item-specific validation.
  */
@@ -640,6 +661,9 @@ static int check_leaf_item(struct btrfs_root *root,
 	case BTRFS_EXTENT_DATA_KEY:
 		ret = check_extent_data_item(root, leaf, key, slot);
 		break;
+	case BTRFS_EXTENT_CSUM_KEY:
+		ret = check_csum_item(root, leaf, key, slot);
+		break;
 	}
 	return ret;
 }
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 033/146] btrfs: Move leaf and node validation checker to tree-checker.c
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (31 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 032/146] btrfs: Add checker for EXTENT_CSUM Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 034/146] btrfs: tree-checker: Enhance btrfs_check_node output Greg Kroah-Hartman
                   ` (117 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Qu Wenruo, David Sterba,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 557ea5dd003d371536f6b4e8f7c8209a2b6fd4e3 upstream.

It's no doubt the comprehensive tree block checker will become larger,
so moving them into their own files is quite reasonable.

Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
[ wording adjustments ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/Makefile       |   2 +-
 fs/btrfs/disk-io.c      | 285 +-----------------------------------
 fs/btrfs/tree-checker.c | 309 ++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/tree-checker.h |  26 ++++
 4 files changed, 340 insertions(+), 282 deletions(-)
 create mode 100644 fs/btrfs/tree-checker.c
 create mode 100644 fs/btrfs/tree-checker.h

diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index f2cd9dedb037..195229df5ba0 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -10,7 +10,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \
 	   export.o tree-log.o free-space-cache.o zlib.o lzo.o zstd.o \
 	   compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \
 	   reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \
-	   uuid-tree.o props.o hash.o free-space-tree.o
+	   uuid-tree.o props.o hash.o free-space-tree.o tree-checker.o
 
 btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o
 btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 53841d773a40..2df5e906db08 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -50,6 +50,7 @@
 #include "sysfs.h"
 #include "qgroup.h"
 #include "compression.h"
+#include "tree-checker.h"
 
 #ifdef CONFIG_X86
 #include <asm/cpufeature.h>
@@ -544,284 +545,6 @@ static int check_tree_block_fsid(struct btrfs_fs_info *fs_info,
 	return ret;
 }
 
-#define CORRUPT(reason, eb, root, slot)					\
-	btrfs_crit(root->fs_info,					\
-		   "corrupt %s, %s: block=%llu, root=%llu, slot=%d",	\
-		   btrfs_header_level(eb) == 0 ? "leaf" : "node",	\
-		   reason, btrfs_header_bytenr(eb), root->objectid, slot)
-
-static int check_extent_data_item(struct btrfs_root *root,
-				  struct extent_buffer *leaf,
-				  struct btrfs_key *key, int slot)
-{
-	struct btrfs_file_extent_item *fi;
-	u32 sectorsize = root->fs_info->sectorsize;
-	u32 item_size = btrfs_item_size_nr(leaf, slot);
-
-	if (!IS_ALIGNED(key->offset, sectorsize)) {
-		CORRUPT("unaligned key offset for file extent",
-			leaf, root, slot);
-		return -EUCLEAN;
-	}
-
-	fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item);
-
-	if (btrfs_file_extent_type(leaf, fi) > BTRFS_FILE_EXTENT_TYPES) {
-		CORRUPT("invalid file extent type", leaf, root, slot);
-		return -EUCLEAN;
-	}
-
-	/*
-	 * Support for new compression/encrption must introduce incompat flag,
-	 * and must be caught in open_ctree().
-	 */
-	if (btrfs_file_extent_compression(leaf, fi) > BTRFS_COMPRESS_TYPES) {
-		CORRUPT("invalid file extent compression", leaf, root, slot);
-		return -EUCLEAN;
-	}
-	if (btrfs_file_extent_encryption(leaf, fi)) {
-		CORRUPT("invalid file extent encryption", leaf, root, slot);
-		return -EUCLEAN;
-	}
-	if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) {
-		/* Inline extent must have 0 as key offset */
-		if (key->offset) {
-			CORRUPT("inline extent has non-zero key offset",
-				leaf, root, slot);
-			return -EUCLEAN;
-		}
-
-		/* Compressed inline extent has no on-disk size, skip it */
-		if (btrfs_file_extent_compression(leaf, fi) !=
-		    BTRFS_COMPRESS_NONE)
-			return 0;
-
-		/* Uncompressed inline extent size must match item size */
-		if (item_size != BTRFS_FILE_EXTENT_INLINE_DATA_START +
-		    btrfs_file_extent_ram_bytes(leaf, fi)) {
-			CORRUPT("plaintext inline extent has invalid size",
-				leaf, root, slot);
-			return -EUCLEAN;
-		}
-		return 0;
-	}
-
-	/* Regular or preallocated extent has fixed item size */
-	if (item_size != sizeof(*fi)) {
-		CORRUPT(
-		"regluar or preallocated extent data item size is invalid",
-			leaf, root, slot);
-		return -EUCLEAN;
-	}
-	if (!IS_ALIGNED(btrfs_file_extent_ram_bytes(leaf, fi), sectorsize) ||
-	    !IS_ALIGNED(btrfs_file_extent_disk_bytenr(leaf, fi), sectorsize) ||
-	    !IS_ALIGNED(btrfs_file_extent_disk_num_bytes(leaf, fi), sectorsize) ||
-	    !IS_ALIGNED(btrfs_file_extent_offset(leaf, fi), sectorsize) ||
-	    !IS_ALIGNED(btrfs_file_extent_num_bytes(leaf, fi), sectorsize)) {
-		CORRUPT(
-		"regular or preallocated extent data item has unaligned value",
-			leaf, root, slot);
-		return -EUCLEAN;
-	}
-
-	return 0;
-}
-
-static int check_csum_item(struct btrfs_root *root, struct extent_buffer *leaf,
-			   struct btrfs_key *key, int slot)
-{
-	u32 sectorsize = root->fs_info->sectorsize;
-	u32 csumsize = btrfs_super_csum_size(root->fs_info->super_copy);
-
-	if (key->objectid != BTRFS_EXTENT_CSUM_OBJECTID) {
-		CORRUPT("invalid objectid for csum item", leaf, root, slot);
-		return -EUCLEAN;
-	}
-	if (!IS_ALIGNED(key->offset, sectorsize)) {
-		CORRUPT("unaligned key offset for csum item", leaf, root, slot);
-		return -EUCLEAN;
-	}
-	if (!IS_ALIGNED(btrfs_item_size_nr(leaf, slot), csumsize)) {
-		CORRUPT("unaligned csum item size", leaf, root, slot);
-		return -EUCLEAN;
-	}
-	return 0;
-}
-
-/*
- * Common point to switch the item-specific validation.
- */
-static int check_leaf_item(struct btrfs_root *root,
-			   struct extent_buffer *leaf,
-			   struct btrfs_key *key, int slot)
-{
-	int ret = 0;
-
-	switch (key->type) {
-	case BTRFS_EXTENT_DATA_KEY:
-		ret = check_extent_data_item(root, leaf, key, slot);
-		break;
-	case BTRFS_EXTENT_CSUM_KEY:
-		ret = check_csum_item(root, leaf, key, slot);
-		break;
-	}
-	return ret;
-}
-
-static noinline int check_leaf(struct btrfs_root *root,
-			       struct extent_buffer *leaf)
-{
-	struct btrfs_fs_info *fs_info = root->fs_info;
-	/* No valid key type is 0, so all key should be larger than this key */
-	struct btrfs_key prev_key = {0, 0, 0};
-	struct btrfs_key key;
-	u32 nritems = btrfs_header_nritems(leaf);
-	int slot;
-
-	/*
-	 * Extent buffers from a relocation tree have a owner field that
-	 * corresponds to the subvolume tree they are based on. So just from an
-	 * extent buffer alone we can not find out what is the id of the
-	 * corresponding subvolume tree, so we can not figure out if the extent
-	 * buffer corresponds to the root of the relocation tree or not. So skip
-	 * this check for relocation trees.
-	 */
-	if (nritems == 0 && !btrfs_header_flag(leaf, BTRFS_HEADER_FLAG_RELOC)) {
-		struct btrfs_root *check_root;
-
-		key.objectid = btrfs_header_owner(leaf);
-		key.type = BTRFS_ROOT_ITEM_KEY;
-		key.offset = (u64)-1;
-
-		check_root = btrfs_get_fs_root(fs_info, &key, false);
-		/*
-		 * The only reason we also check NULL here is that during
-		 * open_ctree() some roots has not yet been set up.
-		 */
-		if (!IS_ERR_OR_NULL(check_root)) {
-			struct extent_buffer *eb;
-
-			eb = btrfs_root_node(check_root);
-			/* if leaf is the root, then it's fine */
-			if (leaf != eb) {
-				CORRUPT("non-root leaf's nritems is 0",
-					leaf, check_root, 0);
-				free_extent_buffer(eb);
-				return -EUCLEAN;
-			}
-			free_extent_buffer(eb);
-		}
-		return 0;
-	}
-
-	if (nritems == 0)
-		return 0;
-
-	/*
-	 * Check the following things to make sure this is a good leaf, and
-	 * leaf users won't need to bother with similar sanity checks:
-	 *
-	 * 1) key order
-	 * 2) item offset and size
-	 *    No overlap, no hole, all inside the leaf.
-	 * 3) item content
-	 *    If possible, do comprehensive sanity check.
-	 *    NOTE: All checks must only rely on the item data itself.
-	 */
-	for (slot = 0; slot < nritems; slot++) {
-		u32 item_end_expected;
-		int ret;
-
-		btrfs_item_key_to_cpu(leaf, &key, slot);
-
-		/* Make sure the keys are in the right order */
-		if (btrfs_comp_cpu_keys(&prev_key, &key) >= 0) {
-			CORRUPT("bad key order", leaf, root, slot);
-			return -EUCLEAN;
-		}
-
-		/*
-		 * Make sure the offset and ends are right, remember that the
-		 * item data starts at the end of the leaf and grows towards the
-		 * front.
-		 */
-		if (slot == 0)
-			item_end_expected = BTRFS_LEAF_DATA_SIZE(fs_info);
-		else
-			item_end_expected = btrfs_item_offset_nr(leaf,
-								 slot - 1);
-		if (btrfs_item_end_nr(leaf, slot) != item_end_expected) {
-			CORRUPT("slot offset bad", leaf, root, slot);
-			return -EUCLEAN;
-		}
-
-		/*
-		 * Check to make sure that we don't point outside of the leaf,
-		 * just in case all the items are consistent to each other, but
-		 * all point outside of the leaf.
-		 */
-		if (btrfs_item_end_nr(leaf, slot) >
-		    BTRFS_LEAF_DATA_SIZE(fs_info)) {
-			CORRUPT("slot end outside of leaf", leaf, root, slot);
-			return -EUCLEAN;
-		}
-
-		/* Also check if the item pointer overlaps with btrfs item. */
-		if (btrfs_item_nr_offset(slot) + sizeof(struct btrfs_item) >
-		    btrfs_item_ptr_offset(leaf, slot)) {
-			CORRUPT("slot overlap with its data", leaf, root, slot);
-			return -EUCLEAN;
-		}
-
-		/* Check if the item size and content meet other criteria */
-		ret = check_leaf_item(root, leaf, &key, slot);
-		if (ret < 0)
-			return ret;
-
-		prev_key.objectid = key.objectid;
-		prev_key.type = key.type;
-		prev_key.offset = key.offset;
-	}
-
-	return 0;
-}
-
-static int check_node(struct btrfs_root *root, struct extent_buffer *node)
-{
-	unsigned long nr = btrfs_header_nritems(node);
-	struct btrfs_key key, next_key;
-	int slot;
-	u64 bytenr;
-	int ret = 0;
-
-	if (nr == 0 || nr > BTRFS_NODEPTRS_PER_BLOCK(root->fs_info)) {
-		btrfs_crit(root->fs_info,
-			   "corrupt node: block %llu root %llu nritems %lu",
-			   node->start, root->objectid, nr);
-		return -EIO;
-	}
-
-	for (slot = 0; slot < nr - 1; slot++) {
-		bytenr = btrfs_node_blockptr(node, slot);
-		btrfs_node_key_to_cpu(node, &key, slot);
-		btrfs_node_key_to_cpu(node, &next_key, slot + 1);
-
-		if (!bytenr) {
-			CORRUPT("invalid item slot", node, root, slot);
-			ret = -EIO;
-			goto out;
-		}
-
-		if (btrfs_comp_cpu_keys(&key, &next_key) >= 0) {
-			CORRUPT("bad key order", node, root, slot);
-			ret = -EIO;
-			goto out;
-		}
-	}
-out:
-	return ret;
-}
-
 static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio,
 				      u64 phy_offset, struct page *page,
 				      u64 start, u64 end, int mirror)
@@ -887,12 +610,12 @@ static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio,
 	 * that we don't try and read the other copies of this block, just
 	 * return -EIO.
 	 */
-	if (found_level == 0 && check_leaf(root, eb)) {
+	if (found_level == 0 && btrfs_check_leaf(root, eb)) {
 		set_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
 		ret = -EIO;
 	}
 
-	if (found_level > 0 && check_node(root, eb))
+	if (found_level > 0 && btrfs_check_node(root, eb))
 		ret = -EIO;
 
 	if (!ret)
@@ -4147,7 +3870,7 @@ void btrfs_mark_buffer_dirty(struct extent_buffer *buf)
 					 buf->len,
 					 fs_info->dirty_metadata_batch);
 #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY
-	if (btrfs_header_level(buf) == 0 && check_leaf(root, buf)) {
+	if (btrfs_header_level(buf) == 0 && btrfs_check_leaf(root, buf)) {
 		btrfs_print_leaf(buf);
 		ASSERT(0);
 	}
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
new file mode 100644
index 000000000000..56e25a630103
--- /dev/null
+++ b/fs/btrfs/tree-checker.c
@@ -0,0 +1,309 @@
+/*
+ * Copyright (C) Qu Wenruo 2017.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program.
+ */
+
+/*
+ * The module is used to catch unexpected/corrupted tree block data.
+ * Such behavior can be caused either by a fuzzed image or bugs.
+ *
+ * The objective is to do leaf/node validation checks when tree block is read
+ * from disk, and check *every* possible member, so other code won't
+ * need to checking them again.
+ *
+ * Due to the potential and unwanted damage, every checker needs to be
+ * carefully reviewed otherwise so it does not prevent mount of valid images.
+ */
+
+#include "ctree.h"
+#include "tree-checker.h"
+#include "disk-io.h"
+#include "compression.h"
+
+#define CORRUPT(reason, eb, root, slot)					\
+	btrfs_crit(root->fs_info,					\
+		   "corrupt %s, %s: block=%llu, root=%llu, slot=%d",	\
+		   btrfs_header_level(eb) == 0 ? "leaf" : "node",	\
+		   reason, btrfs_header_bytenr(eb), root->objectid, slot)
+
+static int check_extent_data_item(struct btrfs_root *root,
+				  struct extent_buffer *leaf,
+				  struct btrfs_key *key, int slot)
+{
+	struct btrfs_file_extent_item *fi;
+	u32 sectorsize = root->fs_info->sectorsize;
+	u32 item_size = btrfs_item_size_nr(leaf, slot);
+
+	if (!IS_ALIGNED(key->offset, sectorsize)) {
+		CORRUPT("unaligned key offset for file extent",
+			leaf, root, slot);
+		return -EUCLEAN;
+	}
+
+	fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item);
+
+	if (btrfs_file_extent_type(leaf, fi) > BTRFS_FILE_EXTENT_TYPES) {
+		CORRUPT("invalid file extent type", leaf, root, slot);
+		return -EUCLEAN;
+	}
+
+	/*
+	 * Support for new compression/encrption must introduce incompat flag,
+	 * and must be caught in open_ctree().
+	 */
+	if (btrfs_file_extent_compression(leaf, fi) > BTRFS_COMPRESS_TYPES) {
+		CORRUPT("invalid file extent compression", leaf, root, slot);
+		return -EUCLEAN;
+	}
+	if (btrfs_file_extent_encryption(leaf, fi)) {
+		CORRUPT("invalid file extent encryption", leaf, root, slot);
+		return -EUCLEAN;
+	}
+	if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) {
+		/* Inline extent must have 0 as key offset */
+		if (key->offset) {
+			CORRUPT("inline extent has non-zero key offset",
+				leaf, root, slot);
+			return -EUCLEAN;
+		}
+
+		/* Compressed inline extent has no on-disk size, skip it */
+		if (btrfs_file_extent_compression(leaf, fi) !=
+		    BTRFS_COMPRESS_NONE)
+			return 0;
+
+		/* Uncompressed inline extent size must match item size */
+		if (item_size != BTRFS_FILE_EXTENT_INLINE_DATA_START +
+		    btrfs_file_extent_ram_bytes(leaf, fi)) {
+			CORRUPT("plaintext inline extent has invalid size",
+				leaf, root, slot);
+			return -EUCLEAN;
+		}
+		return 0;
+	}
+
+	/* Regular or preallocated extent has fixed item size */
+	if (item_size != sizeof(*fi)) {
+		CORRUPT(
+		"regluar or preallocated extent data item size is invalid",
+			leaf, root, slot);
+		return -EUCLEAN;
+	}
+	if (!IS_ALIGNED(btrfs_file_extent_ram_bytes(leaf, fi), sectorsize) ||
+	    !IS_ALIGNED(btrfs_file_extent_disk_bytenr(leaf, fi), sectorsize) ||
+	    !IS_ALIGNED(btrfs_file_extent_disk_num_bytes(leaf, fi), sectorsize) ||
+	    !IS_ALIGNED(btrfs_file_extent_offset(leaf, fi), sectorsize) ||
+	    !IS_ALIGNED(btrfs_file_extent_num_bytes(leaf, fi), sectorsize)) {
+		CORRUPT(
+		"regular or preallocated extent data item has unaligned value",
+			leaf, root, slot);
+		return -EUCLEAN;
+	}
+
+	return 0;
+}
+
+static int check_csum_item(struct btrfs_root *root, struct extent_buffer *leaf,
+			   struct btrfs_key *key, int slot)
+{
+	u32 sectorsize = root->fs_info->sectorsize;
+	u32 csumsize = btrfs_super_csum_size(root->fs_info->super_copy);
+
+	if (key->objectid != BTRFS_EXTENT_CSUM_OBJECTID) {
+		CORRUPT("invalid objectid for csum item", leaf, root, slot);
+		return -EUCLEAN;
+	}
+	if (!IS_ALIGNED(key->offset, sectorsize)) {
+		CORRUPT("unaligned key offset for csum item", leaf, root, slot);
+		return -EUCLEAN;
+	}
+	if (!IS_ALIGNED(btrfs_item_size_nr(leaf, slot), csumsize)) {
+		CORRUPT("unaligned csum item size", leaf, root, slot);
+		return -EUCLEAN;
+	}
+	return 0;
+}
+
+/*
+ * Common point to switch the item-specific validation.
+ */
+static int check_leaf_item(struct btrfs_root *root,
+			   struct extent_buffer *leaf,
+			   struct btrfs_key *key, int slot)
+{
+	int ret = 0;
+
+	switch (key->type) {
+	case BTRFS_EXTENT_DATA_KEY:
+		ret = check_extent_data_item(root, leaf, key, slot);
+		break;
+	case BTRFS_EXTENT_CSUM_KEY:
+		ret = check_csum_item(root, leaf, key, slot);
+		break;
+	}
+	return ret;
+}
+
+int btrfs_check_leaf(struct btrfs_root *root, struct extent_buffer *leaf)
+{
+	struct btrfs_fs_info *fs_info = root->fs_info;
+	/* No valid key type is 0, so all key should be larger than this key */
+	struct btrfs_key prev_key = {0, 0, 0};
+	struct btrfs_key key;
+	u32 nritems = btrfs_header_nritems(leaf);
+	int slot;
+
+	/*
+	 * Extent buffers from a relocation tree have a owner field that
+	 * corresponds to the subvolume tree they are based on. So just from an
+	 * extent buffer alone we can not find out what is the id of the
+	 * corresponding subvolume tree, so we can not figure out if the extent
+	 * buffer corresponds to the root of the relocation tree or not. So
+	 * skip this check for relocation trees.
+	 */
+	if (nritems == 0 && !btrfs_header_flag(leaf, BTRFS_HEADER_FLAG_RELOC)) {
+		struct btrfs_root *check_root;
+
+		key.objectid = btrfs_header_owner(leaf);
+		key.type = BTRFS_ROOT_ITEM_KEY;
+		key.offset = (u64)-1;
+
+		check_root = btrfs_get_fs_root(fs_info, &key, false);
+		/*
+		 * The only reason we also check NULL here is that during
+		 * open_ctree() some roots has not yet been set up.
+		 */
+		if (!IS_ERR_OR_NULL(check_root)) {
+			struct extent_buffer *eb;
+
+			eb = btrfs_root_node(check_root);
+			/* if leaf is the root, then it's fine */
+			if (leaf != eb) {
+				CORRUPT("non-root leaf's nritems is 0",
+					leaf, check_root, 0);
+				free_extent_buffer(eb);
+				return -EUCLEAN;
+			}
+			free_extent_buffer(eb);
+		}
+		return 0;
+	}
+
+	if (nritems == 0)
+		return 0;
+
+	/*
+	 * Check the following things to make sure this is a good leaf, and
+	 * leaf users won't need to bother with similar sanity checks:
+	 *
+	 * 1) key ordering
+	 * 2) item offset and size
+	 *    No overlap, no hole, all inside the leaf.
+	 * 3) item content
+	 *    If possible, do comprehensive sanity check.
+	 *    NOTE: All checks must only rely on the item data itself.
+	 */
+	for (slot = 0; slot < nritems; slot++) {
+		u32 item_end_expected;
+		int ret;
+
+		btrfs_item_key_to_cpu(leaf, &key, slot);
+
+		/* Make sure the keys are in the right order */
+		if (btrfs_comp_cpu_keys(&prev_key, &key) >= 0) {
+			CORRUPT("bad key order", leaf, root, slot);
+			return -EUCLEAN;
+		}
+
+		/*
+		 * Make sure the offset and ends are right, remember that the
+		 * item data starts at the end of the leaf and grows towards the
+		 * front.
+		 */
+		if (slot == 0)
+			item_end_expected = BTRFS_LEAF_DATA_SIZE(fs_info);
+		else
+			item_end_expected = btrfs_item_offset_nr(leaf,
+								 slot - 1);
+		if (btrfs_item_end_nr(leaf, slot) != item_end_expected) {
+			CORRUPT("slot offset bad", leaf, root, slot);
+			return -EUCLEAN;
+		}
+
+		/*
+		 * Check to make sure that we don't point outside of the leaf,
+		 * just in case all the items are consistent to each other, but
+		 * all point outside of the leaf.
+		 */
+		if (btrfs_item_end_nr(leaf, slot) >
+		    BTRFS_LEAF_DATA_SIZE(fs_info)) {
+			CORRUPT("slot end outside of leaf", leaf, root, slot);
+			return -EUCLEAN;
+		}
+
+		/* Also check if the item pointer overlaps with btrfs item. */
+		if (btrfs_item_nr_offset(slot) + sizeof(struct btrfs_item) >
+		    btrfs_item_ptr_offset(leaf, slot)) {
+			CORRUPT("slot overlap with its data", leaf, root, slot);
+			return -EUCLEAN;
+		}
+
+		/* Check if the item size and content meet other criteria */
+		ret = check_leaf_item(root, leaf, &key, slot);
+		if (ret < 0)
+			return ret;
+
+		prev_key.objectid = key.objectid;
+		prev_key.type = key.type;
+		prev_key.offset = key.offset;
+	}
+
+	return 0;
+}
+
+int btrfs_check_node(struct btrfs_root *root, struct extent_buffer *node)
+{
+	unsigned long nr = btrfs_header_nritems(node);
+	struct btrfs_key key, next_key;
+	int slot;
+	u64 bytenr;
+	int ret = 0;
+
+	if (nr == 0 || nr > BTRFS_NODEPTRS_PER_BLOCK(root->fs_info)) {
+		btrfs_crit(root->fs_info,
+			   "corrupt node: block %llu root %llu nritems %lu",
+			   node->start, root->objectid, nr);
+		return -EIO;
+	}
+
+	for (slot = 0; slot < nr - 1; slot++) {
+		bytenr = btrfs_node_blockptr(node, slot);
+		btrfs_node_key_to_cpu(node, &key, slot);
+		btrfs_node_key_to_cpu(node, &next_key, slot + 1);
+
+		if (!bytenr) {
+			CORRUPT("invalid item slot", node, root, slot);
+			ret = -EIO;
+			goto out;
+		}
+
+		if (btrfs_comp_cpu_keys(&key, &next_key) >= 0) {
+			CORRUPT("bad key order", node, root, slot);
+			ret = -EIO;
+			goto out;
+		}
+	}
+out:
+	return ret;
+}
diff --git a/fs/btrfs/tree-checker.h b/fs/btrfs/tree-checker.h
new file mode 100644
index 000000000000..96c486e95d70
--- /dev/null
+++ b/fs/btrfs/tree-checker.h
@@ -0,0 +1,26 @@
+/*
+ * Copyright (C) Qu Wenruo 2017.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program.
+ */
+
+#ifndef __BTRFS_TREE_CHECKER__
+#define __BTRFS_TREE_CHECKER__
+
+#include "ctree.h"
+#include "extent_io.h"
+
+int btrfs_check_leaf(struct btrfs_root *root, struct extent_buffer *leaf);
+int btrfs_check_node(struct btrfs_root *root, struct extent_buffer *node);
+
+#endif
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 034/146] btrfs: tree-checker: Enhance btrfs_check_node output
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (32 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 033/146] btrfs: Move leaf and node validation checker to tree-checker.c Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 035/146] btrfs: tree-checker: Fix false panic for sanity test Greg Kroah-Hartman
                   ` (116 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Qu Wenruo, David Sterba,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit bba4f29896c986c4cec17bc0f19f2ce644fceae1 upstream.

Use inline function to replace macro since we don't need
stringification.
(Macro still exists until all callers get updated)

And add more info about the error, and replace EIO with EUCLEAN.

For nr_items error, report if it's too large or too small, and output
the valid value range.

For node block pointer, added a new alignment checker.

For key order, also output the next key to make the problem more
obvious.

Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
[ wording adjustments, unindented long strings ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/tree-checker.c | 68 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 61 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 56e25a630103..5acdf3355a3f 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -37,6 +37,46 @@
 		   btrfs_header_level(eb) == 0 ? "leaf" : "node",	\
 		   reason, btrfs_header_bytenr(eb), root->objectid, slot)
 
+/*
+ * Error message should follow the following format:
+ * corrupt <type>: <identifier>, <reason>[, <bad_value>]
+ *
+ * @type:	leaf or node
+ * @identifier:	the necessary info to locate the leaf/node.
+ * 		It's recommened to decode key.objecitd/offset if it's
+ * 		meaningful.
+ * @reason:	describe the error
+ * @bad_value:	optional, it's recommened to output bad value and its
+ *		expected value (range).
+ *
+ * Since comma is used to separate the components, only space is allowed
+ * inside each component.
+ */
+
+/*
+ * Append generic "corrupt leaf/node root=%llu block=%llu slot=%d: " to @fmt.
+ * Allows callers to customize the output.
+ */
+__printf(4, 5)
+static void generic_err(const struct btrfs_root *root,
+			const struct extent_buffer *eb, int slot,
+			const char *fmt, ...)
+{
+	struct va_format vaf;
+	va_list args;
+
+	va_start(args, fmt);
+
+	vaf.fmt = fmt;
+	vaf.va = &args;
+
+	btrfs_crit(root->fs_info,
+		"corrupt %s: root=%llu block=%llu slot=%d, %pV",
+		btrfs_header_level(eb) == 0 ? "leaf" : "node",
+		root->objectid, btrfs_header_bytenr(eb), slot, &vaf);
+	va_end(args);
+}
+
 static int check_extent_data_item(struct btrfs_root *root,
 				  struct extent_buffer *leaf,
 				  struct btrfs_key *key, int slot)
@@ -282,9 +322,11 @@ int btrfs_check_node(struct btrfs_root *root, struct extent_buffer *node)
 
 	if (nr == 0 || nr > BTRFS_NODEPTRS_PER_BLOCK(root->fs_info)) {
 		btrfs_crit(root->fs_info,
-			   "corrupt node: block %llu root %llu nritems %lu",
-			   node->start, root->objectid, nr);
-		return -EIO;
+"corrupt node: root=%llu block=%llu, nritems too %s, have %lu expect range [1,%u]",
+			   root->objectid, node->start,
+			   nr == 0 ? "small" : "large", nr,
+			   BTRFS_NODEPTRS_PER_BLOCK(root->fs_info));
+		return -EUCLEAN;
 	}
 
 	for (slot = 0; slot < nr - 1; slot++) {
@@ -293,14 +335,26 @@ int btrfs_check_node(struct btrfs_root *root, struct extent_buffer *node)
 		btrfs_node_key_to_cpu(node, &next_key, slot + 1);
 
 		if (!bytenr) {
-			CORRUPT("invalid item slot", node, root, slot);
-			ret = -EIO;
+			generic_err(root, node, slot,
+				"invalid NULL node pointer");
+			ret = -EUCLEAN;
+			goto out;
+		}
+		if (!IS_ALIGNED(bytenr, root->fs_info->sectorsize)) {
+			generic_err(root, node, slot,
+			"unaligned pointer, have %llu should be aligned to %u",
+				bytenr, root->fs_info->sectorsize);
+			ret = -EUCLEAN;
 			goto out;
 		}
 
 		if (btrfs_comp_cpu_keys(&key, &next_key) >= 0) {
-			CORRUPT("bad key order", node, root, slot);
-			ret = -EIO;
+			generic_err(root, node, slot,
+	"bad key order, current (%llu %u %llu) next (%llu %u %llu)",
+				key.objectid, key.type, key.offset,
+				next_key.objectid, next_key.type,
+				next_key.offset);
+			ret = -EUCLEAN;
 			goto out;
 		}
 	}
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 035/146] btrfs: tree-checker: Fix false panic for sanity test
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (33 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 034/146] btrfs: tree-checker: Enhance btrfs_check_node output Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 036/146] btrfs: tree-checker: Add checker for dir item Greg Kroah-Hartman
                   ` (115 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Filipe Manana, Lakshmipathi.G,
	Qu Wenruo, Liu Bo, David Sterba, Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 69fc6cbbac542c349b3d350d10f6e394c253c81d upstream.

[BUG]
If we run btrfs with CONFIG_BTRFS_FS_RUN_SANITY_TESTS=y, it will
instantly cause kernel panic like:

------
...
assertion failed: 0, file: fs/btrfs/disk-io.c, line: 3853
...
Call Trace:
 btrfs_mark_buffer_dirty+0x187/0x1f0 [btrfs]
 setup_items_for_insert+0x385/0x650 [btrfs]
 __btrfs_drop_extents+0x129a/0x1870 [btrfs]
...
-----

[Cause]
Btrfs will call btrfs_check_leaf() in btrfs_mark_buffer_dirty() to check
if the leaf is valid with CONFIG_BTRFS_FS_RUN_SANITY_TESTS=y.

However quite some btrfs_mark_buffer_dirty() callers(*) don't really
initialize its item data but only initialize its item pointers, leaving
item data uninitialized.

This makes tree-checker catch uninitialized data as error, causing
such panic.

*: These callers include but not limited to
setup_items_for_insert()
btrfs_split_item()
btrfs_expand_item()

[Fix]
Add a new parameter @check_item_data to btrfs_check_leaf().
With @check_item_data set to false, item data check will be skipped and
fallback to old btrfs_check_leaf() behavior.

So we can still get early warning if we screw up item pointers, and
avoid false panic.

Cc: Filipe Manana <fdmanana@gmail.com>
Reported-by: Lakshmipathi.G <lakshmipathi.g@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>

Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/disk-io.c      | 10 ++++++++--
 fs/btrfs/tree-checker.c | 27 ++++++++++++++++++++++-----
 fs/btrfs/tree-checker.h | 14 +++++++++++++-
 3 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 2df5e906db08..e42673477c25 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -610,7 +610,7 @@ static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio,
 	 * that we don't try and read the other copies of this block, just
 	 * return -EIO.
 	 */
-	if (found_level == 0 && btrfs_check_leaf(root, eb)) {
+	if (found_level == 0 && btrfs_check_leaf_full(root, eb)) {
 		set_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags);
 		ret = -EIO;
 	}
@@ -3870,7 +3870,13 @@ void btrfs_mark_buffer_dirty(struct extent_buffer *buf)
 					 buf->len,
 					 fs_info->dirty_metadata_batch);
 #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY
-	if (btrfs_header_level(buf) == 0 && btrfs_check_leaf(root, buf)) {
+	/*
+	 * Since btrfs_mark_buffer_dirty() can be called with item pointer set
+	 * but item data not updated.
+	 * So here we should only check item pointers, not item data.
+	 */
+	if (btrfs_header_level(buf) == 0 &&
+	    btrfs_check_leaf_relaxed(root, buf)) {
 		btrfs_print_leaf(buf);
 		ASSERT(0);
 	}
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 5acdf3355a3f..ff4fa8f905d3 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -195,7 +195,8 @@ static int check_leaf_item(struct btrfs_root *root,
 	return ret;
 }
 
-int btrfs_check_leaf(struct btrfs_root *root, struct extent_buffer *leaf)
+static int check_leaf(struct btrfs_root *root, struct extent_buffer *leaf,
+		      bool check_item_data)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
 	/* No valid key type is 0, so all key should be larger than this key */
@@ -299,10 +300,15 @@ int btrfs_check_leaf(struct btrfs_root *root, struct extent_buffer *leaf)
 			return -EUCLEAN;
 		}
 
-		/* Check if the item size and content meet other criteria */
-		ret = check_leaf_item(root, leaf, &key, slot);
-		if (ret < 0)
-			return ret;
+		if (check_item_data) {
+			/*
+			 * Check if the item size and content meet other
+			 * criteria
+			 */
+			ret = check_leaf_item(root, leaf, &key, slot);
+			if (ret < 0)
+				return ret;
+		}
 
 		prev_key.objectid = key.objectid;
 		prev_key.type = key.type;
@@ -312,6 +318,17 @@ int btrfs_check_leaf(struct btrfs_root *root, struct extent_buffer *leaf)
 	return 0;
 }
 
+int btrfs_check_leaf_full(struct btrfs_root *root, struct extent_buffer *leaf)
+{
+	return check_leaf(root, leaf, true);
+}
+
+int btrfs_check_leaf_relaxed(struct btrfs_root *root,
+			     struct extent_buffer *leaf)
+{
+	return check_leaf(root, leaf, false);
+}
+
 int btrfs_check_node(struct btrfs_root *root, struct extent_buffer *node)
 {
 	unsigned long nr = btrfs_header_nritems(node);
diff --git a/fs/btrfs/tree-checker.h b/fs/btrfs/tree-checker.h
index 96c486e95d70..3d53e8d6fda0 100644
--- a/fs/btrfs/tree-checker.h
+++ b/fs/btrfs/tree-checker.h
@@ -20,7 +20,19 @@
 #include "ctree.h"
 #include "extent_io.h"
 
-int btrfs_check_leaf(struct btrfs_root *root, struct extent_buffer *leaf);
+/*
+ * Comprehensive leaf checker.
+ * Will check not only the item pointers, but also every possible member
+ * in item data.
+ */
+int btrfs_check_leaf_full(struct btrfs_root *root, struct extent_buffer *leaf);
+
+/*
+ * Less strict leaf checker.
+ * Will only check item pointers, not reading item data.
+ */
+int btrfs_check_leaf_relaxed(struct btrfs_root *root,
+			     struct extent_buffer *leaf);
 int btrfs_check_node(struct btrfs_root *root, struct extent_buffer *node);
 
 #endif
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 036/146] btrfs: tree-checker: Add checker for dir item
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (34 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 035/146] btrfs: tree-checker: Fix false panic for sanity test Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 037/146] btrfs: tree-checker: use %zu format string for size_t Greg Kroah-Hartman
                   ` (114 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Qu Wenruo, Nikolay Borisov, Su Yue,
	David Sterba, Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit ad7b0368f33cffe67fecd302028915926e50ef7e upstream.

Add checker for dir item, for key types DIR_ITEM, DIR_INDEX and
XATTR_ITEM.

This checker does comprehensive checks for:

1) dir_item header and its data size
   Against item boundary and maximum name/xattr length.
   This part is mostly the same as old verify_dir_item().

2) dir_type
   Against maximum file types, and against key type.
   Since XATTR key should only have FT_XATTR dir item, and normal dir
   item type should not have XATTR key.

   The check between key->type and dir_type is newly introduced by this
   patch.

3) name hash
   For XATTR and DIR_ITEM key, key->offset is name hash (crc32c).
   Check the hash of the name against the key to ensure it's correct.

   The name hash check is only found in btrfs-progs before this patch.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/tree-checker.c | 141 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 141 insertions(+)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index ff4fa8f905d3..b32df86de9bf 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -30,6 +30,7 @@
 #include "tree-checker.h"
 #include "disk-io.h"
 #include "compression.h"
+#include "hash.h"
 
 #define CORRUPT(reason, eb, root, slot)					\
 	btrfs_crit(root->fs_info,					\
@@ -175,6 +176,141 @@ static int check_csum_item(struct btrfs_root *root, struct extent_buffer *leaf,
 	return 0;
 }
 
+/*
+ * Customized reported for dir_item, only important new info is key->objectid,
+ * which represents inode number
+ */
+__printf(4, 5)
+static void dir_item_err(const struct btrfs_root *root,
+			 const struct extent_buffer *eb, int slot,
+			 const char *fmt, ...)
+{
+	struct btrfs_key key;
+	struct va_format vaf;
+	va_list args;
+
+	btrfs_item_key_to_cpu(eb, &key, slot);
+	va_start(args, fmt);
+
+	vaf.fmt = fmt;
+	vaf.va = &args;
+
+	btrfs_crit(root->fs_info,
+	"corrupt %s: root=%llu block=%llu slot=%d ino=%llu, %pV",
+		btrfs_header_level(eb) == 0 ? "leaf" : "node", root->objectid,
+		btrfs_header_bytenr(eb), slot, key.objectid, &vaf);
+	va_end(args);
+}
+
+static int check_dir_item(struct btrfs_root *root,
+			  struct extent_buffer *leaf,
+			  struct btrfs_key *key, int slot)
+{
+	struct btrfs_dir_item *di;
+	u32 item_size = btrfs_item_size_nr(leaf, slot);
+	u32 cur = 0;
+
+	di = btrfs_item_ptr(leaf, slot, struct btrfs_dir_item);
+	while (cur < item_size) {
+		char namebuf[max(BTRFS_NAME_LEN, XATTR_NAME_MAX)];
+		u32 name_len;
+		u32 data_len;
+		u32 max_name_len;
+		u32 total_size;
+		u32 name_hash;
+		u8 dir_type;
+
+		/* header itself should not cross item boundary */
+		if (cur + sizeof(*di) > item_size) {
+			dir_item_err(root, leaf, slot,
+		"dir item header crosses item boundary, have %lu boundary %u",
+				cur + sizeof(*di), item_size);
+			return -EUCLEAN;
+		}
+
+		/* dir type check */
+		dir_type = btrfs_dir_type(leaf, di);
+		if (dir_type >= BTRFS_FT_MAX) {
+			dir_item_err(root, leaf, slot,
+			"invalid dir item type, have %u expect [0, %u)",
+				dir_type, BTRFS_FT_MAX);
+			return -EUCLEAN;
+		}
+
+		if (key->type == BTRFS_XATTR_ITEM_KEY &&
+		    dir_type != BTRFS_FT_XATTR) {
+			dir_item_err(root, leaf, slot,
+		"invalid dir item type for XATTR key, have %u expect %u",
+				dir_type, BTRFS_FT_XATTR);
+			return -EUCLEAN;
+		}
+		if (dir_type == BTRFS_FT_XATTR &&
+		    key->type != BTRFS_XATTR_ITEM_KEY) {
+			dir_item_err(root, leaf, slot,
+			"xattr dir type found for non-XATTR key");
+			return -EUCLEAN;
+		}
+		if (dir_type == BTRFS_FT_XATTR)
+			max_name_len = XATTR_NAME_MAX;
+		else
+			max_name_len = BTRFS_NAME_LEN;
+
+		/* Name/data length check */
+		name_len = btrfs_dir_name_len(leaf, di);
+		data_len = btrfs_dir_data_len(leaf, di);
+		if (name_len > max_name_len) {
+			dir_item_err(root, leaf, slot,
+			"dir item name len too long, have %u max %u",
+				name_len, max_name_len);
+			return -EUCLEAN;
+		}
+		if (name_len + data_len > BTRFS_MAX_XATTR_SIZE(root->fs_info)) {
+			dir_item_err(root, leaf, slot,
+			"dir item name and data len too long, have %u max %u",
+				name_len + data_len,
+				BTRFS_MAX_XATTR_SIZE(root->fs_info));
+			return -EUCLEAN;
+		}
+
+		if (data_len && dir_type != BTRFS_FT_XATTR) {
+			dir_item_err(root, leaf, slot,
+			"dir item with invalid data len, have %u expect 0",
+				data_len);
+			return -EUCLEAN;
+		}
+
+		total_size = sizeof(*di) + name_len + data_len;
+
+		/* header and name/data should not cross item boundary */
+		if (cur + total_size > item_size) {
+			dir_item_err(root, leaf, slot,
+		"dir item data crosses item boundary, have %u boundary %u",
+				cur + total_size, item_size);
+			return -EUCLEAN;
+		}
+
+		/*
+		 * Special check for XATTR/DIR_ITEM, as key->offset is name
+		 * hash, should match its name
+		 */
+		if (key->type == BTRFS_DIR_ITEM_KEY ||
+		    key->type == BTRFS_XATTR_ITEM_KEY) {
+			read_extent_buffer(leaf, namebuf,
+					(unsigned long)(di + 1), name_len);
+			name_hash = btrfs_name_hash(namebuf, name_len);
+			if (key->offset != name_hash) {
+				dir_item_err(root, leaf, slot,
+		"name hash mismatch with key, have 0x%016x expect 0x%016llx",
+					name_hash, key->offset);
+				return -EUCLEAN;
+			}
+		}
+		cur += total_size;
+		di = (struct btrfs_dir_item *)((void *)di + total_size);
+	}
+	return 0;
+}
+
 /*
  * Common point to switch the item-specific validation.
  */
@@ -191,6 +327,11 @@ static int check_leaf_item(struct btrfs_root *root,
 	case BTRFS_EXTENT_CSUM_KEY:
 		ret = check_csum_item(root, leaf, key, slot);
 		break;
+	case BTRFS_DIR_ITEM_KEY:
+	case BTRFS_DIR_INDEX_KEY:
+	case BTRFS_XATTR_ITEM_KEY:
+		ret = check_dir_item(root, leaf, key, slot);
+		break;
 	}
 	return ret;
 }
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 037/146] btrfs: tree-checker: use %zu format string for size_t
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (35 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 036/146] btrfs: tree-checker: Add checker for dir item Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 038/146] btrfs: tree-check: reduce stack consumption in check_dir_item Greg Kroah-Hartman
                   ` (113 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Arnd Bergmann, David Sterba,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 7cfad65297bfe0aa2996cd72d21c898aa84436d9 upstream.

The return value of sizeof() is of type size_t, so we must print it
using the %z format modifier rather than %l to avoid this warning
on some architectures:

fs/btrfs/tree-checker.c: In function 'check_dir_item':
fs/btrfs/tree-checker.c:273:50: error: format '%lu' expects argument of type 'long unsigned int', but argument 5 has type 'u32' {aka 'unsigned int'} [-Werror=format=]

Fixes: 005887f2e3e0 ("btrfs: tree-checker: Add checker for dir item")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/tree-checker.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index b32df86de9bf..9376739a4cc8 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -223,7 +223,7 @@ static int check_dir_item(struct btrfs_root *root,
 		/* header itself should not cross item boundary */
 		if (cur + sizeof(*di) > item_size) {
 			dir_item_err(root, leaf, slot,
-		"dir item header crosses item boundary, have %lu boundary %u",
+		"dir item header crosses item boundary, have %zu boundary %u",
 				cur + sizeof(*di), item_size);
 			return -EUCLEAN;
 		}
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 038/146] btrfs: tree-check: reduce stack consumption in check_dir_item
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (36 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 037/146] btrfs: tree-checker: use %zu format string for size_t Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 039/146] btrfs: tree-checker: Verify block_group_item Greg Kroah-Hartman
                   ` (112 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Josef Bacik, Qu Wenruo, David Sterba,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit e2683fc9d219430f5b78889b50cde7f40efeba7b upstream.

I've noticed that the updated item checker stack consumption increased
dramatically in 542f5385e20cf97447 ("btrfs: tree-checker: Add checker
for dir item")

tree-checker.c:check_leaf                    +552 (176 -> 728)

The array is 255 bytes long, dynamic allocation would slow down the
sanity checks so it's more reasonable to keep it on-stack. Moving the
variable to the scope of use reduces the stack usage again

tree-checker.c:check_leaf                    -264 (728 -> 464)

Reviewed-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/tree-checker.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 9376739a4cc8..f7e7a455b710 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -212,7 +212,6 @@ static int check_dir_item(struct btrfs_root *root,
 
 	di = btrfs_item_ptr(leaf, slot, struct btrfs_dir_item);
 	while (cur < item_size) {
-		char namebuf[max(BTRFS_NAME_LEN, XATTR_NAME_MAX)];
 		u32 name_len;
 		u32 data_len;
 		u32 max_name_len;
@@ -295,6 +294,8 @@ static int check_dir_item(struct btrfs_root *root,
 		 */
 		if (key->type == BTRFS_DIR_ITEM_KEY ||
 		    key->type == BTRFS_XATTR_ITEM_KEY) {
+			char namebuf[max(BTRFS_NAME_LEN, XATTR_NAME_MAX)];
+
 			read_extent_buffer(leaf, namebuf,
 					(unsigned long)(di + 1), name_len);
 			name_hash = btrfs_name_hash(namebuf, name_len);
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 039/146] btrfs: tree-checker: Verify block_group_item
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (37 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 038/146] btrfs: tree-check: reduce stack consumption in check_dir_item Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 040/146] btrfs: tree-checker: Detect invalid and empty essential trees Greg Kroah-Hartman
                   ` (111 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Xu Wen, Qu Wenruo, Gu Jinxiang,
	Nikolay Borisov, David Sterba, Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit fce466eab7ac6baa9d2dcd88abcf945be3d4a089 upstream.

A crafted image with invalid block group items could make free space cache
code to cause panic.

We could detect such invalid block group item by checking:
1) Item size
   Known fixed value.
2) Block group size (key.offset)
   We have an upper limit on block group item (10G)
3) Chunk objectid
   Known fixed value.
4) Type
   Only 4 valid type values, DATA, METADATA, SYSTEM and DATA|METADATA.
   No more than 1 bit set for profile type.
5) Used space
   No more than the block group size.

This should allow btrfs to detect and refuse to mount the crafted image.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=199849
Reported-by: Xu Wen <wen.xu@gatech.edu>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[bwh: Backported to 4.14:
 - In check_leaf_item(), pass root->fs_info to check_block_group_item()
 - Adjust context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/tree-checker.c | 100 ++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/volumes.c      |   2 +-
 fs/btrfs/volumes.h      |   2 +
 3 files changed, 103 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index f7e7a455b710..cf9b10a07134 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -31,6 +31,7 @@
 #include "disk-io.h"
 #include "compression.h"
 #include "hash.h"
+#include "volumes.h"
 
 #define CORRUPT(reason, eb, root, slot)					\
 	btrfs_crit(root->fs_info,					\
@@ -312,6 +313,102 @@ static int check_dir_item(struct btrfs_root *root,
 	return 0;
 }
 
+__printf(4, 5)
+__cold
+static void block_group_err(const struct btrfs_fs_info *fs_info,
+			    const struct extent_buffer *eb, int slot,
+			    const char *fmt, ...)
+{
+	struct btrfs_key key;
+	struct va_format vaf;
+	va_list args;
+
+	btrfs_item_key_to_cpu(eb, &key, slot);
+	va_start(args, fmt);
+
+	vaf.fmt = fmt;
+	vaf.va = &args;
+
+	btrfs_crit(fs_info,
+	"corrupt %s: root=%llu block=%llu slot=%d bg_start=%llu bg_len=%llu, %pV",
+		btrfs_header_level(eb) == 0 ? "leaf" : "node",
+		btrfs_header_owner(eb), btrfs_header_bytenr(eb), slot,
+		key.objectid, key.offset, &vaf);
+	va_end(args);
+}
+
+static int check_block_group_item(struct btrfs_fs_info *fs_info,
+				  struct extent_buffer *leaf,
+				  struct btrfs_key *key, int slot)
+{
+	struct btrfs_block_group_item bgi;
+	u32 item_size = btrfs_item_size_nr(leaf, slot);
+	u64 flags;
+	u64 type;
+
+	/*
+	 * Here we don't really care about alignment since extent allocator can
+	 * handle it.  We care more about the size, as if one block group is
+	 * larger than maximum size, it's must be some obvious corruption.
+	 */
+	if (key->offset > BTRFS_MAX_DATA_CHUNK_SIZE || key->offset == 0) {
+		block_group_err(fs_info, leaf, slot,
+			"invalid block group size, have %llu expect (0, %llu]",
+				key->offset, BTRFS_MAX_DATA_CHUNK_SIZE);
+		return -EUCLEAN;
+	}
+
+	if (item_size != sizeof(bgi)) {
+		block_group_err(fs_info, leaf, slot,
+			"invalid item size, have %u expect %zu",
+				item_size, sizeof(bgi));
+		return -EUCLEAN;
+	}
+
+	read_extent_buffer(leaf, &bgi, btrfs_item_ptr_offset(leaf, slot),
+			   sizeof(bgi));
+	if (btrfs_block_group_chunk_objectid(&bgi) !=
+	    BTRFS_FIRST_CHUNK_TREE_OBJECTID) {
+		block_group_err(fs_info, leaf, slot,
+		"invalid block group chunk objectid, have %llu expect %llu",
+				btrfs_block_group_chunk_objectid(&bgi),
+				BTRFS_FIRST_CHUNK_TREE_OBJECTID);
+		return -EUCLEAN;
+	}
+
+	if (btrfs_block_group_used(&bgi) > key->offset) {
+		block_group_err(fs_info, leaf, slot,
+			"invalid block group used, have %llu expect [0, %llu)",
+				btrfs_block_group_used(&bgi), key->offset);
+		return -EUCLEAN;
+	}
+
+	flags = btrfs_block_group_flags(&bgi);
+	if (hweight64(flags & BTRFS_BLOCK_GROUP_PROFILE_MASK) > 1) {
+		block_group_err(fs_info, leaf, slot,
+"invalid profile flags, have 0x%llx (%lu bits set) expect no more than 1 bit set",
+			flags & BTRFS_BLOCK_GROUP_PROFILE_MASK,
+			hweight64(flags & BTRFS_BLOCK_GROUP_PROFILE_MASK));
+		return -EUCLEAN;
+	}
+
+	type = flags & BTRFS_BLOCK_GROUP_TYPE_MASK;
+	if (type != BTRFS_BLOCK_GROUP_DATA &&
+	    type != BTRFS_BLOCK_GROUP_METADATA &&
+	    type != BTRFS_BLOCK_GROUP_SYSTEM &&
+	    type != (BTRFS_BLOCK_GROUP_METADATA |
+			   BTRFS_BLOCK_GROUP_DATA)) {
+		block_group_err(fs_info, leaf, slot,
+"invalid type, have 0x%llx (%lu bits set) expect either 0x%llx, 0x%llx, 0x%llu or 0x%llx",
+			type, hweight64(type),
+			BTRFS_BLOCK_GROUP_DATA, BTRFS_BLOCK_GROUP_METADATA,
+			BTRFS_BLOCK_GROUP_SYSTEM,
+			BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_DATA);
+		return -EUCLEAN;
+	}
+	return 0;
+}
+
 /*
  * Common point to switch the item-specific validation.
  */
@@ -333,6 +430,9 @@ static int check_leaf_item(struct btrfs_root *root,
 	case BTRFS_XATTR_ITEM_KEY:
 		ret = check_dir_item(root, leaf, key, slot);
 		break;
+	case BTRFS_BLOCK_GROUP_ITEM_KEY:
+		ret = check_block_group_item(root->fs_info, leaf, key, slot);
+		break;
 	}
 	return ret;
 }
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index cfd5728e7519..9663b6aa2a56 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4647,7 +4647,7 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 
 	if (type & BTRFS_BLOCK_GROUP_DATA) {
 		max_stripe_size = SZ_1G;
-		max_chunk_size = 10 * max_stripe_size;
+		max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE;
 		if (!devs_max)
 			devs_max = BTRFS_MAX_DEVS(info->chunk_root);
 	} else if (type & BTRFS_BLOCK_GROUP_METADATA) {
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index c5dd48eb7b3d..76fb6e84f201 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -24,6 +24,8 @@
 #include <linux/btrfs.h>
 #include "async-thread.h"
 
+#define BTRFS_MAX_DATA_CHUNK_SIZE	(10ULL * SZ_1G)
+
 extern struct mutex uuid_mutex;
 
 #define BTRFS_STRIPE_LEN	SZ_64K
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 040/146] btrfs: tree-checker: Detect invalid and empty essential trees
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (38 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 039/146] btrfs: tree-checker: Verify block_group_item Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 041/146] btrfs: Check that each block group has corresponding chunk at mount time Greg Kroah-Hartman
                   ` (110 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Xu Wen, Qu Wenruo, Gu Jinxiang,
	David Sterba, Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit ba480dd4db9f1798541eb2d1c423fc95feee8d36 upstream.

A crafted image has empty root tree block, which will later cause NULL
pointer dereference.

The following trees should never be empty:
1) Tree root
   Must contain at least root items for extent tree, device tree and fs
   tree

2) Chunk tree
   Or we can't even bootstrap as it contains the mapping.

3) Fs tree
   At least inode item for top level inode (.).

4) Device tree
   Dev extents for chunks

5) Extent tree
   Must have corresponding extent for each chunk.

If any of them is empty, we are sure the fs is corrupted and no need to
mount it.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=199847
Reported-by: Xu Wen <wen.xu@gatech.edu>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Tested-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[bwh: Backported to 4.14: Pass root instead of fs_info to generic_err()]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/tree-checker.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index cf9b10a07134..31756bac75b4 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -456,9 +456,22 @@ static int check_leaf(struct btrfs_root *root, struct extent_buffer *leaf,
 	 * skip this check for relocation trees.
 	 */
 	if (nritems == 0 && !btrfs_header_flag(leaf, BTRFS_HEADER_FLAG_RELOC)) {
+		u64 owner = btrfs_header_owner(leaf);
 		struct btrfs_root *check_root;
 
-		key.objectid = btrfs_header_owner(leaf);
+		/* These trees must never be empty */
+		if (owner == BTRFS_ROOT_TREE_OBJECTID ||
+		    owner == BTRFS_CHUNK_TREE_OBJECTID ||
+		    owner == BTRFS_EXTENT_TREE_OBJECTID ||
+		    owner == BTRFS_DEV_TREE_OBJECTID ||
+		    owner == BTRFS_FS_TREE_OBJECTID ||
+		    owner == BTRFS_DATA_RELOC_TREE_OBJECTID) {
+			generic_err(root, leaf, 0,
+			"invalid root, root %llu must never be empty",
+				    owner);
+			return -EUCLEAN;
+		}
+		key.objectid = owner;
 		key.type = BTRFS_ROOT_ITEM_KEY;
 		key.offset = (u64)-1;
 
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 041/146] btrfs: Check that each block group has corresponding chunk at mount time
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (39 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 040/146] btrfs: tree-checker: Detect invalid and empty essential trees Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 042/146] btrfs: tree-checker: Check level for leaves and nodes Greg Kroah-Hartman
                   ` (109 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Xu Wen, Qu Wenruo, Su Yue,
	David Sterba, Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 514c7dca85a0bf40be984dab0b477403a6db901f upstream.

A crafted btrfs image with incorrect chunk<->block group mapping will
trigger a lot of unexpected things as the mapping is essential.

Although the problem can be caught by block group item checker
added in "btrfs: tree-checker: Verify block_group_item", it's still not
sufficient.  A sufficiently valid block group item can pass the check
added by the mentioned patch but could fail to match the existing chunk.

This patch will add extra block group -> chunk mapping check, to ensure
we have a completely matching (start, len, flags) chunk for each block
group at mount time.

Here we reuse the original helper find_first_block_group(), which is
already doing the basic bg -> chunk checks, adding further checks of the
start/len and type flags.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=199837
Reported-by: Xu Wen <wen.xu@gatech.edu>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/extent-tree.c | 28 +++++++++++++++++++++++++++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index fdc42eddccc2..83791d13c204 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -9828,6 +9828,8 @@ static int find_first_block_group(struct btrfs_fs_info *fs_info,
 	int ret = 0;
 	struct btrfs_key found_key;
 	struct extent_buffer *leaf;
+	struct btrfs_block_group_item bg;
+	u64 flags;
 	int slot;
 
 	ret = btrfs_search_slot(NULL, root, key, path, 0, 0);
@@ -9862,8 +9864,32 @@ static int find_first_block_group(struct btrfs_fs_info *fs_info,
 			"logical %llu len %llu found bg but no related chunk",
 					  found_key.objectid, found_key.offset);
 				ret = -ENOENT;
+			} else if (em->start != found_key.objectid ||
+				   em->len != found_key.offset) {
+				btrfs_err(fs_info,
+		"block group %llu len %llu mismatch with chunk %llu len %llu",
+					  found_key.objectid, found_key.offset,
+					  em->start, em->len);
+				ret = -EUCLEAN;
 			} else {
-				ret = 0;
+				read_extent_buffer(leaf, &bg,
+					btrfs_item_ptr_offset(leaf, slot),
+					sizeof(bg));
+				flags = btrfs_block_group_flags(&bg) &
+					BTRFS_BLOCK_GROUP_TYPE_MASK;
+
+				if (flags != (em->map_lookup->type &
+					      BTRFS_BLOCK_GROUP_TYPE_MASK)) {
+					btrfs_err(fs_info,
+"block group %llu len %llu type flags 0x%llx mismatch with chunk type flags 0x%llx",
+						found_key.objectid,
+						found_key.offset, flags,
+						(BTRFS_BLOCK_GROUP_TYPE_MASK &
+						 em->map_lookup->type));
+					ret = -EUCLEAN;
+				} else {
+					ret = 0;
+				}
 			}
 			free_extent_map(em);
 			goto out;
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 042/146] btrfs: tree-checker: Check level for leaves and nodes
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (40 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 041/146] btrfs: Check that each block group has corresponding chunk at mount time Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 043/146] btrfs: tree-checker: Fix misleading group system information Greg Kroah-Hartman
                   ` (108 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Qu Wenruo, Su Yue, David Sterba,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit f556faa46eb4e96d0d0772e74ecf66781e132f72 upstream.

Although we have tree level check at tree read runtime, it's completely
based on its parent level.
We still need to do accurate level check to avoid invalid tree blocks
sneak into kernel space.

The check itself is simple, for leaf its level should always be 0.
For nodes its level should be in range [1, BTRFS_MAX_LEVEL - 1].

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[bwh: Backported to 4.14:
 - Pass root instead of fs_info to generic_err()
 - Adjust context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/tree-checker.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 31756bac75b4..fa8f64119e6f 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -447,6 +447,13 @@ static int check_leaf(struct btrfs_root *root, struct extent_buffer *leaf,
 	u32 nritems = btrfs_header_nritems(leaf);
 	int slot;
 
+	if (btrfs_header_level(leaf) != 0) {
+		generic_err(root, leaf, 0,
+			"invalid level for leaf, have %d expect 0",
+			btrfs_header_level(leaf));
+		return -EUCLEAN;
+	}
+
 	/*
 	 * Extent buffers from a relocation tree have a owner field that
 	 * corresponds to the subvolume tree they are based on. So just from an
@@ -589,9 +596,16 @@ int btrfs_check_node(struct btrfs_root *root, struct extent_buffer *node)
 	unsigned long nr = btrfs_header_nritems(node);
 	struct btrfs_key key, next_key;
 	int slot;
+	int level = btrfs_header_level(node);
 	u64 bytenr;
 	int ret = 0;
 
+	if (level <= 0 || level >= BTRFS_MAX_LEVEL) {
+		generic_err(root, node, 0,
+			"invalid level for node, have %d expect [1, %d]",
+			level, BTRFS_MAX_LEVEL - 1);
+		return -EUCLEAN;
+	}
 	if (nr == 0 || nr > BTRFS_NODEPTRS_PER_BLOCK(root->fs_info)) {
 		btrfs_crit(root->fs_info,
 "corrupt node: root=%llu block=%llu, nritems too %s, have %lu expect range [1,%u]",
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 043/146] btrfs: tree-checker: Fix misleading group system information
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (41 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 042/146] btrfs: tree-checker: Check level for leaves and nodes Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 044/146] f2fs: check blkaddr more accuratly before issue a bio Greg Kroah-Hartman
                   ` (107 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Nikolay Borisov, Qu Wenruo,
	Shaokun Zhang, David Sterba, Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 761333f2f50ccc887aa9957ae829300262c0d15b upstream.

block_group_err shows the group system as a decimal value with a '0x'
prefix, which is somewhat misleading.

Fix it to print hexadecimal, as was intended.

Fixes: fce466eab7ac6 ("btrfs: tree-checker: Verify block_group_item")
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/tree-checker.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index fa8f64119e6f..f206aec1525d 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -399,7 +399,7 @@ static int check_block_group_item(struct btrfs_fs_info *fs_info,
 	    type != (BTRFS_BLOCK_GROUP_METADATA |
 			   BTRFS_BLOCK_GROUP_DATA)) {
 		block_group_err(fs_info, leaf, slot,
-"invalid type, have 0x%llx (%lu bits set) expect either 0x%llx, 0x%llx, 0x%llu or 0x%llx",
+"invalid type, have 0x%llx (%lu bits set) expect either 0x%llx, 0x%llx, 0x%llx or 0x%llx",
 			type, hweight64(type),
 			BTRFS_BLOCK_GROUP_DATA, BTRFS_BLOCK_GROUP_METADATA,
 			BTRFS_BLOCK_GROUP_SYSTEM,
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 044/146] f2fs: check blkaddr more accuratly before issue a bio
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (42 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 043/146] btrfs: tree-checker: Fix misleading group system information Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 045/146] f2fs: sanity check on sit entry Greg Kroah-Hartman
                   ` (106 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Yunlei He, Chao Yu, Jaegeuk Kim,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 0833721ec3658a4e9d5e58b6fa82cf9edc431e59 upstream.

This patch check blkaddr more accuratly before issue a
write or read bio.

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/f2fs/checkpoint.c |  2 ++
 fs/f2fs/data.c       |  5 +++--
 fs/f2fs/f2fs.h       |  1 +
 fs/f2fs/segment.h    | 25 +++++++++++++++++++------
 4 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 41fce930f44c..1bd4cd8c79c6 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -69,6 +69,7 @@ static struct page *__get_meta_page(struct f2fs_sb_info *sbi, pgoff_t index,
 		.old_blkaddr = index,
 		.new_blkaddr = index,
 		.encrypted_page = NULL,
+		.is_meta = is_meta,
 	};
 
 	if (unlikely(!is_meta))
@@ -163,6 +164,7 @@ int ra_meta_pages(struct f2fs_sb_info *sbi, block_t start, int nrpages,
 		.op_flags = sync ? (REQ_META | REQ_PRIO) : REQ_RAHEAD,
 		.encrypted_page = NULL,
 		.in_list = false,
+		.is_meta = (type != META_POR),
 	};
 	struct blk_plug plug;
 
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 6fbb6d75318a..5913de3f661d 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -369,6 +369,7 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio)
 	struct page *page = fio->encrypted_page ?
 			fio->encrypted_page : fio->page;
 
+	verify_block_addr(fio, fio->new_blkaddr);
 	trace_f2fs_submit_page_bio(page, fio);
 	f2fs_trace_ios(fio, 0);
 
@@ -413,8 +414,8 @@ next:
 	}
 
 	if (fio->old_blkaddr != NEW_ADDR)
-		verify_block_addr(sbi, fio->old_blkaddr);
-	verify_block_addr(sbi, fio->new_blkaddr);
+		verify_block_addr(fio, fio->old_blkaddr);
+	verify_block_addr(fio, fio->new_blkaddr);
 
 	bio_page = fio->encrypted_page ? fio->encrypted_page : fio->page;
 
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 54f8520ad7a2..aa7b033af1b0 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -910,6 +910,7 @@ struct f2fs_io_info {
 	bool submitted;		/* indicate IO submission */
 	int need_lock;		/* indicate we need to lock cp_rwsem */
 	bool in_list;		/* indicate fio is in io_list */
+	bool is_meta;		/* indicate borrow meta inode mapping or not */
 	enum iostat_type io_type;	/* io type */
 };
 
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 4dfb5080098f..4b635e8c91b0 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -53,13 +53,19 @@
 	 ((secno) == CURSEG_I(sbi, CURSEG_COLD_NODE)->segno /		\
 	  (sbi)->segs_per_sec))	\
 
-#define MAIN_BLKADDR(sbi)	(SM_I(sbi)->main_blkaddr)
-#define SEG0_BLKADDR(sbi)	(SM_I(sbi)->seg0_blkaddr)
+#define MAIN_BLKADDR(sbi)						\
+	(SM_I(sbi) ? SM_I(sbi)->main_blkaddr : 				\
+		le32_to_cpu(F2FS_RAW_SUPER(sbi)->main_blkaddr))
+#define SEG0_BLKADDR(sbi)						\
+	(SM_I(sbi) ? SM_I(sbi)->seg0_blkaddr : 				\
+		le32_to_cpu(F2FS_RAW_SUPER(sbi)->segment0_blkaddr))
 
 #define MAIN_SEGS(sbi)	(SM_I(sbi)->main_segments)
 #define MAIN_SECS(sbi)	((sbi)->total_sections)
 
-#define TOTAL_SEGS(sbi)	(SM_I(sbi)->segment_count)
+#define TOTAL_SEGS(sbi)							\
+	(SM_I(sbi) ? SM_I(sbi)->segment_count : 				\
+		le32_to_cpu(F2FS_RAW_SUPER(sbi)->segment_count))
 #define TOTAL_BLKS(sbi)	(TOTAL_SEGS(sbi) << (sbi)->log_blocks_per_seg)
 
 #define MAX_BLKADDR(sbi)	(SEG0_BLKADDR(sbi) + TOTAL_BLKS(sbi))
@@ -619,10 +625,17 @@ static inline void check_seg_range(struct f2fs_sb_info *sbi, unsigned int segno)
 	f2fs_bug_on(sbi, segno > TOTAL_SEGS(sbi) - 1);
 }
 
-static inline void verify_block_addr(struct f2fs_sb_info *sbi, block_t blk_addr)
+static inline void verify_block_addr(struct f2fs_io_info *fio, block_t blk_addr)
 {
-	BUG_ON(blk_addr < SEG0_BLKADDR(sbi)
-			|| blk_addr >= MAX_BLKADDR(sbi));
+	struct f2fs_sb_info *sbi = fio->sbi;
+
+	if (PAGE_TYPE_OF_BIO(fio->type) == META &&
+				(!is_read_io(fio->op) || fio->is_meta))
+		BUG_ON(blk_addr < SEG0_BLKADDR(sbi) ||
+				blk_addr >= MAIN_BLKADDR(sbi));
+	else
+		BUG_ON(blk_addr < MAIN_BLKADDR(sbi) ||
+				blk_addr >= MAX_BLKADDR(sbi));
 }
 
 /*
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 045/146] f2fs: sanity check on sit entry
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (43 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 044/146] f2fs: check blkaddr more accuratly before issue a bio Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 046/146] f2fs: enhance sanity_check_raw_super() to avoid potential overflow Greg Kroah-Hartman
                   ` (105 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+83699adeb2d13579c31e, Chao Yu,
	Jaegeuk Kim, Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit b2ca374f33bd33fd822eb871876e4888cf79dc97 upstream.

syzbot hit the following crash on upstream commit
87ef12027b9b1dd0e0b12cf311fbcb19f9d92539 (Wed Apr 18 19:48:17 2018 +0000)
Merge tag 'ceph-for-4.17-rc2' of git://github.com/ceph/ceph-client
syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=83699adeb2d13579c31e

C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5805208181407744
syzkaller reproducer: https://syzkaller.appspot.com/x/repro.syz?id=6005073343676416
Raw console output: https://syzkaller.appspot.com/x/log.txt?id=6555047731134464
Kernel config: https://syzkaller.appspot.com/x/.config?id=1808800213120130118
compiler: gcc (GCC) 8.0.1 20180413 (experimental)

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+83699adeb2d13579c31e@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for details.
If you forward the report, please keep this part and the footer.

F2FS-fs (loop0): Magic Mismatch, valid(0xf2f52010) - read(0x0)
F2FS-fs (loop0): Can't find valid F2FS filesystem in 1th superblock
F2FS-fs (loop0): invalid crc value
BUG: unable to handle kernel paging request at ffffed006b2a50c0
PGD 21ffee067 P4D 21ffee067 PUD 21fbeb067 PMD 0
Oops: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 4514 Comm: syzkaller989480 Not tainted 4.17.0-rc1+ #8
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:build_sit_entries fs/f2fs/segment.c:3653 [inline]
RIP: 0010:build_segment_manager+0x7ef7/0xbf70 fs/f2fs/segment.c:3852
RSP: 0018:ffff8801b102e5b0 EFLAGS: 00010a06
RAX: 1ffff1006b2a50c0 RBX: 0000000000000004 RCX: 0000000000000001
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8801ac74243e
RBP: ffff8801b102f410 R08: ffff8801acbd46c0 R09: fffffbfff14d9af8
R10: fffffbfff14d9af8 R11: ffff8801acbd46c0 R12: ffff8801ac742a80
R13: ffff8801d9519100 R14: dffffc0000000000 R15: ffff880359528600
FS:  0000000001e04880(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffed006b2a50c0 CR3: 00000001ac6ac000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 f2fs_fill_super+0x4095/0x7bf0 fs/f2fs/super.c:2803
 mount_bdev+0x30c/0x3e0 fs/super.c:1165
 f2fs_mount+0x34/0x40 fs/f2fs/super.c:3020
 mount_fs+0xae/0x328 fs/super.c:1268
 vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037
 vfs_kern_mount fs/namespace.c:1027 [inline]
 do_new_mount fs/namespace.c:2517 [inline]
 do_mount+0x564/0x3070 fs/namespace.c:2847
 ksys_mount+0x12d/0x140 fs/namespace.c:3063
 __do_sys_mount fs/namespace.c:3077 [inline]
 __se_sys_mount fs/namespace.c:3074 [inline]
 __x64_sys_mount+0xbe/0x150 fs/namespace.c:3074
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x443d6a
RSP: 002b:00007ffd312813c8 EFLAGS: 00000297 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 0000000020000c00 RCX: 0000000000443d6a
RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffd312813d0
RBP: 0000000000000003 R08: 0000000020016a00 R09: 000000000000000a
R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000004
R13: 0000000000402c60 R14: 0000000000000000 R15: 0000000000000000
RIP: build_sit_entries fs/f2fs/segment.c:3653 [inline] RSP: ffff8801b102e5b0
RIP: build_segment_manager+0x7ef7/0xbf70 fs/f2fs/segment.c:3852 RSP: ffff8801b102e5b0
CR2: ffffed006b2a50c0
---[ end trace a2034989e196ff17 ]---

Reported-and-tested-by: syzbot+83699adeb2d13579c31e@syzkaller.appspotmail.com
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>

Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/f2fs/segment.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 3c7bbbae0afa..1104a6c80251 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -3304,6 +3304,15 @@ static int build_sit_entries(struct f2fs_sb_info *sbi)
 		unsigned int old_valid_blocks;
 
 		start = le32_to_cpu(segno_in_journal(journal, i));
+		if (start >= MAIN_SEGS(sbi)) {
+			f2fs_msg(sbi->sb, KERN_ERR,
+					"Wrong journal entry on segno %u",
+					start);
+			set_sbi_flag(sbi, SBI_NEED_FSCK);
+			err = -EINVAL;
+			break;
+		}
+
 		se = &sit_i->sentries[start];
 		sit = sit_in_journal(journal, i);
 
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 046/146] f2fs: enhance sanity_check_raw_super() to avoid potential overflow
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (44 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 045/146] f2fs: sanity check on sit entry Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 047/146] f2fs: clean up with is_valid_blkaddr() Greg Kroah-Hartman
                   ` (104 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Greg KH, Silvio Cesare,
	Linus Torvalds, Chao Yu, Jaegeuk Kim, Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 0cfe75c5b011994651a4ca6d74f20aa997bfc69a upstream.

In order to avoid the below overflow issue, we should have checked the
boundaries in superblock before reaching out to allocation. As Linus suggested,
the right place should be sanity_check_raw_super().

Dr Silvio Cesare of InfoSect reported:

There are integer overflows with using the cp_payload superblock field in the
f2fs filesystem potentially leading to memory corruption.

include/linux/f2fs_fs.h

struct f2fs_super_block {
...
        __le32 cp_payload;

fs/f2fs/f2fs.h

typedef u32 block_t;    /*
                         * should not change u32, since it is the on-disk block
                         * address format, __le32.
                         */
...

static inline block_t __cp_payload(struct f2fs_sb_info *sbi)
{
        return le32_to_cpu(F2FS_RAW_SUPER(sbi)->cp_payload);
}

fs/f2fs/checkpoint.c

        block_t start_blk, orphan_blocks, i, j;
...
        start_blk = __start_cp_addr(sbi) + 1 + __cp_payload(sbi);
        orphan_blocks = __start_sum_addr(sbi) - 1 - __cp_payload(sbi);

+++ integer overflows

...
        unsigned int cp_blks = 1 + __cp_payload(sbi);
...
        sbi->ckpt = kzalloc(cp_blks * blk_size, GFP_KERNEL);

+++ integer overflow leading to incorrect heap allocation.

        int cp_payload_blks = __cp_payload(sbi);
...
        ckpt->cp_pack_start_sum = cpu_to_le32(1 + cp_payload_blks +
                        orphan_blocks);

+++ sign bug and integer overflow

...
        for (i = 1; i < 1 + cp_payload_blks; i++)

+++ integer overflow

...

      sbi->max_orphans = (sbi->blocks_per_seg - F2FS_CP_PACKS -
                        NR_CURSEG_TYPE - __cp_payload(sbi)) *
                                F2FS_ORPHANS_PER_BLOCK;

+++ integer overflow

Reported-by: Greg KH <greg@kroah.com>
Reported-by: Silvio Cesare <silvio.cesare@gmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[bwh: Backported to 4.14: No hot file extension support]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/f2fs/super.c | 71 ++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 64 insertions(+), 7 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 7cda685296b2..c3e1090e7c0a 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1807,6 +1807,8 @@ static inline bool sanity_check_area_boundary(struct f2fs_sb_info *sbi,
 static int sanity_check_raw_super(struct f2fs_sb_info *sbi,
 				struct buffer_head *bh)
 {
+	block_t segment_count, segs_per_sec, secs_per_zone;
+	block_t total_sections, blocks_per_seg;
 	struct f2fs_super_block *raw_super = (struct f2fs_super_block *)
 					(bh->b_data + F2FS_SUPER_OFFSET);
 	struct super_block *sb = sbi->sb;
@@ -1863,6 +1865,68 @@ static int sanity_check_raw_super(struct f2fs_sb_info *sbi,
 		return 1;
 	}
 
+	segment_count = le32_to_cpu(raw_super->segment_count);
+	segs_per_sec = le32_to_cpu(raw_super->segs_per_sec);
+	secs_per_zone = le32_to_cpu(raw_super->secs_per_zone);
+	total_sections = le32_to_cpu(raw_super->section_count);
+
+	/* blocks_per_seg should be 512, given the above check */
+	blocks_per_seg = 1 << le32_to_cpu(raw_super->log_blocks_per_seg);
+
+	if (segment_count > F2FS_MAX_SEGMENT ||
+				segment_count < F2FS_MIN_SEGMENTS) {
+		f2fs_msg(sb, KERN_INFO,
+			"Invalid segment count (%u)",
+			segment_count);
+		return 1;
+	}
+
+	if (total_sections > segment_count ||
+			total_sections < F2FS_MIN_SEGMENTS ||
+			segs_per_sec > segment_count || !segs_per_sec) {
+		f2fs_msg(sb, KERN_INFO,
+			"Invalid segment/section count (%u, %u x %u)",
+			segment_count, total_sections, segs_per_sec);
+		return 1;
+	}
+
+	if ((segment_count / segs_per_sec) < total_sections) {
+		f2fs_msg(sb, KERN_INFO,
+			"Small segment_count (%u < %u * %u)",
+			segment_count, segs_per_sec, total_sections);
+		return 1;
+	}
+
+	if (segment_count > (le32_to_cpu(raw_super->block_count) >> 9)) {
+		f2fs_msg(sb, KERN_INFO,
+			"Wrong segment_count / block_count (%u > %u)",
+			segment_count, le32_to_cpu(raw_super->block_count));
+		return 1;
+	}
+
+	if (secs_per_zone > total_sections) {
+		f2fs_msg(sb, KERN_INFO,
+			"Wrong secs_per_zone (%u > %u)",
+			secs_per_zone, total_sections);
+		return 1;
+	}
+	if (le32_to_cpu(raw_super->extension_count) > F2FS_MAX_EXTENSION) {
+		f2fs_msg(sb, KERN_INFO,
+			"Corrupted extension count (%u > %u)",
+			le32_to_cpu(raw_super->extension_count),
+			F2FS_MAX_EXTENSION);
+		return 1;
+	}
+
+	if (le32_to_cpu(raw_super->cp_payload) >
+				(blocks_per_seg - F2FS_CP_PACKS)) {
+		f2fs_msg(sb, KERN_INFO,
+			"Insane cp_payload (%u > %u)",
+			le32_to_cpu(raw_super->cp_payload),
+			blocks_per_seg - F2FS_CP_PACKS);
+		return 1;
+	}
+
 	/* check reserved ino info */
 	if (le32_to_cpu(raw_super->node_ino) != 1 ||
 		le32_to_cpu(raw_super->meta_ino) != 2 ||
@@ -1875,13 +1939,6 @@ static int sanity_check_raw_super(struct f2fs_sb_info *sbi,
 		return 1;
 	}
 
-	if (le32_to_cpu(raw_super->segment_count) > F2FS_MAX_SEGMENT) {
-		f2fs_msg(sb, KERN_INFO,
-			"Invalid segment count (%u)",
-			le32_to_cpu(raw_super->segment_count));
-		return 1;
-	}
-
 	/* check CP/SIT/NAT/SSA/MAIN_AREA area boundary */
 	if (sanity_check_area_boundary(sbi, bh))
 		return 1;
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 047/146] f2fs: clean up with is_valid_blkaddr()
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (45 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 046/146] f2fs: enhance sanity_check_raw_super() to avoid potential overflow Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 048/146] f2fs: introduce and spread verify_blkaddr Greg Kroah-Hartman
                   ` (103 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Chao Yu, Jaegeuk Kim, Ben Hutchings,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 7b525dd01365c6764018e374d391c92466be1b7a upstream.

- rename is_valid_blkaddr() to is_valid_meta_blkaddr() for readability.
- introduce is_valid_blkaddr() for cleanup.

No logic change in this patch.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/f2fs/checkpoint.c |  4 ++--
 fs/f2fs/data.c       | 18 +++++-------------
 fs/f2fs/f2fs.h       |  9 ++++++++-
 fs/f2fs/file.c       |  2 +-
 fs/f2fs/inode.c      |  2 +-
 fs/f2fs/node.c       |  5 ++---
 fs/f2fs/recovery.c   |  6 +++---
 fs/f2fs/segment.c    |  4 ++--
 fs/f2fs/segment.h    |  2 +-
 9 files changed, 25 insertions(+), 27 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 1bd4cd8c79c6..2b951046657e 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -118,7 +118,7 @@ struct page *get_tmp_page(struct f2fs_sb_info *sbi, pgoff_t index)
 	return __get_meta_page(sbi, index, false);
 }
 
-bool is_valid_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type)
+bool is_valid_meta_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type)
 {
 	switch (type) {
 	case META_NAT:
@@ -174,7 +174,7 @@ int ra_meta_pages(struct f2fs_sb_info *sbi, block_t start, int nrpages,
 	blk_start_plug(&plug);
 	for (; nrpages-- > 0; blkno++) {
 
-		if (!is_valid_blkaddr(sbi, blkno, type))
+		if (!is_valid_meta_blkaddr(sbi, blkno, type))
 			goto out;
 
 		switch (type) {
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 5913de3f661d..f7b2909e9c5f 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -413,7 +413,7 @@ next:
 		spin_unlock(&io->io_lock);
 	}
 
-	if (fio->old_blkaddr != NEW_ADDR)
+	if (is_valid_blkaddr(fio->old_blkaddr))
 		verify_block_addr(fio, fio->old_blkaddr);
 	verify_block_addr(fio, fio->new_blkaddr);
 
@@ -946,7 +946,7 @@ next_dnode:
 next_block:
 	blkaddr = datablock_addr(dn.inode, dn.node_page, dn.ofs_in_node);
 
-	if (blkaddr == NEW_ADDR || blkaddr == NULL_ADDR) {
+	if (!is_valid_blkaddr(blkaddr)) {
 		if (create) {
 			if (unlikely(f2fs_cp_error(sbi))) {
 				err = -EIO;
@@ -1388,15 +1388,6 @@ static inline bool need_inplace_update(struct f2fs_io_info *fio)
 	return need_inplace_update_policy(inode, fio);
 }
 
-static inline bool valid_ipu_blkaddr(struct f2fs_io_info *fio)
-{
-	if (fio->old_blkaddr == NEW_ADDR)
-		return false;
-	if (fio->old_blkaddr == NULL_ADDR)
-		return false;
-	return true;
-}
-
 int do_write_data_page(struct f2fs_io_info *fio)
 {
 	struct page *page = fio->page;
@@ -1411,7 +1402,7 @@ int do_write_data_page(struct f2fs_io_info *fio)
 			f2fs_lookup_extent_cache(inode, page->index, &ei)) {
 		fio->old_blkaddr = ei.blk + page->index - ei.fofs;
 
-		if (valid_ipu_blkaddr(fio)) {
+		if (is_valid_blkaddr(fio->old_blkaddr)) {
 			ipu_force = true;
 			fio->need_lock = LOCK_DONE;
 			goto got_it;
@@ -1438,7 +1429,8 @@ got_it:
 	 * If current allocation needs SSR,
 	 * it had better in-place writes for updated data.
 	 */
-	if (ipu_force || (valid_ipu_blkaddr(fio) && need_inplace_update(fio))) {
+	if (ipu_force || (is_valid_blkaddr(fio->old_blkaddr) &&
+					need_inplace_update(fio))) {
 		err = encrypt_one_page(fio);
 		if (err)
 			goto out_writepage;
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index aa7b033af1b0..d74c77b51b71 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -2355,6 +2355,13 @@ static inline void f2fs_update_iostat(struct f2fs_sb_info *sbi,
 	spin_unlock(&sbi->iostat_lock);
 }
 
+static inline bool is_valid_blkaddr(block_t blkaddr)
+{
+	if (blkaddr == NEW_ADDR || blkaddr == NULL_ADDR)
+		return false;
+	return true;
+}
+
 /*
  * file.c
  */
@@ -2565,7 +2572,7 @@ void f2fs_stop_checkpoint(struct f2fs_sb_info *sbi, bool end_io);
 struct page *grab_meta_page(struct f2fs_sb_info *sbi, pgoff_t index);
 struct page *get_meta_page(struct f2fs_sb_info *sbi, pgoff_t index);
 struct page *get_tmp_page(struct f2fs_sb_info *sbi, pgoff_t index);
-bool is_valid_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type);
+bool is_valid_meta_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type);
 int ra_meta_pages(struct f2fs_sb_info *sbi, block_t start, int nrpages,
 			int type, bool sync);
 void ra_meta_pages_cond(struct f2fs_sb_info *sbi, pgoff_t index);
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 6f589730782d..d368eda462bb 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -334,7 +334,7 @@ static bool __found_offset(block_t blkaddr, pgoff_t dirty, pgoff_t pgofs,
 	switch (whence) {
 	case SEEK_DATA:
 		if ((blkaddr == NEW_ADDR && dirty == pgofs) ||
-			(blkaddr != NEW_ADDR && blkaddr != NULL_ADDR))
+			is_valid_blkaddr(blkaddr))
 			return true;
 		break;
 	case SEEK_HOLE:
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 259b0aa283f0..2bcbbf566f0c 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -66,7 +66,7 @@ static bool __written_first_block(struct f2fs_inode *ri)
 {
 	block_t addr = le32_to_cpu(ri->i_addr[offset_in_addr(ri)]);
 
-	if (addr != NEW_ADDR && addr != NULL_ADDR)
+	if (is_valid_blkaddr(addr))
 		return true;
 	return false;
 }
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 712505ec5de4..5f212fb2d62b 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -334,8 +334,7 @@ static void set_node_addr(struct f2fs_sb_info *sbi, struct node_info *ni,
 			new_blkaddr == NULL_ADDR);
 	f2fs_bug_on(sbi, nat_get_blkaddr(e) == NEW_ADDR &&
 			new_blkaddr == NEW_ADDR);
-	f2fs_bug_on(sbi, nat_get_blkaddr(e) != NEW_ADDR &&
-			nat_get_blkaddr(e) != NULL_ADDR &&
+	f2fs_bug_on(sbi, is_valid_blkaddr(nat_get_blkaddr(e)) &&
 			new_blkaddr == NEW_ADDR);
 
 	/* increment version no as node is removed */
@@ -350,7 +349,7 @@ static void set_node_addr(struct f2fs_sb_info *sbi, struct node_info *ni,
 
 	/* change address */
 	nat_set_blkaddr(e, new_blkaddr);
-	if (new_blkaddr == NEW_ADDR || new_blkaddr == NULL_ADDR)
+	if (!is_valid_blkaddr(new_blkaddr))
 		set_nat_flag(e, IS_CHECKPOINTED, false);
 	__set_nat_cache_dirty(nm_i, e);
 
diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index 765fadf954af..53f41ad4cbe1 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -236,7 +236,7 @@ static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head,
 	while (1) {
 		struct fsync_inode_entry *entry;
 
-		if (!is_valid_blkaddr(sbi, blkaddr, META_POR))
+		if (!is_valid_meta_blkaddr(sbi, blkaddr, META_POR))
 			return 0;
 
 		page = get_tmp_page(sbi, blkaddr);
@@ -479,7 +479,7 @@ retry_dn:
 		}
 
 		/* dest is valid block, try to recover from src to dest */
-		if (is_valid_blkaddr(sbi, dest, META_POR)) {
+		if (is_valid_meta_blkaddr(sbi, dest, META_POR)) {
 
 			if (src == NULL_ADDR) {
 				err = reserve_new_block(&dn);
@@ -540,7 +540,7 @@ static int recover_data(struct f2fs_sb_info *sbi, struct list_head *inode_list,
 	while (1) {
 		struct fsync_inode_entry *entry;
 
-		if (!is_valid_blkaddr(sbi, blkaddr, META_POR))
+		if (!is_valid_meta_blkaddr(sbi, blkaddr, META_POR))
 			break;
 
 		ra_meta_pages_cond(sbi, blkaddr);
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 1104a6c80251..483d7a869679 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1758,7 +1758,7 @@ bool is_checkpointed_data(struct f2fs_sb_info *sbi, block_t blkaddr)
 	struct seg_entry *se;
 	bool is_cp = false;
 
-	if (blkaddr == NEW_ADDR || blkaddr == NULL_ADDR)
+	if (!is_valid_blkaddr(blkaddr))
 		return true;
 
 	mutex_lock(&sit_i->sentry_lock);
@@ -2571,7 +2571,7 @@ void f2fs_wait_on_block_writeback(struct f2fs_sb_info *sbi, block_t blkaddr)
 {
 	struct page *cpage;
 
-	if (blkaddr == NEW_ADDR || blkaddr == NULL_ADDR)
+	if (!is_valid_blkaddr(blkaddr))
 		return;
 
 	cpage = find_lock_page(META_MAPPING(sbi), blkaddr);
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 4b635e8c91b0..f977774338c3 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -85,7 +85,7 @@
 	(GET_SEGOFF_FROM_SEG0(sbi, blk_addr) & ((sbi)->blocks_per_seg - 1))
 
 #define GET_SEGNO(sbi, blk_addr)					\
-	((((blk_addr) == NULL_ADDR) || ((blk_addr) == NEW_ADDR)) ?	\
+	((!is_valid_blkaddr(blk_addr)) ?			\
 	NULL_SEGNO : GET_L2R_SEGNO(FREE_I(sbi),			\
 		GET_SEGNO_FROM_SEG0(sbi, blk_addr)))
 #define BLKS_PER_SEC(sbi)					\
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 048/146] f2fs: introduce and spread verify_blkaddr
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (46 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 047/146] f2fs: clean up with is_valid_blkaddr() Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 049/146] f2fs: fix to do sanity check with secs_per_zone Greg Kroah-Hartman
                   ` (102 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Chao Yu, Jaegeuk Kim, Ben Hutchings,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit e1da7872f6eda977bd812346bf588c35e4495a1e upstream.

This patch introduces verify_blkaddr to check meta/data block address
with valid range to detect bug earlier.

In addition, once we encounter an invalid blkaddr, notice user to run
fsck to fix, and let the kernel panic.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[bwh: Backported to 4.14: I skipped an earlier renaming of
 is_valid_meta_blkaddr() to f2fs_is_valid_meta_blkaddr()]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/f2fs/checkpoint.c | 11 +++++++++--
 fs/f2fs/data.c       |  8 ++++----
 fs/f2fs/f2fs.h       | 32 +++++++++++++++++++++++++++++---
 fs/f2fs/file.c       |  9 +++++----
 fs/f2fs/inode.c      |  7 ++++---
 fs/f2fs/node.c       |  4 ++--
 fs/f2fs/recovery.c   |  6 +++---
 fs/f2fs/segment.c    |  4 ++--
 fs/f2fs/segment.h    |  8 +++-----
 9 files changed, 61 insertions(+), 28 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 2b951046657e..d7bd9745e883 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -118,7 +118,8 @@ struct page *get_tmp_page(struct f2fs_sb_info *sbi, pgoff_t index)
 	return __get_meta_page(sbi, index, false);
 }
 
-bool is_valid_meta_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type)
+bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi,
+					block_t blkaddr, int type)
 {
 	switch (type) {
 	case META_NAT:
@@ -138,10 +139,16 @@ bool is_valid_meta_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type)
 			return false;
 		break;
 	case META_POR:
+	case DATA_GENERIC:
 		if (unlikely(blkaddr >= MAX_BLKADDR(sbi) ||
 			blkaddr < MAIN_BLKADDR(sbi)))
 			return false;
 		break;
+	case META_GENERIC:
+		if (unlikely(blkaddr < SEG0_BLKADDR(sbi) ||
+			blkaddr >= MAIN_BLKADDR(sbi)))
+			return false;
+		break;
 	default:
 		BUG();
 	}
@@ -174,7 +181,7 @@ int ra_meta_pages(struct f2fs_sb_info *sbi, block_t start, int nrpages,
 	blk_start_plug(&plug);
 	for (; nrpages-- > 0; blkno++) {
 
-		if (!is_valid_meta_blkaddr(sbi, blkno, type))
+		if (!f2fs_is_valid_blkaddr(sbi, blkno, type))
 			goto out;
 
 		switch (type) {
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index f7b2909e9c5f..615878806611 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -413,7 +413,7 @@ next:
 		spin_unlock(&io->io_lock);
 	}
 
-	if (is_valid_blkaddr(fio->old_blkaddr))
+	if (__is_valid_data_blkaddr(fio->old_blkaddr))
 		verify_block_addr(fio, fio->old_blkaddr);
 	verify_block_addr(fio, fio->new_blkaddr);
 
@@ -946,7 +946,7 @@ next_dnode:
 next_block:
 	blkaddr = datablock_addr(dn.inode, dn.node_page, dn.ofs_in_node);
 
-	if (!is_valid_blkaddr(blkaddr)) {
+	if (!is_valid_data_blkaddr(sbi, blkaddr)) {
 		if (create) {
 			if (unlikely(f2fs_cp_error(sbi))) {
 				err = -EIO;
@@ -1402,7 +1402,7 @@ int do_write_data_page(struct f2fs_io_info *fio)
 			f2fs_lookup_extent_cache(inode, page->index, &ei)) {
 		fio->old_blkaddr = ei.blk + page->index - ei.fofs;
 
-		if (is_valid_blkaddr(fio->old_blkaddr)) {
+		if (is_valid_data_blkaddr(fio->sbi, fio->old_blkaddr)) {
 			ipu_force = true;
 			fio->need_lock = LOCK_DONE;
 			goto got_it;
@@ -1429,7 +1429,7 @@ got_it:
 	 * If current allocation needs SSR,
 	 * it had better in-place writes for updated data.
 	 */
-	if (ipu_force || (is_valid_blkaddr(fio->old_blkaddr) &&
+	if (ipu_force || (is_valid_data_blkaddr(fio->sbi, fio->old_blkaddr) &&
 					need_inplace_update(fio))) {
 		err = encrypt_one_page(fio);
 		if (err)
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index d74c77b51b71..d15d79457f5c 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -162,7 +162,7 @@ struct cp_control {
 };
 
 /*
- * For CP/NAT/SIT/SSA readahead
+ * indicate meta/data type
  */
 enum {
 	META_CP,
@@ -170,6 +170,8 @@ enum {
 	META_SIT,
 	META_SSA,
 	META_POR,
+	DATA_GENERIC,
+	META_GENERIC,
 };
 
 /* for the list of ino */
@@ -2355,13 +2357,36 @@ static inline void f2fs_update_iostat(struct f2fs_sb_info *sbi,
 	spin_unlock(&sbi->iostat_lock);
 }
 
-static inline bool is_valid_blkaddr(block_t blkaddr)
+bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi,
+					block_t blkaddr, int type);
+void f2fs_msg(struct super_block *sb, const char *level, const char *fmt, ...);
+static inline void verify_blkaddr(struct f2fs_sb_info *sbi,
+					block_t blkaddr, int type)
+{
+	if (!f2fs_is_valid_blkaddr(sbi, blkaddr, type)) {
+		f2fs_msg(sbi->sb, KERN_ERR,
+			"invalid blkaddr: %u, type: %d, run fsck to fix.",
+			blkaddr, type);
+		f2fs_bug_on(sbi, 1);
+	}
+}
+
+static inline bool __is_valid_data_blkaddr(block_t blkaddr)
 {
 	if (blkaddr == NEW_ADDR || blkaddr == NULL_ADDR)
 		return false;
 	return true;
 }
 
+static inline bool is_valid_data_blkaddr(struct f2fs_sb_info *sbi,
+						block_t blkaddr)
+{
+	if (!__is_valid_data_blkaddr(blkaddr))
+		return false;
+	verify_blkaddr(sbi, blkaddr, DATA_GENERIC);
+	return true;
+}
+
 /*
  * file.c
  */
@@ -2572,7 +2597,8 @@ void f2fs_stop_checkpoint(struct f2fs_sb_info *sbi, bool end_io);
 struct page *grab_meta_page(struct f2fs_sb_info *sbi, pgoff_t index);
 struct page *get_meta_page(struct f2fs_sb_info *sbi, pgoff_t index);
 struct page *get_tmp_page(struct f2fs_sb_info *sbi, pgoff_t index);
-bool is_valid_meta_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type);
+bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi,
+					block_t blkaddr, int type);
 int ra_meta_pages(struct f2fs_sb_info *sbi, block_t start, int nrpages,
 			int type, bool sync);
 void ra_meta_pages_cond(struct f2fs_sb_info *sbi, pgoff_t index);
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index d368eda462bb..e9c575ef70b5 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -328,13 +328,13 @@ static pgoff_t __get_first_dirty_index(struct address_space *mapping,
 	return pgofs;
 }
 
-static bool __found_offset(block_t blkaddr, pgoff_t dirty, pgoff_t pgofs,
-							int whence)
+static bool __found_offset(struct f2fs_sb_info *sbi, block_t blkaddr,
+				pgoff_t dirty, pgoff_t pgofs, int whence)
 {
 	switch (whence) {
 	case SEEK_DATA:
 		if ((blkaddr == NEW_ADDR && dirty == pgofs) ||
-			is_valid_blkaddr(blkaddr))
+			is_valid_data_blkaddr(sbi, blkaddr))
 			return true;
 		break;
 	case SEEK_HOLE:
@@ -397,7 +397,8 @@ static loff_t f2fs_seek_block(struct file *file, loff_t offset, int whence)
 			blkaddr = datablock_addr(dn.inode,
 					dn.node_page, dn.ofs_in_node);
 
-			if (__found_offset(blkaddr, dirty, pgofs, whence)) {
+			if (__found_offset(F2FS_I_SB(inode), blkaddr, dirty,
+							pgofs, whence)) {
 				f2fs_put_dnode(&dn);
 				goto found;
 			}
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 2bcbbf566f0c..b1c0eb433841 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -62,11 +62,12 @@ static void __get_inode_rdev(struct inode *inode, struct f2fs_inode *ri)
 	}
 }
 
-static bool __written_first_block(struct f2fs_inode *ri)
+static bool __written_first_block(struct f2fs_sb_info *sbi,
+					struct f2fs_inode *ri)
 {
 	block_t addr = le32_to_cpu(ri->i_addr[offset_in_addr(ri)]);
 
-	if (is_valid_blkaddr(addr))
+	if (is_valid_data_blkaddr(sbi, addr))
 		return true;
 	return false;
 }
@@ -235,7 +236,7 @@ static int do_read_inode(struct inode *inode)
 	/* get rdev by using inline_info */
 	__get_inode_rdev(inode, ri);
 
-	if (__written_first_block(ri))
+	if (__written_first_block(sbi, ri))
 		set_inode_flag(inode, FI_FIRST_BLOCK_WRITTEN);
 
 	if (!need_inode_block_update(sbi, inode->i_ino))
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 5f212fb2d62b..999814c8cbea 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -334,7 +334,7 @@ static void set_node_addr(struct f2fs_sb_info *sbi, struct node_info *ni,
 			new_blkaddr == NULL_ADDR);
 	f2fs_bug_on(sbi, nat_get_blkaddr(e) == NEW_ADDR &&
 			new_blkaddr == NEW_ADDR);
-	f2fs_bug_on(sbi, is_valid_blkaddr(nat_get_blkaddr(e)) &&
+	f2fs_bug_on(sbi, is_valid_data_blkaddr(sbi, nat_get_blkaddr(e)) &&
 			new_blkaddr == NEW_ADDR);
 
 	/* increment version no as node is removed */
@@ -349,7 +349,7 @@ static void set_node_addr(struct f2fs_sb_info *sbi, struct node_info *ni,
 
 	/* change address */
 	nat_set_blkaddr(e, new_blkaddr);
-	if (!is_valid_blkaddr(new_blkaddr))
+	if (!is_valid_data_blkaddr(sbi, new_blkaddr))
 		set_nat_flag(e, IS_CHECKPOINTED, false);
 	__set_nat_cache_dirty(nm_i, e);
 
diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index 53f41ad4cbe1..6ea445377767 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -236,7 +236,7 @@ static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head,
 	while (1) {
 		struct fsync_inode_entry *entry;
 
-		if (!is_valid_meta_blkaddr(sbi, blkaddr, META_POR))
+		if (!f2fs_is_valid_blkaddr(sbi, blkaddr, META_POR))
 			return 0;
 
 		page = get_tmp_page(sbi, blkaddr);
@@ -479,7 +479,7 @@ retry_dn:
 		}
 
 		/* dest is valid block, try to recover from src to dest */
-		if (is_valid_meta_blkaddr(sbi, dest, META_POR)) {
+		if (f2fs_is_valid_blkaddr(sbi, dest, META_POR)) {
 
 			if (src == NULL_ADDR) {
 				err = reserve_new_block(&dn);
@@ -540,7 +540,7 @@ static int recover_data(struct f2fs_sb_info *sbi, struct list_head *inode_list,
 	while (1) {
 		struct fsync_inode_entry *entry;
 
-		if (!is_valid_meta_blkaddr(sbi, blkaddr, META_POR))
+		if (!f2fs_is_valid_blkaddr(sbi, blkaddr, META_POR))
 			break;
 
 		ra_meta_pages_cond(sbi, blkaddr);
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 483d7a869679..5c698757e116 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1758,7 +1758,7 @@ bool is_checkpointed_data(struct f2fs_sb_info *sbi, block_t blkaddr)
 	struct seg_entry *se;
 	bool is_cp = false;
 
-	if (!is_valid_blkaddr(blkaddr))
+	if (!is_valid_data_blkaddr(sbi, blkaddr))
 		return true;
 
 	mutex_lock(&sit_i->sentry_lock);
@@ -2571,7 +2571,7 @@ void f2fs_wait_on_block_writeback(struct f2fs_sb_info *sbi, block_t blkaddr)
 {
 	struct page *cpage;
 
-	if (!is_valid_blkaddr(blkaddr))
+	if (!is_valid_data_blkaddr(sbi, blkaddr))
 		return;
 
 	cpage = find_lock_page(META_MAPPING(sbi), blkaddr);
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index f977774338c3..dd8f977fc273 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -85,7 +85,7 @@
 	(GET_SEGOFF_FROM_SEG0(sbi, blk_addr) & ((sbi)->blocks_per_seg - 1))
 
 #define GET_SEGNO(sbi, blk_addr)					\
-	((!is_valid_blkaddr(blk_addr)) ?			\
+	((!is_valid_data_blkaddr(sbi, blk_addr)) ?			\
 	NULL_SEGNO : GET_L2R_SEGNO(FREE_I(sbi),			\
 		GET_SEGNO_FROM_SEG0(sbi, blk_addr)))
 #define BLKS_PER_SEC(sbi)					\
@@ -631,11 +631,9 @@ static inline void verify_block_addr(struct f2fs_io_info *fio, block_t blk_addr)
 
 	if (PAGE_TYPE_OF_BIO(fio->type) == META &&
 				(!is_read_io(fio->op) || fio->is_meta))
-		BUG_ON(blk_addr < SEG0_BLKADDR(sbi) ||
-				blk_addr >= MAIN_BLKADDR(sbi));
+		verify_blkaddr(sbi, blk_addr, META_GENERIC);
 	else
-		BUG_ON(blk_addr < MAIN_BLKADDR(sbi) ||
-				blk_addr >= MAX_BLKADDR(sbi));
+		verify_blkaddr(sbi, blk_addr, DATA_GENERIC);
 }
 
 /*
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 049/146] f2fs: fix to do sanity check with secs_per_zone
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (47 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 048/146] f2fs: introduce and spread verify_blkaddr Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 050/146] f2fs: Add sanity_check_inode() function Greg Kroah-Hartman
                   ` (101 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Chao Yu, Jaegeuk Kim, Ben Hutchings,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 42bf546c1fe3f3654bdf914e977acbc2b80a5be5 upstream.

As Wen Xu reported in below link:

https://bugzilla.kernel.org/show_bug.cgi?id=200183

- Overview
Divide zero in reset_curseg() when mounting a crafted f2fs image

- Reproduce

- Kernel message
[  588.281510] divide error: 0000 [#1] SMP KASAN PTI
[  588.282701] CPU: 0 PID: 1293 Comm: mount Not tainted 4.18.0-rc1+ #4
[  588.284000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  588.286178] RIP: 0010:reset_curseg+0x94/0x1a0
[  588.298166] RSP: 0018:ffff8801e88d7940 EFLAGS: 00010246
[  588.299360] RAX: 0000000000000014 RBX: ffff8801e1d46d00 RCX: ffffffffb88bf60b
[  588.300809] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e1d46d64
[  588.305272] R13: 0000000000000000 R14: 0000000000000014 R15: 0000000000000000
[  588.306822] FS:  00007fad85008840(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  588.308456] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  588.309623] CR2: 0000000001705078 CR3: 00000001f30f8000 CR4: 00000000000006f0
[  588.311085] Call Trace:
[  588.311637]  f2fs_build_segment_manager+0x103f/0x3410
[  588.316136]  ? f2fs_commit_super+0x1b0/0x1b0
[  588.317031]  ? set_blocksize+0x90/0x140
[  588.319473]  f2fs_mount+0x15/0x20
[  588.320166]  mount_fs+0x60/0x1a0
[  588.320847]  ? alloc_vfsmnt+0x309/0x360
[  588.321647]  vfs_kern_mount+0x6b/0x1a0
[  588.322432]  do_mount+0x34a/0x18c0
[  588.323175]  ? strndup_user+0x46/0x70
[  588.323937]  ? copy_mount_string+0x20/0x20
[  588.324793]  ? memcg_kmem_put_cache+0x1b/0xa0
[  588.325702]  ? kasan_check_write+0x14/0x20
[  588.326562]  ? _copy_from_user+0x6a/0x90
[  588.327375]  ? memdup_user+0x42/0x60
[  588.328118]  ksys_mount+0x83/0xd0
[  588.328808]  __x64_sys_mount+0x67/0x80
[  588.329607]  do_syscall_64+0x78/0x170
[  588.330400]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  588.331461] RIP: 0033:0x7fad848e8b9a
[  588.336022] RSP: 002b:00007ffd7c5b6be8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[  588.337547] RAX: ffffffffffffffda RBX: 00000000016f8030 RCX: 00007fad848e8b9a
[  588.338999] RDX: 00000000016f8210 RSI: 00000000016f9f30 RDI: 0000000001700ec0
[  588.340442] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[  588.341887] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000000001700ec0
[  588.343341] R13: 00000000016f8210 R14: 0000000000000000 R15: 0000000000000003
[  588.354891] ---[ end trace 4ce02f25ff7d3df5 ]---
[  588.355862] RIP: 0010:reset_curseg+0x94/0x1a0
[  588.360742] RSP: 0018:ffff8801e88d7940 EFLAGS: 00010246
[  588.361812] RAX: 0000000000000014 RBX: ffff8801e1d46d00 RCX: ffffffffb88bf60b
[  588.363485] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e1d46d64
[  588.365213] RBP: ffff8801e88d7968 R08: ffffed003c32266f R09: ffffed003c32266f
[  588.366661] R10: 0000000000000001 R11: ffffed003c32266e R12: ffff8801f0337700
[  588.368110] R13: 0000000000000000 R14: 0000000000000014 R15: 0000000000000000
[  588.370057] FS:  00007fad85008840(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  588.372099] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  588.373291] CR2: 0000000001705078 CR3: 00000001f30f8000 CR4: 00000000000006f0

- Location
https://elixir.bootlin.com/linux/latest/source/fs/f2fs/segment.c#L2147
        curseg->zone = GET_ZONE_FROM_SEG(sbi, curseg->segno);

If secs_per_zone is corrupted due to fuzzing test, it will cause divide
zero operation when using GET_ZONE_FROM_SEG macro, so we should do more
sanity check with secs_per_zone during mount to avoid this issue.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/f2fs/super.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index c3e1090e7c0a..084ff9c82a37 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1904,9 +1904,9 @@ static int sanity_check_raw_super(struct f2fs_sb_info *sbi,
 		return 1;
 	}
 
-	if (secs_per_zone > total_sections) {
+	if (secs_per_zone > total_sections || !secs_per_zone) {
 		f2fs_msg(sb, KERN_INFO,
-			"Wrong secs_per_zone (%u > %u)",
+			"Wrong secs_per_zone / total_sections (%u, %u)",
 			secs_per_zone, total_sections);
 		return 1;
 	}
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 050/146] f2fs: Add sanity_check_inode() function
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (48 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 049/146] f2fs: fix to do sanity check with secs_per_zone Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 051/146] f2fs: fix to do sanity check with extra_attr feature Greg Kroah-Hartman
                   ` (100 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jaegeuk Kim, Chao Yu, Ben Hutchings,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

This was done as part of commit 5d64600d4f33 "f2fs: avoid bug_on on
corrupted inode" upstream, but the specific check that commit added is
not applicable to 4.14.

Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/f2fs/inode.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index b1c0eb433841..917a6c3d5649 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -180,6 +180,13 @@ void f2fs_inode_chksum_set(struct f2fs_sb_info *sbi, struct page *page)
 	ri->i_inode_checksum = cpu_to_le32(f2fs_inode_chksum(sbi, page));
 }
 
+static bool sanity_check_inode(struct inode *inode)
+{
+	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+
+	return true;
+}
+
 static int do_read_inode(struct inode *inode)
 {
 	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
@@ -281,6 +288,10 @@ struct inode *f2fs_iget(struct super_block *sb, unsigned long ino)
 	ret = do_read_inode(inode);
 	if (ret)
 		goto bad_inode;
+	if (!sanity_check_inode(inode)) {
+		ret = -EINVAL;
+		goto bad_inode;
+	}
 make_now:
 	if (ino == F2FS_NODE_INO(sbi)) {
 		inode->i_mapping->a_ops = &f2fs_node_aops;
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 051/146] f2fs: fix to do sanity check with extra_attr feature
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (49 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 050/146] f2fs: Add sanity_check_inode() function Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 052/146] f2fs: fix to do sanity check with user_block_count Greg Kroah-Hartman
                   ` (99 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Wen Xu, Chao Yu, Jaegeuk Kim,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 76d56d4ab4f2a9e4f085c7d77172194ddaccf7d2 upstream.

If FI_EXTRA_ATTR is set in inode by fuzzing, inode.i_addr[0] will be
parsed as inode.i_extra_isize, then in __recover_inline_status, inline
data address will beyond boundary of page, result in accessing invalid
memory.

So in this condition, during reading inode page, let's do sanity check
with EXTRA_ATTR feature of fs and extra_attr bit of inode, if they're
inconsistent, deny to load this inode.

- Overview
Out-of-bound access in f2fs_iget() when mounting a corrupted f2fs image

- Reproduce

The following message will be got in KASAN build of 4.18 upstream kernel.
[  819.392227] ==================================================================
[  819.393901] BUG: KASAN: slab-out-of-bounds in f2fs_iget+0x736/0x1530
[  819.395329] Read of size 4 at addr ffff8801f099c968 by task mount/1292

[  819.397079] CPU: 1 PID: 1292 Comm: mount Not tainted 4.18.0-rc1+ #4
[  819.397082] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  819.397088] Call Trace:
[  819.397124]  dump_stack+0x7b/0xb5
[  819.397154]  print_address_description+0x70/0x290
[  819.397159]  kasan_report+0x291/0x390
[  819.397163]  ? f2fs_iget+0x736/0x1530
[  819.397176]  check_memory_region+0x139/0x190
[  819.397182]  __asan_loadN+0xf/0x20
[  819.397185]  f2fs_iget+0x736/0x1530
[  819.397197]  f2fs_fill_super+0x1b4f/0x2b40
[  819.397202]  ? f2fs_fill_super+0x1b4f/0x2b40
[  819.397208]  ? f2fs_commit_super+0x1b0/0x1b0
[  819.397227]  ? set_blocksize+0x90/0x140
[  819.397241]  mount_bdev+0x1c5/0x210
[  819.397245]  ? f2fs_commit_super+0x1b0/0x1b0
[  819.397252]  f2fs_mount+0x15/0x20
[  819.397256]  mount_fs+0x60/0x1a0
[  819.397267]  ? alloc_vfsmnt+0x309/0x360
[  819.397272]  vfs_kern_mount+0x6b/0x1a0
[  819.397282]  do_mount+0x34a/0x18c0
[  819.397300]  ? lockref_put_or_lock+0xcf/0x160
[  819.397306]  ? copy_mount_string+0x20/0x20
[  819.397318]  ? memcg_kmem_put_cache+0x1b/0xa0
[  819.397324]  ? kasan_check_write+0x14/0x20
[  819.397334]  ? _copy_from_user+0x6a/0x90
[  819.397353]  ? memdup_user+0x42/0x60
[  819.397359]  ksys_mount+0x83/0xd0
[  819.397365]  __x64_sys_mount+0x67/0x80
[  819.397388]  do_syscall_64+0x78/0x170
[  819.397403]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  819.397422] RIP: 0033:0x7f54c667cb9a
[  819.397424] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48
[  819.397483] RSP: 002b:00007ffd8f46cd08 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[  819.397496] RAX: ffffffffffffffda RBX: 0000000000dfa030 RCX: 00007f54c667cb9a
[  819.397498] RDX: 0000000000dfa210 RSI: 0000000000dfbf30 RDI: 0000000000e02ec0
[  819.397501] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[  819.397503] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000000000e02ec0
[  819.397505] R13: 0000000000dfa210 R14: 0000000000000000 R15: 0000000000000003

[  819.397866] Allocated by task 139:
[  819.398702]  save_stack+0x46/0xd0
[  819.398705]  kasan_kmalloc+0xad/0xe0
[  819.398709]  kasan_slab_alloc+0x11/0x20
[  819.398713]  kmem_cache_alloc+0xd1/0x1e0
[  819.398717]  dup_fd+0x50/0x4c0
[  819.398740]  copy_process.part.37+0xbed/0x32e0
[  819.398744]  _do_fork+0x16e/0x590
[  819.398748]  __x64_sys_clone+0x69/0x80
[  819.398752]  do_syscall_64+0x78/0x170
[  819.398756]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[  819.399097] Freed by task 159:
[  819.399743]  save_stack+0x46/0xd0
[  819.399747]  __kasan_slab_free+0x13c/0x1a0
[  819.399750]  kasan_slab_free+0xe/0x10
[  819.399754]  kmem_cache_free+0x89/0x1e0
[  819.399757]  put_files_struct+0x132/0x150
[  819.399761]  exit_files+0x62/0x70
[  819.399766]  do_exit+0x47b/0x1390
[  819.399770]  do_group_exit+0x86/0x130
[  819.399774]  __x64_sys_exit_group+0x2c/0x30
[  819.399778]  do_syscall_64+0x78/0x170
[  819.399782]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[  819.400115] The buggy address belongs to the object at ffff8801f099c680
                which belongs to the cache files_cache of size 704
[  819.403234] The buggy address is located 40 bytes to the right of
                704-byte region [ffff8801f099c680, ffff8801f099c940)
[  819.405689] The buggy address belongs to the page:
[  819.406709] page:ffffea0007c26700 count:1 mapcount:0 mapping:ffff8801f69a3340 index:0xffff8801f099d380 compound_mapcount: 0
[  819.408984] flags: 0x2ffff0000008100(slab|head)
[  819.409932] raw: 02ffff0000008100 ffffea00077fb600 0000000200000002 ffff8801f69a3340
[  819.411514] raw: ffff8801f099d380 0000000080130000 00000001ffffffff 0000000000000000
[  819.413073] page dumped because: kasan: bad access detected

[  819.414539] Memory state around the buggy address:
[  819.415521]  ffff8801f099c800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  819.416981]  ffff8801f099c880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  819.418454] >ffff8801f099c900: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[  819.419921]                                                           ^
[  819.421265]  ffff8801f099c980: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
[  819.422745]  ffff8801f099ca00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  819.424206] ==================================================================
[  819.425668] Disabling lock debugging due to kernel taint
[  819.457463] F2FS-fs (loop0): Mounted with checkpoint version = 3

The kernel still mounts the image. If you run the following program on the mounted folder mnt,

(poc.c)

static void activity(char *mpoint) {

  char *foo_bar_baz;
  int err;

  static int buf[8192];
  memset(buf, 0, sizeof(buf));

  err = asprintf(&foo_bar_baz, "%s/foo/bar/baz", mpoint);
    int fd = open(foo_bar_baz, O_RDONLY, 0);
  if (fd >= 0) {
      read(fd, (char *)buf, 11);
      close(fd);
  }
}

int main(int argc, char *argv[]) {
  activity(argv[1]);
  return 0;
}

You can get kernel crash:
[  819.457463] F2FS-fs (loop0): Mounted with checkpoint version = 3
[  918.028501] BUG: unable to handle kernel paging request at ffffed0048000d82
[  918.044020] PGD 23ffee067 P4D 23ffee067 PUD 23fbef067 PMD 0
[  918.045207] Oops: 0000 [#1] SMP KASAN PTI
[  918.046048] CPU: 0 PID: 1309 Comm: poc Tainted: G    B             4.18.0-rc1+ #4
[  918.047573] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  918.049552] RIP: 0010:check_memory_region+0x5e/0x190
[  918.050565] Code: f8 49 c1 e8 03 49 89 db 49 c1 eb 03 4d 01 cb 4d 01 c1 4d 8d 63 01 4c 89 c8 4d 89 e2 4d 29 ca 49 83 fa 10 7f 3d 4d 85 d2 74 32 <41> 80 39 00 75 23 48 b8 01 00 00 00 00 fc ff df 4d 01 d1 49 01 c0
[  918.054322] RSP: 0018:ffff8801e3a1f258 EFLAGS: 00010202
[  918.055400] RAX: ffffed0048000d82 RBX: ffff880240006c11 RCX: ffffffffb8867d14
[  918.056832] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff880240006c10
[  918.058253] RBP: ffff8801e3a1f268 R08: 1ffff10048000d82 R09: ffffed0048000d82
[  918.059717] R10: 0000000000000001 R11: ffffed0048000d82 R12: ffffed0048000d83
[  918.061159] R13: ffff8801e3a1f390 R14: 0000000000000000 R15: ffff880240006c08
[  918.062614] FS:  00007fac9732c700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  918.064246] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  918.065412] CR2: ffffed0048000d82 CR3: 00000001df77a000 CR4: 00000000000006f0
[  918.066882] Call Trace:
[  918.067410]  __asan_loadN+0xf/0x20
[  918.068149]  f2fs_find_target_dentry+0xf4/0x270
[  918.069083]  ? __get_node_page+0x331/0x5b0
[  918.069925]  f2fs_find_in_inline_dir+0x24b/0x310
[  918.070881]  ? f2fs_recover_inline_data+0x4c0/0x4c0
[  918.071905]  ? unwind_next_frame.part.5+0x34f/0x490
[  918.072901]  ? unwind_dump+0x290/0x290
[  918.073695]  ? is_bpf_text_address+0xe/0x20
[  918.074566]  __f2fs_find_entry+0x599/0x670
[  918.075408]  ? kasan_unpoison_shadow+0x36/0x50
[  918.076315]  ? kasan_kmalloc+0xad/0xe0
[  918.077100]  ? memcg_kmem_put_cache+0x55/0xa0
[  918.077998]  ? f2fs_find_target_dentry+0x270/0x270
[  918.079006]  ? d_set_d_op+0x30/0x100
[  918.079749]  ? __d_lookup_rcu+0x69/0x2e0
[  918.080556]  ? __d_alloc+0x275/0x450
[  918.081297]  ? kasan_check_write+0x14/0x20
[  918.082135]  ? memset+0x31/0x40
[  918.082820]  ? fscrypt_setup_filename+0x1ec/0x4c0
[  918.083782]  ? d_alloc_parallel+0x5bb/0x8c0
[  918.084640]  f2fs_find_entry+0xe9/0x110
[  918.085432]  ? __f2fs_find_entry+0x670/0x670
[  918.086308]  ? kasan_check_write+0x14/0x20
[  918.087163]  f2fs_lookup+0x297/0x590
[  918.087902]  ? f2fs_link+0x2b0/0x2b0
[  918.088646]  ? legitimize_path.isra.29+0x61/0xa0
[  918.089589]  __lookup_slow+0x12e/0x240
[  918.090371]  ? may_delete+0x2b0/0x2b0
[  918.091123]  ? __nd_alloc_stack+0xa0/0xa0
[  918.091944]  lookup_slow+0x44/0x60
[  918.092642]  walk_component+0x3ee/0xa40
[  918.093428]  ? is_bpf_text_address+0xe/0x20
[  918.094283]  ? pick_link+0x3e0/0x3e0
[  918.095047]  ? in_group_p+0xa5/0xe0
[  918.095771]  ? generic_permission+0x53/0x1e0
[  918.096666]  ? security_inode_permission+0x1d/0x70
[  918.097646]  ? inode_permission+0x7a/0x1f0
[  918.098497]  link_path_walk+0x2a2/0x7b0
[  918.099298]  ? apparmor_capget+0x3d0/0x3d0
[  918.100140]  ? walk_component+0xa40/0xa40
[  918.100958]  ? path_init+0x2e6/0x580
[  918.101695]  path_openat+0x1bb/0x2160
[  918.102471]  ? __save_stack_trace+0x92/0x100
[  918.103352]  ? save_stack+0xb5/0xd0
[  918.104070]  ? vfs_unlink+0x250/0x250
[  918.104822]  ? save_stack+0x46/0xd0
[  918.105538]  ? kasan_slab_alloc+0x11/0x20
[  918.106370]  ? kmem_cache_alloc+0xd1/0x1e0
[  918.107213]  ? getname_flags+0x76/0x2c0
[  918.107997]  ? getname+0x12/0x20
[  918.108677]  ? do_sys_open+0x14b/0x2c0
[  918.109450]  ? __x64_sys_open+0x4c/0x60
[  918.110255]  ? do_syscall_64+0x78/0x170
[  918.111083]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  918.112148]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  918.113204]  ? f2fs_empty_inline_dir+0x1e0/0x1e0
[  918.114150]  ? timespec64_trunc+0x5c/0x90
[  918.114993]  ? wb_io_lists_depopulated+0x1a/0xc0
[  918.115937]  ? inode_io_list_move_locked+0x102/0x110
[  918.116949]  do_filp_open+0x12b/0x1d0
[  918.117709]  ? may_open_dev+0x50/0x50
[  918.118475]  ? kasan_kmalloc+0xad/0xe0
[  918.119246]  do_sys_open+0x17c/0x2c0
[  918.119983]  ? do_sys_open+0x17c/0x2c0
[  918.120751]  ? filp_open+0x60/0x60
[  918.121463]  ? task_work_run+0x4d/0xf0
[  918.122237]  __x64_sys_open+0x4c/0x60
[  918.123001]  do_syscall_64+0x78/0x170
[  918.123759]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  918.124802] RIP: 0033:0x7fac96e3e040
[  918.125537] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 83 3d 09 27 2d 00 00 75 10 b8 02 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e e0 01 00 48 89 04 24
[  918.129341] RSP: 002b:00007fff1b37f848 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
[  918.130870] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fac96e3e040
[  918.132295] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000122d080
[  918.133748] RBP: 00007fff1b37f9b0 R08: 00007fac9710bbd8 R09: 0000000000000001
[  918.135209] R10: 000000000000069d R11: 0000000000000246 R12: 0000000000400c20
[  918.136650] R13: 00007fff1b37fab0 R14: 0000000000000000 R15: 0000000000000000
[  918.138093] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy
[  918.147924] CR2: ffffed0048000d82
[  918.148619] ---[ end trace 4ce02f25ff7d3df5 ]---
[  918.149563] RIP: 0010:check_memory_region+0x5e/0x190
[  918.150576] Code: f8 49 c1 e8 03 49 89 db 49 c1 eb 03 4d 01 cb 4d 01 c1 4d 8d 63 01 4c 89 c8 4d 89 e2 4d 29 ca 49 83 fa 10 7f 3d 4d 85 d2 74 32 <41> 80 39 00 75 23 48 b8 01 00 00 00 00 fc ff df 4d 01 d1 49 01 c0
[  918.154360] RSP: 0018:ffff8801e3a1f258 EFLAGS: 00010202
[  918.155411] RAX: ffffed0048000d82 RBX: ffff880240006c11 RCX: ffffffffb8867d14
[  918.156833] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff880240006c10
[  918.158257] RBP: ffff8801e3a1f268 R08: 1ffff10048000d82 R09: ffffed0048000d82
[  918.159722] R10: 0000000000000001 R11: ffffed0048000d82 R12: ffffed0048000d83
[  918.161149] R13: ffff8801e3a1f390 R14: 0000000000000000 R15: ffff880240006c08
[  918.162587] FS:  00007fac9732c700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  918.164203] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  918.165356] CR2: ffffed0048000d82 CR3: 00000001df77a000 CR4: 00000000000006f0

Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[bwh: Backported to 4.14: adjust context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/f2fs/inode.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 917a6c3d5649..428d502f23e8 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -184,6 +184,15 @@ static bool sanity_check_inode(struct inode *inode)
 {
 	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
 
+	if (f2fs_has_extra_attr(inode) &&
+			!f2fs_sb_has_extra_attr(sbi->sb)) {
+		set_sbi_flag(sbi, SBI_NEED_FSCK);
+		f2fs_msg(sbi->sb, KERN_WARNING,
+			"%s: inode (ino=%lx) is with extra_attr, "
+			"but extra_attr feature is off",
+			__func__, inode->i_ino);
+		return false;
+	}
 	return true;
 }
 
@@ -233,6 +242,11 @@ static int do_read_inode(struct inode *inode)
 
 	get_inline_info(inode, ri);
 
+	if (!sanity_check_inode(inode)) {
+		f2fs_put_page(node_page, 1);
+		return -EINVAL;
+	}
+
 	fi->i_extra_isize = f2fs_has_extra_attr(inode) ?
 					le16_to_cpu(ri->i_extra_isize) : 0;
 
@@ -288,10 +302,6 @@ struct inode *f2fs_iget(struct super_block *sb, unsigned long ino)
 	ret = do_read_inode(inode);
 	if (ret)
 		goto bad_inode;
-	if (!sanity_check_inode(inode)) {
-		ret = -EINVAL;
-		goto bad_inode;
-	}
 make_now:
 	if (ino == F2FS_NODE_INO(sbi)) {
 		inode->i_mapping->a_ops = &f2fs_node_aops;
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 052/146] f2fs: fix to do sanity check with user_block_count
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (50 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 051/146] f2fs: fix to do sanity check with extra_attr feature Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:48 ` [PATCH 4.14 053/146] f2fs: fix to do sanity check with node footer and iblocks Greg Kroah-Hartman
                   ` (98 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Wen Xu, Chao Yu, Jaegeuk Kim,
	Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 9dc956b2c8523aed39d1e6508438be9fea28c8fc upstream.

This patch fixs to do sanity check with user_block_count.

- Overview
Divide zero in utilization when mount() a corrupted f2fs image

- Reproduce (4.18 upstream kernel)

- Kernel message
[  564.099503] F2FS-fs (loop0): invalid crc value
[  564.101991] divide error: 0000 [#1] SMP KASAN PTI
[  564.103103] CPU: 1 PID: 1298 Comm: f2fs_discard-7: Not tainted 4.18.0-rc1+ #4
[  564.104584] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  564.106624] RIP: 0010:issue_discard_thread+0x248/0x5c0
[  564.107692] Code: ff ff 48 8b bd e8 fe ff ff 41 8b 9d 4c 04 00 00 e8 cd b8 ad ff 41 8b 85 50 04 00 00 31 d2 48 8d 04 80 48 8d 04 80 48 c1 e0 02 <48> f7 f3 83 f8 50 7e 16 41 c7 86 7c ff ff ff 01 00 00 00 41 c7 86
[  564.111686] RSP: 0018:ffff8801f3117dc0 EFLAGS: 00010206
[  564.112775] RAX: 0000000000000384 RBX: 0000000000000000 RCX: ffffffffb88c1e03
[  564.114250] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e3aa4850
[  564.115706] RBP: ffff8801f3117f00 R08: 1ffffffff751a1d0 R09: fffffbfff751a1d0
[  564.117177] R10: 0000000000000001 R11: fffffbfff751a1d0 R12: 00000000fffffffc
[  564.118634] R13: ffff8801e3aa4400 R14: ffff8801f3117ed8 R15: ffff8801e2050000
[  564.120094] FS:  0000000000000000(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000
[  564.121748] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  564.122923] CR2: 000000000202b078 CR3: 00000001f11ac000 CR4: 00000000000006e0
[  564.124383] Call Trace:
[  564.124924]  ? __issue_discard_cmd+0x480/0x480
[  564.125882]  ? __sched_text_start+0x8/0x8
[  564.126756]  ? __kthread_parkme+0xcb/0x100
[  564.127620]  ? kthread_blkcg+0x70/0x70
[  564.128412]  kthread+0x180/0x1d0
[  564.129105]  ? __issue_discard_cmd+0x480/0x480
[  564.130029]  ? kthread_associate_blkcg+0x150/0x150
[  564.131033]  ret_from_fork+0x35/0x40
[  564.131794] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy
[  564.141798] ---[ end trace 4ce02f25ff7d3df5 ]---
[  564.142773] RIP: 0010:issue_discard_thread+0x248/0x5c0
[  564.143885] Code: ff ff 48 8b bd e8 fe ff ff 41 8b 9d 4c 04 00 00 e8 cd b8 ad ff 41 8b 85 50 04 00 00 31 d2 48 8d 04 80 48 8d 04 80 48 c1 e0 02 <48> f7 f3 83 f8 50 7e 16 41 c7 86 7c ff ff ff 01 00 00 00 41 c7 86
[  564.147776] RSP: 0018:ffff8801f3117dc0 EFLAGS: 00010206
[  564.148856] RAX: 0000000000000384 RBX: 0000000000000000 RCX: ffffffffb88c1e03
[  564.150424] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e3aa4850
[  564.151906] RBP: ffff8801f3117f00 R08: 1ffffffff751a1d0 R09: fffffbfff751a1d0
[  564.153463] R10: 0000000000000001 R11: fffffbfff751a1d0 R12: 00000000fffffffc
[  564.154915] R13: ffff8801e3aa4400 R14: ffff8801f3117ed8 R15: ffff8801e2050000
[  564.156405] FS:  0000000000000000(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000
[  564.158070] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  564.159279] CR2: 000000000202b078 CR3: 00000001f11ac000 CR4: 00000000000006e0
[  564.161043] ==================================================================
[  564.162587] BUG: KASAN: stack-out-of-bounds in from_kuid_munged+0x1d/0x50
[  564.163994] Read of size 4 at addr ffff8801f3117c84 by task f2fs_discard-7:/1298

[  564.165852] CPU: 1 PID: 1298 Comm: f2fs_discard-7: Tainted: G      D           4.18.0-rc1+ #4
[  564.167593] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  564.169522] Call Trace:
[  564.170057]  dump_stack+0x7b/0xb5
[  564.170778]  print_address_description+0x70/0x290
[  564.171765]  kasan_report+0x291/0x390
[  564.172540]  ? from_kuid_munged+0x1d/0x50
[  564.173408]  __asan_load4+0x78/0x80
[  564.174148]  from_kuid_munged+0x1d/0x50
[  564.174962]  do_notify_parent+0x1f5/0x4f0
[  564.175808]  ? send_sigqueue+0x390/0x390
[  564.176639]  ? css_set_move_task+0x152/0x340
[  564.184197]  do_exit+0x1290/0x1390
[  564.184950]  ? __issue_discard_cmd+0x480/0x480
[  564.185884]  ? mm_update_next_owner+0x380/0x380
[  564.186829]  ? __sched_text_start+0x8/0x8
[  564.187672]  ? __kthread_parkme+0xcb/0x100
[  564.188528]  ? kthread_blkcg+0x70/0x70
[  564.189333]  ? kthread+0x180/0x1d0
[  564.190052]  ? __issue_discard_cmd+0x480/0x480
[  564.190983]  rewind_stack_do_exit+0x17/0x20

[  564.192190] The buggy address belongs to the page:
[  564.193213] page:ffffea0007cc45c0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[  564.194856] flags: 0x2ffff0000000000()
[  564.195644] raw: 02ffff0000000000 0000000000000000 dead000000000200 0000000000000000
[  564.197247] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[  564.198826] page dumped because: kasan: bad access detected

[  564.200299] Memory state around the buggy address:
[  564.201306]  ffff8801f3117b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  564.202779]  ffff8801f3117c00: 00 00 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3 f3
[  564.204252] >ffff8801f3117c80: f3 f3 f3 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
[  564.205742]                    ^
[  564.206424]  ffff8801f3117d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  564.207908]  ffff8801f3117d80: f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00
[  564.209389] ==================================================================
[  564.231795] F2FS-fs (loop0): Mounted with checkpoint version = 2

- Location
https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/f2fs/segment.h#L586
	return div_u64((u64)valid_user_blocks(sbi) * 100,
					sbi->user_block_count);
Missing checks on sbi->user_block_count.

Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/f2fs/super.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 084ff9c82a37..9fafb1404f39 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1956,6 +1956,8 @@ int sanity_check_ckpt(struct f2fs_sb_info *sbi)
 	unsigned int sit_segs, nat_segs;
 	unsigned int sit_bitmap_size, nat_bitmap_size;
 	unsigned int log_blocks_per_seg;
+	unsigned int segment_count_main;
+	block_t user_block_count;
 	int i;
 
 	total = le32_to_cpu(raw_super->segment_count);
@@ -1980,6 +1982,16 @@ int sanity_check_ckpt(struct f2fs_sb_info *sbi)
 		return 1;
 	}
 
+	user_block_count = le64_to_cpu(ckpt->user_block_count);
+	segment_count_main = le32_to_cpu(raw_super->segment_count_main);
+	log_blocks_per_seg = le32_to_cpu(raw_super->log_blocks_per_seg);
+	if (!user_block_count || user_block_count >=
+			segment_count_main << log_blocks_per_seg) {
+		f2fs_msg(sbi->sb, KERN_ERR,
+			"Wrong user_block_count: %u", user_block_count);
+		return 1;
+	}
+
 	main_segs = le32_to_cpu(raw_super->segment_count_main);
 	blocks_per_seg = sbi->blocks_per_seg;
 
@@ -1996,7 +2008,6 @@ int sanity_check_ckpt(struct f2fs_sb_info *sbi)
 
 	sit_bitmap_size = le32_to_cpu(ckpt->sit_ver_bitmap_bytesize);
 	nat_bitmap_size = le32_to_cpu(ckpt->nat_ver_bitmap_bytesize);
-	log_blocks_per_seg = le32_to_cpu(raw_super->log_blocks_per_seg);
 
 	if (sit_bitmap_size != ((sit_segs / 2) << log_blocks_per_seg) / 8 ||
 		nat_bitmap_size != ((nat_segs / 2) << log_blocks_per_seg) / 8) {
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 053/146] f2fs: fix to do sanity check with node footer and iblocks
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (51 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 052/146] f2fs: fix to do sanity check with user_block_count Greg Kroah-Hartman
@ 2018-12-04 10:48 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 054/146] f2fs: fix to do sanity check with block address in main area Greg Kroah-Hartman
                   ` (97 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Chao Yu, Jaegeuk Kim, Ben Hutchings,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit e34438c903b653daca2b2a7de95aed46226f8ed3 upstream.

This patch adds to do sanity check with below fields of inode to
avoid reported panic.
- node footer
- iblocks

https://bugzilla.kernel.org/show_bug.cgi?id=200223

- Overview
BUG() triggered in f2fs_truncate_inode_blocks() when un-mounting a mounted f2fs image after writing to it

- Reproduce

- POC (poc.c)

static void activity(char *mpoint) {

  char *foo_bar_baz;
  int err;

  static int buf[8192];
  memset(buf, 0, sizeof(buf));

  err = asprintf(&foo_bar_baz, "%s/foo/bar/baz", mpoint);

  // open / write / read
  int fd = open(foo_bar_baz, O_RDWR | O_TRUNC, 0777);
  if (fd >= 0) {
    write(fd, (char *)buf, 517);
    write(fd, (char *)buf, sizeof(buf));
    close(fd);
  }

}

int main(int argc, char *argv[]) {
  activity(argv[1]);
  return 0;
}

- Kernel meesage
[  552.479723] F2FS-fs (loop0): Mounted with checkpoint version = 2
[  556.451891] ------------[ cut here ]------------
[  556.451899] kernel BUG at fs/f2fs/node.c:987!
[  556.452920] invalid opcode: 0000 [#1] SMP KASAN PTI
[  556.453936] CPU: 1 PID: 1310 Comm: umount Not tainted 4.18.0-rc1+ #4
[  556.455213] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  556.457140] RIP: 0010:f2fs_truncate_inode_blocks+0x4a7/0x6f0
[  556.458280] Code: e8 ae ea ff ff 41 89 c7 c1 e8 1f 84 c0 74 0a 41 83 ff fe 0f 85 35 ff ff ff 81 85 b0 fe ff ff fb 03 00 00 e9 f7 fd ff ff 0f 0b <0f> 0b e8 62 b7 9a 00 48 8b bd a0 fe ff ff e8 56 54 ae ff 48 8b b5
[  556.462015] RSP: 0018:ffff8801f292f808 EFLAGS: 00010286
[  556.463068] RAX: ffffed003e73242d RBX: ffff8801f292f958 RCX: ffffffffb88b81bc
[  556.464479] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8801f3992164
[  556.465901] RBP: ffff8801f292f980 R08: ffffed003e73242d R09: ffffed003e73242d
[  556.467311] R10: 0000000000000001 R11: ffffed003e73242c R12: 00000000fffffc64
[  556.468706] R13: ffff8801f3992000 R14: 0000000000000058 R15: 00000000ffff8801
[  556.470117] FS:  00007f8029297840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000
[  556.471702] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  556.472838] CR2: 000055f5f57305d8 CR3: 00000001f18b0000 CR4: 00000000000006e0
[  556.474265] Call Trace:
[  556.474782]  ? f2fs_alloc_nid_failed+0xf0/0xf0
[  556.475686]  ? truncate_nodes+0x980/0x980
[  556.476516]  ? pagecache_get_page+0x21f/0x2f0
[  556.477412]  ? __asan_loadN+0xf/0x20
[  556.478153]  ? __get_node_page+0x331/0x5b0
[  556.478992]  ? reweight_entity+0x1e6/0x3b0
[  556.479826]  f2fs_truncate_blocks+0x55e/0x740
[  556.480709]  ? f2fs_truncate_data_blocks+0x20/0x20
[  556.481689]  ? __radix_tree_lookup+0x34/0x160
[  556.482630]  ? radix_tree_lookup+0xd/0x10
[  556.483445]  f2fs_truncate+0xd4/0x1a0
[  556.484206]  f2fs_evict_inode+0x5ce/0x630
[  556.485032]  evict+0x16f/0x290
[  556.485664]  iput+0x280/0x300
[  556.486300]  dentry_unlink_inode+0x165/0x1e0
[  556.487169]  __dentry_kill+0x16a/0x260
[  556.487936]  dentry_kill+0x70/0x250
[  556.488651]  shrink_dentry_list+0x125/0x260
[  556.489504]  shrink_dcache_parent+0xc1/0x110
[  556.490379]  ? shrink_dcache_sb+0x200/0x200
[  556.491231]  ? bit_wait_timeout+0xc0/0xc0
[  556.492047]  do_one_tree+0x12/0x40
[  556.492743]  shrink_dcache_for_umount+0x3f/0xa0
[  556.493656]  generic_shutdown_super+0x43/0x1c0
[  556.494561]  kill_block_super+0x52/0x80
[  556.495341]  kill_f2fs_super+0x62/0x70
[  556.496105]  deactivate_locked_super+0x6f/0xa0
[  556.497004]  deactivate_super+0x5e/0x80
[  556.497785]  cleanup_mnt+0x61/0xa0
[  556.498492]  __cleanup_mnt+0x12/0x20
[  556.499218]  task_work_run+0xc8/0xf0
[  556.499949]  exit_to_usermode_loop+0x125/0x130
[  556.500846]  do_syscall_64+0x138/0x170
[  556.501609]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  556.502659] RIP: 0033:0x7f8028b77487
[  556.503384] Code: 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 c9 2b 00 f7 d8 64 89 01 48
[  556.507137] RSP: 002b:00007fff9f2e3598 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[  556.508637] RAX: 0000000000000000 RBX: 0000000000ebd030 RCX: 00007f8028b77487
[  556.510069] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000ec41e0
[  556.511481] RBP: 0000000000ec41e0 R08: 0000000000000000 R09: 0000000000000014
[  556.512892] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007f802908083c
[  556.514320] R13: 0000000000000000 R14: 0000000000ebd210 R15: 00007fff9f2e3820
[  556.515745] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy
[  556.529276] ---[ end trace 4ce02f25ff7d3df5 ]---
[  556.530340] RIP: 0010:f2fs_truncate_inode_blocks+0x4a7/0x6f0
[  556.531513] Code: e8 ae ea ff ff 41 89 c7 c1 e8 1f 84 c0 74 0a 41 83 ff fe 0f 85 35 ff ff ff 81 85 b0 fe ff ff fb 03 00 00 e9 f7 fd ff ff 0f 0b <0f> 0b e8 62 b7 9a 00 48 8b bd a0 fe ff ff e8 56 54 ae ff 48 8b b5
[  556.535330] RSP: 0018:ffff8801f292f808 EFLAGS: 00010286
[  556.536395] RAX: ffffed003e73242d RBX: ffff8801f292f958 RCX: ffffffffb88b81bc
[  556.537824] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8801f3992164
[  556.539290] RBP: ffff8801f292f980 R08: ffffed003e73242d R09: ffffed003e73242d
[  556.540709] R10: 0000000000000001 R11: ffffed003e73242c R12: 00000000fffffc64
[  556.542131] R13: ffff8801f3992000 R14: 0000000000000058 R15: 00000000ffff8801
[  556.543579] FS:  00007f8029297840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000
[  556.545180] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  556.546338] CR2: 000055f5f57305d8 CR3: 00000001f18b0000 CR4: 00000000000006e0
[  556.547809] ==================================================================
[  556.549248] BUG: KASAN: stack-out-of-bounds in arch_tlb_gather_mmu+0x52/0x170
[  556.550672] Write of size 8 at addr ffff8801f292fd10 by task umount/1310

[  556.552338] CPU: 1 PID: 1310 Comm: umount Tainted: G      D           4.18.0-rc1+ #4
[  556.553886] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  556.555756] Call Trace:
[  556.556264]  dump_stack+0x7b/0xb5
[  556.556944]  print_address_description+0x70/0x290
[  556.557903]  kasan_report+0x291/0x390
[  556.558649]  ? arch_tlb_gather_mmu+0x52/0x170
[  556.559537]  __asan_store8+0x57/0x90
[  556.560268]  arch_tlb_gather_mmu+0x52/0x170
[  556.561110]  tlb_gather_mmu+0x12/0x40
[  556.561862]  exit_mmap+0x123/0x2a0
[  556.562555]  ? __ia32_sys_munmap+0x50/0x50
[  556.563384]  ? exit_aio+0x98/0x230
[  556.564079]  ? __x32_compat_sys_io_submit+0x260/0x260
[  556.565099]  ? taskstats_exit+0x1f4/0x640
[  556.565925]  ? kasan_check_read+0x11/0x20
[  556.566739]  ? mm_update_next_owner+0x322/0x380
[  556.567652]  mmput+0x8b/0x1d0
[  556.568260]  do_exit+0x43a/0x1390
[  556.568937]  ? mm_update_next_owner+0x380/0x380
[  556.569855]  ? deactivate_super+0x5e/0x80
[  556.570668]  ? cleanup_mnt+0x61/0xa0
[  556.571395]  ? __cleanup_mnt+0x12/0x20
[  556.572156]  ? task_work_run+0xc8/0xf0
[  556.572917]  ? exit_to_usermode_loop+0x125/0x130
[  556.573861]  rewind_stack_do_exit+0x17/0x20
[  556.574707] RIP: 0033:0x7f8028b77487
[  556.575428] Code: Bad RIP value.
[  556.576106] RSP: 002b:00007fff9f2e3598 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[  556.577599] RAX: 0000000000000000 RBX: 0000000000ebd030 RCX: 00007f8028b77487
[  556.579020] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000ec41e0
[  556.580422] RBP: 0000000000ec41e0 R08: 0000000000000000 R09: 0000000000000014
[  556.581833] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007f802908083c
[  556.583252] R13: 0000000000000000 R14: 0000000000ebd210 R15: 00007fff9f2e3820

[  556.584983] The buggy address belongs to the page:
[  556.585961] page:ffffea0007ca4bc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[  556.587540] flags: 0x2ffff0000000000()
[  556.588296] raw: 02ffff0000000000 0000000000000000 dead000000000200 0000000000000000
[  556.589822] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[  556.591359] page dumped because: kasan: bad access detected

[  556.592786] Memory state around the buggy address:
[  556.593753]  ffff8801f292fc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  556.595191]  ffff8801f292fc80: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
[  556.596613] >ffff8801f292fd00: 00 00 f3 00 00 00 00 f3 f3 00 00 00 00 f4 f4 f4
[  556.598044]                          ^
[  556.598797]  ffff8801f292fd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
[  556.600225]  ffff8801f292fe00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f4 f4 f4
[  556.601647] ==================================================================

- Location
https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/f2fs/node.c#L987
		case NODE_DIND_BLOCK:
			err = truncate_nodes(&dn, nofs, offset[1], 3);
			cont = 0;
			break;

		default:
			BUG(); <---
		}

Reported-by Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/f2fs/inode.c | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 428d502f23e8..be7d9773d291 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -180,9 +180,30 @@ void f2fs_inode_chksum_set(struct f2fs_sb_info *sbi, struct page *page)
 	ri->i_inode_checksum = cpu_to_le32(f2fs_inode_chksum(sbi, page));
 }
 
-static bool sanity_check_inode(struct inode *inode)
+static bool sanity_check_inode(struct inode *inode, struct page *node_page)
 {
 	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+	unsigned long long iblocks;
+
+	iblocks = le64_to_cpu(F2FS_INODE(node_page)->i_blocks);
+	if (!iblocks) {
+		set_sbi_flag(sbi, SBI_NEED_FSCK);
+		f2fs_msg(sbi->sb, KERN_WARNING,
+			"%s: corrupted inode i_blocks i_ino=%lx iblocks=%llu, "
+			"run fsck to fix.",
+			__func__, inode->i_ino, iblocks);
+		return false;
+	}
+
+	if (ino_of_node(node_page) != nid_of_node(node_page)) {
+		set_sbi_flag(sbi, SBI_NEED_FSCK);
+		f2fs_msg(sbi->sb, KERN_WARNING,
+			"%s: corrupted inode footer i_ino=%lx, ino,nid: "
+			"[%u, %u] run fsck to fix.",
+			__func__, inode->i_ino,
+			ino_of_node(node_page), nid_of_node(node_page));
+		return false;
+	}
 
 	if (f2fs_has_extra_attr(inode) &&
 			!f2fs_sb_has_extra_attr(sbi->sb)) {
@@ -242,7 +263,7 @@ static int do_read_inode(struct inode *inode)
 
 	get_inline_info(inode, ri);
 
-	if (!sanity_check_inode(inode)) {
+	if (!sanity_check_inode(inode, node_page)) {
 		f2fs_put_page(node_page, 1);
 		return -EINVAL;
 	}
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 054/146] f2fs: fix to do sanity check with block address in main area
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (52 preceding siblings ...)
  2018-12-04 10:48 ` [PATCH 4.14 053/146] f2fs: fix to do sanity check with node footer and iblocks Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 20:27   ` Sudip Mukherjee
  2018-12-04 10:49 ` [PATCH 4.14 055/146] f2fs: fix to do sanity check with i_extra_isize Greg Kroah-Hartman
                   ` (96 subsequent siblings)
  150 siblings, 1 reply; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Chao Yu, Jaegeuk Kim, Ben Hutchings,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit c9b60788fc760d136211853f10ce73dc152d1f4a upstream.

This patch add to do sanity check with below field:
- cp_pack_total_block_count
- blkaddr of data/node
- extent info

- Overview
BUG() in verify_block_addr() when writing to a corrupted f2fs image

- Reproduce (4.18 upstream kernel)

- POC (poc.c)

static void activity(char *mpoint) {

  char *foo_bar_baz;
  int err;

  static int buf[8192];
  memset(buf, 0, sizeof(buf));

  err = asprintf(&foo_bar_baz, "%s/foo/bar/baz", mpoint);

  int fd = open(foo_bar_baz, O_RDWR | O_TRUNC, 0777);
  if (fd >= 0) {
    write(fd, (char *)buf, sizeof(buf));
    fdatasync(fd);
    close(fd);
  }
}

int main(int argc, char *argv[]) {
  activity(argv[1]);
  return 0;
}

- Kernel message
[  689.349473] F2FS-fs (loop0): Mounted with checkpoint version = 3
[  699.728662] WARNING: CPU: 0 PID: 1309 at fs/f2fs/segment.c:2860 f2fs_inplace_write_data+0x232/0x240
[  699.728670] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy
[  699.729056] CPU: 0 PID: 1309 Comm: a.out Not tainted 4.18.0-rc1+ #4
[  699.729064] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  699.729074] RIP: 0010:f2fs_inplace_write_data+0x232/0x240
[  699.729076] Code: ff e9 cf fe ff ff 49 8d 7d 10 e8 39 45 ad ff 4d 8b 7d 10 be 04 00 00 00 49 8d 7f 48 e8 07 49 ad ff 45 8b 7f 48 e9 fb fe ff ff <0f> 0b f0 41 80 4d 48 04 e9 65 fe ff ff 90 66 66 66 66 90 55 48 8d
[  699.729130] RSP: 0018:ffff8801f43af568 EFLAGS: 00010202
[  699.729139] RAX: 000000000000003f RBX: ffff8801f43af7b8 RCX: ffffffffb88c9113
[  699.729142] RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffff8802024e5540
[  699.729144] RBP: ffff8801f43af590 R08: 0000000000000009 R09: ffffffffffffffe8
[  699.729147] R10: 0000000000000001 R11: ffffed0039b0596a R12: ffff8802024e5540
[  699.729149] R13: ffff8801f0335500 R14: ffff8801e3e7a700 R15: ffff8801e1ee4450
[  699.729154] FS:  00007f9bf97f5700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  699.729156] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  699.729159] CR2: 00007f9bf925d170 CR3: 00000001f0c34000 CR4: 00000000000006f0
[  699.729171] Call Trace:
[  699.729192]  f2fs_do_write_data_page+0x2e2/0xe00
[  699.729203]  ? f2fs_should_update_outplace+0xd0/0xd0
[  699.729238]  ? memcg_drain_all_list_lrus+0x280/0x280
[  699.729269]  ? __radix_tree_replace+0xa3/0x120
[  699.729276]  __write_data_page+0x5c7/0xe30
[  699.729291]  ? kasan_check_read+0x11/0x20
[  699.729310]  ? page_mapped+0x8a/0x110
[  699.729321]  ? page_mkclean+0xe9/0x160
[  699.729327]  ? f2fs_do_write_data_page+0xe00/0xe00
[  699.729331]  ? invalid_page_referenced_vma+0x130/0x130
[  699.729345]  ? clear_page_dirty_for_io+0x332/0x450
[  699.729351]  f2fs_write_cache_pages+0x4ca/0x860
[  699.729358]  ? __write_data_page+0xe30/0xe30
[  699.729374]  ? percpu_counter_add_batch+0x22/0xa0
[  699.729380]  ? kasan_check_write+0x14/0x20
[  699.729391]  ? _raw_spin_lock+0x17/0x40
[  699.729403]  ? f2fs_mark_inode_dirty_sync.part.18+0x16/0x30
[  699.729413]  ? iov_iter_advance+0x113/0x640
[  699.729418]  ? f2fs_write_end+0x133/0x2e0
[  699.729423]  ? balance_dirty_pages_ratelimited+0x239/0x640
[  699.729428]  f2fs_write_data_pages+0x329/0x520
[  699.729433]  ? generic_perform_write+0x250/0x320
[  699.729438]  ? f2fs_write_cache_pages+0x860/0x860
[  699.729454]  ? current_time+0x110/0x110
[  699.729459]  ? f2fs_preallocate_blocks+0x1ef/0x370
[  699.729464]  do_writepages+0x37/0xb0
[  699.729468]  ? f2fs_write_cache_pages+0x860/0x860
[  699.729472]  ? do_writepages+0x37/0xb0
[  699.729478]  __filemap_fdatawrite_range+0x19a/0x1f0
[  699.729483]  ? delete_from_page_cache_batch+0x4e0/0x4e0
[  699.729496]  ? __vfs_write+0x2b2/0x410
[  699.729501]  file_write_and_wait_range+0x66/0xb0
[  699.729506]  f2fs_do_sync_file+0x1f9/0xd90
[  699.729511]  ? truncate_partial_data_page+0x290/0x290
[  699.729521]  ? __sb_end_write+0x30/0x50
[  699.729526]  ? vfs_write+0x20f/0x260
[  699.729530]  f2fs_sync_file+0x9a/0xb0
[  699.729534]  ? f2fs_do_sync_file+0xd90/0xd90
[  699.729548]  vfs_fsync_range+0x68/0x100
[  699.729554]  ? __fget_light+0xc9/0xe0
[  699.729558]  do_fsync+0x3d/0x70
[  699.729562]  __x64_sys_fdatasync+0x24/0x30
[  699.729585]  do_syscall_64+0x78/0x170
[  699.729595]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  699.729613] RIP: 0033:0x7f9bf930d800
[  699.729615] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 83 3d 49 bf 2c 00 00 75 10 b8 4b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be 78 01 00 48 89 04 24
[  699.729668] RSP: 002b:00007ffee3606c68 EFLAGS: 00000246 ORIG_RAX: 000000000000004b
[  699.729673] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9bf930d800
[  699.729675] RDX: 0000000000008000 RSI: 00000000006010a0 RDI: 0000000000000003
[  699.729678] RBP: 00007ffee3606ca0 R08: 0000000001503010 R09: 0000000000000000
[  699.729680] R10: 00000000000002e8 R11: 0000000000000246 R12: 0000000000400610
[  699.729683] R13: 00007ffee3606da0 R14: 0000000000000000 R15: 0000000000000000
[  699.729687] ---[ end trace 4ce02f25ff7d3df5 ]---
[  699.729782] ------------[ cut here ]------------
[  699.729785] kernel BUG at fs/f2fs/segment.h:654!
[  699.731055] invalid opcode: 0000 [#1] SMP KASAN PTI
[  699.732104] CPU: 0 PID: 1309 Comm: a.out Tainted: G        W         4.18.0-rc1+ #4
[  699.733684] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  699.735611] RIP: 0010:f2fs_submit_page_bio+0x29b/0x730
[  699.736649] Code: 54 49 8d bd 18 04 00 00 e8 b2 59 af ff 41 8b 8d 18 04 00 00 8b 45 b8 41 d3 e6 44 01 f0 4c 8d 73 14 41 39 c7 0f 82 37 fe ff ff <0f> 0b 65 8b 05 2c 04 77 47 89 c0 48 0f a3 05 52 c1 d5 01 0f 92 c0
[  699.740524] RSP: 0018:ffff8801f43af508 EFLAGS: 00010283
[  699.741573] RAX: 0000000000000000 RBX: ffff8801f43af7b8 RCX: ffffffffb88a7cef
[  699.743006] RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff8801e3e7a64c
[  699.744426] RBP: ffff8801f43af558 R08: ffffed003e066b55 R09: ffffed003e066b55
[  699.745833] R10: 0000000000000001 R11: ffffed003e066b54 R12: ffffea0007876940
[  699.747256] R13: ffff8801f0335500 R14: ffff8801e3e7a600 R15: 0000000000000001
[  699.748683] FS:  00007f9bf97f5700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  699.750293] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  699.751462] CR2: 00007f9bf925d170 CR3: 00000001f0c34000 CR4: 00000000000006f0
[  699.752874] Call Trace:
[  699.753386]  ? f2fs_inplace_write_data+0x93/0x240
[  699.754341]  f2fs_inplace_write_data+0xd2/0x240
[  699.755271]  f2fs_do_write_data_page+0x2e2/0xe00
[  699.756214]  ? f2fs_should_update_outplace+0xd0/0xd0
[  699.757215]  ? memcg_drain_all_list_lrus+0x280/0x280
[  699.758209]  ? __radix_tree_replace+0xa3/0x120
[  699.759164]  __write_data_page+0x5c7/0xe30
[  699.760002]  ? kasan_check_read+0x11/0x20
[  699.760823]  ? page_mapped+0x8a/0x110
[  699.761573]  ? page_mkclean+0xe9/0x160
[  699.762345]  ? f2fs_do_write_data_page+0xe00/0xe00
[  699.763332]  ? invalid_page_referenced_vma+0x130/0x130
[  699.764374]  ? clear_page_dirty_for_io+0x332/0x450
[  699.765347]  f2fs_write_cache_pages+0x4ca/0x860
[  699.766276]  ? __write_data_page+0xe30/0xe30
[  699.767161]  ? percpu_counter_add_batch+0x22/0xa0
[  699.768112]  ? kasan_check_write+0x14/0x20
[  699.768951]  ? _raw_spin_lock+0x17/0x40
[  699.769739]  ? f2fs_mark_inode_dirty_sync.part.18+0x16/0x30
[  699.770885]  ? iov_iter_advance+0x113/0x640
[  699.771743]  ? f2fs_write_end+0x133/0x2e0
[  699.772569]  ? balance_dirty_pages_ratelimited+0x239/0x640
[  699.773680]  f2fs_write_data_pages+0x329/0x520
[  699.774603]  ? generic_perform_write+0x250/0x320
[  699.775544]  ? f2fs_write_cache_pages+0x860/0x860
[  699.776510]  ? current_time+0x110/0x110
[  699.777299]  ? f2fs_preallocate_blocks+0x1ef/0x370
[  699.778279]  do_writepages+0x37/0xb0
[  699.779026]  ? f2fs_write_cache_pages+0x860/0x860
[  699.779978]  ? do_writepages+0x37/0xb0
[  699.780755]  __filemap_fdatawrite_range+0x19a/0x1f0
[  699.781746]  ? delete_from_page_cache_batch+0x4e0/0x4e0
[  699.782820]  ? __vfs_write+0x2b2/0x410
[  699.783597]  file_write_and_wait_range+0x66/0xb0
[  699.784540]  f2fs_do_sync_file+0x1f9/0xd90
[  699.785381]  ? truncate_partial_data_page+0x290/0x290
[  699.786415]  ? __sb_end_write+0x30/0x50
[  699.787204]  ? vfs_write+0x20f/0x260
[  699.787941]  f2fs_sync_file+0x9a/0xb0
[  699.788694]  ? f2fs_do_sync_file+0xd90/0xd90
[  699.789572]  vfs_fsync_range+0x68/0x100
[  699.790360]  ? __fget_light+0xc9/0xe0
[  699.791128]  do_fsync+0x3d/0x70
[  699.791779]  __x64_sys_fdatasync+0x24/0x30
[  699.792614]  do_syscall_64+0x78/0x170
[  699.793371]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  699.794406] RIP: 0033:0x7f9bf930d800
[  699.795134] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 83 3d 49 bf 2c 00 00 75 10 b8 4b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be 78 01 00 48 89 04 24
[  699.798960] RSP: 002b:00007ffee3606c68 EFLAGS: 00000246 ORIG_RAX: 000000000000004b
[  699.800483] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9bf930d800
[  699.801923] RDX: 0000000000008000 RSI: 00000000006010a0 RDI: 0000000000000003
[  699.803373] RBP: 00007ffee3606ca0 R08: 0000000001503010 R09: 0000000000000000
[  699.804798] R10: 00000000000002e8 R11: 0000000000000246 R12: 0000000000400610
[  699.806233] R13: 00007ffee3606da0 R14: 0000000000000000 R15: 0000000000000000
[  699.807667] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy
[  699.817079] ---[ end trace 4ce02f25ff7d3df6 ]---
[  699.818068] RIP: 0010:f2fs_submit_page_bio+0x29b/0x730
[  699.819114] Code: 54 49 8d bd 18 04 00 00 e8 b2 59 af ff 41 8b 8d 18 04 00 00 8b 45 b8 41 d3 e6 44 01 f0 4c 8d 73 14 41 39 c7 0f 82 37 fe ff ff <0f> 0b 65 8b 05 2c 04 77 47 89 c0 48 0f a3 05 52 c1 d5 01 0f 92 c0
[  699.822919] RSP: 0018:ffff8801f43af508 EFLAGS: 00010283
[  699.823977] RAX: 0000000000000000 RBX: ffff8801f43af7b8 RCX: ffffffffb88a7cef
[  699.825436] RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff8801e3e7a64c
[  699.826881] RBP: ffff8801f43af558 R08: ffffed003e066b55 R09: ffffed003e066b55
[  699.828292] R10: 0000000000000001 R11: ffffed003e066b54 R12: ffffea0007876940
[  699.829750] R13: ffff8801f0335500 R14: ffff8801e3e7a600 R15: 0000000000000001
[  699.831192] FS:  00007f9bf97f5700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  699.832793] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  699.833981] CR2: 00007f9bf925d170 CR3: 00000001f0c34000 CR4: 00000000000006f0
[  699.835556] ==================================================================
[  699.837029] BUG: KASAN: stack-out-of-bounds in update_stack_state+0x38c/0x3e0
[  699.838462] Read of size 8 at addr ffff8801f43af970 by task a.out/1309

[  699.840086] CPU: 0 PID: 1309 Comm: a.out Tainted: G      D W         4.18.0-rc1+ #4
[  699.841603] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  699.843475] Call Trace:
[  699.843982]  dump_stack+0x7b/0xb5
[  699.844661]  print_address_description+0x70/0x290
[  699.845607]  kasan_report+0x291/0x390
[  699.846351]  ? update_stack_state+0x38c/0x3e0
[  699.853831]  __asan_load8+0x54/0x90
[  699.854569]  update_stack_state+0x38c/0x3e0
[  699.855428]  ? __read_once_size_nocheck.constprop.7+0x20/0x20
[  699.856601]  ? __save_stack_trace+0x5e/0x100
[  699.857476]  unwind_next_frame.part.5+0x18e/0x490
[  699.858448]  ? unwind_dump+0x290/0x290
[  699.859217]  ? clear_page_dirty_for_io+0x332/0x450
[  699.860185]  __unwind_start+0x106/0x190
[  699.860974]  __save_stack_trace+0x5e/0x100
[  699.861808]  ? __save_stack_trace+0x5e/0x100
[  699.862691]  ? unlink_anon_vmas+0xba/0x2c0
[  699.863525]  save_stack_trace+0x1f/0x30
[  699.864312]  save_stack+0x46/0xd0
[  699.864993]  ? __alloc_pages_slowpath+0x1420/0x1420
[  699.865990]  ? flush_tlb_mm_range+0x15e/0x220
[  699.866889]  ? kasan_check_write+0x14/0x20
[  699.867724]  ? __dec_node_state+0x92/0xb0
[  699.868543]  ? lock_page_memcg+0x85/0xf0
[  699.869350]  ? unlock_page_memcg+0x16/0x80
[  699.870185]  ? page_remove_rmap+0x198/0x520
[  699.871048]  ? mark_page_accessed+0x133/0x200
[  699.871930]  ? _cond_resched+0x1a/0x50
[  699.872700]  ? unmap_page_range+0xcd4/0xe50
[  699.873551]  ? rb_next+0x58/0x80
[  699.874217]  ? rb_next+0x58/0x80
[  699.874895]  __kasan_slab_free+0x13c/0x1a0
[  699.875734]  ? unlink_anon_vmas+0xba/0x2c0
[  699.876563]  kasan_slab_free+0xe/0x10
[  699.877315]  kmem_cache_free+0x89/0x1e0
[  699.878095]  unlink_anon_vmas+0xba/0x2c0
[  699.878913]  free_pgtables+0x101/0x1b0
[  699.879677]  exit_mmap+0x146/0x2a0
[  699.880378]  ? __ia32_sys_munmap+0x50/0x50
[  699.881214]  ? kasan_check_read+0x11/0x20
[  699.882052]  ? mm_update_next_owner+0x322/0x380
[  699.882985]  mmput+0x8b/0x1d0
[  699.883602]  do_exit+0x43a/0x1390
[  699.884288]  ? mm_update_next_owner+0x380/0x380
[  699.885212]  ? f2fs_sync_file+0x9a/0xb0
[  699.885995]  ? f2fs_do_sync_file+0xd90/0xd90
[  699.886877]  ? vfs_fsync_range+0x68/0x100
[  699.887694]  ? __fget_light+0xc9/0xe0
[  699.888442]  ? do_fsync+0x3d/0x70
[  699.889118]  ? __x64_sys_fdatasync+0x24/0x30
[  699.889996]  rewind_stack_do_exit+0x17/0x20
[  699.890860] RIP: 0033:0x7f9bf930d800
[  699.891585] Code: Bad RIP value.
[  699.892268] RSP: 002b:00007ffee3606c68 EFLAGS: 00000246 ORIG_RAX: 000000000000004b
[  699.893781] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9bf930d800
[  699.895220] RDX: 0000000000008000 RSI: 00000000006010a0 RDI: 0000000000000003
[  699.896643] RBP: 00007ffee3606ca0 R08: 0000000001503010 R09: 0000000000000000
[  699.898069] R10: 00000000000002e8 R11: 0000000000000246 R12: 0000000000400610
[  699.899505] R13: 00007ffee3606da0 R14: 0000000000000000 R15: 0000000000000000

[  699.901241] The buggy address belongs to the page:
[  699.902215] page:ffffea0007d0ebc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[  699.903811] flags: 0x2ffff0000000000()
[  699.904585] raw: 02ffff0000000000 0000000000000000 ffffffff07d00101 0000000000000000
[  699.906125] raw: 0000000000000000 0000000000240000 00000000ffffffff 0000000000000000
[  699.907673] page dumped because: kasan: bad access detected

[  699.909108] Memory state around the buggy address:
[  699.910077]  ffff8801f43af800: 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3 f3 f3 00 00 00
[  699.911528]  ffff8801f43af880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  699.912953] >ffff8801f43af900: 00 00 00 00 00 00 00 00 f1 01 f4 f4 f4 f2 f2 f2
[  699.914392]                                                              ^
[  699.915758]  ffff8801f43af980: f2 00 f4 f4 00 00 00 00 f2 00 00 00 00 00 00 00
[  699.917193]  ffff8801f43afa00: 00 00 00 00 00 00 00 00 00 f3 f3 f3 00 00 00 00
[  699.918634] ==================================================================

- Location
https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/f2fs/segment.h#L644

Reported-by Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[bwh: Backported to 4.14:
 - Error label is different in validate_checkpoint() due to the earlier
   backport of "f2fs: fix invalid memory access"
 - Adjust context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/f2fs/checkpoint.c | 22 +++++++++++++++++++---
 fs/f2fs/data.c       | 33 +++++++++++++++++++++++++++------
 fs/f2fs/f2fs.h       |  3 +++
 fs/f2fs/file.c       | 12 ++++++++++++
 fs/f2fs/inode.c      | 17 +++++++++++++++++
 fs/f2fs/node.c       |  4 ++++
 fs/f2fs/segment.h    |  3 +--
 7 files changed, 83 insertions(+), 11 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index d7bd9745e883..c81cd5057b8e 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -86,8 +86,10 @@ repeat:
 	fio.page = page;
 
 	if (f2fs_submit_page_bio(&fio)) {
-		f2fs_put_page(page, 1);
-		goto repeat;
+		memset(page_address(page), 0, PAGE_SIZE);
+		f2fs_stop_checkpoint(sbi, false);
+		f2fs_bug_on(sbi, 1);
+		return page;
 	}
 
 	lock_page(page);
@@ -141,8 +143,14 @@ bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi,
 	case META_POR:
 	case DATA_GENERIC:
 		if (unlikely(blkaddr >= MAX_BLKADDR(sbi) ||
-			blkaddr < MAIN_BLKADDR(sbi)))
+			blkaddr < MAIN_BLKADDR(sbi))) {
+			if (type == DATA_GENERIC) {
+				f2fs_msg(sbi->sb, KERN_WARNING,
+					"access invalid blkaddr:%u", blkaddr);
+				WARN_ON(1);
+			}
 			return false;
+		}
 		break;
 	case META_GENERIC:
 		if (unlikely(blkaddr < SEG0_BLKADDR(sbi) ||
@@ -746,6 +754,14 @@ static struct page *validate_checkpoint(struct f2fs_sb_info *sbi,
 					&cp_page_1, version);
 	if (err)
 		return NULL;
+
+	if (le32_to_cpu(cp_block->cp_pack_total_block_count) >
+					sbi->blocks_per_seg) {
+		f2fs_msg(sbi->sb, KERN_WARNING,
+			"invalid cp_pack_total_block_count:%u",
+			le32_to_cpu(cp_block->cp_pack_total_block_count));
+		goto invalid_cp;
+	}
 	pre_version = *version;
 
 	cp_addr += le32_to_cpu(cp_block->cp_pack_total_block_count) - 1;
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 615878806611..8f6e7c3a10f8 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -369,7 +369,10 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio)
 	struct page *page = fio->encrypted_page ?
 			fio->encrypted_page : fio->page;
 
-	verify_block_addr(fio, fio->new_blkaddr);
+	if (!f2fs_is_valid_blkaddr(fio->sbi, fio->new_blkaddr,
+			__is_meta_io(fio) ? META_GENERIC : DATA_GENERIC))
+		return -EFAULT;
+
 	trace_f2fs_submit_page_bio(page, fio);
 	f2fs_trace_ios(fio, 0);
 
@@ -946,6 +949,12 @@ next_dnode:
 next_block:
 	blkaddr = datablock_addr(dn.inode, dn.node_page, dn.ofs_in_node);
 
+	if (__is_valid_data_blkaddr(blkaddr) &&
+		!f2fs_is_valid_blkaddr(sbi, blkaddr, DATA_GENERIC)) {
+		err = -EFAULT;
+		goto sync_out;
+	}
+
 	if (!is_valid_data_blkaddr(sbi, blkaddr)) {
 		if (create) {
 			if (unlikely(f2fs_cp_error(sbi))) {
@@ -1264,6 +1273,10 @@ got_it:
 				SetPageUptodate(page);
 				goto confused;
 			}
+
+			if (!f2fs_is_valid_blkaddr(F2FS_I_SB(inode), block_nr,
+								DATA_GENERIC))
+				goto set_error_page;
 		} else {
 			zero_user_segment(page, 0, PAGE_SIZE);
 			if (!PageUptodate(page))
@@ -1402,11 +1415,13 @@ int do_write_data_page(struct f2fs_io_info *fio)
 			f2fs_lookup_extent_cache(inode, page->index, &ei)) {
 		fio->old_blkaddr = ei.blk + page->index - ei.fofs;
 
-		if (is_valid_data_blkaddr(fio->sbi, fio->old_blkaddr)) {
-			ipu_force = true;
-			fio->need_lock = LOCK_DONE;
-			goto got_it;
-		}
+		if (!f2fs_is_valid_blkaddr(fio->sbi, fio->old_blkaddr,
+							DATA_GENERIC))
+			return -EFAULT;
+
+		ipu_force = true;
+		fio->need_lock = LOCK_DONE;
+		goto got_it;
 	}
 
 	/* Deadlock due to between page->lock and f2fs_lock_op */
@@ -1425,6 +1440,12 @@ int do_write_data_page(struct f2fs_io_info *fio)
 		goto out_writepage;
 	}
 got_it:
+	if (__is_valid_data_blkaddr(fio->old_blkaddr) &&
+		!f2fs_is_valid_blkaddr(fio->sbi, fio->old_blkaddr,
+							DATA_GENERIC)) {
+		err = -EFAULT;
+		goto out_writepage;
+	}
 	/*
 	 * If current allocation needs SSR,
 	 * it had better in-place writes for updated data.
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index d15d79457f5c..3f1a44696036 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -2357,6 +2357,9 @@ static inline void f2fs_update_iostat(struct f2fs_sb_info *sbi,
 	spin_unlock(&sbi->iostat_lock);
 }
 
+#define __is_meta_io(fio) (PAGE_TYPE_OF_BIO(fio->type) == META &&	\
+				(!is_read_io(fio->op) || fio->is_meta))
+
 bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi,
 					block_t blkaddr, int type);
 void f2fs_msg(struct super_block *sb, const char *level, const char *fmt, ...);
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index e9c575ef70b5..7d3189f1941c 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -397,6 +397,13 @@ static loff_t f2fs_seek_block(struct file *file, loff_t offset, int whence)
 			blkaddr = datablock_addr(dn.inode,
 					dn.node_page, dn.ofs_in_node);
 
+			if (__is_valid_data_blkaddr(blkaddr) &&
+				!f2fs_is_valid_blkaddr(F2FS_I_SB(inode),
+						blkaddr, DATA_GENERIC)) {
+				f2fs_put_dnode(&dn);
+				goto fail;
+			}
+
 			if (__found_offset(F2FS_I_SB(inode), blkaddr, dirty,
 							pgofs, whence)) {
 				f2fs_put_dnode(&dn);
@@ -496,6 +503,11 @@ int truncate_data_blocks_range(struct dnode_of_data *dn, int count)
 
 		dn->data_blkaddr = NULL_ADDR;
 		set_data_blkaddr(dn);
+
+		if (__is_valid_data_blkaddr(blkaddr) &&
+			!f2fs_is_valid_blkaddr(sbi, blkaddr, DATA_GENERIC))
+			continue;
+
 		invalidate_blocks(sbi, blkaddr);
 		if (dn->ofs_in_node == 0 && IS_INODE(dn->node_page))
 			clear_inode_flag(dn->inode, FI_FIRST_BLOCK_WRITTEN);
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index be7d9773d291..aeed9943836a 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -214,6 +214,23 @@ static bool sanity_check_inode(struct inode *inode, struct page *node_page)
 			__func__, inode->i_ino);
 		return false;
 	}
+
+	if (F2FS_I(inode)->extent_tree) {
+		struct extent_info *ei = &F2FS_I(inode)->extent_tree->largest;
+
+		if (ei->len &&
+			(!f2fs_is_valid_blkaddr(sbi, ei->blk, DATA_GENERIC) ||
+			!f2fs_is_valid_blkaddr(sbi, ei->blk + ei->len - 1,
+							DATA_GENERIC))) {
+			set_sbi_flag(sbi, SBI_NEED_FSCK);
+			f2fs_msg(sbi->sb, KERN_WARNING,
+				"%s: inode (ino=%lx) extent info [%u, %u, %u] "
+				"is incorrect, run fsck to fix",
+				__func__, inode->i_ino,
+				ei->blk, ei->fofs, ei->len);
+			return false;
+		}
+	}
 	return true;
 }
 
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 999814c8cbea..6adb6c60f017 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1398,6 +1398,10 @@ static int __write_node_page(struct page *page, bool atomic, bool *submitted,
 		return 0;
 	}
 
+	if (__is_valid_data_blkaddr(ni.blk_addr) &&
+		!f2fs_is_valid_blkaddr(sbi, ni.blk_addr, DATA_GENERIC))
+		goto redirty_out;
+
 	if (atomic && !test_opt(sbi, NOBARRIER))
 		fio.op_flags |= REQ_PREFLUSH | REQ_FUA;
 
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index dd8f977fc273..47348d98165b 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -629,8 +629,7 @@ static inline void verify_block_addr(struct f2fs_io_info *fio, block_t blk_addr)
 {
 	struct f2fs_sb_info *sbi = fio->sbi;
 
-	if (PAGE_TYPE_OF_BIO(fio->type) == META &&
-				(!is_read_io(fio->op) || fio->is_meta))
+	if (__is_meta_io(fio))
 		verify_blkaddr(sbi, blk_addr, META_GENERIC);
 	else
 		verify_blkaddr(sbi, blk_addr, DATA_GENERIC);
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 055/146] f2fs: fix to do sanity check with i_extra_isize
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (53 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 054/146] f2fs: fix to do sanity check with block address in main area Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 056/146] f2fs: fix to do sanity check with cp_pack_start_sum Greg Kroah-Hartman
                   ` (95 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Chao Yu, Jaegeuk Kim, Ben Hutchings,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 18dd6470c2d14d10f5a2dd926925dc80dbd3abfd upstream.

If inode.i_extra_isize was fuzzed to an abnormal value, when
calculating inline data size, the result will overflow, result
in accessing invalid memory area when operating inline data.

Let's do sanity check with i_extra_isize during inode loading
for fixing.

https://bugzilla.kernel.org/show_bug.cgi?id=200421

- Reproduce

- POC (poc.c)
    #define _GNU_SOURCE
    #include <sys/types.h>
    #include <sys/mount.h>
    #include <sys/mman.h>
    #include <sys/stat.h>
    #include <sys/xattr.h>

    #include <dirent.h>
    #include <errno.h>
    #include <error.h>
    #include <fcntl.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <unistd.h>

    #include <linux/falloc.h>
    #include <linux/loop.h>

    static void activity(char *mpoint) {

      char *foo_bar_baz;
      char *foo_baz;
      char *xattr;
      int err;

      err = asprintf(&foo_bar_baz, "%s/foo/bar/baz", mpoint);
      err = asprintf(&foo_baz, "%s/foo/baz", mpoint);
      err = asprintf(&xattr, "%s/foo/bar/xattr", mpoint);

      rename(foo_bar_baz, foo_baz);

      char buf2[113];
      memset(buf2, 0, sizeof(buf2));
      listxattr(xattr, buf2, sizeof(buf2));
      removexattr(xattr, "user.mime_type");

    }

    int main(int argc, char *argv[]) {
      activity(argv[1]);
      return 0;
    }

- Kernel message
Umount the image will leave the following message
[ 2910.995489] F2FS-fs (loop0): Mounted with checkpoint version = 2
[ 2918.416465] ==================================================================
[ 2918.416807] BUG: KASAN: slab-out-of-bounds in f2fs_iget+0xcb9/0x1a80
[ 2918.417009] Read of size 4 at addr ffff88018efc2068 by task a.out/1229

[ 2918.417311] CPU: 1 PID: 1229 Comm: a.out Not tainted 4.17.0+ #1
[ 2918.417314] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 2918.417323] Call Trace:
[ 2918.417366]  dump_stack+0x71/0xab
[ 2918.417401]  print_address_description+0x6b/0x290
[ 2918.417407]  kasan_report+0x28e/0x390
[ 2918.417411]  ? f2fs_iget+0xcb9/0x1a80
[ 2918.417415]  f2fs_iget+0xcb9/0x1a80
[ 2918.417422]  ? f2fs_lookup+0x2e7/0x580
[ 2918.417425]  f2fs_lookup+0x2e7/0x580
[ 2918.417433]  ? __recover_dot_dentries+0x400/0x400
[ 2918.417447]  ? legitimize_path.isra.29+0x5a/0xa0
[ 2918.417453]  __lookup_slow+0x11c/0x220
[ 2918.417457]  ? may_delete+0x2a0/0x2a0
[ 2918.417475]  ? deref_stack_reg+0xe0/0xe0
[ 2918.417479]  ? __lookup_hash+0xb0/0xb0
[ 2918.417483]  lookup_slow+0x3e/0x60
[ 2918.417488]  walk_component+0x3ac/0x990
[ 2918.417492]  ? generic_permission+0x51/0x1e0
[ 2918.417495]  ? inode_permission+0x51/0x1d0
[ 2918.417499]  ? pick_link+0x3e0/0x3e0
[ 2918.417502]  ? link_path_walk+0x4b1/0x770
[ 2918.417513]  ? _raw_spin_lock_irqsave+0x25/0x50
[ 2918.417518]  ? walk_component+0x990/0x990
[ 2918.417522]  ? path_init+0x2e6/0x580
[ 2918.417526]  path_lookupat+0x13f/0x430
[ 2918.417531]  ? trailing_symlink+0x3a0/0x3a0
[ 2918.417534]  ? do_renameat2+0x270/0x7b0
[ 2918.417538]  ? __kasan_slab_free+0x14c/0x190
[ 2918.417541]  ? do_renameat2+0x270/0x7b0
[ 2918.417553]  ? kmem_cache_free+0x85/0x1e0
[ 2918.417558]  ? do_renameat2+0x270/0x7b0
[ 2918.417563]  filename_lookup+0x13c/0x280
[ 2918.417567]  ? filename_parentat+0x2b0/0x2b0
[ 2918.417572]  ? kasan_unpoison_shadow+0x31/0x40
[ 2918.417575]  ? kasan_kmalloc+0xa6/0xd0
[ 2918.417593]  ? strncpy_from_user+0xaa/0x1c0
[ 2918.417598]  ? getname_flags+0x101/0x2b0
[ 2918.417614]  ? path_listxattr+0x87/0x110
[ 2918.417619]  path_listxattr+0x87/0x110
[ 2918.417623]  ? listxattr+0xc0/0xc0
[ 2918.417637]  ? mm_fault_error+0x1b0/0x1b0
[ 2918.417654]  do_syscall_64+0x73/0x160
[ 2918.417660]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2918.417676] RIP: 0033:0x7f2f3a3480d7
[ 2918.417677] Code: f0 ff ff 73 01 c3 48 8b 0d be dd 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 c2 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 dd 2b 00 f7 d8 64 89 01 48
[ 2918.417732] RSP: 002b:00007fff4095b7d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000c2
[ 2918.417744] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2f3a3480d7
[ 2918.417746] RDX: 0000000000000071 RSI: 00007fff4095b810 RDI: 000000000126a0c0
[ 2918.417749] RBP: 00007fff4095b890 R08: 000000000126a010 R09: 0000000000000000
[ 2918.417751] R10: 00000000000001ab R11: 0000000000000206 R12: 00000000004005e0
[ 2918.417753] R13: 00007fff4095b990 R14: 0000000000000000 R15: 0000000000000000

[ 2918.417853] Allocated by task 329:
[ 2918.418002]  kasan_kmalloc+0xa6/0xd0
[ 2918.418007]  kmem_cache_alloc+0xc8/0x1e0
[ 2918.418023]  mempool_init_node+0x194/0x230
[ 2918.418027]  mempool_init+0x12/0x20
[ 2918.418042]  bioset_init+0x2bd/0x380
[ 2918.418052]  blk_alloc_queue_node+0xe9/0x540
[ 2918.418075]  dm_create+0x2c0/0x800
[ 2918.418080]  dev_create+0xd2/0x530
[ 2918.418083]  ctl_ioctl+0x2a3/0x5b0
[ 2918.418087]  dm_ctl_ioctl+0xa/0x10
[ 2918.418092]  do_vfs_ioctl+0x13e/0x8c0
[ 2918.418095]  ksys_ioctl+0x66/0x70
[ 2918.418098]  __x64_sys_ioctl+0x3d/0x50
[ 2918.418102]  do_syscall_64+0x73/0x160
[ 2918.418106]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[ 2918.418204] Freed by task 0:
[ 2918.418301] (stack is not available)

[ 2918.418521] The buggy address belongs to the object at ffff88018efc0000
                which belongs to the cache biovec-max of size 8192
[ 2918.418894] The buggy address is located 104 bytes to the right of
                8192-byte region [ffff88018efc0000, ffff88018efc2000)
[ 2918.419257] The buggy address belongs to the page:
[ 2918.419431] page:ffffea00063bf000 count:1 mapcount:0 mapping:ffff8801f2242540 index:0x0 compound_mapcount: 0
[ 2918.419702] flags: 0x17fff8000008100(slab|head)
[ 2918.419879] raw: 017fff8000008100 dead000000000100 dead000000000200 ffff8801f2242540
[ 2918.420101] raw: 0000000000000000 0000000000030003 00000001ffffffff 0000000000000000
[ 2918.420322] page dumped because: kasan: bad access detected

[ 2918.420599] Memory state around the buggy address:
[ 2918.420764]  ffff88018efc1f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 2918.420975]  ffff88018efc1f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 2918.421194] >ffff88018efc2000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 2918.421406]                                                           ^
[ 2918.421627]  ffff88018efc2080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 2918.421838]  ffff88018efc2100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 2918.422046] ==================================================================
[ 2918.422264] Disabling lock debugging due to kernel taint
[ 2923.901641] BUG: unable to handle kernel paging request at ffff88018f0db000
[ 2923.901884] PGD 22226a067 P4D 22226a067 PUD 222273067 PMD 18e642063 PTE 800000018f0db061
[ 2923.902120] Oops: 0003 [#1] SMP KASAN PTI
[ 2923.902274] CPU: 1 PID: 1231 Comm: umount Tainted: G    B             4.17.0+ #1
[ 2923.902490] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 2923.902761] RIP: 0010:__memset+0x24/0x30
[ 2923.902906] Code: 90 90 90 90 90 90 66 66 90 66 90 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3
[ 2923.903446] RSP: 0018:ffff88018ddf7ae0 EFLAGS: 00010206
[ 2923.903622] RAX: 0000000000000000 RBX: ffff8801d549d888 RCX: 1ffffffffffdaffb
[ 2923.903833] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88018f0daffc
[ 2923.904062] RBP: ffff88018efc206c R08: 1ffff10031df840d R09: ffff88018efc206c
[ 2923.904273] R10: ffffffffffffe1ee R11: ffffed0031df65fa R12: 0000000000000000
[ 2923.904485] R13: ffff8801d549dc98 R14: 00000000ffffc3db R15: ffffea00063bec80
[ 2923.904693] FS:  00007fa8b2f8a840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000
[ 2923.904937] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2923.910080] CR2: ffff88018f0db000 CR3: 000000018f892000 CR4: 00000000000006e0
[ 2923.914930] Call Trace:
[ 2923.919724]  f2fs_truncate_inline_inode+0x114/0x170
[ 2923.924487]  f2fs_truncate_blocks+0x11b/0x7c0
[ 2923.929178]  ? f2fs_truncate_data_blocks+0x10/0x10
[ 2923.933834]  ? dqget+0x670/0x670
[ 2923.938437]  ? f2fs_destroy_extent_tree+0xd6/0x270
[ 2923.943107]  ? __radix_tree_lookup+0x2f/0x150
[ 2923.947772]  f2fs_truncate+0xd4/0x1a0
[ 2923.952491]  f2fs_evict_inode+0x5ab/0x610
[ 2923.957204]  evict+0x15f/0x280
[ 2923.961898]  __dentry_kill+0x161/0x250
[ 2923.966634]  shrink_dentry_list+0xf3/0x250
[ 2923.971897]  shrink_dcache_parent+0xa9/0x100
[ 2923.976561]  ? shrink_dcache_sb+0x1f0/0x1f0
[ 2923.981177]  ? wait_for_completion+0x8a/0x210
[ 2923.985781]  ? migrate_swap_stop+0x2d0/0x2d0
[ 2923.990332]  do_one_tree+0xe/0x40
[ 2923.994735]  shrink_dcache_for_umount+0x3a/0xa0
[ 2923.999077]  generic_shutdown_super+0x3e/0x1c0
[ 2924.003350]  kill_block_super+0x4b/0x70
[ 2924.007619]  deactivate_locked_super+0x65/0x90
[ 2924.011812]  cleanup_mnt+0x5c/0xa0
[ 2924.015995]  task_work_run+0xce/0xf0
[ 2924.020174]  exit_to_usermode_loop+0x115/0x120
[ 2924.024293]  do_syscall_64+0x12f/0x160
[ 2924.028479]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2924.032709] RIP: 0033:0x7fa8b2868487
[ 2924.036888] Code: 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 c9 2b 00 f7 d8 64 89 01 48
[ 2924.045750] RSP: 002b:00007ffc39824d58 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[ 2924.050190] RAX: 0000000000000000 RBX: 00000000008ea030 RCX: 00007fa8b2868487
[ 2924.054604] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00000000008f4360
[ 2924.058940] RBP: 00000000008f4360 R08: 0000000000000000 R09: 0000000000000014
[ 2924.063186] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007fa8b2d7183c
[ 2924.067418] R13: 0000000000000000 R14: 00000000008ea210 R15: 00007ffc39824fe0
[ 2924.071534] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer joydev input_leds serio_raw snd soundcore mac_hid i2c_piix4 ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi btrfs zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 8139too qxl ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel psmouse aes_x86_64 8139cp crypto_simd cryptd mii glue_helper pata_acpi floppy
[ 2924.098044] CR2: ffff88018f0db000
[ 2924.102520] ---[ end trace a8e0d899985faf31 ]---
[ 2924.107012] RIP: 0010:__memset+0x24/0x30
[ 2924.111448] Code: 90 90 90 90 90 90 66 66 90 66 90 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3
[ 2924.120724] RSP: 0018:ffff88018ddf7ae0 EFLAGS: 00010206
[ 2924.125312] RAX: 0000000000000000 RBX: ffff8801d549d888 RCX: 1ffffffffffdaffb
[ 2924.129931] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88018f0daffc
[ 2924.134537] RBP: ffff88018efc206c R08: 1ffff10031df840d R09: ffff88018efc206c
[ 2924.139175] R10: ffffffffffffe1ee R11: ffffed0031df65fa R12: 0000000000000000
[ 2924.143825] R13: ffff8801d549dc98 R14: 00000000ffffc3db R15: ffffea00063bec80
[ 2924.148500] FS:  00007fa8b2f8a840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000
[ 2924.153247] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2924.158003] CR2: ffff88018f0db000 CR3: 000000018f892000 CR4: 00000000000006e0
[ 2924.164641] BUG: Bad rss-counter state mm:00000000fa04621e idx:0 val:4
[ 2924.170007] BUG: Bad rss-counter
tate mm:00000000fa04621e idx:1 val:2

- Location
https://elixir.bootlin.com/linux/v4.18-rc3/source/fs/f2fs/inline.c#L78
	memset(addr + from, 0, MAX_INLINE_DATA(inode) - from);
Here the length can be negative.

Reported-by Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[bwh: Backported to 4.14: adjust context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/f2fs/inode.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index aeed9943836a..9a40724dbaa6 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -183,6 +183,7 @@ void f2fs_inode_chksum_set(struct f2fs_sb_info *sbi, struct page *page)
 static bool sanity_check_inode(struct inode *inode, struct page *node_page)
 {
 	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+	struct f2fs_inode_info *fi = F2FS_I(inode);
 	unsigned long long iblocks;
 
 	iblocks = le64_to_cpu(F2FS_INODE(node_page)->i_blocks);
@@ -215,6 +216,17 @@ static bool sanity_check_inode(struct inode *inode, struct page *node_page)
 		return false;
 	}
 
+	if (fi->i_extra_isize > F2FS_TOTAL_EXTRA_ATTR_SIZE ||
+			fi->i_extra_isize % sizeof(__le32)) {
+		set_sbi_flag(sbi, SBI_NEED_FSCK);
+		f2fs_msg(sbi->sb, KERN_WARNING,
+			"%s: inode (ino=%lx) has corrupted i_extra_isize: %d, "
+			"max: %zu",
+			__func__, inode->i_ino, fi->i_extra_isize,
+			F2FS_TOTAL_EXTRA_ATTR_SIZE);
+		return false;
+	}
+
 	if (F2FS_I(inode)->extent_tree) {
 		struct extent_info *ei = &F2FS_I(inode)->extent_tree->largest;
 
@@ -280,14 +292,14 @@ static int do_read_inode(struct inode *inode)
 
 	get_inline_info(inode, ri);
 
+	fi->i_extra_isize = f2fs_has_extra_attr(inode) ?
+					le16_to_cpu(ri->i_extra_isize) : 0;
+
 	if (!sanity_check_inode(inode, node_page)) {
 		f2fs_put_page(node_page, 1);
 		return -EINVAL;
 	}
 
-	fi->i_extra_isize = f2fs_has_extra_attr(inode) ?
-					le16_to_cpu(ri->i_extra_isize) : 0;
-
 	/* check data exist */
 	if (f2fs_has_inline_data(inode) && !f2fs_exist_data(inode))
 		__recover_inline_status(inode, node_page);
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 056/146] f2fs: fix to do sanity check with cp_pack_start_sum
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (54 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 055/146] f2fs: fix to do sanity check with i_extra_isize Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 057/146] xfs: dont fail when converting shortform attr to long form during ATTR_REPLACE Greg Kroah-Hartman
                   ` (94 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Chao Yu, Jaegeuk Kim, Ben Hutchings,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit e494c2f995d6181d6e29c4927d68e0f295ecf75b upstream.

After fuzzing, cp_pack_start_sum could be corrupted, so current log's
summary info should be wrong due to loading incorrect summary block.
Then, if segment's type in current log is exceeded NR_CURSEG_TYPE, it
can lead accessing invalid dirty_i->dirty_segmap bitmap finally.

Add sanity check for cp_pack_start_sum to fix this issue.

https://bugzilla.kernel.org/show_bug.cgi?id=200419

- Reproduce

- Kernel message (f2fs-dev w/ KASAN)
[ 3117.578432] F2FS-fs (loop0): Invalid log blocks per segment (8)

[ 3117.578445] F2FS-fs (loop0): Can't find valid F2FS filesystem in 2th superblock
[ 3117.581364] F2FS-fs (loop0): invalid crc_offset: 30716
[ 3117.583564] WARNING: CPU: 1 PID: 1225 at fs/f2fs/checkpoint.c:90 __get_meta_page+0x448/0x4b0
[ 3117.583570] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer joydev input_leds serio_raw snd soundcore mac_hid i2c_piix4 ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi btrfs zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 8139too qxl ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel psmouse aes_x86_64 8139cp crypto_simd cryptd mii glue_helper pata_acpi floppy
[ 3117.584014] CPU: 1 PID: 1225 Comm: mount Not tainted 4.17.0+ #1
[ 3117.584017] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 3117.584022] RIP: 0010:__get_meta_page+0x448/0x4b0
[ 3117.584023] Code: 00 49 8d bc 24 84 00 00 00 e8 74 54 da ff 41 83 8c 24 84 00 00 00 08 4c 89 f6 4c 89 ef e8 c0 d9 95 00 48 89 ef e8 18 e3 00 00 <0f> 0b f0 80 4d 48 04 e9 0f fe ff ff 0f 0b 48 89 c7 48 89 04 24 e8
[ 3117.584072] RSP: 0018:ffff88018eb678c0 EFLAGS: 00010286
[ 3117.584082] RAX: ffff88018f0a6a78 RBX: ffffea0007a46600 RCX: ffffffff9314d1b2
[ 3117.584085] RDX: ffffffff00000001 RSI: 0000000000000000 RDI: ffff88018f0a6a98
[ 3117.584087] RBP: ffff88018ebe9980 R08: 0000000000000002 R09: 0000000000000001
[ 3117.584090] R10: 0000000000000001 R11: ffffed00326e4450 R12: ffff880193722200
[ 3117.584092] R13: ffff88018ebe9afc R14: 0000000000000206 R15: ffff88018eb67900
[ 3117.584096] FS:  00007f5694636840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000
[ 3117.584098] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3117.584101] CR2: 00000000016f21b8 CR3: 0000000191c22000 CR4: 00000000000006e0
[ 3117.584112] Call Trace:
[ 3117.584121]  ? f2fs_set_meta_page_dirty+0x150/0x150
[ 3117.584127]  ? f2fs_build_segment_manager+0xbf9/0x3190
[ 3117.584133]  ? f2fs_npages_for_summary_flush+0x75/0x120
[ 3117.584145]  f2fs_build_segment_manager+0xda8/0x3190
[ 3117.584151]  ? f2fs_get_valid_checkpoint+0x298/0xa00
[ 3117.584156]  ? f2fs_flush_sit_entries+0x10e0/0x10e0
[ 3117.584184]  ? map_id_range_down+0x17c/0x1b0
[ 3117.584188]  ? __put_user_ns+0x30/0x30
[ 3117.584206]  ? find_next_bit+0x53/0x90
[ 3117.584237]  ? cpumask_next+0x16/0x20
[ 3117.584249]  f2fs_fill_super+0x1948/0x2b40
[ 3117.584258]  ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.584279]  ? sget_userns+0x65e/0x690
[ 3117.584296]  ? set_blocksize+0x88/0x130
[ 3117.584302]  ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.584305]  mount_bdev+0x1c0/0x200
[ 3117.584310]  mount_fs+0x5c/0x190
[ 3117.584320]  vfs_kern_mount+0x64/0x190
[ 3117.584330]  do_mount+0x2e4/0x1450
[ 3117.584343]  ? lockref_put_return+0x130/0x130
[ 3117.584347]  ? copy_mount_string+0x20/0x20
[ 3117.584357]  ? kasan_unpoison_shadow+0x31/0x40
[ 3117.584362]  ? kasan_kmalloc+0xa6/0xd0
[ 3117.584373]  ? memcg_kmem_put_cache+0x16/0x90
[ 3117.584377]  ? __kmalloc_track_caller+0x196/0x210
[ 3117.584383]  ? _copy_from_user+0x61/0x90
[ 3117.584396]  ? memdup_user+0x3e/0x60
[ 3117.584401]  ksys_mount+0x7e/0xd0
[ 3117.584405]  __x64_sys_mount+0x62/0x70
[ 3117.584427]  do_syscall_64+0x73/0x160
[ 3117.584440]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3117.584455] RIP: 0033:0x7f5693f14b9a
[ 3117.584456] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48
[ 3117.584505] RSP: 002b:00007fff27346488 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[ 3117.584510] RAX: ffffffffffffffda RBX: 00000000016e2030 RCX: 00007f5693f14b9a
[ 3117.584512] RDX: 00000000016e2210 RSI: 00000000016e3f30 RDI: 00000000016ee040
[ 3117.584514] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[ 3117.584516] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 00000000016ee040
[ 3117.584519] R13: 00000000016e2210 R14: 0000000000000000 R15: 0000000000000003
[ 3117.584523] ---[ end trace a8e0d899985faf31 ]---
[ 3117.685663] F2FS-fs (loop0): f2fs_check_nid_range: out-of-range nid=2, run fsck to fix.
[ 3117.685673] F2FS-fs (loop0): recover_data: ino = 2 (i_size: recover) recovered = 1, err = 0
[ 3117.685707] ==================================================================
[ 3117.685955] BUG: KASAN: slab-out-of-bounds in __remove_dirty_segment+0xdd/0x1e0
[ 3117.686175] Read of size 8 at addr ffff88018f0a63d0 by task mount/1225

[ 3117.686477] CPU: 0 PID: 1225 Comm: mount Tainted: G        W         4.17.0+ #1
[ 3117.686481] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 3117.686483] Call Trace:
[ 3117.686494]  dump_stack+0x71/0xab
[ 3117.686512]  print_address_description+0x6b/0x290
[ 3117.686517]  kasan_report+0x28e/0x390
[ 3117.686522]  ? __remove_dirty_segment+0xdd/0x1e0
[ 3117.686527]  __remove_dirty_segment+0xdd/0x1e0
[ 3117.686532]  locate_dirty_segment+0x189/0x190
[ 3117.686538]  f2fs_allocate_new_segments+0xa9/0xe0
[ 3117.686543]  recover_data+0x703/0x2c20
[ 3117.686547]  ? f2fs_recover_fsync_data+0x48f/0xd50
[ 3117.686553]  ? ksys_mount+0x7e/0xd0
[ 3117.686564]  ? policy_nodemask+0x1a/0x90
[ 3117.686567]  ? policy_node+0x56/0x70
[ 3117.686571]  ? add_fsync_inode+0xf0/0xf0
[ 3117.686592]  ? blk_finish_plug+0x44/0x60
[ 3117.686597]  ? f2fs_ra_meta_pages+0x38b/0x5e0
[ 3117.686602]  ? find_inode_fast+0xac/0xc0
[ 3117.686606]  ? f2fs_is_valid_blkaddr+0x320/0x320
[ 3117.686618]  ? __radix_tree_lookup+0x150/0x150
[ 3117.686633]  ? dqget+0x670/0x670
[ 3117.686648]  ? pagecache_get_page+0x29/0x410
[ 3117.686656]  ? kmem_cache_alloc+0x176/0x1e0
[ 3117.686660]  ? f2fs_is_valid_blkaddr+0x11d/0x320
[ 3117.686664]  f2fs_recover_fsync_data+0xc23/0xd50
[ 3117.686670]  ? f2fs_space_for_roll_forward+0x60/0x60
[ 3117.686674]  ? rb_insert_color+0x323/0x3d0
[ 3117.686678]  ? f2fs_recover_orphan_inodes+0xa5/0x700
[ 3117.686683]  ? proc_register+0x153/0x1d0
[ 3117.686686]  ? f2fs_remove_orphan_inode+0x10/0x10
[ 3117.686695]  ? f2fs_attr_store+0x50/0x50
[ 3117.686700]  ? proc_create_single_data+0x52/0x60
[ 3117.686707]  f2fs_fill_super+0x1d06/0x2b40
[ 3117.686728]  ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.686735]  ? sget_userns+0x65e/0x690
[ 3117.686740]  ? set_blocksize+0x88/0x130
[ 3117.686745]  ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.686748]  mount_bdev+0x1c0/0x200
[ 3117.686753]  mount_fs+0x5c/0x190
[ 3117.686758]  vfs_kern_mount+0x64/0x190
[ 3117.686762]  do_mount+0x2e4/0x1450
[ 3117.686769]  ? lockref_put_return+0x130/0x130
[ 3117.686773]  ? copy_mount_string+0x20/0x20
[ 3117.686777]  ? kasan_unpoison_shadow+0x31/0x40
[ 3117.686780]  ? kasan_kmalloc+0xa6/0xd0
[ 3117.686786]  ? memcg_kmem_put_cache+0x16/0x90
[ 3117.686790]  ? __kmalloc_track_caller+0x196/0x210
[ 3117.686795]  ? _copy_from_user+0x61/0x90
[ 3117.686801]  ? memdup_user+0x3e/0x60
[ 3117.686804]  ksys_mount+0x7e/0xd0
[ 3117.686809]  __x64_sys_mount+0x62/0x70
[ 3117.686816]  do_syscall_64+0x73/0x160
[ 3117.686824]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3117.686829] RIP: 0033:0x7f5693f14b9a
[ 3117.686830] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48
[ 3117.686887] RSP: 002b:00007fff27346488 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[ 3117.686892] RAX: ffffffffffffffda RBX: 00000000016e2030 RCX: 00007f5693f14b9a
[ 3117.686894] RDX: 00000000016e2210 RSI: 00000000016e3f30 RDI: 00000000016ee040
[ 3117.686896] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[ 3117.686899] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 00000000016ee040
[ 3117.686901] R13: 00000000016e2210 R14: 0000000000000000 R15: 0000000000000003

[ 3117.687005] Allocated by task 1225:
[ 3117.687152]  kasan_kmalloc+0xa6/0xd0
[ 3117.687157]  kmem_cache_alloc_trace+0xfd/0x200
[ 3117.687161]  f2fs_build_segment_manager+0x2d09/0x3190
[ 3117.687165]  f2fs_fill_super+0x1948/0x2b40
[ 3117.687168]  mount_bdev+0x1c0/0x200
[ 3117.687171]  mount_fs+0x5c/0x190
[ 3117.687174]  vfs_kern_mount+0x64/0x190
[ 3117.687177]  do_mount+0x2e4/0x1450
[ 3117.687180]  ksys_mount+0x7e/0xd0
[ 3117.687182]  __x64_sys_mount+0x62/0x70
[ 3117.687186]  do_syscall_64+0x73/0x160
[ 3117.687190]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[ 3117.687285] Freed by task 19:
[ 3117.687412]  __kasan_slab_free+0x137/0x190
[ 3117.687416]  kfree+0x8b/0x1b0
[ 3117.687460]  ttm_bo_man_put_node+0x61/0x80 [ttm]
[ 3117.687476]  ttm_bo_cleanup_refs+0x15f/0x250 [ttm]
[ 3117.687492]  ttm_bo_delayed_delete+0x2f0/0x300 [ttm]
[ 3117.687507]  ttm_bo_delayed_workqueue+0x17/0x50 [ttm]
[ 3117.687528]  process_one_work+0x2f9/0x740
[ 3117.687531]  worker_thread+0x78/0x6b0
[ 3117.687541]  kthread+0x177/0x1c0
[ 3117.687545]  ret_from_fork+0x35/0x40

[ 3117.687638] The buggy address belongs to the object at ffff88018f0a6300
                which belongs to the cache kmalloc-192 of size 192
[ 3117.688014] The buggy address is located 16 bytes to the right of
                192-byte region [ffff88018f0a6300, ffff88018f0a63c0)
[ 3117.688382] The buggy address belongs to the page:
[ 3117.688554] page:ffffea00063c2980 count:1 mapcount:0 mapping:ffff8801f3403180 index:0x0
[ 3117.688788] flags: 0x17fff8000000100(slab)
[ 3117.688944] raw: 017fff8000000100 ffffea00063c2840 0000000e0000000e ffff8801f3403180
[ 3117.689166] raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
[ 3117.689386] page dumped because: kasan: bad access detected

[ 3117.689653] Memory state around the buggy address:
[ 3117.689816]  ffff88018f0a6280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 3117.690027]  ffff88018f0a6300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 3117.690239] >ffff88018f0a6380: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 3117.690448]                                                  ^
[ 3117.690644]  ffff88018f0a6400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 3117.690868]  ffff88018f0a6480: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 3117.691077] ==================================================================
[ 3117.691290] Disabling lock debugging due to kernel taint
[ 3117.693893] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[ 3117.694120] PGD 80000001f01bc067 P4D 80000001f01bc067 PUD 1d9638067 PMD 0
[ 3117.694338] Oops: 0002 [#1] SMP KASAN PTI
[ 3117.694490] CPU: 1 PID: 1225 Comm: mount Tainted: G    B   W         4.17.0+ #1
[ 3117.694703] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 3117.695073] RIP: 0010:__remove_dirty_segment+0xe2/0x1e0
[ 3117.695246] Code: c4 48 89 c7 e8 cf bb d7 ff 45 0f b6 24 24 41 83 e4 3f 44 88 64 24 07 41 83 e4 3f 4a 8d 7c e3 08 e8 b3 bc d7 ff 4a 8b 4c e3 08 <f0> 4c 0f b3 29 0f 82 94 00 00 00 48 8d bd 20 04 00 00 e8 97 bb d7
[ 3117.695793] RSP: 0018:ffff88018eb67638 EFLAGS: 00010292
[ 3117.695969] RAX: 0000000000000000 RBX: ffff88018f0a6300 RCX: 0000000000000000
[ 3117.696182] RDX: 0000000000000000 RSI: 0000000000000297 RDI: 0000000000000297
[ 3117.696391] RBP: ffff88018ebe9980 R08: ffffed003e743ebb R09: ffffed003e743ebb
[ 3117.696604] R10: 0000000000000001 R11: ffffed003e743eba R12: 0000000000000019
[ 3117.696813] R13: 0000000000000014 R14: 0000000000000320 R15: ffff88018ebe99e0
[ 3117.697032] FS:  00007f5694636840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000
[ 3117.697280] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3117.702357] CR2: 00007fe89bb1a000 CR3: 0000000191c22000 CR4: 00000000000006e0
[ 3117.707235] Call Trace:
[ 3117.712077]  locate_dirty_segment+0x189/0x190
[ 3117.716891]  f2fs_allocate_new_segments+0xa9/0xe0
[ 3117.721617]  recover_data+0x703/0x2c20
[ 3117.726316]  ? f2fs_recover_fsync_data+0x48f/0xd50
[ 3117.730957]  ? ksys_mount+0x7e/0xd0
[ 3117.735573]  ? policy_nodemask+0x1a/0x90
[ 3117.740198]  ? policy_node+0x56/0x70
[ 3117.744829]  ? add_fsync_inode+0xf0/0xf0
[ 3117.749487]  ? blk_finish_plug+0x44/0x60
[ 3117.754152]  ? f2fs_ra_meta_pages+0x38b/0x5e0
[ 3117.758831]  ? find_inode_fast+0xac/0xc0
[ 3117.763448]  ? f2fs_is_valid_blkaddr+0x320/0x320
[ 3117.768046]  ? __radix_tree_lookup+0x150/0x150
[ 3117.772603]  ? dqget+0x670/0x670
[ 3117.777159]  ? pagecache_get_page+0x29/0x410
[ 3117.781648]  ? kmem_cache_alloc+0x176/0x1e0
[ 3117.786067]  ? f2fs_is_valid_blkaddr+0x11d/0x320
[ 3117.790476]  f2fs_recover_fsync_data+0xc23/0xd50
[ 3117.794790]  ? f2fs_space_for_roll_forward+0x60/0x60
[ 3117.799086]  ? rb_insert_color+0x323/0x3d0
[ 3117.803304]  ? f2fs_recover_orphan_inodes+0xa5/0x700
[ 3117.807563]  ? proc_register+0x153/0x1d0
[ 3117.811766]  ? f2fs_remove_orphan_inode+0x10/0x10
[ 3117.815947]  ? f2fs_attr_store+0x50/0x50
[ 3117.820087]  ? proc_create_single_data+0x52/0x60
[ 3117.824262]  f2fs_fill_super+0x1d06/0x2b40
[ 3117.828367]  ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.832432]  ? sget_userns+0x65e/0x690
[ 3117.836500]  ? set_blocksize+0x88/0x130
[ 3117.840501]  ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.844420]  mount_bdev+0x1c0/0x200
[ 3117.848275]  mount_fs+0x5c/0x190
[ 3117.852053]  vfs_kern_mount+0x64/0x190
[ 3117.855810]  do_mount+0x2e4/0x1450
[ 3117.859441]  ? lockref_put_return+0x130/0x130
[ 3117.862996]  ? copy_mount_string+0x20/0x20
[ 3117.866417]  ? kasan_unpoison_shadow+0x31/0x40
[ 3117.869719]  ? kasan_kmalloc+0xa6/0xd0
[ 3117.872948]  ? memcg_kmem_put_cache+0x16/0x90
[ 3117.876121]  ? __kmalloc_track_caller+0x196/0x210
[ 3117.879333]  ? _copy_from_user+0x61/0x90
[ 3117.882467]  ? memdup_user+0x3e/0x60
[ 3117.885604]  ksys_mount+0x7e/0xd0
[ 3117.888700]  __x64_sys_mount+0x62/0x70
[ 3117.891742]  do_syscall_64+0x73/0x160
[ 3117.894692]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3117.897669] RIP: 0033:0x7f5693f14b9a
[ 3117.900563] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48
[ 3117.906922] RSP: 002b:00007fff27346488 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[ 3117.910159] RAX: ffffffffffffffda RBX: 00000000016e2030 RCX: 00007f5693f14b9a
[ 3117.913469] RDX: 00000000016e2210 RSI: 00000000016e3f30 RDI: 00000000016ee040
[ 3117.916764] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[ 3117.920071] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 00000000016ee040
[ 3117.923393] R13: 00000000016e2210 R14: 0000000000000000 R15: 0000000000000003
[ 3117.926680] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer joydev input_leds serio_raw snd soundcore mac_hid i2c_piix4 ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi btrfs zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 8139too qxl ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel psmouse aes_x86_64 8139cp crypto_simd cryptd mii glue_helper pata_acpi floppy
[ 3117.949979] CR2: 0000000000000000
[ 3117.954283] ---[ end trace a8e0d899985faf32 ]---
[ 3117.958575] RIP: 0010:__remove_dirty_segment+0xe2/0x1e0
[ 3117.962810] Code: c4 48 89 c7 e8 cf bb d7 ff 45 0f b6 24 24 41 83 e4 3f 44 88 64 24 07 41 83 e4 3f 4a 8d 7c e3 08 e8 b3 bc d7 ff 4a 8b 4c e3 08 <f0> 4c 0f b3 29 0f 82 94 00 00 00 48 8d bd 20 04 00 00 e8 97 bb d7
[ 3117.971789] RSP: 0018:ffff88018eb67638 EFLAGS: 00010292
[ 3117.976333] RAX: 0000000000000000 RBX: ffff88018f0a6300 RCX: 0000000000000000
[ 3117.980926] RDX: 0000000000000000 RSI: 0000000000000297 RDI: 0000000000000297
[ 3117.985497] RBP: ffff88018ebe9980 R08: ffffed003e743ebb R09: ffffed003e743ebb
[ 3117.990098] R10: 0000000000000001 R11: ffffed003e743eba R12: 0000000000000019
[ 3117.994761] R13: 0000000000000014 R14: 0000000000000320 R15: ffff88018ebe99e0
[ 3117.999392] FS:  00007f5694636840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000
[ 3118.004096] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3118.008816] CR2: 00007fe89bb1a000 CR3: 0000000191c22000 CR4: 00000000000006e0

- Location
https://elixir.bootlin.com/linux/v4.18-rc3/source/fs/f2fs/segment.c#L775
		if (test_and_clear_bit(segno, dirty_i->dirty_segmap[t]))
			dirty_i->nr_dirty[t]--;
Here dirty_i->dirty_segmap[t] can be NULL which leads to crash in test_and_clear_bit()

Reported-by Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[bwh: Backported to 4.14: The function is called sanity_check_ckpt()]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/f2fs/checkpoint.c |  8 ++++----
 fs/f2fs/super.c      | 12 ++++++++++++
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index c81cd5057b8e..624817eeb25e 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -825,15 +825,15 @@ int get_valid_checkpoint(struct f2fs_sb_info *sbi)
 	cp_block = (struct f2fs_checkpoint *)page_address(cur_page);
 	memcpy(sbi->ckpt, cp_block, blk_size);
 
-	/* Sanity checking of checkpoint */
-	if (sanity_check_ckpt(sbi))
-		goto free_fail_no_cp;
-
 	if (cur_page == cp1)
 		sbi->cur_cp_pack = 1;
 	else
 		sbi->cur_cp_pack = 2;
 
+	/* Sanity checking of checkpoint */
+	if (sanity_check_ckpt(sbi))
+		goto free_fail_no_cp;
+
 	if (cp_blks <= 1)
 		goto done;
 
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 9fafb1404f39..de4de4ebe64c 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1957,6 +1957,7 @@ int sanity_check_ckpt(struct f2fs_sb_info *sbi)
 	unsigned int sit_bitmap_size, nat_bitmap_size;
 	unsigned int log_blocks_per_seg;
 	unsigned int segment_count_main;
+	unsigned int cp_pack_start_sum, cp_payload;
 	block_t user_block_count;
 	int i;
 
@@ -2017,6 +2018,17 @@ int sanity_check_ckpt(struct f2fs_sb_info *sbi)
 		return 1;
 	}
 
+	cp_pack_start_sum = __start_sum_addr(sbi);
+	cp_payload = __cp_payload(sbi);
+	if (cp_pack_start_sum < cp_payload + 1 ||
+		cp_pack_start_sum > blocks_per_seg - 1 -
+			NR_CURSEG_TYPE) {
+		f2fs_msg(sbi->sb, KERN_ERR,
+			"Wrong cp_pack_start_sum: %u",
+			cp_pack_start_sum);
+		return 1;
+	}
+
 	if (unlikely(f2fs_cp_error(sbi))) {
 		f2fs_msg(sbi->sb, KERN_ERR, "A bug case: need to run fsck");
 		return 1;
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 057/146] xfs: dont fail when converting shortform attr to long form during ATTR_REPLACE
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (55 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 056/146] f2fs: fix to do sanity check with cp_pack_start_sum Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 058/146] Revert "wlcore: Add missing PM call for wlcore_cmd_wait_for_event_or_timeout()" Greg Kroah-Hartman
                   ` (93 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, kanda.motohiro, Dave Chinner,
	Christoph Hellwig, Darrick J. Wong, Ben Hutchings, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 7b38460dc8e4eafba06c78f8e37099d3b34d473c upstream.

Kanda Motohiro reported that expanding a tiny xattr into a large xattr
fails on XFS because we remove the tiny xattr from a shortform fork and
then try to re-add it after converting the fork to extents format having
not removed the ATTR_REPLACE flag.  This fails because the attr is no
longer present, causing a fs shutdown.

This is derived from the patch in his bug report, but we really
shouldn't ignore a nonzero retval from the remove call.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199119
Reported-by: kanda.motohiro@gmail.com
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/xfs/libxfs/xfs_attr.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 6249c92671de..ea66f04f46f7 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -501,7 +501,14 @@ xfs_attr_shortform_addname(xfs_da_args_t *args)
 		if (args->flags & ATTR_CREATE)
 			return retval;
 		retval = xfs_attr_shortform_remove(args);
-		ASSERT(retval == 0);
+		if (retval)
+			return retval;
+		/*
+		 * Since we have removed the old attr, clear ATTR_REPLACE so
+		 * that the leaf format add routine won't trip over the attr
+		 * not being around.
+		 */
+		args->flags &= ~ATTR_REPLACE;
 	}
 
 	if (args->namelen >= XFS_ATTR_SF_ENTSIZE_MAX ||
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 058/146] Revert "wlcore: Add missing PM call for wlcore_cmd_wait_for_event_or_timeout()"
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (56 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 057/146] xfs: dont fail when converting shortform attr to long form during ATTR_REPLACE Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 059/146] net: skb_scrub_packet(): Scrub offload_fwd_mark Greg Kroah-Hartman
                   ` (92 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

This reverts commit e87efc44dd36ba3db59847c418354711ebad779b which was
upstream commit 4ec7cece87b3ed21ffcd407c62fb2f151a366bc1.

>From Dietmar May's report on the stable mailing list
(https://www.spinics.net/lists/stable/msg272201.html):

> I've run into some problems which appear due to (a) recent patch(es) on
> the wlcore wifi driver.
>
> 4.4.160 - commit 3fdd34643ffc378b5924941fad40352c04610294
> 4.9.131 - commit afeeecc764436f31d4447575bb9007732333818c
>
> Earlier versions (4.9.130 and 4.4.159 - tested back to 4.4.49) do not
> exhibit this problem. It is still present in 4.9.141.
>
> master as of 4.20.0-rc4 does not exhibit this problem.
>
> Basically, during client association when in AP mode (running hostapd),
> handshake may or may not complete following a noticeable delay. If
> successful, then the driver fails consistently in warn_slowpath_null
> during disassociation. If unsuccessful, the wifi client attempts multiple
> times, sometimes failing repeatedly. I've had clients unable to connect
> for 3-5 minutes during testing, with the syslog filled with dozens of
> backtraces. syslog details are below.
>
> I'm working on an embedded device with a TI 3352 ARM processor and a
> murata wl1271 module in sdio mode. We're running a fully patched ubuntu
> 18.04 ARM build, with a kernel built from kernel.org's stable/linux repo <https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-4.9.y&id=afeeecc764436f31d4447575bb9007732333818c>.
> Relevant parts of the kernel config are included below.
>
> The commit message states:
>
> > /I've only seen this few times with the runtime PM patches enabled so
> > this one is probably not needed before that. This seems to work
> > currently based on the current PM implementation timer. Let's apply
> > this separately though in case others are hitting this issue./
> We're not doing anything explicit with power management. The device is an
> IoT edge gateway with battery backup, normally running on wall power. The
> battery is currently used solely to shut down the system cleanly to avoid
> filesystem corruption.
>
> The device tree is configured to keep power in suspend; but the device
> should never suspend, so in our case, there is no need to call
> wl1271_ps_elp_wakeup() or wl1271_ps_elp_sleep(), as occurs in the patch.

Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/wireless/ti/wlcore/cmd.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/net/wireless/ti/wlcore/cmd.c b/drivers/net/wireless/ti/wlcore/cmd.c
index f48c3f62966d..761cf8573a80 100644
--- a/drivers/net/wireless/ti/wlcore/cmd.c
+++ b/drivers/net/wireless/ti/wlcore/cmd.c
@@ -35,7 +35,6 @@
 #include "wl12xx_80211.h"
 #include "cmd.h"
 #include "event.h"
-#include "ps.h"
 #include "tx.h"
 #include "hw_ops.h"
 
@@ -192,10 +191,6 @@ int wlcore_cmd_wait_for_event_or_timeout(struct wl1271 *wl,
 
 	timeout_time = jiffies + msecs_to_jiffies(WL1271_EVENT_TIMEOUT);
 
-	ret = wl1271_ps_elp_wakeup(wl);
-	if (ret < 0)
-		return ret;
-
 	do {
 		if (time_after(jiffies, timeout_time)) {
 			wl1271_debug(DEBUG_CMD, "timeout waiting for event %d",
@@ -227,7 +222,6 @@ int wlcore_cmd_wait_for_event_or_timeout(struct wl1271 *wl,
 	} while (!event);
 
 out:
-	wl1271_ps_elp_sleep(wl);
 	kfree(events_vector);
 	return ret;
 }
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 059/146] net: skb_scrub_packet(): Scrub offload_fwd_mark
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (57 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 058/146] Revert "wlcore: Add missing PM call for wlcore_cmd_wait_for_event_or_timeout()" Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 060/146] net: thunderx: set xdp_prog to NULL if bpf_prog_add fails Greg Kroah-Hartman
                   ` (91 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Petr Machata, Ido Schimmel,
	Jiri Pirko, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Petr Machata <petrm@mellanox.com>

[ Upstream commit b5dd186d10ba59e6b5ba60e42b3b083df56df6f3 ]

When a packet is trapped and the corresponding SKB marked as
already-forwarded, it retains this marking even after it is forwarded
across veth links into another bridge. There, since it ingresses the
bridge over veth, which doesn't have offload_fwd_mark, it triggers a
warning in nbp_switchdev_frame_mark().

Then nbp_switchdev_allowed_egress() decides not to allow egress from
this bridge through another veth, because the SKB is already marked, and
the mark (of 0) of course matches. Thus the packet is incorrectly
blocked.

Solve by resetting offload_fwd_mark() in skb_scrub_packet(). That
function is called from tunnels and also from veth, and thus catches the
cases where traffic is forwarded between bridges and transformed in a
way that invalidates the marking.

Fixes: 6bc506b4fb06 ("bridge: switchdev: Add forward mark support for stacked devices")
Fixes: abf4bb6b63d0 ("skbuff: Add the offload_mr_fwd_mark field")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Suggested-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/core/skbuff.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4882,6 +4882,10 @@ void skb_scrub_packet(struct sk_buff *sk
 	nf_reset(skb);
 	nf_reset_trace(skb);
 
+#ifdef CONFIG_NET_SWITCHDEV
+	skb->offload_fwd_mark = 0;
+#endif
+
 	if (!xnet)
 		return;
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 060/146] net: thunderx: set xdp_prog to NULL if bpf_prog_add fails
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (58 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 059/146] net: skb_scrub_packet(): Scrub offload_fwd_mark Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 061/146] virtio-net: disable guest csum during XDP set Greg Kroah-Hartman
                   ` (90 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Lorenzo Bianconi, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>

[ Upstream commit 6d0f60b0f8588fd4380ea5df9601e12fddd55ce2 ]

Set xdp_prog pointer to NULL if bpf_prog_add fails since that routine
reports the error code instead of NULL in case of failure and xdp_prog
pointer value is used in the driver to verify if XDP is currently
enabled.
Moreover report the error code to userspace if nicvf_xdp_setup fails

Fixes: 05c773f52b96 ("net: thunderx: Add basic XDP support")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/cavium/thunder/nicvf_main.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1691,6 +1691,7 @@ static int nicvf_xdp_setup(struct nicvf
 	bool if_up = netif_running(nic->netdev);
 	struct bpf_prog *old_prog;
 	bool bpf_attached = false;
+	int ret = 0;
 
 	/* For now just support only the usual MTU sized frames */
 	if (prog && (dev->mtu > 1500)) {
@@ -1724,8 +1725,12 @@ static int nicvf_xdp_setup(struct nicvf
 	if (nic->xdp_prog) {
 		/* Attach BPF program */
 		nic->xdp_prog = bpf_prog_add(nic->xdp_prog, nic->rx_queues - 1);
-		if (!IS_ERR(nic->xdp_prog))
+		if (!IS_ERR(nic->xdp_prog)) {
 			bpf_attached = true;
+		} else {
+			ret = PTR_ERR(nic->xdp_prog);
+			nic->xdp_prog = NULL;
+		}
 	}
 
 	/* Calculate Tx queues needed for XDP and network stack */
@@ -1737,7 +1742,7 @@ static int nicvf_xdp_setup(struct nicvf
 		netif_trans_update(nic->netdev);
 	}
 
-	return 0;
+	return ret;
 }
 
 static int nicvf_xdp(struct net_device *netdev, struct netdev_xdp *xdp)



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 061/146] virtio-net: disable guest csum during XDP set
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (59 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 060/146] net: thunderx: set xdp_prog to NULL if bpf_prog_add fails Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 062/146] virtio-net: fail XDP set if guest csum is negotiated Greg Kroah-Hartman
                   ` (89 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jesper Dangaard Brouer, Pavel Popa,
	David Ahern, Jason Wang, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jason Wang <jasowang@redhat.com>

[ Upstream commit e59ff2c49ae16e1d179de679aca81405829aee6c ]

We don't disable VIRTIO_NET_F_GUEST_CSUM if XDP was set. This means we
can receive partial csumed packets with metadata kept in the
vnet_hdr. This may have several side effects:

- It could be overridden by header adjustment, thus is might be not
  correct after XDP processing.
- There's no way to pass such metadata information through
  XDP_REDIRECT to another driver.
- XDP does not support checksum offload right now.

So simply disable guest csum if possible in this the case of XDP.

Fixes: 3f93522ffab2d ("virtio-net: switch off offloads on demand if possible on XDP set")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Pavel Popa <pashinho1990@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/virtio_net.c |    8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -61,7 +61,8 @@ static const unsigned long guest_offload
 	VIRTIO_NET_F_GUEST_TSO4,
 	VIRTIO_NET_F_GUEST_TSO6,
 	VIRTIO_NET_F_GUEST_ECN,
-	VIRTIO_NET_F_GUEST_UFO
+	VIRTIO_NET_F_GUEST_UFO,
+	VIRTIO_NET_F_GUEST_CSUM
 };
 
 struct virtnet_stats {
@@ -1939,9 +1940,6 @@ static int virtnet_clear_guest_offloads(
 	if (!vi->guest_offloads)
 		return 0;
 
-	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_CSUM))
-		offloads = 1ULL << VIRTIO_NET_F_GUEST_CSUM;
-
 	return virtnet_set_guest_offloads(vi, offloads);
 }
 
@@ -1951,8 +1949,6 @@ static int virtnet_restore_guest_offload
 
 	if (!vi->guest_offloads)
 		return 0;
-	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_CSUM))
-		offloads |= 1ULL << VIRTIO_NET_F_GUEST_CSUM;
 
 	return virtnet_set_guest_offloads(vi, offloads);
 }



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 062/146] virtio-net: fail XDP set if guest csum is negotiated
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (60 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 061/146] virtio-net: disable guest csum during XDP set Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 063/146] net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue Greg Kroah-Hartman
                   ` (88 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jesper Dangaard Brouer, Pavel Popa,
	David Ahern, Jason Wang, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jason Wang <jasowang@redhat.com>

[ Upstream commit 18ba58e1c234ea1a2d9835ac8c1735d965ce4640 ]

We don't support partial csumed packet since its metadata will be lost
or incorrect during XDP processing. So fail the XDP set if guest_csum
feature is negotiated.

Fixes: f600b6905015 ("virtio_net: Add XDP support")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Pavel Popa <pashinho1990@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/virtio_net.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1966,8 +1966,9 @@ static int virtnet_xdp_set(struct net_de
 	    && (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO4) ||
 	        virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO6) ||
 	        virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_ECN) ||
-		virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO))) {
-		NL_SET_ERR_MSG_MOD(extack, "Can't set XDP while host is implementing LRO, disable LRO first");
+		virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO) ||
+		virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_CSUM))) {
+		NL_SET_ERR_MSG_MOD(extack, "Can't set XDP while host is implementing LRO/CSUM, disable LRO/CSUM first");
 		return -EOPNOTSUPP;
 	}
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 063/146] net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (61 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 062/146] virtio-net: fail XDP set if guest csum is negotiated Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 064/146] packet: copy user buffers before orphan or clone Greg Kroah-Hartman
                   ` (87 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Lorenzo Bianconi, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>

[ Upstream commit ef2a7cf1d8831535b8991459567b385661eb4a36 ]

Reset snd_queue tso_hdrs pointer to NULL in nicvf_free_snd_queue routine
since it is used to check if tso dma descriptor queue has been previously
allocated. The issue can be triggered with the following reproducer:

$ip link set dev enP2p1s0v0 xdpdrv obj xdp_dummy.o
$ip link set dev enP2p1s0v0 xdpdrv off

[  341.467649] WARNING: CPU: 74 PID: 2158 at mm/vmalloc.c:1511 __vunmap+0x98/0xe0
[  341.515010] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018
[  341.521874] pstate: 60400005 (nZCv daif +PAN -UAO)
[  341.526654] pc : __vunmap+0x98/0xe0
[  341.530132] lr : __vunmap+0x98/0xe0
[  341.533609] sp : ffff00001c5db860
[  341.536913] x29: ffff00001c5db860 x28: 0000000000020000
[  341.542214] x27: ffff810feb5090b0 x26: ffff000017e57000
[  341.547515] x25: 0000000000000000 x24: 00000000fbd00000
[  341.552816] x23: 0000000000000000 x22: ffff810feb5090b0
[  341.558117] x21: 0000000000000000 x20: 0000000000000000
[  341.563418] x19: ffff000017e57000 x18: 0000000000000000
[  341.568719] x17: 0000000000000000 x16: 0000000000000000
[  341.574020] x15: 0000000000000010 x14: ffffffffffffffff
[  341.579321] x13: ffff00008985eb27 x12: ffff00000985eb2f
[  341.584622] x11: ffff0000096b3000 x10: ffff00001c5db510
[  341.589923] x9 : 00000000ffffffd0 x8 : ffff0000086868e8
[  341.595224] x7 : 3430303030303030 x6 : 00000000000006ef
[  341.600525] x5 : 00000000003fffff x4 : 0000000000000000
[  341.605825] x3 : 0000000000000000 x2 : ffffffffffffffff
[  341.611126] x1 : ffff0000096b3728 x0 : 0000000000000038
[  341.616428] Call trace:
[  341.618866]  __vunmap+0x98/0xe0
[  341.621997]  vunmap+0x3c/0x50
[  341.624961]  arch_dma_free+0x68/0xa0
[  341.628534]  dma_direct_free+0x50/0x80
[  341.632285]  nicvf_free_resources+0x160/0x2d8 [nicvf]
[  341.637327]  nicvf_config_data_transfer+0x174/0x5e8 [nicvf]
[  341.642890]  nicvf_stop+0x298/0x340 [nicvf]
[  341.647066]  __dev_close_many+0x9c/0x108
[  341.650977]  dev_close_many+0xa4/0x158
[  341.654720]  rollback_registered_many+0x140/0x530
[  341.659414]  rollback_registered+0x54/0x80
[  341.663499]  unregister_netdevice_queue+0x9c/0xe8
[  341.668192]  unregister_netdev+0x28/0x38
[  341.672106]  nicvf_remove+0xa4/0xa8 [nicvf]
[  341.676280]  nicvf_shutdown+0x20/0x30 [nicvf]
[  341.680630]  pci_device_shutdown+0x44/0x88
[  341.684720]  device_shutdown+0x144/0x250
[  341.688640]  kernel_restart_prepare+0x44/0x50
[  341.692986]  kernel_restart+0x20/0x68
[  341.696638]  __se_sys_reboot+0x210/0x238
[  341.700550]  __arm64_sys_reboot+0x24/0x30
[  341.704555]  el0_svc_handler+0x94/0x110
[  341.708382]  el0_svc+0x8/0xc
[  341.711252] ---[ end trace 3f4019c8439959c9 ]---
[  341.715874] page:ffff7e0003ef4000 count:0 mapcount:0 mapping:0000000000000000 index:0x4
[  341.723872] flags: 0x1fffe000000000()
[  341.727527] raw: 001fffe000000000 ffff7e0003f1a008 ffff7e0003ef4048 0000000000000000
[  341.735263] raw: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000
[  341.742994] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)

where xdp_dummy.c is a simple bpf program that forwards the incoming
frames to the network stack (available here:
https://github.com/altoor/xdp_walkthrough_examples/blob/master/sample_1/xdp_dummy.c)

Fixes: 05c773f52b96 ("net: thunderx: Add basic XDP support")
Fixes: 4863dea3fab0 ("net: Adding support for Cavium ThunderX network controller")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -585,10 +585,12 @@ static void nicvf_free_snd_queue(struct
 	if (!sq->dmem.base)
 		return;
 
-	if (sq->tso_hdrs)
+	if (sq->tso_hdrs) {
 		dma_free_coherent(&nic->pdev->dev,
 				  sq->dmem.q_len * TSO_HEADER_SIZE,
 				  sq->tso_hdrs, sq->tso_hdrs_phys);
+		sq->tso_hdrs = NULL;
+	}
 
 	/* Free pending skbs in the queue */
 	smp_rmb();



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 064/146] packet: copy user buffers before orphan or clone
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (62 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 063/146] net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 065/146] rapidio/rionet: do not free skb before reading its length Greg Kroah-Hartman
                   ` (86 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Anand H. Krishnan, Willem de Bruijn,
	David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Willem de Bruijn <willemb@google.com>

[ Upstream commit 5cd8d46ea1562be80063f53c7c6a5f40224de623 ]

tpacket_snd sends packets with user pages linked into skb frags. It
notifies that pages can be reused when the skb is released by setting
skb->destructor to tpacket_destruct_skb.

This can cause data corruption if the skb is orphaned (e.g., on
transmit through veth) or cloned (e.g., on mirror to another psock).

Create a kernel-private copy of data in these cases, same as tun/tap
zerocopy transmission. Reuse that infrastructure: mark the skb as
SKBTX_ZEROCOPY_FRAG, which will trigger copy in skb_orphan_frags(_rx).

Unlike other zerocopy packets, do not set shinfo destructor_arg to
struct ubuf_info. tpacket_destruct_skb already uses that ptr to notify
when the original skb is released and a timestamp is recorded. Do not
change this timestamp behavior. The ubuf_info->callback is not needed
anyway, as no zerocopy notification is expected.

Mark destructor_arg as not-a-uarg by setting the lower bit to 1. The
resulting value is not a valid ubuf_info pointer, nor a valid
tpacket_snd frame address. Add skb_zcopy_.._nouarg helpers for this.

The fix relies on features introduced in commit 52267790ef52 ("sock:
add MSG_ZEROCOPY"), so can be backported as is only to 4.14.

Tested with from `./in_netns.sh ./txring_overwrite` from
http://github.com/wdebruij/kerneltools/tests

Fixes: 69e3c75f4d54 ("net: TX_RING and packet mmap")
Reported-by: Anand H. Krishnan <anandhkrishnan@gmail.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/skbuff.h |   18 +++++++++++++++++-
 net/packet/af_packet.c |    4 ++--
 2 files changed, 19 insertions(+), 3 deletions(-)

--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1288,6 +1288,22 @@ static inline void skb_zcopy_set(struct
 	}
 }
 
+static inline void skb_zcopy_set_nouarg(struct sk_buff *skb, void *val)
+{
+	skb_shinfo(skb)->destructor_arg = (void *)((uintptr_t) val | 0x1UL);
+	skb_shinfo(skb)->tx_flags |= SKBTX_ZEROCOPY_FRAG;
+}
+
+static inline bool skb_zcopy_is_nouarg(struct sk_buff *skb)
+{
+	return (uintptr_t) skb_shinfo(skb)->destructor_arg & 0x1UL;
+}
+
+static inline void *skb_zcopy_get_nouarg(struct sk_buff *skb)
+{
+	return (void *)((uintptr_t) skb_shinfo(skb)->destructor_arg & ~0x1UL);
+}
+
 /* Release a reference on a zerocopy structure */
 static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy)
 {
@@ -1297,7 +1313,7 @@ static inline void skb_zcopy_clear(struc
 		if (uarg->callback == sock_zerocopy_callback) {
 			uarg->zerocopy = uarg->zerocopy && zerocopy;
 			sock_zerocopy_put(uarg);
-		} else {
+		} else if (!skb_zcopy_is_nouarg(skb)) {
 			uarg->callback(uarg, zerocopy);
 		}
 
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2433,7 +2433,7 @@ static void tpacket_destruct_skb(struct
 		void *ph;
 		__u32 ts;
 
-		ph = skb_shinfo(skb)->destructor_arg;
+		ph = skb_zcopy_get_nouarg(skb);
 		packet_dec_pending(&po->tx_ring);
 
 		ts = __packet_set_timestamp(po, ph, skb);
@@ -2499,7 +2499,7 @@ static int tpacket_fill_skb(struct packe
 	skb->priority = po->sk.sk_priority;
 	skb->mark = po->sk.sk_mark;
 	sock_tx_timestamp(&po->sk, sockc->tsflags, &skb_shinfo(skb)->tx_flags);
-	skb_shinfo(skb)->destructor_arg = ph.raw;
+	skb_zcopy_set_nouarg(skb, ph.raw);
 
 	skb_reserve(skb, hlen);
 	skb_reset_network_header(skb);



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 065/146] rapidio/rionet: do not free skb before reading its length
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (63 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 064/146] packet: copy user buffers before orphan or clone Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 066/146] s390/qeth: fix length check in SNMP processing Greg Kroah-Hartman
                   ` (85 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Pan Bian, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pan Bian <bianpan2016@163.com>

[ Upstream commit cfc435198f53a6fa1f656d98466b24967ff457d0 ]

skb is freed via dev_kfree_skb_any, however, skb->len is read then. This
may result in a use-after-free bug.

Fixes: e6161d64263 ("rapidio/rionet: rework driver initialization and removal")
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/rionet.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/rionet.c
+++ b/drivers/net/rionet.c
@@ -216,9 +216,9 @@ static int rionet_start_xmit(struct sk_b
 			 * it just report sending a packet to the target
 			 * (without actual packet transfer).
 			 */
-			dev_kfree_skb_any(skb);
 			ndev->stats.tx_packets++;
 			ndev->stats.tx_bytes += skb->len;
+			dev_kfree_skb_any(skb);
 		}
 	}
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 066/146] s390/qeth: fix length check in SNMP processing
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (64 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 065/146] rapidio/rionet: do not free skb before reading its length Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 067/146] usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2 Greg Kroah-Hartman
                   ` (84 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Julian Wiedmann, Ursula Braun,
	David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Julian Wiedmann <jwi@linux.ibm.com>

[ Upstream commit 9a764c1e59684c0358e16ccaafd870629f2cfe67 ]

The response for a SNMP request can consist of multiple parts, which
the cmd callback stages into a kernel buffer until all parts have been
received. If the callback detects that the staging buffer provides
insufficient space, it bails out with error.
This processing is buggy for the first part of the response - while it
initially checks for a length of 'data_len', it later copies an
additional amount of 'offsetof(struct qeth_snmp_cmd, data)' bytes.

Fix the calculation of 'data_len' for the first part of the response.
This also nicely cleans up the memcpy code.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/s390/net/qeth_core_main.c |   27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -4545,8 +4545,8 @@ static int qeth_snmp_command_cb(struct q
 {
 	struct qeth_ipa_cmd *cmd;
 	struct qeth_arp_query_info *qinfo;
-	struct qeth_snmp_cmd *snmp;
 	unsigned char *data;
+	void *snmp_data;
 	__u16 data_len;
 
 	QETH_CARD_TEXT(card, 3, "snpcmdcb");
@@ -4554,7 +4554,6 @@ static int qeth_snmp_command_cb(struct q
 	cmd = (struct qeth_ipa_cmd *) sdata;
 	data = (unsigned char *)((char *)cmd - reply->offset);
 	qinfo = (struct qeth_arp_query_info *) reply->param;
-	snmp = &cmd->data.setadapterparms.data.snmp;
 
 	if (cmd->hdr.return_code) {
 		QETH_CARD_TEXT_(card, 4, "scer1%x", cmd->hdr.return_code);
@@ -4567,10 +4566,15 @@ static int qeth_snmp_command_cb(struct q
 		return 0;
 	}
 	data_len = *((__u16 *)QETH_IPA_PDU_LEN_PDU1(data));
-	if (cmd->data.setadapterparms.hdr.seq_no == 1)
-		data_len -= (__u16)((char *)&snmp->data - (char *)cmd);
-	else
-		data_len -= (__u16)((char *)&snmp->request - (char *)cmd);
+	if (cmd->data.setadapterparms.hdr.seq_no == 1) {
+		snmp_data = &cmd->data.setadapterparms.data.snmp;
+		data_len -= offsetof(struct qeth_ipa_cmd,
+				     data.setadapterparms.data.snmp);
+	} else {
+		snmp_data = &cmd->data.setadapterparms.data.snmp.request;
+		data_len -= offsetof(struct qeth_ipa_cmd,
+				     data.setadapterparms.data.snmp.request);
+	}
 
 	/* check if there is enough room in userspace */
 	if ((qinfo->udata_len - qinfo->udata_offset) < data_len) {
@@ -4583,16 +4587,9 @@ static int qeth_snmp_command_cb(struct q
 	QETH_CARD_TEXT_(card, 4, "sseqn%i",
 		cmd->data.setadapterparms.hdr.seq_no);
 	/*copy entries to user buffer*/
-	if (cmd->data.setadapterparms.hdr.seq_no == 1) {
-		memcpy(qinfo->udata + qinfo->udata_offset,
-		       (char *)snmp,
-		       data_len + offsetof(struct qeth_snmp_cmd, data));
-		qinfo->udata_offset += offsetof(struct qeth_snmp_cmd, data);
-	} else {
-		memcpy(qinfo->udata + qinfo->udata_offset,
-		       (char *)&snmp->request, data_len);
-	}
+	memcpy(qinfo->udata + qinfo->udata_offset, snmp_data, data_len);
 	qinfo->udata_offset += data_len;
+
 	/* check if all replies received ... */
 		QETH_CARD_TEXT_(card, 4, "srtot%i",
 			       cmd->data.setadapterparms.hdr.used_total);



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 067/146] usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (65 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 066/146] s390/qeth: fix length check in SNMP processing Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 068/146] sched/core: Fix cpu.max vs. cpuhotplug deadlock Greg Kroah-Hartman
                   ` (83 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Oliver Zweigle, Bernd Eckstein,
	David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Bernd Eckstein <3erndeckstein@gmail.com>

[ Upstream commit 45611c61dd503454b2edae00aabe1e429ec49ebe ]

The bug is not easily reproducable, as it may occur very infrequently
(we had machines with 20minutes heavy downloading before it occurred)
However, on a virual machine (VMWare on Windows 10 host) it occurred
pretty frequently (1-2 seconds after a speedtest was started)

dev->tx_skb mab be freed via dev_kfree_skb_irq on a callback
before it is set.

This causes the following problems:
- double free of the skb or potential memory leak
- in dmesg: 'recvmsg bug' and 'recvmsg bug 2' and eventually
  general protection fault

Example dmesg output:
[  134.841986] ------------[ cut here ]------------
[  134.841987] recvmsg bug: copied 9C24A555 seq 9C24B557 rcvnxt 9C25A6B3 fl 0
[  134.841993] WARNING: CPU: 7 PID: 2629 at /build/linux-hwe-On9fm7/linux-hwe-4.15.0/net/ipv4/tcp.c:1865 tcp_recvmsg+0x44d/0xab0
[  134.841994] Modules linked in: ipheth(OE) kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd vmw_balloon intel_rapl_perf joydev input_leds serio_raw vmw_vsock_vmci_transport vsock shpchp i2c_piix4 mac_hid binfmt_misc vmw_vmci parport_pc ppdev lp parport autofs4 vmw_pvscsi vmxnet3 hid_generic usbhid hid vmwgfx ttm drm_kms_helper syscopyarea sysfillrect mptspi mptscsih sysimgblt ahci psmouse fb_sys_fops pata_acpi mptbase libahci e1000 drm scsi_transport_spi
[  134.842046] CPU: 7 PID: 2629 Comm: python Tainted: G        W  OE    4.15.0-34-generic #37~16.04.1-Ubuntu
[  134.842046] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[  134.842048] RIP: 0010:tcp_recvmsg+0x44d/0xab0
[  134.842048] RSP: 0018:ffffa6630422bcc8 EFLAGS: 00010286
[  134.842049] RAX: 0000000000000000 RBX: ffff997616f4f200 RCX: 0000000000000006
[  134.842049] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff9976257d6490
[  134.842050] RBP: ffffa6630422bd98 R08: 0000000000000001 R09: 000000000004bba4
[  134.842050] R10: 0000000001e00c6f R11: 000000000004bba4 R12: ffff99760dee3000
[  134.842051] R13: 0000000000000000 R14: ffff99760dee3514 R15: 0000000000000000
[  134.842051] FS:  00007fe332347700(0000) GS:ffff9976257c0000(0000) knlGS:0000000000000000
[  134.842052] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  134.842053] CR2: 0000000001e41000 CR3: 000000020e9b4006 CR4: 00000000003606e0
[  134.842055] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  134.842055] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  134.842057] Call Trace:
[  134.842060]  ? aa_sk_perm+0x53/0x1a0
[  134.842064]  inet_recvmsg+0x51/0xc0
[  134.842066]  sock_recvmsg+0x43/0x50
[  134.842070]  SYSC_recvfrom+0xe4/0x160
[  134.842072]  ? __schedule+0x3de/0x8b0
[  134.842075]  ? ktime_get_ts64+0x4c/0xf0
[  134.842079]  SyS_recvfrom+0xe/0x10
[  134.842082]  do_syscall_64+0x73/0x130
[  134.842086]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  134.842086] RIP: 0033:0x7fe331f5a81d
[  134.842088] RSP: 002b:00007ffe8da98398 EFLAGS: 00000246 ORIG_RAX: 000000000000002d
[  134.842090] RAX: ffffffffffffffda RBX: ffffffffffffffff RCX: 00007fe331f5a81d
[  134.842094] RDX: 00000000000003fb RSI: 0000000001e00874 RDI: 0000000000000003
[  134.842095] RBP: 00007fe32f642c70 R08: 0000000000000000 R09: 0000000000000000
[  134.842097] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe332347698
[  134.842099] R13: 0000000001b7e0a0 R14: 0000000001e00874 R15: 0000000000000000
[  134.842103] Code: 24 fd ff ff e9 cc fe ff ff 48 89 d8 41 8b 8c 24 10 05 00 00 44 8b 45 80 48 c7 c7 08 bd 59 8b 48 89 85 68 ff ff ff e8 b3 c4 7d ff <0f> 0b 48 8b 85 68 ff ff ff e9 e9 fe ff ff 41 8b 8c 24 10 05 00
[  134.842126] ---[ end trace b7138fc08c83147f ]---
[  134.842144] general protection fault: 0000 [#1] SMP PTI
[  134.842145] Modules linked in: ipheth(OE) kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd vmw_balloon intel_rapl_perf joydev input_leds serio_raw vmw_vsock_vmci_transport vsock shpchp i2c_piix4 mac_hid binfmt_misc vmw_vmci parport_pc ppdev lp parport autofs4 vmw_pvscsi vmxnet3 hid_generic usbhid hid vmwgfx ttm drm_kms_helper syscopyarea sysfillrect mptspi mptscsih sysimgblt ahci psmouse fb_sys_fops pata_acpi mptbase libahci e1000 drm scsi_transport_spi
[  134.842161] CPU: 7 PID: 2629 Comm: python Tainted: G        W  OE    4.15.0-34-generic #37~16.04.1-Ubuntu
[  134.842162] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[  134.842164] RIP: 0010:tcp_close+0x2c6/0x440
[  134.842165] RSP: 0018:ffffa6630422bde8 EFLAGS: 00010202
[  134.842167] RAX: 0000000000000000 RBX: ffff99760dee3000 RCX: 0000000180400034
[  134.842168] RDX: 5c4afd407207a6c4 RSI: ffffe868495bd300 RDI: ffff997616f4f200
[  134.842169] RBP: ffffa6630422be08 R08: 0000000016f4d401 R09: 0000000180400034
[  134.842169] R10: ffffa6630422bd98 R11: 0000000000000000 R12: 000000000000600c
[  134.842170] R13: 0000000000000000 R14: ffff99760dee30c8 R15: ffff9975bd44fe00
[  134.842171] FS:  00007fe332347700(0000) GS:ffff9976257c0000(0000) knlGS:0000000000000000
[  134.842173] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  134.842174] CR2: 0000000001e41000 CR3: 000000020e9b4006 CR4: 00000000003606e0
[  134.842177] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  134.842178] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  134.842179] Call Trace:
[  134.842181]  inet_release+0x42/0x70
[  134.842183]  __sock_release+0x42/0xb0
[  134.842184]  sock_close+0x15/0x20
[  134.842187]  __fput+0xea/0x220
[  134.842189]  ____fput+0xe/0x10
[  134.842191]  task_work_run+0x8a/0xb0
[  134.842193]  exit_to_usermode_loop+0xc4/0xd0
[  134.842195]  do_syscall_64+0xf4/0x130
[  134.842197]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  134.842197] RIP: 0033:0x7fe331f5a560
[  134.842198] RSP: 002b:00007ffe8da982e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[  134.842200] RAX: 0000000000000000 RBX: 00007fe32f642c70 RCX: 00007fe331f5a560
[  134.842201] RDX: 00000000008f5320 RSI: 0000000001cd4b50 RDI: 0000000000000003
[  134.842202] RBP: 00007fe32f6500f8 R08: 000000000000003c R09: 00000000009343c0
[  134.842203] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe32f6500d0
[  134.842204] R13: 00000000008f5320 R14: 00000000008f5320 R15: 0000000001cd4770
[  134.842205] Code: c8 00 00 00 45 31 e4 49 39 fe 75 4d eb 50 83 ab d8 00 00 00 01 48 8b 17 48 8b 47 08 48 c7 07 00 00 00 00 48 c7 47 08 00 00 00 00 <48> 89 42 08 48 89 10 0f b6 57 34 8b 47 2c 2b 47 28 83 e2 01 80
[  134.842226] RIP: tcp_close+0x2c6/0x440 RSP: ffffa6630422bde8
[  134.842227] ---[ end trace b7138fc08c831480 ]---

The proposed patch eliminates a potential racing condition.
Before, usb_submit_urb was called and _after_ that, the skb was attached
(dev->tx_skb). So, on a callback it was possible, however unlikely that the
skb was freed before it was set. That way (because dev->tx_skb was not set
to NULL after it was freed), it could happen that a skb from a earlier
transmission was freed a second time (and the skb we should have freed did
not get freed at all)

Now we free the skb directly in ipheth_tx(). It is not passed to the
callback anymore, eliminating the posibility of a double free of the same
skb. Depending on the retval of usb_submit_urb() we use dev_kfree_skb_any()
respectively dev_consume_skb_any() to free the skb.

Signed-off-by: Oliver Zweigle <Oliver.Zweigle@faro.com>
Signed-off-by: Bernd Eckstein <3ernd.Eckstein@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/usb/ipheth.c |   10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

--- a/drivers/net/usb/ipheth.c
+++ b/drivers/net/usb/ipheth.c
@@ -140,7 +140,6 @@ struct ipheth_device {
 	struct usb_device *udev;
 	struct usb_interface *intf;
 	struct net_device *net;
-	struct sk_buff *tx_skb;
 	struct urb *tx_urb;
 	struct urb *rx_urb;
 	unsigned char *tx_buf;
@@ -229,6 +228,7 @@ static void ipheth_rcvbulk_callback(stru
 	case -ENOENT:
 	case -ECONNRESET:
 	case -ESHUTDOWN:
+	case -EPROTO:
 		return;
 	case 0:
 		break;
@@ -280,7 +280,6 @@ static void ipheth_sndbulk_callback(stru
 		dev_err(&dev->intf->dev, "%s: urb status: %d\n",
 		__func__, status);
 
-	dev_kfree_skb_irq(dev->tx_skb);
 	netif_wake_queue(dev->net);
 }
 
@@ -410,7 +409,7 @@ static int ipheth_tx(struct sk_buff *skb
 	if (skb->len > IPHETH_BUF_SIZE) {
 		WARN(1, "%s: skb too large: %d bytes\n", __func__, skb->len);
 		dev->net->stats.tx_dropped++;
-		dev_kfree_skb_irq(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
@@ -430,12 +429,11 @@ static int ipheth_tx(struct sk_buff *skb
 		dev_err(&dev->intf->dev, "%s: usb_submit_urb: %d\n",
 			__func__, retval);
 		dev->net->stats.tx_errors++;
-		dev_kfree_skb_irq(skb);
+		dev_kfree_skb_any(skb);
 	} else {
-		dev->tx_skb = skb;
-
 		dev->net->stats.tx_packets++;
 		dev->net->stats.tx_bytes += skb->len;
+		dev_consume_skb_any(skb);
 		netif_stop_queue(net);
 	}
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 068/146] sched/core: Fix cpu.max vs. cpuhotplug deadlock
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (66 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 067/146] usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2 Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 069/146] x86/bugs: Add AMDs variant of SSB_NO Greg Kroah-Hartman
                   ` (82 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tejun Heo, Peter Zijlstra (Intel),
	Linus Torvalds, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Zijlstra peterz@infradead.org

commit ce48c146495a1a50e48cdbfbfaba3e708be7c07c upstream

Tejun reported the following cpu-hotplug lock (percpu-rwsem) read recursion:

  tg_set_cfs_bandwidth()
    get_online_cpus()
      cpus_read_lock()

    cfs_bandwidth_usage_inc()
      static_key_slow_inc()
        cpus_read_lock()

Reported-by: Tejun Heo <tj@kernel.org>
Tested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180122215328.GP3397@worktop
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/jump_label.h |    7 +++++++
 kernel/jump_label.c        |   12 +++++++++---
 kernel/sched/fair.c        |    4 ++--
 3 files changed, 18 insertions(+), 5 deletions(-)

--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -160,6 +160,8 @@ extern void arch_jump_label_transform_st
 extern int jump_label_text_reserved(void *start, void *end);
 extern void static_key_slow_inc(struct static_key *key);
 extern void static_key_slow_dec(struct static_key *key);
+extern void static_key_slow_inc_cpuslocked(struct static_key *key);
+extern void static_key_slow_dec_cpuslocked(struct static_key *key);
 extern void jump_label_apply_nops(struct module *mod);
 extern int static_key_count(struct static_key *key);
 extern void static_key_enable(struct static_key *key);
@@ -222,6 +224,9 @@ static inline void static_key_slow_dec(s
 	atomic_dec(&key->enabled);
 }
 
+#define static_key_slow_inc_cpuslocked(key) static_key_slow_inc(key)
+#define static_key_slow_dec_cpuslocked(key) static_key_slow_dec(key)
+
 static inline int jump_label_text_reserved(void *start, void *end)
 {
 	return 0;
@@ -416,6 +421,8 @@ extern bool ____wrong_branch_error(void)
 
 #define static_branch_inc(x)		static_key_slow_inc(&(x)->key)
 #define static_branch_dec(x)		static_key_slow_dec(&(x)->key)
+#define static_branch_inc_cpuslocked(x)	static_key_slow_inc_cpuslocked(&(x)->key)
+#define static_branch_dec_cpuslocked(x)	static_key_slow_dec_cpuslocked(&(x)->key)
 
 /*
  * Normal usage; boolean enable/disable.
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -79,7 +79,7 @@ int static_key_count(struct static_key *
 }
 EXPORT_SYMBOL_GPL(static_key_count);
 
-static void static_key_slow_inc_cpuslocked(struct static_key *key)
+void static_key_slow_inc_cpuslocked(struct static_key *key)
 {
 	int v, v1;
 
@@ -180,7 +180,7 @@ void static_key_disable(struct static_ke
 }
 EXPORT_SYMBOL_GPL(static_key_disable);
 
-static void static_key_slow_dec_cpuslocked(struct static_key *key,
+static void __static_key_slow_dec_cpuslocked(struct static_key *key,
 					   unsigned long rate_limit,
 					   struct delayed_work *work)
 {
@@ -211,7 +211,7 @@ static void __static_key_slow_dec(struct
 				  struct delayed_work *work)
 {
 	cpus_read_lock();
-	static_key_slow_dec_cpuslocked(key, rate_limit, work);
+	__static_key_slow_dec_cpuslocked(key, rate_limit, work);
 	cpus_read_unlock();
 }
 
@@ -229,6 +229,12 @@ void static_key_slow_dec(struct static_k
 }
 EXPORT_SYMBOL_GPL(static_key_slow_dec);
 
+void static_key_slow_dec_cpuslocked(struct static_key *key)
+{
+	STATIC_KEY_CHECK_USE();
+	__static_key_slow_dec_cpuslocked(key, 0, NULL);
+}
+
 void static_key_slow_dec_deferred(struct static_key_deferred *key)
 {
 	STATIC_KEY_CHECK_USE();
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4040,12 +4040,12 @@ static inline bool cfs_bandwidth_used(vo
 
 void cfs_bandwidth_usage_inc(void)
 {
-	static_key_slow_inc(&__cfs_bandwidth_used);
+	static_key_slow_inc_cpuslocked(&__cfs_bandwidth_used);
 }
 
 void cfs_bandwidth_usage_dec(void)
 {
-	static_key_slow_dec(&__cfs_bandwidth_used);
+	static_key_slow_dec_cpuslocked(&__cfs_bandwidth_used);
 }
 #else /* HAVE_JUMP_LABEL */
 static bool cfs_bandwidth_used(void)



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 069/146] x86/bugs: Add AMDs variant of SSB_NO
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (67 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 068/146] sched/core: Fix cpu.max vs. cpuhotplug deadlock Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 070/146] x86/bugs: Add AMDs SPEC_CTRL MSR usage Greg Kroah-Hartman
                   ` (81 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Konrad Rzeszutek Wilk,
	Thomas Gleixner, Tom Lendacky, Janakarajan Natarajan, kvm,
	andrew.cooper3, Andy Lutomirski, H. Peter Anvin, Borislav Petkov,
	David Woodhouse

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com

commit 24809860012e0130fbafe536709e08a22b3e959e upstream

The AMD document outlining the SSBD handling
124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf
mentions that the CPUID 8000_0008.EBX[26] will mean that the
speculative store bypass disable is no longer needed.

A copy of this document is available at:
    https://bugzilla.kernel.org/show_bug.cgi?id=199889

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: kvm@vger.kernel.org
Cc: andrew.cooper3@citrix.com
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lkml.kernel.org/r/20180601145921.9500-2-konrad.wilk@oracle.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/cpufeatures.h |    1 +
 arch/x86/kernel/cpu/common.c       |    3 ++-
 arch/x86/kvm/cpuid.c               |    2 +-
 3 files changed, 4 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -285,6 +285,7 @@
 #define X86_FEATURE_AMD_IBRS		(13*32+14) /* "" Indirect Branch Restricted Speculation */
 #define X86_FEATURE_AMD_STIBP		(13*32+15) /* "" Single Thread Indirect Branch Predictors */
 #define X86_FEATURE_VIRT_SSBD		(13*32+25) /* Virtualized Speculative Store Bypass Disable */
+#define X86_FEATURE_AMD_SSB_NO		(13*32+26) /* "" Speculative Store Bypass is fixed in hardware. */
 
 /* Thermal and Power Management Leaf, CPUID level 0x00000006 (EAX), word 14 */
 #define X86_FEATURE_DTHERM		(14*32+ 0) /* Digital Thermal Sensor */
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -958,7 +958,8 @@ static void __init cpu_set_bug_bits(stru
 		rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
 
 	if (!x86_match_cpu(cpu_no_spec_store_bypass) &&
-	   !(ia32_cap & ARCH_CAP_SSB_NO))
+	   !(ia32_cap & ARCH_CAP_SSB_NO) &&
+	   !cpu_has(c, X86_FEATURE_AMD_SSB_NO))
 		setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);
 
 	if (x86_match_cpu(cpu_no_speculation))
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -367,7 +367,7 @@ static inline int __do_cpuid_ent(struct
 
 	/* cpuid 0x80000008.ebx */
 	const u32 kvm_cpuid_8000_0008_ebx_x86_features =
-		F(AMD_IBPB) | F(AMD_IBRS) | F(VIRT_SSBD);
+		F(AMD_IBPB) | F(AMD_IBRS) | F(VIRT_SSBD) | F(AMD_SSB_NO);
 
 	/* cpuid 0xC0000001.edx */
 	const u32 kvm_cpuid_C000_0001_edx_x86_features =



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 070/146] x86/bugs: Add AMDs SPEC_CTRL MSR usage
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (68 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 069/146] x86/bugs: Add AMDs variant of SSB_NO Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 071/146] x86/bugs: Switch the selection of mitigation from CPU vendor to CPU features Greg Kroah-Hartman
                   ` (80 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Konrad Rzeszutek Wilk,
	Thomas Gleixner, Tom Lendacky, Janakarajan Natarajan, kvm,
	KarimAllah Ahmed, andrew.cooper3, Joerg Roedel,
	Radim Krčmář,
	Andy Lutomirski, H. Peter Anvin, Paolo Bonzini, Borislav Petkov,
	David Woodhouse, Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com

commit 6ac2f49edb1ef5446089c7c660017732886d62d6 upstream

The AMD document outlining the SSBD handling
124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf
mentions that if CPUID 8000_0008.EBX[24] is set we should be using
the SPEC_CTRL MSR (0x48) over the VIRT SPEC_CTRL MSR (0xC001_011f)
for speculative store bypass disable.

This in effect means we should clear the X86_FEATURE_VIRT_SSBD
flag so that we would prefer the SPEC_CTRL MSR.

See the document titled:
   124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf

A copy of this document is available at
   https://bugzilla.kernel.org/show_bug.cgi?id=199889

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: kvm@vger.kernel.org
Cc: KarimAllah Ahmed <karahmed@amazon.de>
Cc: andrew.cooper3@citrix.com
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20180601145921.9500-3-konrad.wilk@oracle.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/cpufeatures.h |    1 +
 arch/x86/kernel/cpu/bugs.c         |   12 +++++++-----
 arch/x86/kernel/cpu/common.c       |    6 ++++++
 arch/x86/kvm/cpuid.c               |   10 ++++++++--
 arch/x86/kvm/svm.c                 |    8 +++++---
 5 files changed, 27 insertions(+), 10 deletions(-)

--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -284,6 +284,7 @@
 #define X86_FEATURE_AMD_IBPB		(13*32+12) /* "" Indirect Branch Prediction Barrier */
 #define X86_FEATURE_AMD_IBRS		(13*32+14) /* "" Indirect Branch Restricted Speculation */
 #define X86_FEATURE_AMD_STIBP		(13*32+15) /* "" Single Thread Indirect Branch Predictors */
+#define X86_FEATURE_AMD_SSBD		(13*32+24) /* "" Speculative Store Bypass Disable */
 #define X86_FEATURE_VIRT_SSBD		(13*32+25) /* Virtualized Speculative Store Bypass Disable */
 #define X86_FEATURE_AMD_SSB_NO		(13*32+26) /* "" Speculative Store Bypass is fixed in hardware. */
 
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -532,18 +532,20 @@ static enum ssb_mitigation __init __ssb_
 	if (mode == SPEC_STORE_BYPASS_DISABLE) {
 		setup_force_cpu_cap(X86_FEATURE_SPEC_STORE_BYPASS_DISABLE);
 		/*
-		 * Intel uses the SPEC CTRL MSR Bit(2) for this, while AMD uses
-		 * a completely different MSR and bit dependent on family.
+		 * Intel uses the SPEC CTRL MSR Bit(2) for this, while AMD may
+		 * use a completely different MSR and bit dependent on family.
 		 */
 		switch (boot_cpu_data.x86_vendor) {
 		case X86_VENDOR_INTEL:
+		case X86_VENDOR_AMD:
+			if (!static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) {
+				x86_amd_ssb_disable();
+				break;
+			}
 			x86_spec_ctrl_base |= SPEC_CTRL_SSBD;
 			x86_spec_ctrl_mask |= SPEC_CTRL_SSBD;
 			wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
 			break;
-		case X86_VENDOR_AMD:
-			x86_amd_ssb_disable();
-			break;
 		}
 	}
 
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -760,6 +760,12 @@ static void init_speculation_control(str
 		set_cpu_cap(c, X86_FEATURE_STIBP);
 		set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL);
 	}
+
+	if (cpu_has(c, X86_FEATURE_AMD_SSBD)) {
+		set_cpu_cap(c, X86_FEATURE_SSBD);
+		set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL);
+		clear_cpu_cap(c, X86_FEATURE_VIRT_SSBD);
+	}
 }
 
 void get_cpu_cap(struct cpuinfo_x86 *c)
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -367,7 +367,8 @@ static inline int __do_cpuid_ent(struct
 
 	/* cpuid 0x80000008.ebx */
 	const u32 kvm_cpuid_8000_0008_ebx_x86_features =
-		F(AMD_IBPB) | F(AMD_IBRS) | F(VIRT_SSBD) | F(AMD_SSB_NO);
+		F(AMD_IBPB) | F(AMD_IBRS) | F(AMD_SSBD) | F(VIRT_SSBD) |
+		F(AMD_SSB_NO);
 
 	/* cpuid 0xC0000001.edx */
 	const u32 kvm_cpuid_C000_0001_edx_x86_features =
@@ -649,7 +650,12 @@ static inline int __do_cpuid_ent(struct
 			entry->ebx |= F(VIRT_SSBD);
 		entry->ebx &= kvm_cpuid_8000_0008_ebx_x86_features;
 		cpuid_mask(&entry->ebx, CPUID_8000_0008_EBX);
-		if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD))
+		/*
+		 * The preference is to use SPEC CTRL MSR instead of the
+		 * VIRT_SPEC MSR.
+		 */
+		if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD) &&
+		    !boot_cpu_has(X86_FEATURE_AMD_SSBD))
 			entry->ebx |= F(VIRT_SSBD);
 		break;
 	}
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3644,7 +3644,8 @@ static int svm_get_msr(struct kvm_vcpu *
 		break;
 	case MSR_IA32_SPEC_CTRL:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS))
+		    !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD))
 			return 1;
 
 		msr_info->data = svm->spec_ctrl;
@@ -3749,11 +3750,12 @@ static int svm_set_msr(struct kvm_vcpu *
 		break;
 	case MSR_IA32_SPEC_CTRL:
 		if (!msr->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS))
+		    !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD))
 			return 1;
 
 		/* The STIBP bit doesn't fault even if it's not advertised */
-		if (data & ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP))
+		if (data & ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD))
 			return 1;
 
 		svm->spec_ctrl = data;



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 071/146] x86/bugs: Switch the selection of mitigation from CPU vendor to CPU features
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (69 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 070/146] x86/bugs: Add AMDs SPEC_CTRL MSR usage Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 072/146] x86/bugs: Update when to check for the LS_CFG SSBD mitigation Greg Kroah-Hartman
                   ` (79 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Konrad Rzeszutek Wilk,
	Thomas Gleixner, Kees Cook, kvm, KarimAllah Ahmed,
	andrew.cooper3, H. Peter Anvin, Borislav Petkov, David Woodhouse

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com

commit 108fab4b5c8f12064ef86e02cb0459992affb30f upstream

Both AMD and Intel can have SPEC_CTRL_MSR for SSBD.

However AMD also has two more other ways of doing it - which
are !SPEC_CTRL MSR ways.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: kvm@vger.kernel.org
Cc: KarimAllah Ahmed <karahmed@amazon.de>
Cc: andrew.cooper3@citrix.com
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lkml.kernel.org/r/20180601145921.9500-4-konrad.wilk@oracle.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/bugs.c |   11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -535,17 +535,12 @@ static enum ssb_mitigation __init __ssb_
 		 * Intel uses the SPEC CTRL MSR Bit(2) for this, while AMD may
 		 * use a completely different MSR and bit dependent on family.
 		 */
-		switch (boot_cpu_data.x86_vendor) {
-		case X86_VENDOR_INTEL:
-		case X86_VENDOR_AMD:
-			if (!static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) {
-				x86_amd_ssb_disable();
-				break;
-			}
+		if (!static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL))
+			x86_amd_ssb_disable();
+		else {
 			x86_spec_ctrl_base |= SPEC_CTRL_SSBD;
 			x86_spec_ctrl_mask |= SPEC_CTRL_SSBD;
 			wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
-			break;
 		}
 	}
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 072/146] x86/bugs: Update when to check for the LS_CFG SSBD mitigation
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (70 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 071/146] x86/bugs: Switch the selection of mitigation from CPU vendor to CPU features Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 073/146] x86/bugs: Fix the AMD SSBD usage of the SPEC_CTRL MSR Greg Kroah-Hartman
                   ` (78 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tom Lendacky, Borislav Petkov,
	David Woodhouse, Konrad Rzeszutek Wilk, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tom Lendacky thomas.lendacky@amd.com

commit 845d382bb15c6e7dc5026c0ff919c5b13fc7e11b upstream

If either the X86_FEATURE_AMD_SSBD or X86_FEATURE_VIRT_SSBD features are
present, then there is no need to perform the check for the LS_CFG SSBD
mitigation support.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180702213553.29202.21089.stgit@tlendack-t1.amdoffice.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/amd.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -554,7 +554,9 @@ static void bsp_init_amd(struct cpuinfo_
 		nodes_per_socket = ((value >> 3) & 7) + 1;
 	}
 
-	if (c->x86 >= 0x15 && c->x86 <= 0x17) {
+	if (!boot_cpu_has(X86_FEATURE_AMD_SSBD) &&
+	    !boot_cpu_has(X86_FEATURE_VIRT_SSBD) &&
+	    c->x86 >= 0x15 && c->x86 <= 0x17) {
 		unsigned int bit;
 
 		switch (c->x86) {



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 073/146] x86/bugs: Fix the AMD SSBD usage of the SPEC_CTRL MSR
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (71 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 072/146] x86/bugs: Update when to check for the LS_CFG SSBD mitigation Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 074/146] x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation Greg Kroah-Hartman
                   ` (77 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tom Lendacky, Borislav Petkov,
	David Woodhouse, Konrad Rzeszutek Wilk, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tom Lendacky thomas.lendacky@amd.com

commit 612bc3b3d4be749f73a513a17d9b3ee1330d3487 upstream

On AMD, the presence of the MSR_SPEC_CTRL feature does not imply that the
SSBD mitigation support should use the SPEC_CTRL MSR. Other features could
have caused the MSR_SPEC_CTRL feature to be set, while a different SSBD
mitigation option is in place.

Update the SSBD support to check for the actual SSBD features that will
use the SPEC_CTRL MSR.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 6ac2f49edb1e ("x86/bugs: Add AMD's SPEC_CTRL MSR usage")
Link: http://lkml.kernel.org/r/20180702213602.29202.33151.stgit@tlendack-t1.amdoffice.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/bugs.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -166,7 +166,8 @@ x86_virt_spec_ctrl(u64 guest_spec_ctrl,
 		guestval |= guest_spec_ctrl & x86_spec_ctrl_mask;
 
 		/* SSBD controlled in MSR_SPEC_CTRL */
-		if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD))
+		if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD) ||
+		    static_cpu_has(X86_FEATURE_AMD_SSBD))
 			hostval |= ssbd_tif_to_spec_ctrl(ti->flags);
 
 		if (hostval != guestval) {
@@ -535,9 +536,10 @@ static enum ssb_mitigation __init __ssb_
 		 * Intel uses the SPEC CTRL MSR Bit(2) for this, while AMD may
 		 * use a completely different MSR and bit dependent on family.
 		 */
-		if (!static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL))
+		if (!static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD) &&
+		    !static_cpu_has(X86_FEATURE_AMD_SSBD)) {
 			x86_amd_ssb_disable();
-		else {
+		} else {
 			x86_spec_ctrl_base |= SPEC_CTRL_SSBD;
 			x86_spec_ctrl_mask |= SPEC_CTRL_SSBD;
 			wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 074/146] x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (72 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 073/146] x86/bugs: Fix the AMD SSBD usage of the SPEC_CTRL MSR Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 075/146] x86/speculation: Apply IBPB more strictly to avoid cross-process data leak Greg Kroah-Hartman
                   ` (76 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jiri Kosina, Thomas Gleixner,
	Peter Zijlstra, Josh Poimboeuf, Andrea Arcangeli, WoodhouseDavid,
	Andi Kleen, Tim Chen, SchauflerCasey

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jiri Kosina jkosina@suse.cz

commit 53c613fe6349994f023245519265999eed75957f upstream

STIBP is a feature provided by certain Intel ucodes / CPUs. This feature
(once enabled) prevents cross-hyperthread control of decisions made by
indirect branch predictors.

Enable this feature if

- the CPU is vulnerable to spectre v2
- the CPU supports SMT and has SMT siblings online
- spectre_v2 mitigation autoselection is enabled (default)

After some previous discussion, this leaves STIBP on all the time, as wrmsr
on crossing kernel boundary is a no-no. This could perhaps later be a bit
more optimized (like disabling it in NOHZ, experiment with disabling it in
idle, etc) if needed.

Note that the synchronization of the mask manipulation via newly added
spec_ctrl_mutex is currently not strictly needed, as the only updater is
already being serialized by cpu_add_remove_lock, but let's make this a
little bit more future-proof.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc:  "WoodhouseDavid" <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc:  "SchauflerCasey" <casey.schaufler@intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1809251438240.15880@cbobk.fhfr.pm
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/bugs.c |   57 ++++++++++++++++++++++++++++++++++++++++-----
 kernel/cpu.c               |   11 +++++++-
 2 files changed, 61 insertions(+), 7 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -34,12 +34,10 @@ static void __init spectre_v2_select_mit
 static void __init ssb_select_mitigation(void);
 static void __init l1tf_select_mitigation(void);
 
-/*
- * Our boot-time value of the SPEC_CTRL MSR. We read it once so that any
- * writes to SPEC_CTRL contain whatever reserved bits have been set.
- */
-u64 __ro_after_init x86_spec_ctrl_base;
+/* The base value of the SPEC_CTRL MSR that always has to be preserved. */
+u64 x86_spec_ctrl_base;
 EXPORT_SYMBOL_GPL(x86_spec_ctrl_base);
+static DEFINE_MUTEX(spec_ctrl_mutex);
 
 /*
  * The vendor and possibly platform specific bits which can be modified in
@@ -324,6 +322,46 @@ static enum spectre_v2_mitigation_cmd __
 	return cmd;
 }
 
+static bool stibp_needed(void)
+{
+	if (spectre_v2_enabled == SPECTRE_V2_NONE)
+		return false;
+
+	if (!boot_cpu_has(X86_FEATURE_STIBP))
+		return false;
+
+	return true;
+}
+
+static void update_stibp_msr(void *info)
+{
+	wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
+}
+
+void arch_smt_update(void)
+{
+	u64 mask;
+
+	if (!stibp_needed())
+		return;
+
+	mutex_lock(&spec_ctrl_mutex);
+	mask = x86_spec_ctrl_base;
+	if (cpu_smt_control == CPU_SMT_ENABLED)
+		mask |= SPEC_CTRL_STIBP;
+	else
+		mask &= ~SPEC_CTRL_STIBP;
+
+	if (mask != x86_spec_ctrl_base) {
+		pr_info("Spectre v2 cross-process SMT mitigation: %s STIBP\n",
+				cpu_smt_control == CPU_SMT_ENABLED ?
+				"Enabling" : "Disabling");
+		x86_spec_ctrl_base = mask;
+		on_each_cpu(update_stibp_msr, NULL, 1);
+	}
+	mutex_unlock(&spec_ctrl_mutex);
+}
+
 static void __init spectre_v2_select_mitigation(void)
 {
 	enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
@@ -423,6 +461,9 @@ specv2_set_mode:
 		setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW);
 		pr_info("Enabling Restricted Speculation for firmware calls\n");
 	}
+
+	/* Enable STIBP if appropriate */
+	arch_smt_update();
 }
 
 #undef pr_fmt
@@ -813,6 +854,8 @@ static ssize_t l1tf_show_state(char *buf
 static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
 			       char *buf, unsigned int bug)
 {
+	int ret;
+
 	if (!boot_cpu_has_bug(bug))
 		return sprintf(buf, "Not affected\n");
 
@@ -827,10 +870,12 @@ static ssize_t cpu_show_common(struct de
 		return sprintf(buf, "Mitigation: __user pointer sanitization\n");
 
 	case X86_BUG_SPECTRE_V2:
-		return sprintf(buf, "%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+		ret = sprintf(buf, "%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
 			       boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
 			       boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
+			       (x86_spec_ctrl_base & SPEC_CTRL_STIBP) ? ", STIBP" : "",
 			       spectre_v2_module_string());
+		return ret;
 
 	case X86_BUG_SPEC_STORE_BYPASS:
 		return sprintf(buf, "%s\n", ssb_strings[ssb_mode]);
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2045,6 +2045,12 @@ static void cpuhp_online_cpu_device(unsi
 	kobject_uevent(&dev->kobj, KOBJ_ONLINE);
 }
 
+/*
+ * Architectures that need SMT-specific errata handling during SMT hotplug
+ * should override this.
+ */
+void __weak arch_smt_update(void) { };
+
 static int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
 {
 	int cpu, ret = 0;
@@ -2071,8 +2077,10 @@ static int cpuhp_smt_disable(enum cpuhp_
 		 */
 		cpuhp_offline_cpu_device(cpu);
 	}
-	if (!ret)
+	if (!ret) {
 		cpu_smt_control = ctrlval;
+		arch_smt_update();
+	}
 	cpu_maps_update_done();
 	return ret;
 }
@@ -2083,6 +2091,7 @@ static int cpuhp_smt_enable(void)
 
 	cpu_maps_update_begin();
 	cpu_smt_control = CPU_SMT_ENABLED;
+	arch_smt_update();
 	for_each_present_cpu(cpu) {
 		/* Skip online CPUs and CPUs on offline nodes */
 		if (cpu_online(cpu) || !node_online(cpu_to_node(cpu)))



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 075/146] x86/speculation: Apply IBPB more strictly to avoid cross-process data leak
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (73 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 074/146] x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 076/146] x86/speculation: Propagate information about RSB filling mitigation to sysfs Greg Kroah-Hartman
                   ` (75 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jiri Kosina, Thomas Gleixner,
	Peter Zijlstra, Josh Poimboeuf, Andrea Arcangeli, WoodhouseDavid,
	Andi Kleen, SchauflerCasey

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jiri Kosina jkosina@suse.cz

commit dbfe2953f63c640463c630746cd5d9de8b2f63ae upstream

Currently, IBPB is only issued in cases when switching into a non-dumpable
process, the rationale being to protect such 'important and security
sensitive' processess (such as GPG) from data leaking into a different
userspace process via spectre v2.

This is however completely insufficient to provide proper userspace-to-userpace
spectrev2 protection, as any process can poison branch buffers before being
scheduled out, and the newly scheduled process immediately becomes spectrev2
victim.

In order to minimize the performance impact (for usecases that do require
spectrev2 protection), issue the barrier only in cases when switching between
processess where the victim can't be ptraced by the potential attacker (as in
such cases, the attacker doesn't have to bother with branch buffers at all).

[ tglx: Split up PTRACE_MODE_NOACCESS_CHK into PTRACE_MODE_SCHED and
  PTRACE_MODE_IBPB to be able to do ptrace() context tracking reasonably
  fine-grained ]

Fixes: 18bf3c3ea8 ("x86/speculation: Use Indirect Branch Prediction Barrier in context switch")
Originally-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc:  "WoodhouseDavid" <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc:  "SchauflerCasey" <casey.schaufler@intel.com>
Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1809251437340.15880@cbobk.fhfr.pm
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/mm/tlb.c      |   31 ++++++++++++++++++++-----------
 include/linux/ptrace.h |   21 +++++++++++++++++++--
 kernel/ptrace.c        |   10 ++++++++++
 3 files changed, 49 insertions(+), 13 deletions(-)

--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -7,6 +7,7 @@
 #include <linux/export.h>
 #include <linux/cpu.h>
 #include <linux/debugfs.h>
+#include <linux/ptrace.h>
 
 #include <asm/tlbflush.h>
 #include <asm/mmu_context.h>
@@ -180,6 +181,19 @@ static void sync_current_stack_to_mm(str
 	}
 }
 
+static bool ibpb_needed(struct task_struct *tsk, u64 last_ctx_id)
+{
+	/*
+	 * Check if the current (previous) task has access to the memory
+	 * of the @tsk (next) task. If access is denied, make sure to
+	 * issue a IBPB to stop user->user Spectre-v2 attacks.
+	 *
+	 * Note: __ptrace_may_access() returns 0 or -ERRNO.
+	 */
+	return (tsk && tsk->mm && tsk->mm->context.ctx_id != last_ctx_id &&
+		ptrace_may_access_sched(tsk, PTRACE_MODE_SPEC_IBPB));
+}
+
 void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
 			struct task_struct *tsk)
 {
@@ -256,18 +270,13 @@ void switch_mm_irqs_off(struct mm_struct
 		 * one process from doing Spectre-v2 attacks on another.
 		 *
 		 * As an optimization, flush indirect branches only when
-		 * switching into processes that disable dumping. This
-		 * protects high value processes like gpg, without having
-		 * too high performance overhead. IBPB is *expensive*!
-		 *
-		 * This will not flush branches when switching into kernel
-		 * threads. It will also not flush if we switch to idle
-		 * thread and back to the same process. It will flush if we
-		 * switch to a different non-dumpable process.
+		 * switching into a processes that can't be ptrace by the
+		 * current one (as in such case, attacker has much more
+		 * convenient way how to tamper with the next process than
+		 * branch buffer poisoning).
 		 */
-		if (tsk && tsk->mm &&
-		    tsk->mm->context.ctx_id != last_ctx_id &&
-		    get_dumpable(tsk->mm) != SUID_DUMP_USER)
+		if (static_cpu_has(X86_FEATURE_USE_IBPB) &&
+				ibpb_needed(tsk, last_ctx_id))
 			indirect_branch_prediction_barrier();
 
 		if (IS_ENABLED(CONFIG_VMAP_STACK)) {
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -62,14 +62,17 @@ extern void exit_ptrace(struct task_stru
 #define PTRACE_MODE_READ	0x01
 #define PTRACE_MODE_ATTACH	0x02
 #define PTRACE_MODE_NOAUDIT	0x04
-#define PTRACE_MODE_FSCREDS 0x08
-#define PTRACE_MODE_REALCREDS 0x10
+#define PTRACE_MODE_FSCREDS	0x08
+#define PTRACE_MODE_REALCREDS	0x10
+#define PTRACE_MODE_SCHED	0x20
+#define PTRACE_MODE_IBPB	0x40
 
 /* shorthands for READ/ATTACH and FSCREDS/REALCREDS combinations */
 #define PTRACE_MODE_READ_FSCREDS (PTRACE_MODE_READ | PTRACE_MODE_FSCREDS)
 #define PTRACE_MODE_READ_REALCREDS (PTRACE_MODE_READ | PTRACE_MODE_REALCREDS)
 #define PTRACE_MODE_ATTACH_FSCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_FSCREDS)
 #define PTRACE_MODE_ATTACH_REALCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_REALCREDS)
+#define PTRACE_MODE_SPEC_IBPB (PTRACE_MODE_ATTACH_REALCREDS | PTRACE_MODE_IBPB)
 
 /**
  * ptrace_may_access - check whether the caller is permitted to access
@@ -87,6 +90,20 @@ extern void exit_ptrace(struct task_stru
  */
 extern bool ptrace_may_access(struct task_struct *task, unsigned int mode);
 
+/**
+ * ptrace_may_access - check whether the caller is permitted to access
+ * a target task.
+ * @task: target task
+ * @mode: selects type of access and caller credentials
+ *
+ * Returns true on success, false on denial.
+ *
+ * Similar to ptrace_may_access(). Only to be called from context switch
+ * code. Does not call into audit and the regular LSM hooks due to locking
+ * constraints.
+ */
+extern bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode);
+
 static inline int ptrace_reparented(struct task_struct *child)
 {
 	return !same_thread_group(child->real_parent, child->parent);
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -261,6 +261,9 @@ static int ptrace_check_attach(struct ta
 
 static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
 {
+	if (mode & PTRACE_MODE_SCHED)
+		return false;
+
 	if (mode & PTRACE_MODE_NOAUDIT)
 		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
 	else
@@ -328,9 +331,16 @@ ok:
 	     !ptrace_has_cap(mm->user_ns, mode)))
 	    return -EPERM;
 
+	if (mode & PTRACE_MODE_SCHED)
+		return 0;
 	return security_ptrace_access_check(task, mode);
 }
 
+bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode)
+{
+	return __ptrace_may_access(task, mode | PTRACE_MODE_SCHED);
+}
+
 bool ptrace_may_access(struct task_struct *task, unsigned int mode)
 {
 	int err;



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 076/146] x86/speculation: Propagate information about RSB filling mitigation to sysfs
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (74 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 075/146] x86/speculation: Apply IBPB more strictly to avoid cross-process data leak Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 077/146] x86/speculation: Add RETPOLINE_AMD support to the inline asm CALL_NOSPEC variant Greg Kroah-Hartman
                   ` (74 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jiri Kosina, Thomas Gleixner,
	Peter Zijlstra, Josh Poimboeuf, Andrea Arcangeli, WoodhouseDavid,
	Andi Kleen, Tim Chen, SchauflerCasey

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jiri Kosina jkosina@suse.cz

commit bb4b3b7762735cdaba5a40fd94c9303d9ffa147a upstream

If spectrev2 mitigation has been enabled, RSB is filled on context switch
in order to protect from various classes of spectrev2 attacks.

If this mitigation is enabled, say so in sysfs for spectrev2.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc:  "WoodhouseDavid" <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc:  "SchauflerCasey" <casey.schaufler@intel.com>
Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1809251438580.15880@cbobk.fhfr.pm
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/bugs.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -870,10 +870,11 @@ static ssize_t cpu_show_common(struct de
 		return sprintf(buf, "Mitigation: __user pointer sanitization\n");
 
 	case X86_BUG_SPECTRE_V2:
-		ret = sprintf(buf, "%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+		ret = sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
 			       boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
 			       boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
 			       (x86_spec_ctrl_base & SPEC_CTRL_STIBP) ? ", STIBP" : "",
+			       boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "",
 			       spectre_v2_module_string());
 		return ret;
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 077/146] x86/speculation: Add RETPOLINE_AMD support to the inline asm CALL_NOSPEC variant
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (75 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 076/146] x86/speculation: Propagate information about RSB filling mitigation to sysfs Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 078/146] x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support Greg Kroah-Hartman
                   ` (73 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Zhenzhong Duan, Borislav Petkov,
	Daniel Borkmann, David Woodhouse, H. Peter Anvin, Ingo Molnar,
	Konrad Rzeszutek Wilk, Peter Zijlstra, Thomas Gleixner,
	Wang YanQing, dhaval.giani, srinivas.eeda

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Zhenzhong Duan zhenzhong.duan@oracle.com

commit 0cbb76d6285794f30953bfa3ab831714b59dd700 upstream

..so that they match their asm counterpart.

Add the missing ANNOTATE_NOSPEC_ALTERNATIVE in CALL_NOSPEC, while at it.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang YanQing <udknight@gmail.com>
Cc: dhaval.giani@oracle.com
Cc: srinivas.eeda@oracle.com
Link: http://lkml.kernel.org/r/c3975665-173e-4d70-8dee-06c926ac26ee@default
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/nospec-branch.h |   17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -170,11 +170,15 @@
  */
 # define CALL_NOSPEC						\
 	ANNOTATE_NOSPEC_ALTERNATIVE				\
-	ALTERNATIVE(						\
+	ALTERNATIVE_2(						\
 	ANNOTATE_RETPOLINE_SAFE					\
 	"call *%[thunk_target]\n",				\
 	"call __x86_indirect_thunk_%V[thunk_target]\n",		\
-	X86_FEATURE_RETPOLINE)
+	X86_FEATURE_RETPOLINE,					\
+	"lfence;\n"						\
+	ANNOTATE_RETPOLINE_SAFE					\
+	"call *%[thunk_target]\n",				\
+	X86_FEATURE_RETPOLINE_AMD)
 # define THUNK_TARGET(addr) [thunk_target] "r" (addr)
 
 #elif defined(CONFIG_X86_32) && defined(CONFIG_RETPOLINE)
@@ -184,7 +188,8 @@
  * here, anyway.
  */
 # define CALL_NOSPEC						\
-	ALTERNATIVE(						\
+	ANNOTATE_NOSPEC_ALTERNATIVE				\
+	ALTERNATIVE_2(						\
 	ANNOTATE_RETPOLINE_SAFE					\
 	"call *%[thunk_target]\n",				\
 	"       jmp    904f;\n"					\
@@ -199,7 +204,11 @@
 	"       ret;\n"						\
 	"       .align 16\n"					\
 	"904:	call   901b;\n",				\
-	X86_FEATURE_RETPOLINE)
+	X86_FEATURE_RETPOLINE,					\
+	"lfence;\n"						\
+	ANNOTATE_RETPOLINE_SAFE					\
+	"call *%[thunk_target]\n",				\
+	X86_FEATURE_RETPOLINE_AMD)
 
 # define THUNK_TARGET(addr) [thunk_target] "rm" (addr)
 #else /* No retpoline for C / inline asm */



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 078/146] x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (76 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 077/146] x86/speculation: Add RETPOLINE_AMD support to the inline asm CALL_NOSPEC variant Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 079/146] x86/retpoline: Remove minimal retpoline support Greg Kroah-Hartman
                   ` (72 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Peter Zijlstra, Zhenzhong Duan,
	Thomas Gleixner, David Woodhouse, Borislav Petkov,
	Daniel Borkmann, H. Peter Anvin, Konrad Rzeszutek Wilk,
	Andy Lutomirski, Masahiro Yamada, Michal Marek, srinivas.eeda

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Zhenzhong Duan zhenzhong.duan@oracle.com

commit 4cd24de3a0980bf3100c9dcb08ef65ca7c31af48 upstream

Since retpoline capable compilers are widely available, make
CONFIG_RETPOLINE hard depend on the compiler capability.

Break the build when CONFIG_RETPOLINE is enabled and the compiler does not
support it. Emit an error message in that case:

 "arch/x86/Makefile:226: *** You are building kernel with non-retpoline
  compiler, please update your compiler..  Stop."

[dwmw: Fail the build with non-retpoline compiler]

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Borislav Petkov <bp@suse.de>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: <srinivas.eeda@oracle.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/cca0cb20-f9e2-4094-840b-fb0f8810cd34@default
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/Kconfig                     |    4 ----
 arch/x86/Makefile                    |    5 +++--
 arch/x86/include/asm/nospec-branch.h |   10 ++++++----
 arch/x86/kernel/cpu/bugs.c           |    2 +-
 scripts/Makefile.build               |    2 --
 5 files changed, 10 insertions(+), 13 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -440,10 +440,6 @@ config RETPOLINE
 	  branches. Requires a compiler with -mindirect-branch=thunk-extern
 	  support for full protection. The kernel may run slower.
 
-	  Without compiler support, at least indirect branches in assembler
-	  code are eliminated. Since this includes the syscall entry path,
-	  it is not entirely pointless.
-
 config INTEL_RDT
 	bool "Intel Resource Director Technology support"
 	default n
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -241,9 +241,10 @@ KBUILD_CFLAGS += -fno-asynchronous-unwin
 
 # Avoid indirect branches in kernel to deal with Spectre
 ifdef CONFIG_RETPOLINE
-ifneq ($(RETPOLINE_CFLAGS),)
-  KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) -DRETPOLINE
+ifeq ($(RETPOLINE_CFLAGS),)
+  $(error You are building kernel with non-retpoline compiler, please update your compiler.)
 endif
+  KBUILD_CFLAGS += $(RETPOLINE_CFLAGS)
 endif
 
 archscripts: scripts_basic
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -162,11 +162,12 @@
 	_ASM_PTR " 999b\n\t"					\
 	".popsection\n\t"
 
-#if defined(CONFIG_X86_64) && defined(RETPOLINE)
+#ifdef CONFIG_RETPOLINE
+#ifdef CONFIG_X86_64
 
 /*
- * Since the inline asm uses the %V modifier which is only in newer GCC,
- * the 64-bit one is dependent on RETPOLINE not CONFIG_RETPOLINE.
+ * Inline asm uses the %V modifier which is only in newer GCC
+ * which is ensured when CONFIG_RETPOLINE is defined.
  */
 # define CALL_NOSPEC						\
 	ANNOTATE_NOSPEC_ALTERNATIVE				\
@@ -181,7 +182,7 @@
 	X86_FEATURE_RETPOLINE_AMD)
 # define THUNK_TARGET(addr) [thunk_target] "r" (addr)
 
-#elif defined(CONFIG_X86_32) && defined(CONFIG_RETPOLINE)
+#else /* CONFIG_X86_32 */
 /*
  * For i386 we use the original ret-equivalent retpoline, because
  * otherwise we'll run out of registers. We don't care about CET
@@ -211,6 +212,7 @@
 	X86_FEATURE_RETPOLINE_AMD)
 
 # define THUNK_TARGET(addr) [thunk_target] "rm" (addr)
+#endif
 #else /* No retpoline for C / inline asm */
 # define CALL_NOSPEC "call *%[thunk_target]\n"
 # define THUNK_TARGET(addr) [thunk_target] "rm" (addr)
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -251,7 +251,7 @@ static void __init spec2_print_if_secure
 
 static inline bool retp_compiler(void)
 {
-	return __is_defined(RETPOLINE);
+	return __is_defined(CONFIG_RETPOLINE);
 }
 
 static inline bool match_option(const char *arg, int arglen, const char *opt)
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -272,10 +272,8 @@ else
 objtool_args += $(call cc-ifversion, -lt, 0405, --no-unreachable)
 endif
 ifdef CONFIG_RETPOLINE
-ifneq ($(RETPOLINE_CFLAGS),)
   objtool_args += --retpoline
 endif
-endif
 
 
 ifdef CONFIG_MODVERSIONS



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 079/146] x86/retpoline: Remove minimal retpoline support
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (77 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 078/146] x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 080/146] x86/speculation: Update the TIF_SSBD comment Greg Kroah-Hartman
                   ` (71 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Peter Zijlstra, Zhenzhong Duan,
	Thomas Gleixner, David Woodhouse, Borislav Petkov,
	H. Peter Anvin, Konrad Rzeszutek Wilk, srinivas.eeda

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Zhenzhong Duan zhenzhong.duan@oracle.com

commit ef014aae8f1cd2793e4e014bbb102bed53f852b7 upstream

Now that CONFIG_RETPOLINE hard depends on compiler support, there is no
reason to keep the minimal retpoline support around which only provided
basic protection in the assembly files.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Borislav Petkov <bp@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: <srinivas.eeda@oracle.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/f06f0a89-5587-45db-8ed2-0a9d6638d5c0@default
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/nospec-branch.h |    3 ---
 arch/x86/kernel/cpu/bugs.c           |   13 ++-----------
 2 files changed, 2 insertions(+), 14 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -221,11 +221,8 @@
 /* The Spectre V2 mitigation variants */
 enum spectre_v2_mitigation {
 	SPECTRE_V2_NONE,
-	SPECTRE_V2_RETPOLINE_MINIMAL,
-	SPECTRE_V2_RETPOLINE_MINIMAL_AMD,
 	SPECTRE_V2_RETPOLINE_GENERIC,
 	SPECTRE_V2_RETPOLINE_AMD,
-	SPECTRE_V2_IBRS,
 	SPECTRE_V2_IBRS_ENHANCED,
 };
 
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -134,8 +134,6 @@ enum spectre_v2_mitigation_cmd {
 
 static const char *spectre_v2_strings[] = {
 	[SPECTRE_V2_NONE]			= "Vulnerable",
-	[SPECTRE_V2_RETPOLINE_MINIMAL]		= "Vulnerable: Minimal generic ASM retpoline",
-	[SPECTRE_V2_RETPOLINE_MINIMAL_AMD]	= "Vulnerable: Minimal AMD ASM retpoline",
 	[SPECTRE_V2_RETPOLINE_GENERIC]		= "Mitigation: Full generic retpoline",
 	[SPECTRE_V2_RETPOLINE_AMD]		= "Mitigation: Full AMD retpoline",
 	[SPECTRE_V2_IBRS_ENHANCED]		= "Mitigation: Enhanced IBRS",
@@ -249,11 +247,6 @@ static void __init spec2_print_if_secure
 		pr_info("%s selected on command line.\n", reason);
 }
 
-static inline bool retp_compiler(void)
-{
-	return __is_defined(CONFIG_RETPOLINE);
-}
-
 static inline bool match_option(const char *arg, int arglen, const char *opt)
 {
 	int len = strlen(opt);
@@ -414,14 +407,12 @@ retpoline_auto:
 			pr_err("Spectre mitigation: LFENCE not serializing, switching to generic retpoline\n");
 			goto retpoline_generic;
 		}
-		mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_AMD :
-					 SPECTRE_V2_RETPOLINE_MINIMAL_AMD;
+		mode = SPECTRE_V2_RETPOLINE_AMD;
 		setup_force_cpu_cap(X86_FEATURE_RETPOLINE_AMD);
 		setup_force_cpu_cap(X86_FEATURE_RETPOLINE);
 	} else {
 	retpoline_generic:
-		mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_GENERIC :
-					 SPECTRE_V2_RETPOLINE_MINIMAL;
+		mode = SPECTRE_V2_RETPOLINE_GENERIC;
 		setup_force_cpu_cap(X86_FEATURE_RETPOLINE);
 	}
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 080/146] x86/speculation: Update the TIF_SSBD comment
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (78 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 079/146] x86/retpoline: Remove minimal retpoline support Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 081/146] x86/speculation: Clean up spectre_v2_parse_cmdline() Greg Kroah-Hartman
                   ` (70 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tim Chen, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Andy Lutomirski, Linus Torvalds,
	Jiri Kosina, Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli,
	David Woodhouse, Andi Kleen, Dave Hansen, Casey Schaufler,
	Asit Mallick, Arjan van de Ven, Jon Masters, Waiman Long,
	Dave Stewart, Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tim Chen tim.c.chen@linux.intel.com

commit 8eb729b77faf83ac4c1f363a9ad68d042415f24c upstream

"Reduced Data Speculation" is an obsolete term. The correct new name is
"Speculative store bypass disable" - which is abbreviated into SSBD.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185003.593893901@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/thread_info.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -81,7 +81,7 @@ struct thread_info {
 #define TIF_SIGPENDING		2	/* signal pending */
 #define TIF_NEED_RESCHED	3	/* rescheduling necessary */
 #define TIF_SINGLESTEP		4	/* reenable singlestep on user return*/
-#define TIF_SSBD			5	/* Reduced data speculation */
+#define TIF_SSBD		5	/* Speculative store bypass disable */
 #define TIF_SYSCALL_EMU		6	/* syscall emulation active */
 #define TIF_SYSCALL_AUDIT	7	/* syscall auditing active */
 #define TIF_SECCOMP		8	/* secure computing */



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 081/146] x86/speculation: Clean up spectre_v2_parse_cmdline()
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (79 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 080/146] x86/speculation: Update the TIF_SSBD comment Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 082/146] x86/speculation: Remove unnecessary ret variable in cpu_show_common() Greg Kroah-Hartman
                   ` (69 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tim Chen, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Andy Lutomirski, Linus Torvalds,
	Jiri Kosina, Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli,
	David Woodhouse, Andi Kleen, Dave Hansen, Casey Schaufler,
	Asit Mallick, Arjan van de Ven, Jon Masters, Waiman Long,
	Dave Stewart, Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tim Chen tim.c.chen@linux.intel.com

commit 24848509aa55eac39d524b587b051f4e86df3c12 upstream

Remove the unnecessary 'else' statement in spectre_v2_parse_cmdline()
to save an indentation level.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185003.688010903@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/bugs.c |   27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -275,22 +275,21 @@ static enum spectre_v2_mitigation_cmd __
 
 	if (cmdline_find_option_bool(boot_command_line, "nospectre_v2"))
 		return SPECTRE_V2_CMD_NONE;
-	else {
-		ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, sizeof(arg));
-		if (ret < 0)
-			return SPECTRE_V2_CMD_AUTO;
 
-		for (i = 0; i < ARRAY_SIZE(mitigation_options); i++) {
-			if (!match_option(arg, ret, mitigation_options[i].option))
-				continue;
-			cmd = mitigation_options[i].cmd;
-			break;
-		}
+	ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, sizeof(arg));
+	if (ret < 0)
+		return SPECTRE_V2_CMD_AUTO;
 
-		if (i >= ARRAY_SIZE(mitigation_options)) {
-			pr_err("unknown option (%s). Switching to AUTO select\n", arg);
-			return SPECTRE_V2_CMD_AUTO;
-		}
+	for (i = 0; i < ARRAY_SIZE(mitigation_options); i++) {
+		if (!match_option(arg, ret, mitigation_options[i].option))
+			continue;
+		cmd = mitigation_options[i].cmd;
+		break;
+	}
+
+	if (i >= ARRAY_SIZE(mitigation_options)) {
+		pr_err("unknown option (%s). Switching to AUTO select\n", arg);
+		return SPECTRE_V2_CMD_AUTO;
 	}
 
 	if ((cmd == SPECTRE_V2_CMD_RETPOLINE ||



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 082/146] x86/speculation: Remove unnecessary ret variable in cpu_show_common()
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (80 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 081/146] x86/speculation: Clean up spectre_v2_parse_cmdline() Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 083/146] x86/speculation: Move STIPB/IBPB string conditionals out of cpu_show_common() Greg Kroah-Hartman
                   ` (68 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tim Chen, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Andy Lutomirski, Linus Torvalds,
	Jiri Kosina, Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli,
	David Woodhouse, Andi Kleen, Dave Hansen, Casey Schaufler,
	Asit Mallick, Arjan van de Ven, Jon Masters, Waiman Long,
	Dave Stewart, Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tim Chen tim.c.chen@linux.intel.com

commit b86bda0426853bfe8a3506c7d2a5b332760ae46b upstream

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185003.783903657@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/bugs.c |    5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -844,8 +844,6 @@ static ssize_t l1tf_show_state(char *buf
 static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
 			       char *buf, unsigned int bug)
 {
-	int ret;
-
 	if (!boot_cpu_has_bug(bug))
 		return sprintf(buf, "Not affected\n");
 
@@ -860,13 +858,12 @@ static ssize_t cpu_show_common(struct de
 		return sprintf(buf, "Mitigation: __user pointer sanitization\n");
 
 	case X86_BUG_SPECTRE_V2:
-		ret = sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+		return sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
 			       boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
 			       boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
 			       (x86_spec_ctrl_base & SPEC_CTRL_STIBP) ? ", STIBP" : "",
 			       boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "",
 			       spectre_v2_module_string());
-		return ret;
 
 	case X86_BUG_SPEC_STORE_BYPASS:
 		return sprintf(buf, "%s\n", ssb_strings[ssb_mode]);



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 083/146] x86/speculation: Move STIPB/IBPB string conditionals out of cpu_show_common()
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (81 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 082/146] x86/speculation: Remove unnecessary ret variable in cpu_show_common() Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 084/146] x86/speculation: Disable STIBP when enhanced IBRS is in use Greg Kroah-Hartman
                   ` (67 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tim Chen, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Andy Lutomirski, Linus Torvalds,
	Jiri Kosina, Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli,
	David Woodhouse, Andi Kleen, Dave Hansen, Casey Schaufler,
	Asit Mallick, Arjan van de Ven, Jon Masters, Waiman Long,
	Dave Stewart, Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tim Chen tim.c.chen@linux.intel.com

commit a8f76ae41cd633ac00be1b3019b1eb4741be3828 upstream

The Spectre V2 printout in cpu_show_common() handles conditionals for the
various mitigation methods directly in the sprintf() argument list. That's
hard to read and will become unreadable if more complex decisions need to
be made for a particular method.

Move the conditionals for STIBP and IBPB string selection into helper
functions, so they can be extended later on.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185003.874479208@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/bugs.c |   20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -841,6 +841,22 @@ static ssize_t l1tf_show_state(char *buf
 }
 #endif
 
+static char *stibp_state(void)
+{
+	if (x86_spec_ctrl_base & SPEC_CTRL_STIBP)
+		return ", STIBP";
+	else
+		return "";
+}
+
+static char *ibpb_state(void)
+{
+	if (boot_cpu_has(X86_FEATURE_USE_IBPB))
+		return ", IBPB";
+	else
+		return "";
+}
+
 static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
 			       char *buf, unsigned int bug)
 {
@@ -859,9 +875,9 @@ static ssize_t cpu_show_common(struct de
 
 	case X86_BUG_SPECTRE_V2:
 		return sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
-			       boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
+			       ibpb_state(),
 			       boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
-			       (x86_spec_ctrl_base & SPEC_CTRL_STIBP) ? ", STIBP" : "",
+			       stibp_state(),
 			       boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "",
 			       spectre_v2_module_string());
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 084/146] x86/speculation: Disable STIBP when enhanced IBRS is in use
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (82 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 083/146] x86/speculation: Move STIPB/IBPB string conditionals out of cpu_show_common() Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 085/146] x86/speculation: Rename SSBD update functions Greg Kroah-Hartman
                   ` (66 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tim Chen, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Andy Lutomirski, Linus Torvalds,
	Jiri Kosina, Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli,
	David Woodhouse, Andi Kleen, Dave Hansen, Casey Schaufler,
	Asit Mallick, Arjan van de Ven, Jon Masters, Waiman Long,
	Dave Stewart, Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tim Chen tim.c.chen@linux.intel.com

commit 34bce7c9690b1d897686aac89604ba7adc365556 upstream

If enhanced IBRS is active, STIBP is redundant for mitigating Spectre v2
user space exploits from hyperthread sibling.

Disable STIBP when enhanced IBRS is used.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185003.966801480@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/bugs.c |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -319,6 +319,10 @@ static bool stibp_needed(void)
 	if (spectre_v2_enabled == SPECTRE_V2_NONE)
 		return false;
 
+	/* Enhanced IBRS makes using STIBP unnecessary. */
+	if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
+		return false;
+
 	if (!boot_cpu_has(X86_FEATURE_STIBP))
 		return false;
 
@@ -843,6 +847,9 @@ static ssize_t l1tf_show_state(char *buf
 
 static char *stibp_state(void)
 {
+	if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
+		return "";
+
 	if (x86_spec_ctrl_base & SPEC_CTRL_STIBP)
 		return ", STIBP";
 	else



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 085/146] x86/speculation: Rename SSBD update functions
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (83 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 084/146] x86/speculation: Disable STIBP when enhanced IBRS is in use Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 086/146] x86/speculation: Reorganize speculation control MSRs update Greg Kroah-Hartman
                   ` (65 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tim Chen, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Andy Lutomirski, Linus Torvalds,
	Jiri Kosina, Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli,
	David Woodhouse, Andi Kleen, Dave Hansen, Casey Schaufler,
	Asit Mallick, Arjan van de Ven, Jon Masters, Waiman Long,
	Dave Stewart, Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit 26c4d75b234040c11728a8acb796b3a85ba7507c upstream

During context switch, the SSBD bit in SPEC_CTRL MSR is updated according
to changes of the TIF_SSBD flag in the current and next running task.

Currently, only the bit controlling speculative store bypass disable in
SPEC_CTRL MSR is updated and the related update functions all have
"speculative_store" or "ssb" in their names.

For enhanced mitigation control other bits in SPEC_CTRL MSR need to be
updated as well, which makes the SSB names inadequate.

Rename the "speculative_store*" functions to a more generic name. No
functional change.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185004.058866968@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/spec-ctrl.h |    6 +++---
 arch/x86/kernel/cpu/bugs.c       |    4 ++--
 arch/x86/kernel/process.c        |   12 ++++++------
 3 files changed, 11 insertions(+), 11 deletions(-)

--- a/arch/x86/include/asm/spec-ctrl.h
+++ b/arch/x86/include/asm/spec-ctrl.h
@@ -70,11 +70,11 @@ extern void speculative_store_bypass_ht_
 static inline void speculative_store_bypass_ht_init(void) { }
 #endif
 
-extern void speculative_store_bypass_update(unsigned long tif);
+extern void speculation_ctrl_update(unsigned long tif);
 
-static inline void speculative_store_bypass_update_current(void)
+static inline void speculation_ctrl_update_current(void)
 {
-	speculative_store_bypass_update(current_thread_info()->flags);
+	speculation_ctrl_update(current_thread_info()->flags);
 }
 
 #endif
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -199,7 +199,7 @@ x86_virt_spec_ctrl(u64 guest_spec_ctrl,
 		tif = setguest ? ssbd_spec_ctrl_to_tif(guestval) :
 				 ssbd_spec_ctrl_to_tif(hostval);
 
-		speculative_store_bypass_update(tif);
+		speculation_ctrl_update(tif);
 	}
 }
 EXPORT_SYMBOL_GPL(x86_virt_spec_ctrl);
@@ -629,7 +629,7 @@ static int ssb_prctl_set(struct task_str
 	 * mitigation until it is next scheduled.
 	 */
 	if (task == current && update)
-		speculative_store_bypass_update_current();
+		speculation_ctrl_update_current();
 
 	return 0;
 }
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -398,27 +398,27 @@ static __always_inline void amd_set_ssb_
 	wrmsrl(MSR_AMD64_VIRT_SPEC_CTRL, ssbd_tif_to_spec_ctrl(tifn));
 }
 
-static __always_inline void intel_set_ssb_state(unsigned long tifn)
+static __always_inline void spec_ctrl_update_msr(unsigned long tifn)
 {
 	u64 msr = x86_spec_ctrl_base | ssbd_tif_to_spec_ctrl(tifn);
 
 	wrmsrl(MSR_IA32_SPEC_CTRL, msr);
 }
 
-static __always_inline void __speculative_store_bypass_update(unsigned long tifn)
+static __always_inline void __speculation_ctrl_update(unsigned long tifn)
 {
 	if (static_cpu_has(X86_FEATURE_VIRT_SSBD))
 		amd_set_ssb_virt_state(tifn);
 	else if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD))
 		amd_set_core_ssb_state(tifn);
 	else
-		intel_set_ssb_state(tifn);
+		spec_ctrl_update_msr(tifn);
 }
 
-void speculative_store_bypass_update(unsigned long tif)
+void speculation_ctrl_update(unsigned long tif)
 {
 	preempt_disable();
-	__speculative_store_bypass_update(tif);
+	__speculation_ctrl_update(tif);
 	preempt_enable();
 }
 
@@ -455,7 +455,7 @@ void __switch_to_xtra(struct task_struct
 		set_cpuid_faulting(!!(tifn & _TIF_NOCPUID));
 
 	if ((tifp ^ tifn) & _TIF_SSBD)
-		__speculative_store_bypass_update(tifn);
+		__speculation_ctrl_update(tifn);
 }
 
 /*



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 086/146] x86/speculation: Reorganize speculation control MSRs update
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (84 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 085/146] x86/speculation: Rename SSBD update functions Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 087/146] sched/smt: Make sched_smt_present track topology Greg Kroah-Hartman
                   ` (64 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tim Chen, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Andy Lutomirski, Linus Torvalds,
	Jiri Kosina, Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli,
	David Woodhouse, Andi Kleen, Dave Hansen, Casey Schaufler,
	Asit Mallick, Arjan van de Ven, Jon Masters, Waiman Long,
	Dave Stewart, Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tim Chen tim.c.chen@linux.intel.com

commit 01daf56875ee0cd50ed496a09b20eb369b45dfa5 upstream

The logic to detect whether there's a change in the previous and next
task's flag relevant to update speculation control MSRs is spread out
across multiple functions.

Consolidate all checks needed for updating speculation control MSRs into
the new __speculation_ctrl_update() helper function.

This makes it easy to pick the right speculation control MSR and the bits
in MSR_IA32_SPEC_CTRL that need updating based on TIF flags changes.

Originally-by: Thomas Lendacky <Thomas.Lendacky@amd.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185004.151077005@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/process.c |   42 +++++++++++++++++++++++++++---------------
 1 file changed, 27 insertions(+), 15 deletions(-)

--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -398,27 +398,40 @@ static __always_inline void amd_set_ssb_
 	wrmsrl(MSR_AMD64_VIRT_SPEC_CTRL, ssbd_tif_to_spec_ctrl(tifn));
 }
 
-static __always_inline void spec_ctrl_update_msr(unsigned long tifn)
+/*
+ * Update the MSRs managing speculation control, during context switch.
+ *
+ * tifp: Previous task's thread flags
+ * tifn: Next task's thread flags
+ */
+static __always_inline void __speculation_ctrl_update(unsigned long tifp,
+						      unsigned long tifn)
 {
-	u64 msr = x86_spec_ctrl_base | ssbd_tif_to_spec_ctrl(tifn);
+	u64 msr = x86_spec_ctrl_base;
+	bool updmsr = false;
 
-	wrmsrl(MSR_IA32_SPEC_CTRL, msr);
-}
+	/* If TIF_SSBD is different, select the proper mitigation method */
+	if ((tifp ^ tifn) & _TIF_SSBD) {
+		if (static_cpu_has(X86_FEATURE_VIRT_SSBD)) {
+			amd_set_ssb_virt_state(tifn);
+		} else if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD)) {
+			amd_set_core_ssb_state(tifn);
+		} else if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD) ||
+			   static_cpu_has(X86_FEATURE_AMD_SSBD)) {
+			msr |= ssbd_tif_to_spec_ctrl(tifn);
+			updmsr  = true;
+		}
+	}
 
-static __always_inline void __speculation_ctrl_update(unsigned long tifn)
-{
-	if (static_cpu_has(X86_FEATURE_VIRT_SSBD))
-		amd_set_ssb_virt_state(tifn);
-	else if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD))
-		amd_set_core_ssb_state(tifn);
-	else
-		spec_ctrl_update_msr(tifn);
+	if (updmsr)
+		wrmsrl(MSR_IA32_SPEC_CTRL, msr);
 }
 
 void speculation_ctrl_update(unsigned long tif)
 {
+	/* Forced update. Make sure all relevant TIF flags are different */
 	preempt_disable();
-	__speculation_ctrl_update(tif);
+	__speculation_ctrl_update(~tif, tif);
 	preempt_enable();
 }
 
@@ -454,8 +467,7 @@ void __switch_to_xtra(struct task_struct
 	if ((tifp ^ tifn) & _TIF_NOCPUID)
 		set_cpuid_faulting(!!(tifn & _TIF_NOCPUID));
 
-	if ((tifp ^ tifn) & _TIF_SSBD)
-		__speculation_ctrl_update(tifn);
+	__speculation_ctrl_update(tifp, tifn);
 }
 
 /*



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 087/146] sched/smt: Make sched_smt_present track topology
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (85 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 086/146] x86/speculation: Reorganize speculation control MSRs update Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 088/146] x86/Kconfig: Select SCHED_SMT if SMP enabled Greg Kroah-Hartman
                   ` (63 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Peter Zijlstra (Intel),
	Thomas Gleixner, Ingo Molnar, Andy Lutomirski, Linus Torvalds,
	Jiri Kosina, Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli,
	David Woodhouse, Tim Chen, Andi Kleen, Dave Hansen,
	Casey Schaufler, Asit Mallick, Arjan van de Ven, Jon Masters,
	Waiman Long, Dave Stewart, Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Zijlstra (Intel) peterz@infradead.org

commit c5511d03ec090980732e929c318a7a6374b5550e upstream

Currently the 'sched_smt_present' static key is enabled when at CPU bringup
SMT topology is observed, but it is never disabled. However there is demand
to also disable the key when the topology changes such that there is no SMT
present anymore.

Implement this by making the key count the number of cores that have SMT
enabled.

In particular, the SMT topology bits are set before interrrupts are enabled
and similarly, are cleared after interrupts are disabled for the last time
and the CPU dies.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185004.246110444@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/sched/core.c |   19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5617,15 +5617,10 @@ int sched_cpu_activate(unsigned int cpu)
 
 #ifdef CONFIG_SCHED_SMT
 	/*
-	 * The sched_smt_present static key needs to be evaluated on every
-	 * hotplug event because at boot time SMT might be disabled when
-	 * the number of booted CPUs is limited.
-	 *
-	 * If then later a sibling gets hotplugged, then the key would stay
-	 * off and SMT scheduling would never be functional.
+	 * When going up, increment the number of cores with SMT present.
 	 */
-	if (cpumask_weight(cpu_smt_mask(cpu)) > 1)
-		static_branch_enable_cpuslocked(&sched_smt_present);
+	if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+		static_branch_inc_cpuslocked(&sched_smt_present);
 #endif
 	set_cpu_active(cpu, true);
 
@@ -5669,6 +5664,14 @@ int sched_cpu_deactivate(unsigned int cp
 	 */
 	synchronize_rcu_mult(call_rcu, call_rcu_sched);
 
+#ifdef CONFIG_SCHED_SMT
+	/*
+	 * When going down, decrement the number of cores with SMT present.
+	 */
+	if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+		static_branch_dec_cpuslocked(&sched_smt_present);
+#endif
+
 	if (!sched_smp_initialized)
 		return 0;
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 088/146] x86/Kconfig: Select SCHED_SMT if SMP enabled
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (86 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 087/146] sched/smt: Make sched_smt_present track topology Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 089/146] sched/smt: Expose sched_smt_present static key Greg Kroah-Hartman
                   ` (62 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit dbe733642e01dd108f71436aaea7b328cb28fd87 upstream

CONFIG_SCHED_SMT is enabled by all distros, so there is not a real point to
have it configurable. The runtime overhead in the core scheduler code is
minimal because the actual SMT scheduling parts are conditional on a static
key.

This allows to expose the scheduler's SMT state static key to the
speculation control code. Alternatively the scheduler's static key could be
made always available when CONFIG_SMP is enabled, but that's just adding an
unused static key to every other architecture for nothing.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185004.337452245@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/Kconfig |    8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -955,13 +955,7 @@ config NR_CPUS
 	  approximately eight kilobytes to the kernel image.
 
 config SCHED_SMT
-	bool "SMT (Hyperthreading) scheduler support"
-	depends on SMP
-	---help---
-	  SMT scheduler support improves the CPU scheduler's decision making
-	  when dealing with Intel Pentium 4 chips with HyperThreading at a
-	  cost of slightly increased overhead in some places. If unsure say
-	  N here.
+	def_bool y if SMP
 
 config SCHED_MC
 	def_bool y



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 089/146] sched/smt: Expose sched_smt_present static key
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (87 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 088/146] x86/Kconfig: Select SCHED_SMT if SMP enabled Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 090/146] x86/speculation: Rework SMT state change Greg Kroah-Hartman
                   ` (61 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit 321a874a7ef85655e93b3206d0f36b4a6097f948 upstream

Make the scheduler's 'sched_smt_present' static key globaly available, so
it can be used in the x86 speculation control code.

Provide a query function and a stub for the CONFIG_SMP=n case.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185004.430168326@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/sched/smt.h |   18 ++++++++++++++++++
 kernel/sched/sched.h      |    4 +---
 2 files changed, 19 insertions(+), 3 deletions(-)

--- /dev/null
+++ b/include/linux/sched/smt.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_SCHED_SMT_H
+#define _LINUX_SCHED_SMT_H
+
+#include <linux/static_key.h>
+
+#ifdef CONFIG_SCHED_SMT
+extern struct static_key_false sched_smt_present;
+
+static __always_inline bool sched_smt_active(void)
+{
+	return static_branch_likely(&sched_smt_present);
+}
+#else
+static inline bool sched_smt_active(void) { return false; }
+#endif
+
+#endif
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -20,6 +20,7 @@
 #include <linux/sched/task_stack.h>
 #include <linux/sched/cputime.h>
 #include <linux/sched/init.h>
+#include <linux/sched/smt.h>
 
 #include <linux/u64_stats_sync.h>
 #include <linux/kernel_stat.h>
@@ -825,9 +826,6 @@ static inline int cpu_of(struct rq *rq)
 
 
 #ifdef CONFIG_SCHED_SMT
-
-extern struct static_key_false sched_smt_present;
-
 extern void __update_idle_core(struct rq *rq);
 
 static inline void update_idle_core(struct rq *rq)



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 090/146] x86/speculation: Rework SMT state change
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (88 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 089/146] sched/smt: Expose sched_smt_present static key Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 091/146] x86/l1tf: Show actual SMT state Greg Kroah-Hartman
                   ` (60 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit a74cfffb03b73d41e08f84c2e5c87dec0ce3db9f upstream

arch_smt_update() is only called when the sysfs SMT control knob is
changed. This means that when SMT is enabled in the sysfs control knob the
system is considered to have SMT active even if all siblings are offline.

To allow finegrained control of the speculation mitigations, the actual SMT
state is more interesting than the fact that siblings could be enabled.

Rework the code, so arch_smt_update() is invoked from each individual CPU
hotplug function, and simplify the update function while at it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185004.521974984@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/bugs.c |   11 +++++------
 include/linux/sched/smt.h  |    2 ++
 kernel/cpu.c               |   15 +++++++++------
 3 files changed, 16 insertions(+), 12 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -14,6 +14,7 @@
 #include <linux/module.h>
 #include <linux/nospec.h>
 #include <linux/prctl.h>
+#include <linux/sched/smt.h>
 
 #include <asm/spec-ctrl.h>
 #include <asm/cmdline.h>
@@ -342,16 +343,14 @@ void arch_smt_update(void)
 		return;
 
 	mutex_lock(&spec_ctrl_mutex);
-	mask = x86_spec_ctrl_base;
-	if (cpu_smt_control == CPU_SMT_ENABLED)
+
+	mask = x86_spec_ctrl_base & ~SPEC_CTRL_STIBP;
+	if (sched_smt_active())
 		mask |= SPEC_CTRL_STIBP;
-	else
-		mask &= ~SPEC_CTRL_STIBP;
 
 	if (mask != x86_spec_ctrl_base) {
 		pr_info("Spectre v2 cross-process SMT mitigation: %s STIBP\n",
-				cpu_smt_control == CPU_SMT_ENABLED ?
-				"Enabling" : "Disabling");
+			mask & SPEC_CTRL_STIBP ? "Enabling" : "Disabling");
 		x86_spec_ctrl_base = mask;
 		on_each_cpu(update_stibp_msr, NULL, 1);
 	}
--- a/include/linux/sched/smt.h
+++ b/include/linux/sched/smt.h
@@ -15,4 +15,6 @@ static __always_inline bool sched_smt_ac
 static inline bool sched_smt_active(void) { return false; }
 #endif
 
+void arch_smt_update(void);
+
 #endif
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -10,6 +10,7 @@
 #include <linux/sched/signal.h>
 #include <linux/sched/hotplug.h>
 #include <linux/sched/task.h>
+#include <linux/sched/smt.h>
 #include <linux/unistd.h>
 #include <linux/cpu.h>
 #include <linux/oom.h>
@@ -347,6 +348,12 @@ void cpu_hotplug_enable(void)
 EXPORT_SYMBOL_GPL(cpu_hotplug_enable);
 #endif	/* CONFIG_HOTPLUG_CPU */
 
+/*
+ * Architectures that need SMT-specific errata handling during SMT hotplug
+ * should override this.
+ */
+void __weak arch_smt_update(void) { }
+
 #ifdef CONFIG_HOTPLUG_SMT
 enum cpuhp_smt_control cpu_smt_control __read_mostly = CPU_SMT_ENABLED;
 EXPORT_SYMBOL_GPL(cpu_smt_control);
@@ -998,6 +1005,7 @@ out:
 	 * concurrent CPU hotplug via cpu_add_remove_lock.
 	 */
 	lockup_detector_cleanup();
+	arch_smt_update();
 	return ret;
 }
 
@@ -1126,6 +1134,7 @@ static int _cpu_up(unsigned int cpu, int
 	ret = cpuhp_up_callbacks(cpu, st, target);
 out:
 	cpus_write_unlock();
+	arch_smt_update();
 	return ret;
 }
 
@@ -2045,12 +2054,6 @@ static void cpuhp_online_cpu_device(unsi
 	kobject_uevent(&dev->kobj, KOBJ_ONLINE);
 }
 
-/*
- * Architectures that need SMT-specific errata handling during SMT hotplug
- * should override this.
- */
-void __weak arch_smt_update(void) { };
-
 static int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
 {
 	int cpu, ret = 0;



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 091/146] x86/l1tf: Show actual SMT state
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (89 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 090/146] x86/speculation: Rework SMT state change Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 092/146] x86/speculation: Reorder the spec_v2 code Greg Kroah-Hartman
                   ` (59 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit 130d6f946f6f2a972ee3ec8540b7243ab99abe97 upstream

Use the now exposed real SMT state, not the SMT sysfs control knob
state. This reflects the state of the system when the mitigation status is
queried.

This does not change the warning in the VMX launch code. There the
dependency on the control knob makes sense because siblings could be
brought online anytime after launching the VM.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185004.613357354@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/bugs.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -829,13 +829,14 @@ static ssize_t l1tf_show_state(char *buf
 
 	if (l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_EPT_DISABLED ||
 	    (l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_NEVER &&
-	     cpu_smt_control == CPU_SMT_ENABLED))
+	     sched_smt_active())) {
 		return sprintf(buf, "%s; VMX: %s\n", L1TF_DEFAULT_MSG,
 			       l1tf_vmx_states[l1tf_vmx_mitigation]);
+	}
 
 	return sprintf(buf, "%s; VMX: %s, SMT %s\n", L1TF_DEFAULT_MSG,
 		       l1tf_vmx_states[l1tf_vmx_mitigation],
-		       cpu_smt_control == CPU_SMT_ENABLED ? "vulnerable" : "disabled");
+		       sched_smt_active() ? "vulnerable" : "disabled");
 }
 #else
 static ssize_t l1tf_show_state(char *buf)



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 092/146] x86/speculation: Reorder the spec_v2 code
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (90 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 091/146] x86/l1tf: Show actual SMT state Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 093/146] x86/speculation: Mark string arrays const correctly Greg Kroah-Hartman
                   ` (58 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit 15d6b7aab0793b2de8a05d8a828777dd24db424e upstream

Reorder the code so it is better grouped. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185004.707122879@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/bugs.c |  168 ++++++++++++++++++++++-----------------------
 1 file changed, 84 insertions(+), 84 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -123,29 +123,6 @@ void __init check_bugs(void)
 #endif
 }
 
-/* The kernel command line selection */
-enum spectre_v2_mitigation_cmd {
-	SPECTRE_V2_CMD_NONE,
-	SPECTRE_V2_CMD_AUTO,
-	SPECTRE_V2_CMD_FORCE,
-	SPECTRE_V2_CMD_RETPOLINE,
-	SPECTRE_V2_CMD_RETPOLINE_GENERIC,
-	SPECTRE_V2_CMD_RETPOLINE_AMD,
-};
-
-static const char *spectre_v2_strings[] = {
-	[SPECTRE_V2_NONE]			= "Vulnerable",
-	[SPECTRE_V2_RETPOLINE_GENERIC]		= "Mitigation: Full generic retpoline",
-	[SPECTRE_V2_RETPOLINE_AMD]		= "Mitigation: Full AMD retpoline",
-	[SPECTRE_V2_IBRS_ENHANCED]		= "Mitigation: Enhanced IBRS",
-};
-
-#undef pr_fmt
-#define pr_fmt(fmt)     "Spectre V2 : " fmt
-
-static enum spectre_v2_mitigation spectre_v2_enabled __ro_after_init =
-	SPECTRE_V2_NONE;
-
 void
 x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
 {
@@ -215,6 +192,12 @@ static void x86_amd_ssb_disable(void)
 		wrmsrl(MSR_AMD64_LS_CFG, msrval);
 }
 
+#undef pr_fmt
+#define pr_fmt(fmt)     "Spectre V2 : " fmt
+
+static enum spectre_v2_mitigation spectre_v2_enabled __ro_after_init =
+	SPECTRE_V2_NONE;
+
 #ifdef RETPOLINE
 static bool spectre_v2_bad_module;
 
@@ -236,18 +219,6 @@ static inline const char *spectre_v2_mod
 static inline const char *spectre_v2_module_string(void) { return ""; }
 #endif
 
-static void __init spec2_print_if_insecure(const char *reason)
-{
-	if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
-		pr_info("%s selected on command line.\n", reason);
-}
-
-static void __init spec2_print_if_secure(const char *reason)
-{
-	if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
-		pr_info("%s selected on command line.\n", reason);
-}
-
 static inline bool match_option(const char *arg, int arglen, const char *opt)
 {
 	int len = strlen(opt);
@@ -255,24 +226,53 @@ static inline bool match_option(const ch
 	return len == arglen && !strncmp(arg, opt, len);
 }
 
+/* The kernel command line selection for spectre v2 */
+enum spectre_v2_mitigation_cmd {
+	SPECTRE_V2_CMD_NONE,
+	SPECTRE_V2_CMD_AUTO,
+	SPECTRE_V2_CMD_FORCE,
+	SPECTRE_V2_CMD_RETPOLINE,
+	SPECTRE_V2_CMD_RETPOLINE_GENERIC,
+	SPECTRE_V2_CMD_RETPOLINE_AMD,
+};
+
+static const char *spectre_v2_strings[] = {
+	[SPECTRE_V2_NONE]			= "Vulnerable",
+	[SPECTRE_V2_RETPOLINE_GENERIC]		= "Mitigation: Full generic retpoline",
+	[SPECTRE_V2_RETPOLINE_AMD]		= "Mitigation: Full AMD retpoline",
+	[SPECTRE_V2_IBRS_ENHANCED]		= "Mitigation: Enhanced IBRS",
+};
+
 static const struct {
 	const char *option;
 	enum spectre_v2_mitigation_cmd cmd;
 	bool secure;
 } mitigation_options[] = {
-	{ "off",               SPECTRE_V2_CMD_NONE,              false },
-	{ "on",                SPECTRE_V2_CMD_FORCE,             true },
-	{ "retpoline",         SPECTRE_V2_CMD_RETPOLINE,         false },
-	{ "retpoline,amd",     SPECTRE_V2_CMD_RETPOLINE_AMD,     false },
-	{ "retpoline,generic", SPECTRE_V2_CMD_RETPOLINE_GENERIC, false },
-	{ "auto",              SPECTRE_V2_CMD_AUTO,              false },
+	{ "off",		SPECTRE_V2_CMD_NONE,		  false },
+	{ "on",			SPECTRE_V2_CMD_FORCE,		  true  },
+	{ "retpoline",		SPECTRE_V2_CMD_RETPOLINE,	  false },
+	{ "retpoline,amd",	SPECTRE_V2_CMD_RETPOLINE_AMD,	  false },
+	{ "retpoline,generic",	SPECTRE_V2_CMD_RETPOLINE_GENERIC, false },
+	{ "auto",		SPECTRE_V2_CMD_AUTO,		  false },
 };
 
+static void __init spec2_print_if_insecure(const char *reason)
+{
+	if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
+		pr_info("%s selected on command line.\n", reason);
+}
+
+static void __init spec2_print_if_secure(const char *reason)
+{
+	if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
+		pr_info("%s selected on command line.\n", reason);
+}
+
 static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void)
 {
+	enum spectre_v2_mitigation_cmd cmd = SPECTRE_V2_CMD_AUTO;
 	char arg[20];
 	int ret, i;
-	enum spectre_v2_mitigation_cmd cmd = SPECTRE_V2_CMD_AUTO;
 
 	if (cmdline_find_option_bool(boot_command_line, "nospectre_v2"))
 		return SPECTRE_V2_CMD_NONE;
@@ -315,48 +315,6 @@ static enum spectre_v2_mitigation_cmd __
 	return cmd;
 }
 
-static bool stibp_needed(void)
-{
-	if (spectre_v2_enabled == SPECTRE_V2_NONE)
-		return false;
-
-	/* Enhanced IBRS makes using STIBP unnecessary. */
-	if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
-		return false;
-
-	if (!boot_cpu_has(X86_FEATURE_STIBP))
-		return false;
-
-	return true;
-}
-
-static void update_stibp_msr(void *info)
-{
-	wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
-}
-
-void arch_smt_update(void)
-{
-	u64 mask;
-
-	if (!stibp_needed())
-		return;
-
-	mutex_lock(&spec_ctrl_mutex);
-
-	mask = x86_spec_ctrl_base & ~SPEC_CTRL_STIBP;
-	if (sched_smt_active())
-		mask |= SPEC_CTRL_STIBP;
-
-	if (mask != x86_spec_ctrl_base) {
-		pr_info("Spectre v2 cross-process SMT mitigation: %s STIBP\n",
-			mask & SPEC_CTRL_STIBP ? "Enabling" : "Disabling");
-		x86_spec_ctrl_base = mask;
-		on_each_cpu(update_stibp_msr, NULL, 1);
-	}
-	mutex_unlock(&spec_ctrl_mutex);
-}
-
 static void __init spectre_v2_select_mitigation(void)
 {
 	enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
@@ -459,6 +417,48 @@ specv2_set_mode:
 	arch_smt_update();
 }
 
+static bool stibp_needed(void)
+{
+	if (spectre_v2_enabled == SPECTRE_V2_NONE)
+		return false;
+
+	/* Enhanced IBRS makes using STIBP unnecessary. */
+	if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
+		return false;
+
+	if (!boot_cpu_has(X86_FEATURE_STIBP))
+		return false;
+
+	return true;
+}
+
+static void update_stibp_msr(void *info)
+{
+	wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
+}
+
+void arch_smt_update(void)
+{
+	u64 mask;
+
+	if (!stibp_needed())
+		return;
+
+	mutex_lock(&spec_ctrl_mutex);
+
+	mask = x86_spec_ctrl_base & ~SPEC_CTRL_STIBP;
+	if (sched_smt_active())
+		mask |= SPEC_CTRL_STIBP;
+
+	if (mask != x86_spec_ctrl_base) {
+		pr_info("Spectre v2 cross-process SMT mitigation: %s STIBP\n",
+			mask & SPEC_CTRL_STIBP ? "Enabling" : "Disabling");
+		x86_spec_ctrl_base = mask;
+		on_each_cpu(update_stibp_msr, NULL, 1);
+	}
+	mutex_unlock(&spec_ctrl_mutex);
+}
+
 #undef pr_fmt
 #define pr_fmt(fmt)	"Speculative Store Bypass: " fmt
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 093/146] x86/speculation: Mark string arrays const correctly
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (91 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 092/146] x86/speculation: Reorder the spec_v2 code Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 094/146] x86/speculataion: Mark command line parser data __initdata Greg Kroah-Hartman
                   ` (57 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit 8770709f411763884535662744a3786a1806afd3 upstream

checkpatch.pl muttered when reshuffling the code:
 WARNING: static const char * array should probably be static const char * const

Fix up all the string arrays.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185004.800018931@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/bugs.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -236,7 +236,7 @@ enum spectre_v2_mitigation_cmd {
 	SPECTRE_V2_CMD_RETPOLINE_AMD,
 };
 
-static const char *spectre_v2_strings[] = {
+static const char * const spectre_v2_strings[] = {
 	[SPECTRE_V2_NONE]			= "Vulnerable",
 	[SPECTRE_V2_RETPOLINE_GENERIC]		= "Mitigation: Full generic retpoline",
 	[SPECTRE_V2_RETPOLINE_AMD]		= "Mitigation: Full AMD retpoline",
@@ -473,7 +473,7 @@ enum ssb_mitigation_cmd {
 	SPEC_STORE_BYPASS_CMD_SECCOMP,
 };
 
-static const char *ssb_strings[] = {
+static const char * const ssb_strings[] = {
 	[SPEC_STORE_BYPASS_NONE]	= "Vulnerable",
 	[SPEC_STORE_BYPASS_DISABLE]	= "Mitigation: Speculative Store Bypass disabled",
 	[SPEC_STORE_BYPASS_PRCTL]	= "Mitigation: Speculative Store Bypass disabled via prctl",
@@ -813,7 +813,7 @@ early_param("l1tf", l1tf_cmdline);
 #define L1TF_DEFAULT_MSG "Mitigation: PTE Inversion"
 
 #if IS_ENABLED(CONFIG_KVM_INTEL)
-static const char *l1tf_vmx_states[] = {
+static const char * const l1tf_vmx_states[] = {
 	[VMENTER_L1D_FLUSH_AUTO]		= "auto",
 	[VMENTER_L1D_FLUSH_NEVER]		= "vulnerable",
 	[VMENTER_L1D_FLUSH_COND]		= "conditional cache flushes",



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 094/146] x86/speculataion: Mark command line parser data __initdata
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (92 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 093/146] x86/speculation: Mark string arrays const correctly Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 095/146] x86/speculation: Unify conditional spectre v2 print functions Greg Kroah-Hartman
                   ` (56 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit 30ba72a990f5096ae08f284de17986461efcc408 upstream

No point to keep that around.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185004.893886356@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/bugs.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -247,7 +247,7 @@ static const struct {
 	const char *option;
 	enum spectre_v2_mitigation_cmd cmd;
 	bool secure;
-} mitigation_options[] = {
+} mitigation_options[] __initdata = {
 	{ "off",		SPECTRE_V2_CMD_NONE,		  false },
 	{ "on",			SPECTRE_V2_CMD_FORCE,		  true  },
 	{ "retpoline",		SPECTRE_V2_CMD_RETPOLINE,	  false },
@@ -483,7 +483,7 @@ static const char * const ssb_strings[]
 static const struct {
 	const char *option;
 	enum ssb_mitigation_cmd cmd;
-} ssb_mitigation_options[] = {
+} ssb_mitigation_options[]  __initdata = {
 	{ "auto",	SPEC_STORE_BYPASS_CMD_AUTO },    /* Platform decides */
 	{ "on",		SPEC_STORE_BYPASS_CMD_ON },      /* Disable Speculative Store Bypass */
 	{ "off",	SPEC_STORE_BYPASS_CMD_NONE },    /* Don't touch Speculative Store Bypass */



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 095/146] x86/speculation: Unify conditional spectre v2 print functions
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (93 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 094/146] x86/speculataion: Mark command line parser data __initdata Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 096/146] x86/speculation: Add command line control for indirect branch speculation Greg Kroah-Hartman
                   ` (55 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit 495d470e9828500e0155027f230449ac5e29c025 upstream

There is no point in having two functions and a conditional at the call
site.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185004.986890749@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/bugs.c |   17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -256,15 +256,9 @@ static const struct {
 	{ "auto",		SPECTRE_V2_CMD_AUTO,		  false },
 };
 
-static void __init spec2_print_if_insecure(const char *reason)
+static void __init spec_v2_print_cond(const char *reason, bool secure)
 {
-	if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
-		pr_info("%s selected on command line.\n", reason);
-}
-
-static void __init spec2_print_if_secure(const char *reason)
-{
-	if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
+	if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2) != secure)
 		pr_info("%s selected on command line.\n", reason);
 }
 
@@ -307,11 +301,8 @@ static enum spectre_v2_mitigation_cmd __
 		return SPECTRE_V2_CMD_AUTO;
 	}
 
-	if (mitigation_options[i].secure)
-		spec2_print_if_secure(mitigation_options[i].option);
-	else
-		spec2_print_if_insecure(mitigation_options[i].option);
-
+	spec_v2_print_cond(mitigation_options[i].option,
+			   mitigation_options[i].secure);
 	return cmd;
 }
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 096/146] x86/speculation: Add command line control for indirect branch speculation
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (94 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 095/146] x86/speculation: Unify conditional spectre v2 print functions Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 097/146] x86/speculation: Prepare for per task indirect branch speculation control Greg Kroah-Hartman
                   ` (54 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit fa1202ef224391b6f5b26cdd44cc50495e8fab54 upstream

Add command line control for user space indirect branch speculation
mitigations. The new option is: spectre_v2_user=

The initial options are:

    -  on:   Unconditionally enabled
    - off:   Unconditionally disabled
    -auto:   Kernel selects mitigation (default off for now)

When the spectre_v2= command line argument is either 'on' or 'off' this
implies that the application to application control follows that state even
if a contradicting spectre_v2_user= argument is supplied.

Originally-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185005.082720373@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/admin-guide/kernel-parameters.txt |   32 +++++
 arch/x86/include/asm/nospec-branch.h            |   10 +
 arch/x86/kernel/cpu/bugs.c                      |  133 ++++++++++++++++++++----
 3 files changed, 156 insertions(+), 19 deletions(-)

--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3994,9 +3994,13 @@
 
 	spectre_v2=	[X86] Control mitigation of Spectre variant 2
 			(indirect branch speculation) vulnerability.
+			The default operation protects the kernel from
+			user space attacks.
 
-			on   - unconditionally enable
-			off  - unconditionally disable
+			on   - unconditionally enable, implies
+			       spectre_v2_user=on
+			off  - unconditionally disable, implies
+			       spectre_v2_user=off
 			auto - kernel detects whether your CPU model is
 			       vulnerable
 
@@ -4006,6 +4010,12 @@
 			CONFIG_RETPOLINE configuration option, and the
 			compiler with which the kernel was built.
 
+			Selecting 'on' will also enable the mitigation
+			against user space to user space task attacks.
+
+			Selecting 'off' will disable both the kernel and
+			the user space protections.
+
 			Specific mitigations can also be selected manually:
 
 			retpoline	  - replace indirect branches
@@ -4015,6 +4025,24 @@
 			Not specifying this option is equivalent to
 			spectre_v2=auto.
 
+	spectre_v2_user=
+			[X86] Control mitigation of Spectre variant 2
+		        (indirect branch speculation) vulnerability between
+		        user space tasks
+
+			on	- Unconditionally enable mitigations. Is
+				  enforced by spectre_v2=on
+
+			off     - Unconditionally disable mitigations. Is
+				  enforced by spectre_v2=off
+
+			auto    - Kernel selects the mitigation depending on
+				  the available CPU features and vulnerability.
+				  Default is off.
+
+			Not specifying this option is equivalent to
+			spectre_v2_user=auto.
+
 	spec_store_bypass_disable=
 			[HW] Control Speculative Store Bypass (SSB) Disable mitigation
 			(Speculative Store Bypass vulnerability)
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -3,6 +3,8 @@
 #ifndef _ASM_X86_NOSPEC_BRANCH_H_
 #define _ASM_X86_NOSPEC_BRANCH_H_
 
+#include <linux/static_key.h>
+
 #include <asm/alternative.h>
 #include <asm/alternative-asm.h>
 #include <asm/cpufeatures.h>
@@ -226,6 +228,12 @@ enum spectre_v2_mitigation {
 	SPECTRE_V2_IBRS_ENHANCED,
 };
 
+/* The indirect branch speculation control variants */
+enum spectre_v2_user_mitigation {
+	SPECTRE_V2_USER_NONE,
+	SPECTRE_V2_USER_STRICT,
+};
+
 /* The Speculative Store Bypass disable variants */
 enum ssb_mitigation {
 	SPEC_STORE_BYPASS_NONE,
@@ -303,6 +311,8 @@ do {									\
 	preempt_enable();						\
 } while (0)
 
+DECLARE_STATIC_KEY_FALSE(switch_to_cond_stibp);
+
 #endif /* __ASSEMBLY__ */
 
 /*
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -53,6 +53,9 @@ static u64 __ro_after_init x86_spec_ctrl
 u64 __ro_after_init x86_amd_ls_cfg_base;
 u64 __ro_after_init x86_amd_ls_cfg_ssbd_mask;
 
+/* Control conditional STIPB in switch_to() */
+DEFINE_STATIC_KEY_FALSE(switch_to_cond_stibp);
+
 void __init check_bugs(void)
 {
 	identify_boot_cpu();
@@ -198,6 +201,9 @@ static void x86_amd_ssb_disable(void)
 static enum spectre_v2_mitigation spectre_v2_enabled __ro_after_init =
 	SPECTRE_V2_NONE;
 
+static enum spectre_v2_user_mitigation spectre_v2_user __ro_after_init =
+	SPECTRE_V2_USER_NONE;
+
 #ifdef RETPOLINE
 static bool spectre_v2_bad_module;
 
@@ -236,6 +242,104 @@ enum spectre_v2_mitigation_cmd {
 	SPECTRE_V2_CMD_RETPOLINE_AMD,
 };
 
+enum spectre_v2_user_cmd {
+	SPECTRE_V2_USER_CMD_NONE,
+	SPECTRE_V2_USER_CMD_AUTO,
+	SPECTRE_V2_USER_CMD_FORCE,
+};
+
+static const char * const spectre_v2_user_strings[] = {
+	[SPECTRE_V2_USER_NONE]		= "User space: Vulnerable",
+	[SPECTRE_V2_USER_STRICT]	= "User space: Mitigation: STIBP protection",
+};
+
+static const struct {
+	const char			*option;
+	enum spectre_v2_user_cmd	cmd;
+	bool				secure;
+} v2_user_options[] __initdata = {
+	{ "auto",	SPECTRE_V2_USER_CMD_AUTO,	false },
+	{ "off",	SPECTRE_V2_USER_CMD_NONE,	false },
+	{ "on",		SPECTRE_V2_USER_CMD_FORCE,	true  },
+};
+
+static void __init spec_v2_user_print_cond(const char *reason, bool secure)
+{
+	if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2) != secure)
+		pr_info("spectre_v2_user=%s forced on command line.\n", reason);
+}
+
+static enum spectre_v2_user_cmd __init
+spectre_v2_parse_user_cmdline(enum spectre_v2_mitigation_cmd v2_cmd)
+{
+	char arg[20];
+	int ret, i;
+
+	switch (v2_cmd) {
+	case SPECTRE_V2_CMD_NONE:
+		return SPECTRE_V2_USER_CMD_NONE;
+	case SPECTRE_V2_CMD_FORCE:
+		return SPECTRE_V2_USER_CMD_FORCE;
+	default:
+		break;
+	}
+
+	ret = cmdline_find_option(boot_command_line, "spectre_v2_user",
+				  arg, sizeof(arg));
+	if (ret < 0)
+		return SPECTRE_V2_USER_CMD_AUTO;
+
+	for (i = 0; i < ARRAY_SIZE(v2_user_options); i++) {
+		if (match_option(arg, ret, v2_user_options[i].option)) {
+			spec_v2_user_print_cond(v2_user_options[i].option,
+						v2_user_options[i].secure);
+			return v2_user_options[i].cmd;
+		}
+	}
+
+	pr_err("Unknown user space protection option (%s). Switching to AUTO select\n", arg);
+	return SPECTRE_V2_USER_CMD_AUTO;
+}
+
+static void __init
+spectre_v2_user_select_mitigation(enum spectre_v2_mitigation_cmd v2_cmd)
+{
+	enum spectre_v2_user_mitigation mode = SPECTRE_V2_USER_NONE;
+	bool smt_possible = IS_ENABLED(CONFIG_SMP);
+
+	if (!boot_cpu_has(X86_FEATURE_IBPB) && !boot_cpu_has(X86_FEATURE_STIBP))
+		return;
+
+	if (cpu_smt_control == CPU_SMT_FORCE_DISABLED ||
+	    cpu_smt_control == CPU_SMT_NOT_SUPPORTED)
+		smt_possible = false;
+
+	switch (spectre_v2_parse_user_cmdline(v2_cmd)) {
+	case SPECTRE_V2_USER_CMD_AUTO:
+	case SPECTRE_V2_USER_CMD_NONE:
+		goto set_mode;
+	case SPECTRE_V2_USER_CMD_FORCE:
+		mode = SPECTRE_V2_USER_STRICT;
+		break;
+	}
+
+	/* Initialize Indirect Branch Prediction Barrier */
+	if (boot_cpu_has(X86_FEATURE_IBPB)) {
+		setup_force_cpu_cap(X86_FEATURE_USE_IBPB);
+		pr_info("Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier\n");
+	}
+
+	/* If enhanced IBRS is enabled no STIPB required */
+	if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
+		return;
+
+set_mode:
+	spectre_v2_user = mode;
+	/* Only print the STIBP mode when SMT possible */
+	if (smt_possible)
+		pr_info("%s\n", spectre_v2_user_strings[mode]);
+}
+
 static const char * const spectre_v2_strings[] = {
 	[SPECTRE_V2_NONE]			= "Vulnerable",
 	[SPECTRE_V2_RETPOLINE_GENERIC]		= "Mitigation: Full generic retpoline",
@@ -382,12 +486,6 @@ specv2_set_mode:
 	setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
 	pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
 
-	/* Initialize Indirect Branch Prediction Barrier if supported */
-	if (boot_cpu_has(X86_FEATURE_IBPB)) {
-		setup_force_cpu_cap(X86_FEATURE_USE_IBPB);
-		pr_info("Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier\n");
-	}
-
 	/*
 	 * Retpoline means the kernel is safe because it has no indirect
 	 * branches. Enhanced IBRS protects firmware too, so, enable restricted
@@ -404,23 +502,21 @@ specv2_set_mode:
 		pr_info("Enabling Restricted Speculation for firmware calls\n");
 	}
 
+	/* Set up IBPB and STIBP depending on the general spectre V2 command */
+	spectre_v2_user_select_mitigation(cmd);
+
 	/* Enable STIBP if appropriate */
 	arch_smt_update();
 }
 
 static bool stibp_needed(void)
 {
-	if (spectre_v2_enabled == SPECTRE_V2_NONE)
-		return false;
-
 	/* Enhanced IBRS makes using STIBP unnecessary. */
 	if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
 		return false;
 
-	if (!boot_cpu_has(X86_FEATURE_STIBP))
-		return false;
-
-	return true;
+	/* Check for strict user mitigation mode */
+	return spectre_v2_user == SPECTRE_V2_USER_STRICT;
 }
 
 static void update_stibp_msr(void *info)
@@ -841,10 +937,13 @@ static char *stibp_state(void)
 	if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
 		return "";
 
-	if (x86_spec_ctrl_base & SPEC_CTRL_STIBP)
-		return ", STIBP";
-	else
-		return "";
+	switch (spectre_v2_user) {
+	case SPECTRE_V2_USER_NONE:
+		return ", STIBP: disabled";
+	case SPECTRE_V2_USER_STRICT:
+		return ", STIBP: forced";
+	}
+	return "";
 }
 
 static char *ibpb_state(void)



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 097/146] x86/speculation: Prepare for per task indirect branch speculation control
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (95 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 096/146] x86/speculation: Add command line control for indirect branch speculation Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 098/146] x86/process: Consolidate and simplify switch_to_xtra() code Greg Kroah-Hartman
                   ` (53 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tim Chen, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Andy Lutomirski, Linus Torvalds,
	Jiri Kosina, Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli,
	David Woodhouse, Andi Kleen, Dave Hansen, Casey Schaufler,
	Asit Mallick, Arjan van de Ven, Jon Masters, Waiman Long,
	Dave Stewart, Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tim Chen tim.c.chen@linux.intel.com

commit 5bfbe3ad5840d941b89bcac54b821ba14f50a0ba upstream

To avoid the overhead of STIBP always on, it's necessary to allow per task
control of STIBP.

Add a new task flag TIF_SPEC_IB and evaluate it during context switch if
SMT is active and flag evaluation is enabled by the speculation control
code. Add the conditional evaluation to x86_virt_spec_ctrl() as well so the
guest/host switch works properly.

This has no effect because TIF_SPEC_IB cannot be set yet and the static key
which controls evaluation is off. Preparatory patch for adding the control
code.

[ tglx: Simplify the context switch logic and make the TIF evaluation
  	depend on SMP=y and on the static key controlling the conditional
  	update. Rename it to TIF_SPEC_IB because it controls both STIBP and
  	IBPB ]

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185005.176917199@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/msr-index.h   |    5 +++--
 arch/x86/include/asm/spec-ctrl.h   |   12 ++++++++++++
 arch/x86/include/asm/thread_info.h |    5 ++++-
 arch/x86/kernel/cpu/bugs.c         |    4 ++++
 arch/x86/kernel/process.c          |   20 ++++++++++++++++++--
 5 files changed, 41 insertions(+), 5 deletions(-)

--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -41,9 +41,10 @@
 
 #define MSR_IA32_SPEC_CTRL		0x00000048 /* Speculation Control */
 #define SPEC_CTRL_IBRS			(1 << 0)   /* Indirect Branch Restricted Speculation */
-#define SPEC_CTRL_STIBP			(1 << 1)   /* Single Thread Indirect Branch Predictors */
+#define SPEC_CTRL_STIBP_SHIFT		1	   /* Single Thread Indirect Branch Predictor (STIBP) bit */
+#define SPEC_CTRL_STIBP			(1 << SPEC_CTRL_STIBP_SHIFT)	/* STIBP mask */
 #define SPEC_CTRL_SSBD_SHIFT		2	   /* Speculative Store Bypass Disable bit */
-#define SPEC_CTRL_SSBD			(1 << SPEC_CTRL_SSBD_SHIFT)   /* Speculative Store Bypass Disable */
+#define SPEC_CTRL_SSBD			(1 << SPEC_CTRL_SSBD_SHIFT)	/* Speculative Store Bypass Disable */
 
 #define MSR_IA32_PRED_CMD		0x00000049 /* Prediction Command */
 #define PRED_CMD_IBPB			(1 << 0)   /* Indirect Branch Prediction Barrier */
--- a/arch/x86/include/asm/spec-ctrl.h
+++ b/arch/x86/include/asm/spec-ctrl.h
@@ -53,12 +53,24 @@ static inline u64 ssbd_tif_to_spec_ctrl(
 	return (tifn & _TIF_SSBD) >> (TIF_SSBD - SPEC_CTRL_SSBD_SHIFT);
 }
 
+static inline u64 stibp_tif_to_spec_ctrl(u64 tifn)
+{
+	BUILD_BUG_ON(TIF_SPEC_IB < SPEC_CTRL_STIBP_SHIFT);
+	return (tifn & _TIF_SPEC_IB) >> (TIF_SPEC_IB - SPEC_CTRL_STIBP_SHIFT);
+}
+
 static inline unsigned long ssbd_spec_ctrl_to_tif(u64 spec_ctrl)
 {
 	BUILD_BUG_ON(TIF_SSBD < SPEC_CTRL_SSBD_SHIFT);
 	return (spec_ctrl & SPEC_CTRL_SSBD) << (TIF_SSBD - SPEC_CTRL_SSBD_SHIFT);
 }
 
+static inline unsigned long stibp_spec_ctrl_to_tif(u64 spec_ctrl)
+{
+	BUILD_BUG_ON(TIF_SPEC_IB < SPEC_CTRL_STIBP_SHIFT);
+	return (spec_ctrl & SPEC_CTRL_STIBP) << (TIF_SPEC_IB - SPEC_CTRL_STIBP_SHIFT);
+}
+
 static inline u64 ssbd_tif_to_amd_ls_cfg(u64 tifn)
 {
 	return (tifn & _TIF_SSBD) ? x86_amd_ls_cfg_ssbd_mask : 0ULL;
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -85,6 +85,7 @@ struct thread_info {
 #define TIF_SYSCALL_EMU		6	/* syscall emulation active */
 #define TIF_SYSCALL_AUDIT	7	/* syscall auditing active */
 #define TIF_SECCOMP		8	/* secure computing */
+#define TIF_SPEC_IB		9	/* Indirect branch speculation mitigation */
 #define TIF_USER_RETURN_NOTIFY	11	/* notify kernel of userspace return */
 #define TIF_UPROBE		12	/* breakpointed or singlestepping */
 #define TIF_PATCH_PENDING	13	/* pending live patching update */
@@ -112,6 +113,7 @@ struct thread_info {
 #define _TIF_SYSCALL_EMU	(1 << TIF_SYSCALL_EMU)
 #define _TIF_SYSCALL_AUDIT	(1 << TIF_SYSCALL_AUDIT)
 #define _TIF_SECCOMP		(1 << TIF_SECCOMP)
+#define _TIF_SPEC_IB		(1 << TIF_SPEC_IB)
 #define _TIF_USER_RETURN_NOTIFY	(1 << TIF_USER_RETURN_NOTIFY)
 #define _TIF_UPROBE		(1 << TIF_UPROBE)
 #define _TIF_PATCH_PENDING	(1 << TIF_PATCH_PENDING)
@@ -148,7 +150,8 @@ struct thread_info {
 
 /* flags to check in __switch_to() */
 #define _TIF_WORK_CTXSW							\
-	(_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP|_TIF_SSBD)
+	(_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP|		\
+	 _TIF_SSBD|_TIF_SPEC_IB)
 
 #define _TIF_WORK_CTXSW_PREV (_TIF_WORK_CTXSW|_TIF_USER_RETURN_NOTIFY)
 #define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW)
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -147,6 +147,10 @@ x86_virt_spec_ctrl(u64 guest_spec_ctrl,
 		    static_cpu_has(X86_FEATURE_AMD_SSBD))
 			hostval |= ssbd_tif_to_spec_ctrl(ti->flags);
 
+		/* Conditional STIBP enabled? */
+		if (static_branch_unlikely(&switch_to_cond_stibp))
+			hostval |= stibp_tif_to_spec_ctrl(ti->flags);
+
 		if (hostval != guestval) {
 			msrval = setguest ? guestval : hostval;
 			wrmsrl(MSR_IA32_SPEC_CTRL, msrval);
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -407,11 +407,17 @@ static __always_inline void amd_set_ssb_
 static __always_inline void __speculation_ctrl_update(unsigned long tifp,
 						      unsigned long tifn)
 {
+	unsigned long tif_diff = tifp ^ tifn;
 	u64 msr = x86_spec_ctrl_base;
 	bool updmsr = false;
 
-	/* If TIF_SSBD is different, select the proper mitigation method */
-	if ((tifp ^ tifn) & _TIF_SSBD) {
+	/*
+	 * If TIF_SSBD is different, select the proper mitigation
+	 * method. Note that if SSBD mitigation is disabled or permanentely
+	 * enabled this branch can't be taken because nothing can set
+	 * TIF_SSBD.
+	 */
+	if (tif_diff & _TIF_SSBD) {
 		if (static_cpu_has(X86_FEATURE_VIRT_SSBD)) {
 			amd_set_ssb_virt_state(tifn);
 		} else if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD)) {
@@ -423,6 +429,16 @@ static __always_inline void __speculatio
 		}
 	}
 
+	/*
+	 * Only evaluate TIF_SPEC_IB if conditional STIBP is enabled,
+	 * otherwise avoid the MSR write.
+	 */
+	if (IS_ENABLED(CONFIG_SMP) &&
+	    static_branch_unlikely(&switch_to_cond_stibp)) {
+		updmsr |= !!(tif_diff & _TIF_SPEC_IB);
+		msr |= stibp_tif_to_spec_ctrl(tifn);
+	}
+
 	if (updmsr)
 		wrmsrl(MSR_IA32_SPEC_CTRL, msr);
 }



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 098/146] x86/process: Consolidate and simplify switch_to_xtra() code
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (96 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 097/146] x86/speculation: Prepare for per task indirect branch speculation control Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 11:14   ` Thomas Gleixner
  2018-12-04 10:49 ` [PATCH 4.14 099/146] x86/speculation: Avoid __switch_to_xtra() calls Greg Kroah-Hartman
                   ` (52 subsequent siblings)
  150 siblings, 1 reply; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit ff16701a29cba3aafa0bd1656d766813b2d0a811 upstream

Move the conditional invocation of __switch_to_xtra() into an inline
function so the logic can be shared between 32 and 64 bit.

Remove the handthrough of the TSS pointer and retrieve the pointer directly
in the bitmap handling function. Use this_cpu_ptr() instead of the
per_cpu() indirection.

This is a preparatory change so integration of conditional indirect branch
speculation optimization happens only in one place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185005.280855518@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/switch_to.h |    3 ---
 arch/x86/kernel/process.c        |   12 +++++++-----
 arch/x86/kernel/process.h        |   24 ++++++++++++++++++++++++
 arch/x86/kernel/process_32.c     |    8 +-------
 arch/x86/kernel/process_64.c     |   10 +++-------
 5 files changed, 35 insertions(+), 22 deletions(-)

--- a/arch/x86/include/asm/switch_to.h
+++ b/arch/x86/include/asm/switch_to.h
@@ -11,9 +11,6 @@ struct task_struct *__switch_to_asm(stru
 
 __visible struct task_struct *__switch_to(struct task_struct *prev,
 					  struct task_struct *next);
-struct tss_struct;
-void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
-		      struct tss_struct *tss);
 
 /* This runs runs on the previous thread's stack. */
 static inline void prepare_switch_to(struct task_struct *prev,
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -41,6 +41,8 @@
 #include <asm/prctl.h>
 #include <asm/spec-ctrl.h>
 
+#include "process.h"
+
 /*
  * per-CPU TSS segments. Threads are completely 'soft' on Linux,
  * no more per-task TSS's. The TSS size is kept cacheline-aligned
@@ -255,11 +257,12 @@ void arch_setup_new_exec(void)
 		enable_cpuid();
 }
 
-static inline void switch_to_bitmap(struct tss_struct *tss,
-				    struct thread_struct *prev,
+static inline void switch_to_bitmap(struct thread_struct *prev,
 				    struct thread_struct *next,
 				    unsigned long tifp, unsigned long tifn)
 {
+	struct tss_struct *tss = this_cpu_ptr(&cpu_tss_rw);
+
 	if (tifn & _TIF_IO_BITMAP) {
 		/*
 		 * Copy the relevant range of the IO bitmap.
@@ -451,8 +454,7 @@ void speculation_ctrl_update(unsigned lo
 	preempt_enable();
 }
 
-void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
-		      struct tss_struct *tss)
+void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p)
 {
 	struct thread_struct *prev, *next;
 	unsigned long tifp, tifn;
@@ -462,7 +464,7 @@ void __switch_to_xtra(struct task_struct
 
 	tifn = READ_ONCE(task_thread_info(next_p)->flags);
 	tifp = READ_ONCE(task_thread_info(prev_p)->flags);
-	switch_to_bitmap(tss, prev, next, tifp, tifn);
+	switch_to_bitmap(prev, next, tifp, tifn);
 
 	propagate_user_return_notify(prev_p, next_p);
 
--- /dev/null
+++ b/arch/x86/kernel/process.h
@@ -0,0 +1,24 @@
+// SPDX-License-Identifier: GPL-2.0
+//
+// Code shared between 32 and 64 bit
+
+void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p);
+
+/*
+ * This needs to be inline to optimize for the common case where no extra
+ * work needs to be done.
+ */
+static inline void switch_to_extra(struct task_struct *prev,
+				   struct task_struct *next)
+{
+	unsigned long next_tif = task_thread_info(next)->flags;
+	unsigned long prev_tif = task_thread_info(prev)->flags;
+
+	/*
+	 * __switch_to_xtra() handles debug registers, i/o bitmaps,
+	 * speculation mitigations etc.
+	 */
+	if (unlikely(next_tif & _TIF_WORK_CTXSW_NEXT ||
+		     prev_tif & _TIF_WORK_CTXSW_PREV))
+		__switch_to_xtra(prev, next);
+}
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -234,7 +234,6 @@ __switch_to(struct task_struct *prev_p,
 	struct fpu *prev_fpu = &prev->fpu;
 	struct fpu *next_fpu = &next->fpu;
 	int cpu = smp_processor_id();
-	struct tss_struct *tss = &per_cpu(cpu_tss_rw, cpu);
 
 	/* never put a printk in __switch_to... printk() calls wake_up*() indirectly */
 
@@ -266,12 +265,7 @@ __switch_to(struct task_struct *prev_p,
 	if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl))
 		set_iopl_mask(next->iopl);
 
-	/*
-	 * Now maybe handle debug registers and/or IO bitmaps
-	 */
-	if (unlikely(task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV ||
-		     task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT))
-		__switch_to_xtra(prev_p, next_p, tss);
+	switch_to_extra(prev_p, next_p);
 
 	/*
 	 * Leave lazy mode, flushing any hypercalls made here.
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -59,6 +59,8 @@
 #include <asm/unistd_32_ia32.h>
 #endif
 
+#include "process.h"
+
 __visible DEFINE_PER_CPU(unsigned long, rsp_scratch);
 
 /* Prints also some state that isn't saved in the pt_regs */
@@ -400,7 +402,6 @@ __switch_to(struct task_struct *prev_p,
 	struct fpu *prev_fpu = &prev->fpu;
 	struct fpu *next_fpu = &next->fpu;
 	int cpu = smp_processor_id();
-	struct tss_struct *tss = &per_cpu(cpu_tss_rw, cpu);
 
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ENTRY) &&
 		     this_cpu_read(irq_count) != -1);
@@ -467,12 +468,7 @@ __switch_to(struct task_struct *prev_p,
 	/* Reload sp0. */
 	update_sp0(next_p);
 
-	/*
-	 * Now maybe reload the debug registers and handle I/O bitmaps
-	 */
-	if (unlikely(task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT ||
-		     task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV))
-		__switch_to_xtra(prev_p, next_p, tss);
+	__switch_to_xtra(prev_p, next_p);
 
 #ifdef CONFIG_XEN_PV
 	/*



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 099/146] x86/speculation: Avoid __switch_to_xtra() calls
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (97 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 098/146] x86/process: Consolidate and simplify switch_to_xtra() code Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 100/146] x86/speculation: Prepare for conditional IBPB in switch_mm() Greg Kroah-Hartman
                   ` (51 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit 5635d99953f04b550738f6f4c1c532667c3fd872 upstream

The TIF_SPEC_IB bit does not need to be evaluated in the decision to invoke
__switch_to_xtra() when:

 - CONFIG_SMP is disabled

 - The conditional STIPB mode is disabled

The TIF_SPEC_IB bit still controls IBPB in both cases so the TIF work mask
checks might invoke __switch_to_xtra() for nothing if TIF_SPEC_IB is the
only set bit in the work masks.

Optimize it out by masking the bit at compile time for CONFIG_SMP=n and at
run time when the static key controlling the conditional STIBP mode is
disabled.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185005.374062201@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/thread_info.h |   13 +++++++++++--
 arch/x86/kernel/process.h          |   15 +++++++++++++++
 2 files changed, 26 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -149,9 +149,18 @@ struct thread_info {
 	 _TIF_FSCHECK)
 
 /* flags to check in __switch_to() */
-#define _TIF_WORK_CTXSW							\
+#define _TIF_WORK_CTXSW_BASE						\
 	(_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP|		\
-	 _TIF_SSBD|_TIF_SPEC_IB)
+	 _TIF_SSBD)
+
+/*
+ * Avoid calls to __switch_to_xtra() on UP as STIBP is not evaluated.
+ */
+#ifdef CONFIG_SMP
+# define _TIF_WORK_CTXSW	(_TIF_WORK_CTXSW_BASE | _TIF_SPEC_IB)
+#else
+# define _TIF_WORK_CTXSW	(_TIF_WORK_CTXSW_BASE)
+#endif
 
 #define _TIF_WORK_CTXSW_PREV (_TIF_WORK_CTXSW|_TIF_USER_RETURN_NOTIFY)
 #define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW)
--- a/arch/x86/kernel/process.h
+++ b/arch/x86/kernel/process.h
@@ -2,6 +2,8 @@
 //
 // Code shared between 32 and 64 bit
 
+#include <asm/spec-ctrl.h>
+
 void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p);
 
 /*
@@ -14,6 +16,19 @@ static inline void switch_to_extra(struc
 	unsigned long next_tif = task_thread_info(next)->flags;
 	unsigned long prev_tif = task_thread_info(prev)->flags;
 
+	if (IS_ENABLED(CONFIG_SMP)) {
+		/*
+		 * Avoid __switch_to_xtra() invocation when conditional
+		 * STIPB is disabled and the only different bit is
+		 * TIF_SPEC_IB. For CONFIG_SMP=n TIF_SPEC_IB is not
+		 * in the TIF_WORK_CTXSW masks.
+		 */
+		if (!static_branch_likely(&switch_to_cond_stibp)) {
+			prev_tif &= ~_TIF_SPEC_IB;
+			next_tif &= ~_TIF_SPEC_IB;
+		}
+	}
+
 	/*
 	 * __switch_to_xtra() handles debug registers, i/o bitmaps,
 	 * speculation mitigations etc.



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 100/146] x86/speculation: Prepare for conditional IBPB in switch_mm()
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (98 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 099/146] x86/speculation: Avoid __switch_to_xtra() calls Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 101/146] ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRS Greg Kroah-Hartman
                   ` (50 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit 4c71a2b6fd7e42814aa68a6dec88abf3b42ea573 upstream

The IBPB speculation barrier is issued from switch_mm() when the kernel
switches to a user space task with a different mm than the user space task
which ran last on the same CPU.

An additional optimization is to avoid IBPB when the incoming task can be
ptraced by the outgoing task. This optimization only works when switching
directly between two user space tasks. When switching from a kernel task to
a user space task the optimization fails because the previous task cannot
be accessed anymore. So for quite some scenarios the optimization is just
adding overhead.

The upcoming conditional IBPB support will issue IBPB only for user space
tasks which have the TIF_SPEC_IB bit set. This requires to handle the
following cases:

  1) Switch from a user space task (potential attacker) which has
     TIF_SPEC_IB set to a user space task (potential victim) which has
     TIF_SPEC_IB not set.

  2) Switch from a user space task (potential attacker) which has
     TIF_SPEC_IB not set to a user space task (potential victim) which has
     TIF_SPEC_IB set.

This needs to be optimized for the case where the IBPB can be avoided when
only kernel threads ran in between user space tasks which belong to the
same process.

The current check whether two tasks belong to the same context is using the
tasks context id. While correct, it's simpler to use the mm pointer because
it allows to mangle the TIF_SPEC_IB bit into it. The context id based
mechanism requires extra storage, which creates worse code.

When a task is scheduled out its TIF_SPEC_IB bit is mangled as bit 0 into
the per CPU storage which is used to track the last user space mm which was
running on a CPU. This bit can be used together with the TIF_SPEC_IB bit of
the incoming task to make the decision whether IBPB needs to be issued or
not to cover the two cases above.

As conditional IBPB is going to be the default, remove the dubious ptrace
check for the IBPB always case and simply issue IBPB always when the
process changes.

Move the storage to a different place in the struct as the original one
created a hole.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185005.466447057@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/nospec-branch.h |    2 
 arch/x86/include/asm/tlbflush.h      |    8 +-
 arch/x86/kernel/cpu/bugs.c           |   29 +++++++-
 arch/x86/mm/tlb.c                    |  114 ++++++++++++++++++++++++++---------
 4 files changed, 118 insertions(+), 35 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -312,6 +312,8 @@ do {									\
 } while (0)
 
 DECLARE_STATIC_KEY_FALSE(switch_to_cond_stibp);
+DECLARE_STATIC_KEY_FALSE(switch_mm_cond_ibpb);
+DECLARE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
 
 #endif /* __ASSEMBLY__ */
 
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -185,10 +185,14 @@ struct tlb_state {
 
 #define LOADED_MM_SWITCHING ((struct mm_struct *)1)
 
+	/* Last user mm for optimizing IBPB */
+	union {
+		struct mm_struct	*last_user_mm;
+		unsigned long		last_user_mm_ibpb;
+	};
+
 	u16 loaded_mm_asid;
 	u16 next_asid;
-	/* last user mm's ctx id */
-	u64 last_ctx_id;
 
 	/*
 	 * We can be in one of several states:
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -55,6 +55,10 @@ u64 __ro_after_init x86_amd_ls_cfg_ssbd_
 
 /* Control conditional STIPB in switch_to() */
 DEFINE_STATIC_KEY_FALSE(switch_to_cond_stibp);
+/* Control conditional IBPB in switch_mm() */
+DEFINE_STATIC_KEY_FALSE(switch_mm_cond_ibpb);
+/* Control unconditional IBPB in switch_mm() */
+DEFINE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
 
 void __init check_bugs(void)
 {
@@ -330,7 +334,17 @@ spectre_v2_user_select_mitigation(enum s
 	/* Initialize Indirect Branch Prediction Barrier */
 	if (boot_cpu_has(X86_FEATURE_IBPB)) {
 		setup_force_cpu_cap(X86_FEATURE_USE_IBPB);
-		pr_info("Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier\n");
+
+		switch (mode) {
+		case SPECTRE_V2_USER_STRICT:
+			static_branch_enable(&switch_mm_always_ibpb);
+			break;
+		default:
+			break;
+		}
+
+		pr_info("mitigation: Enabling %s Indirect Branch Prediction Barrier\n",
+			mode == SPECTRE_V2_USER_STRICT ? "always-on" : "conditional");
 	}
 
 	/* If enhanced IBRS is enabled no STIPB required */
@@ -952,10 +966,15 @@ static char *stibp_state(void)
 
 static char *ibpb_state(void)
 {
-	if (boot_cpu_has(X86_FEATURE_USE_IBPB))
-		return ", IBPB";
-	else
-		return "";
+	if (boot_cpu_has(X86_FEATURE_IBPB)) {
+		switch (spectre_v2_user) {
+		case SPECTRE_V2_USER_NONE:
+			return ", IBPB: disabled";
+		case SPECTRE_V2_USER_STRICT:
+			return ", IBPB: always-on";
+		}
+	}
+	return "";
 }
 
 static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -7,7 +7,6 @@
 #include <linux/export.h>
 #include <linux/cpu.h>
 #include <linux/debugfs.h>
-#include <linux/ptrace.h>
 
 #include <asm/tlbflush.h>
 #include <asm/mmu_context.h>
@@ -31,6 +30,12 @@
  */
 
 /*
+ * Use bit 0 to mangle the TIF_SPEC_IB state into the mm pointer which is
+ * stored in cpu_tlb_state.last_user_mm_ibpb.
+ */
+#define LAST_USER_MM_IBPB	0x1UL
+
+/*
  * We get here when we do something requiring a TLB invalidation
  * but could not go invalidate all of the contexts.  We do the
  * necessary invalidation by clearing out the 'ctx_id' which
@@ -181,17 +186,87 @@ static void sync_current_stack_to_mm(str
 	}
 }
 
-static bool ibpb_needed(struct task_struct *tsk, u64 last_ctx_id)
+static inline unsigned long mm_mangle_tif_spec_ib(struct task_struct *next)
+{
+	unsigned long next_tif = task_thread_info(next)->flags;
+	unsigned long ibpb = (next_tif >> TIF_SPEC_IB) & LAST_USER_MM_IBPB;
+
+	return (unsigned long)next->mm | ibpb;
+}
+
+static void cond_ibpb(struct task_struct *next)
 {
+	if (!next || !next->mm)
+		return;
+
 	/*
-	 * Check if the current (previous) task has access to the memory
-	 * of the @tsk (next) task. If access is denied, make sure to
-	 * issue a IBPB to stop user->user Spectre-v2 attacks.
-	 *
-	 * Note: __ptrace_may_access() returns 0 or -ERRNO.
+	 * Both, the conditional and the always IBPB mode use the mm
+	 * pointer to avoid the IBPB when switching between tasks of the
+	 * same process. Using the mm pointer instead of mm->context.ctx_id
+	 * opens a hypothetical hole vs. mm_struct reuse, which is more or
+	 * less impossible to control by an attacker. Aside of that it
+	 * would only affect the first schedule so the theoretically
+	 * exposed data is not really interesting.
 	 */
-	return (tsk && tsk->mm && tsk->mm->context.ctx_id != last_ctx_id &&
-		ptrace_may_access_sched(tsk, PTRACE_MODE_SPEC_IBPB));
+	if (static_branch_likely(&switch_mm_cond_ibpb)) {
+		unsigned long prev_mm, next_mm;
+
+		/*
+		 * This is a bit more complex than the always mode because
+		 * it has to handle two cases:
+		 *
+		 * 1) Switch from a user space task (potential attacker)
+		 *    which has TIF_SPEC_IB set to a user space task
+		 *    (potential victim) which has TIF_SPEC_IB not set.
+		 *
+		 * 2) Switch from a user space task (potential attacker)
+		 *    which has TIF_SPEC_IB not set to a user space task
+		 *    (potential victim) which has TIF_SPEC_IB set.
+		 *
+		 * This could be done by unconditionally issuing IBPB when
+		 * a task which has TIF_SPEC_IB set is either scheduled in
+		 * or out. Though that results in two flushes when:
+		 *
+		 * - the same user space task is scheduled out and later
+		 *   scheduled in again and only a kernel thread ran in
+		 *   between.
+		 *
+		 * - a user space task belonging to the same process is
+		 *   scheduled in after a kernel thread ran in between
+		 *
+		 * - a user space task belonging to the same process is
+		 *   scheduled in immediately.
+		 *
+		 * Optimize this with reasonably small overhead for the
+		 * above cases. Mangle the TIF_SPEC_IB bit into the mm
+		 * pointer of the incoming task which is stored in
+		 * cpu_tlbstate.last_user_mm_ibpb for comparison.
+		 */
+		next_mm = mm_mangle_tif_spec_ib(next);
+		prev_mm = this_cpu_read(cpu_tlbstate.last_user_mm_ibpb);
+
+		/*
+		 * Issue IBPB only if the mm's are different and one or
+		 * both have the IBPB bit set.
+		 */
+		if (next_mm != prev_mm &&
+		    (next_mm | prev_mm) & LAST_USER_MM_IBPB)
+			indirect_branch_prediction_barrier();
+
+		this_cpu_write(cpu_tlbstate.last_user_mm_ibpb, next_mm);
+	}
+
+	if (static_branch_unlikely(&switch_mm_always_ibpb)) {
+		/*
+		 * Only flush when switching to a user space task with a
+		 * different context than the user space task which ran
+		 * last on this CPU.
+		 */
+		if (this_cpu_read(cpu_tlbstate.last_user_mm) != next->mm) {
+			indirect_branch_prediction_barrier();
+			this_cpu_write(cpu_tlbstate.last_user_mm, next->mm);
+		}
+	}
 }
 
 void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
@@ -262,22 +337,13 @@ void switch_mm_irqs_off(struct mm_struct
 	} else {
 		u16 new_asid;
 		bool need_flush;
-		u64 last_ctx_id = this_cpu_read(cpu_tlbstate.last_ctx_id);
 
 		/*
 		 * Avoid user/user BTB poisoning by flushing the branch
 		 * predictor when switching between processes. This stops
 		 * one process from doing Spectre-v2 attacks on another.
-		 *
-		 * As an optimization, flush indirect branches only when
-		 * switching into a processes that can't be ptrace by the
-		 * current one (as in such case, attacker has much more
-		 * convenient way how to tamper with the next process than
-		 * branch buffer poisoning).
 		 */
-		if (static_cpu_has(X86_FEATURE_USE_IBPB) &&
-				ibpb_needed(tsk, last_ctx_id))
-			indirect_branch_prediction_barrier();
+		cond_ibpb(tsk);
 
 		if (IS_ENABLED(CONFIG_VMAP_STACK)) {
 			/*
@@ -327,14 +393,6 @@ void switch_mm_irqs_off(struct mm_struct
 			trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, 0);
 		}
 
-		/*
-		 * Record last user mm's context id, so we can avoid
-		 * flushing branch buffer with IBPB if we switch back
-		 * to the same user.
-		 */
-		if (next != &init_mm)
-			this_cpu_write(cpu_tlbstate.last_ctx_id, next->context.ctx_id);
-
 		/* Make sure we write CR3 before loaded_mm. */
 		barrier();
 
@@ -415,7 +473,7 @@ void initialize_tlbstate_and_flush(void)
 	write_cr3(build_cr3(mm->pgd, 0));
 
 	/* Reinitialize tlbstate. */
-	this_cpu_write(cpu_tlbstate.last_ctx_id, mm->context.ctx_id);
+	this_cpu_write(cpu_tlbstate.last_user_mm_ibpb, LAST_USER_MM_IBPB);
 	this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0);
 	this_cpu_write(cpu_tlbstate.next_asid, 1);
 	this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id);



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 101/146] ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRS
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (99 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 100/146] x86/speculation: Prepare for conditional IBPB in switch_mm() Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 102/146] x86/speculation: Split out TIF update Greg Kroah-Hartman
                   ` (49 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit 46f7ecb1e7359f183f5bbd1e08b90e10e52164f9 upstream

The IBPB control code in x86 removed the usage. Remove the functionality
which was introduced for this.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185005.559149393@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/ptrace.h |   17 -----------------
 kernel/ptrace.c        |   10 ----------
 2 files changed, 27 deletions(-)

--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -64,15 +64,12 @@ extern void exit_ptrace(struct task_stru
 #define PTRACE_MODE_NOAUDIT	0x04
 #define PTRACE_MODE_FSCREDS	0x08
 #define PTRACE_MODE_REALCREDS	0x10
-#define PTRACE_MODE_SCHED	0x20
-#define PTRACE_MODE_IBPB	0x40
 
 /* shorthands for READ/ATTACH and FSCREDS/REALCREDS combinations */
 #define PTRACE_MODE_READ_FSCREDS (PTRACE_MODE_READ | PTRACE_MODE_FSCREDS)
 #define PTRACE_MODE_READ_REALCREDS (PTRACE_MODE_READ | PTRACE_MODE_REALCREDS)
 #define PTRACE_MODE_ATTACH_FSCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_FSCREDS)
 #define PTRACE_MODE_ATTACH_REALCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_REALCREDS)
-#define PTRACE_MODE_SPEC_IBPB (PTRACE_MODE_ATTACH_REALCREDS | PTRACE_MODE_IBPB)
 
 /**
  * ptrace_may_access - check whether the caller is permitted to access
@@ -90,20 +87,6 @@ extern void exit_ptrace(struct task_stru
  */
 extern bool ptrace_may_access(struct task_struct *task, unsigned int mode);
 
-/**
- * ptrace_may_access - check whether the caller is permitted to access
- * a target task.
- * @task: target task
- * @mode: selects type of access and caller credentials
- *
- * Returns true on success, false on denial.
- *
- * Similar to ptrace_may_access(). Only to be called from context switch
- * code. Does not call into audit and the regular LSM hooks due to locking
- * constraints.
- */
-extern bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode);
-
 static inline int ptrace_reparented(struct task_struct *child)
 {
 	return !same_thread_group(child->real_parent, child->parent);
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -261,9 +261,6 @@ static int ptrace_check_attach(struct ta
 
 static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
 {
-	if (mode & PTRACE_MODE_SCHED)
-		return false;
-
 	if (mode & PTRACE_MODE_NOAUDIT)
 		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
 	else
@@ -331,16 +328,9 @@ ok:
 	     !ptrace_has_cap(mm->user_ns, mode)))
 	    return -EPERM;
 
-	if (mode & PTRACE_MODE_SCHED)
-		return 0;
 	return security_ptrace_access_check(task, mode);
 }
 
-bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode)
-{
-	return __ptrace_may_access(task, mode | PTRACE_MODE_SCHED);
-}
-
 bool ptrace_may_access(struct task_struct *task, unsigned int mode)
 {
 	int err;



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 102/146] x86/speculation: Split out TIF update
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (100 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 101/146] ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRS Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 103/146] x86/speculation: Prevent stale SPEC_CTRL msr content Greg Kroah-Hartman
                   ` (48 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit e6da8bb6f9abb2628381904b24163c770e630bac upstream

The update of the TIF_SSBD flag and the conditional speculation control MSR
update is done in the ssb_prctl_set() function directly. The upcoming prctl
support for controlling indirect branch speculation via STIBP needs the
same mechanism.

Split the code out and make it reusable. Reword the comment about updates
for other tasks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185005.652305076@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/bugs.c |   35 +++++++++++++++++++++++------------
 1 file changed, 23 insertions(+), 12 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -699,10 +699,29 @@ static void ssb_select_mitigation(void)
 #undef pr_fmt
 #define pr_fmt(fmt)     "Speculation prctl: " fmt
 
-static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl)
+static void task_update_spec_tif(struct task_struct *tsk, int tifbit, bool on)
 {
 	bool update;
 
+	if (on)
+		update = !test_and_set_tsk_thread_flag(tsk, tifbit);
+	else
+		update = test_and_clear_tsk_thread_flag(tsk, tifbit);
+
+	/*
+	 * Immediately update the speculation control MSRs for the current
+	 * task, but for a non-current task delay setting the CPU
+	 * mitigation until it is scheduled next.
+	 *
+	 * This can only happen for SECCOMP mitigation. For PRCTL it's
+	 * always the current task.
+	 */
+	if (tsk == current && update)
+		speculation_ctrl_update_current();
+}
+
+static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl)
+{
 	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL &&
 	    ssb_mode != SPEC_STORE_BYPASS_SECCOMP)
 		return -ENXIO;
@@ -713,28 +732,20 @@ static int ssb_prctl_set(struct task_str
 		if (task_spec_ssb_force_disable(task))
 			return -EPERM;
 		task_clear_spec_ssb_disable(task);
-		update = test_and_clear_tsk_thread_flag(task, TIF_SSBD);
+		task_update_spec_tif(task, TIF_SSBD, false);
 		break;
 	case PR_SPEC_DISABLE:
 		task_set_spec_ssb_disable(task);
-		update = !test_and_set_tsk_thread_flag(task, TIF_SSBD);
+		task_update_spec_tif(task, TIF_SSBD, true);
 		break;
 	case PR_SPEC_FORCE_DISABLE:
 		task_set_spec_ssb_disable(task);
 		task_set_spec_ssb_force_disable(task);
-		update = !test_and_set_tsk_thread_flag(task, TIF_SSBD);
+		task_update_spec_tif(task, TIF_SSBD, true);
 		break;
 	default:
 		return -ERANGE;
 	}
-
-	/*
-	 * If being set on non-current task, delay setting the CPU
-	 * mitigation until it is next scheduled.
-	 */
-	if (task == current && update)
-		speculation_ctrl_update_current();
-
 	return 0;
 }
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 103/146] x86/speculation: Prevent stale SPEC_CTRL msr content
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (101 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 102/146] x86/speculation: Split out TIF update Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 104/146] x86/speculation: Prepare arch_smt_update() for PRCTL mode Greg Kroah-Hartman
                   ` (47 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tim Chen, Thomas Gleixner,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit 6d991ba509ebcfcc908e009d1db51972a4f7a064 upstream

The seccomp speculation control operates on all tasks of a process, but
only the current task of a process can update the MSR immediately. For the
other threads the update is deferred to the next context switch.

This creates the following situation with Process A and B:

Process A task 2 and Process B task 1 are pinned on CPU1. Process A task 2
does not have the speculation control TIF bit set. Process B task 1 has the
speculation control TIF bit set.

CPU0					CPU1
					MSR bit is set
					ProcB.T1 schedules out
					ProcA.T2 schedules in
					MSR bit is cleared
ProcA.T1
  seccomp_update()
  set TIF bit on ProcA.T2
					ProcB.T1 schedules in
					MSR is not updated  <-- FAIL

This happens because the context switch code tries to avoid the MSR update
if the speculation control TIF bits of the incoming and the outgoing task
are the same. In the worst case ProcB.T1 and ProcA.T2 are the only tasks
scheduling back and forth on CPU1, which keeps the MSR stale forever.

In theory this could be remedied by IPIs, but chasing the remote task which
could be migrated is complex and full of races.

The straight forward solution is to avoid the asychronous update of the TIF
bit and defer it to the next context switch. The speculation control state
is stored in task_struct::atomic_flags by the prctl and seccomp updates
already.

Add a new TIF_SPEC_FORCE_UPDATE bit and set this after updating the
atomic_flags. Check the bit on context switch and force a synchronous
update of the speculation control if set. Use the same mechanism for
updating the current task.

Reported-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1811272247140.1875@nanos.tec.linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/spec-ctrl.h   |    6 +-----
 arch/x86/include/asm/thread_info.h |    4 +++-
 arch/x86/kernel/cpu/bugs.c         |   18 +++++++-----------
 arch/x86/kernel/process.c          |   30 +++++++++++++++++++++++++++++-
 4 files changed, 40 insertions(+), 18 deletions(-)

--- a/arch/x86/include/asm/spec-ctrl.h
+++ b/arch/x86/include/asm/spec-ctrl.h
@@ -83,10 +83,6 @@ static inline void speculative_store_byp
 #endif
 
 extern void speculation_ctrl_update(unsigned long tif);
-
-static inline void speculation_ctrl_update_current(void)
-{
-	speculation_ctrl_update(current_thread_info()->flags);
-}
+extern void speculation_ctrl_update_current(void);
 
 #endif
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -86,6 +86,7 @@ struct thread_info {
 #define TIF_SYSCALL_AUDIT	7	/* syscall auditing active */
 #define TIF_SECCOMP		8	/* secure computing */
 #define TIF_SPEC_IB		9	/* Indirect branch speculation mitigation */
+#define TIF_SPEC_FORCE_UPDATE	10	/* Force speculation MSR update in context switch */
 #define TIF_USER_RETURN_NOTIFY	11	/* notify kernel of userspace return */
 #define TIF_UPROBE		12	/* breakpointed or singlestepping */
 #define TIF_PATCH_PENDING	13	/* pending live patching update */
@@ -114,6 +115,7 @@ struct thread_info {
 #define _TIF_SYSCALL_AUDIT	(1 << TIF_SYSCALL_AUDIT)
 #define _TIF_SECCOMP		(1 << TIF_SECCOMP)
 #define _TIF_SPEC_IB		(1 << TIF_SPEC_IB)
+#define _TIF_SPEC_FORCE_UPDATE	(1 << TIF_SPEC_FORCE_UPDATE)
 #define _TIF_USER_RETURN_NOTIFY	(1 << TIF_USER_RETURN_NOTIFY)
 #define _TIF_UPROBE		(1 << TIF_UPROBE)
 #define _TIF_PATCH_PENDING	(1 << TIF_PATCH_PENDING)
@@ -151,7 +153,7 @@ struct thread_info {
 /* flags to check in __switch_to() */
 #define _TIF_WORK_CTXSW_BASE						\
 	(_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP|		\
-	 _TIF_SSBD)
+	 _TIF_SSBD | _TIF_SPEC_FORCE_UPDATE)
 
 /*
  * Avoid calls to __switch_to_xtra() on UP as STIBP is not evaluated.
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -699,14 +699,10 @@ static void ssb_select_mitigation(void)
 #undef pr_fmt
 #define pr_fmt(fmt)     "Speculation prctl: " fmt
 
-static void task_update_spec_tif(struct task_struct *tsk, int tifbit, bool on)
+static void task_update_spec_tif(struct task_struct *tsk)
 {
-	bool update;
-
-	if (on)
-		update = !test_and_set_tsk_thread_flag(tsk, tifbit);
-	else
-		update = test_and_clear_tsk_thread_flag(tsk, tifbit);
+	/* Force the update of the real TIF bits */
+	set_tsk_thread_flag(tsk, TIF_SPEC_FORCE_UPDATE);
 
 	/*
 	 * Immediately update the speculation control MSRs for the current
@@ -716,7 +712,7 @@ static void task_update_spec_tif(struct
 	 * This can only happen for SECCOMP mitigation. For PRCTL it's
 	 * always the current task.
 	 */
-	if (tsk == current && update)
+	if (tsk == current)
 		speculation_ctrl_update_current();
 }
 
@@ -732,16 +728,16 @@ static int ssb_prctl_set(struct task_str
 		if (task_spec_ssb_force_disable(task))
 			return -EPERM;
 		task_clear_spec_ssb_disable(task);
-		task_update_spec_tif(task, TIF_SSBD, false);
+		task_update_spec_tif(task);
 		break;
 	case PR_SPEC_DISABLE:
 		task_set_spec_ssb_disable(task);
-		task_update_spec_tif(task, TIF_SSBD, true);
+		task_update_spec_tif(task);
 		break;
 	case PR_SPEC_FORCE_DISABLE:
 		task_set_spec_ssb_disable(task);
 		task_set_spec_ssb_force_disable(task);
-		task_update_spec_tif(task, TIF_SSBD, true);
+		task_update_spec_tif(task);
 		break;
 	default:
 		return -ERANGE;
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -446,6 +446,18 @@ static __always_inline void __speculatio
 		wrmsrl(MSR_IA32_SPEC_CTRL, msr);
 }
 
+static unsigned long speculation_ctrl_update_tif(struct task_struct *tsk)
+{
+	if (test_and_clear_tsk_thread_flag(tsk, TIF_SPEC_FORCE_UPDATE)) {
+		if (task_spec_ssb_disable(tsk))
+			set_tsk_thread_flag(tsk, TIF_SSBD);
+		else
+			clear_tsk_thread_flag(tsk, TIF_SSBD);
+	}
+	/* Return the updated threadinfo flags*/
+	return task_thread_info(tsk)->flags;
+}
+
 void speculation_ctrl_update(unsigned long tif)
 {
 	/* Forced update. Make sure all relevant TIF flags are different */
@@ -454,6 +466,14 @@ void speculation_ctrl_update(unsigned lo
 	preempt_enable();
 }
 
+/* Called from seccomp/prctl update */
+void speculation_ctrl_update_current(void)
+{
+	preempt_disable();
+	speculation_ctrl_update(speculation_ctrl_update_tif(current));
+	preempt_enable();
+}
+
 void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p)
 {
 	struct thread_struct *prev, *next;
@@ -485,7 +505,15 @@ void __switch_to_xtra(struct task_struct
 	if ((tifp ^ tifn) & _TIF_NOCPUID)
 		set_cpuid_faulting(!!(tifn & _TIF_NOCPUID));
 
-	__speculation_ctrl_update(tifp, tifn);
+	if (likely(!((tifp | tifn) & _TIF_SPEC_FORCE_UPDATE))) {
+		__speculation_ctrl_update(tifp, tifn);
+	} else {
+		speculation_ctrl_update_tif(prev_p);
+		tifn = speculation_ctrl_update_tif(next_p);
+
+		/* Enforce MSR update to ensure consistent state */
+		__speculation_ctrl_update(~tifn, tifn);
+	}
 }
 
 /*



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 104/146] x86/speculation: Prepare arch_smt_update() for PRCTL mode
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (102 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 103/146] x86/speculation: Prevent stale SPEC_CTRL msr content Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 105/146] x86/speculation: Add prctl() control for indirect branch speculation Greg Kroah-Hartman
                   ` (46 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit 6893a959d7fdebbab5f5aa112c277d5a44435ba1 upstream

The upcoming fine grained per task STIBP control needs to be updated on CPU
hotplug as well.

Split out the code which controls the strict mode so the prctl control code
can be added later. Mark the SMP function call argument __unused while at it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185005.759457117@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/bugs.c |   46 ++++++++++++++++++++++++---------------------
 1 file changed, 25 insertions(+), 21 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -527,40 +527,44 @@ specv2_set_mode:
 	arch_smt_update();
 }
 
-static bool stibp_needed(void)
+static void update_stibp_msr(void * __unused)
 {
-	/* Enhanced IBRS makes using STIBP unnecessary. */
-	if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
-		return false;
-
-	/* Check for strict user mitigation mode */
-	return spectre_v2_user == SPECTRE_V2_USER_STRICT;
+	wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
 }
 
-static void update_stibp_msr(void *info)
+/* Update x86_spec_ctrl_base in case SMT state changed. */
+static void update_stibp_strict(void)
 {
-	wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
+	u64 mask = x86_spec_ctrl_base & ~SPEC_CTRL_STIBP;
+
+	if (sched_smt_active())
+		mask |= SPEC_CTRL_STIBP;
+
+	if (mask == x86_spec_ctrl_base)
+		return;
+
+	pr_info("Update user space SMT mitigation: STIBP %s\n",
+		mask & SPEC_CTRL_STIBP ? "always-on" : "off");
+	x86_spec_ctrl_base = mask;
+	on_each_cpu(update_stibp_msr, NULL, 1);
 }
 
 void arch_smt_update(void)
 {
-	u64 mask;
-
-	if (!stibp_needed())
+	/* Enhanced IBRS implies STIBP. No update required. */
+	if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
 		return;
 
 	mutex_lock(&spec_ctrl_mutex);
 
-	mask = x86_spec_ctrl_base & ~SPEC_CTRL_STIBP;
-	if (sched_smt_active())
-		mask |= SPEC_CTRL_STIBP;
-
-	if (mask != x86_spec_ctrl_base) {
-		pr_info("Spectre v2 cross-process SMT mitigation: %s STIBP\n",
-			mask & SPEC_CTRL_STIBP ? "Enabling" : "Disabling");
-		x86_spec_ctrl_base = mask;
-		on_each_cpu(update_stibp_msr, NULL, 1);
+	switch (spectre_v2_user) {
+	case SPECTRE_V2_USER_NONE:
+		break;
+	case SPECTRE_V2_USER_STRICT:
+		update_stibp_strict();
+		break;
 	}
+
 	mutex_unlock(&spec_ctrl_mutex);
 }
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 105/146] x86/speculation: Add prctl() control for indirect branch speculation
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (103 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 104/146] x86/speculation: Prepare arch_smt_update() for PRCTL mode Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 106/146] x86/speculation: Enable prctl mode for spectre_v2_user Greg Kroah-Hartman
                   ` (45 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tim Chen, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Andy Lutomirski, Linus Torvalds,
	Jiri Kosina, Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli,
	David Woodhouse, Andi Kleen, Dave Hansen, Casey Schaufler,
	Asit Mallick, Arjan van de Ven, Jon Masters, Waiman Long,
	Dave Stewart, Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit 9137bb27e60e554dab694eafa4cca241fa3a694f upstream

Add the PR_SPEC_INDIRECT_BRANCH option for the PR_GET_SPECULATION_CTRL and
PR_SET_SPECULATION_CTRL prctls to allow fine grained per task control of
indirect branch speculation via STIBP and IBPB.

Invocations:
 Check indirect branch speculation status with
 - prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, 0, 0, 0);

 Enable indirect branch speculation with
 - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0);

 Disable indirect branch speculation with
 - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0);

 Force disable indirect branch speculation with
 - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0);

See Documentation/userspace-api/spec_ctrl.rst.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185005.866780996@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/userspace-api/spec_ctrl.rst |    9 ++++
 arch/x86/include/asm/nospec-branch.h      |    1 
 arch/x86/kernel/cpu/bugs.c                |   67 ++++++++++++++++++++++++++++++
 arch/x86/kernel/process.c                 |    5 ++
 include/linux/sched.h                     |    9 ++++
 include/uapi/linux/prctl.h                |    1 
 6 files changed, 92 insertions(+)

--- a/Documentation/userspace-api/spec_ctrl.rst
+++ b/Documentation/userspace-api/spec_ctrl.rst
@@ -92,3 +92,12 @@ Speculation misfeature controls
    * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_ENABLE, 0, 0);
    * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_DISABLE, 0, 0);
    * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_FORCE_DISABLE, 0, 0);
+
+- PR_SPEC_INDIR_BRANCH: Indirect Branch Speculation in User Processes
+                        (Mitigate Spectre V2 style attacks against user processes)
+
+  Invocations:
+   * prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, 0, 0, 0);
+   * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0);
+   * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0);
+   * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0);
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -232,6 +232,7 @@ enum spectre_v2_mitigation {
 enum spectre_v2_user_mitigation {
 	SPECTRE_V2_USER_NONE,
 	SPECTRE_V2_USER_STRICT,
+	SPECTRE_V2_USER_PRCTL,
 };
 
 /* The Speculative Store Bypass disable variants */
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -563,6 +563,8 @@ void arch_smt_update(void)
 	case SPECTRE_V2_USER_STRICT:
 		update_stibp_strict();
 		break;
+	case SPECTRE_V2_USER_PRCTL:
+		break;
 	}
 
 	mutex_unlock(&spec_ctrl_mutex);
@@ -749,12 +751,50 @@ static int ssb_prctl_set(struct task_str
 	return 0;
 }
 
+static int ib_prctl_set(struct task_struct *task, unsigned long ctrl)
+{
+	switch (ctrl) {
+	case PR_SPEC_ENABLE:
+		if (spectre_v2_user == SPECTRE_V2_USER_NONE)
+			return 0;
+		/*
+		 * Indirect branch speculation is always disabled in strict
+		 * mode.
+		 */
+		if (spectre_v2_user == SPECTRE_V2_USER_STRICT)
+			return -EPERM;
+		task_clear_spec_ib_disable(task);
+		task_update_spec_tif(task);
+		break;
+	case PR_SPEC_DISABLE:
+	case PR_SPEC_FORCE_DISABLE:
+		/*
+		 * Indirect branch speculation is always allowed when
+		 * mitigation is force disabled.
+		 */
+		if (spectre_v2_user == SPECTRE_V2_USER_NONE)
+			return -EPERM;
+		if (spectre_v2_user == SPECTRE_V2_USER_STRICT)
+			return 0;
+		task_set_spec_ib_disable(task);
+		if (ctrl == PR_SPEC_FORCE_DISABLE)
+			task_set_spec_ib_force_disable(task);
+		task_update_spec_tif(task);
+		break;
+	default:
+		return -ERANGE;
+	}
+	return 0;
+}
+
 int arch_prctl_spec_ctrl_set(struct task_struct *task, unsigned long which,
 			     unsigned long ctrl)
 {
 	switch (which) {
 	case PR_SPEC_STORE_BYPASS:
 		return ssb_prctl_set(task, ctrl);
+	case PR_SPEC_INDIRECT_BRANCH:
+		return ib_prctl_set(task, ctrl);
 	default:
 		return -ENODEV;
 	}
@@ -787,11 +827,34 @@ static int ssb_prctl_get(struct task_str
 	}
 }
 
+static int ib_prctl_get(struct task_struct *task)
+{
+	if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
+		return PR_SPEC_NOT_AFFECTED;
+
+	switch (spectre_v2_user) {
+	case SPECTRE_V2_USER_NONE:
+		return PR_SPEC_ENABLE;
+	case SPECTRE_V2_USER_PRCTL:
+		if (task_spec_ib_force_disable(task))
+			return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE;
+		if (task_spec_ib_disable(task))
+			return PR_SPEC_PRCTL | PR_SPEC_DISABLE;
+		return PR_SPEC_PRCTL | PR_SPEC_ENABLE;
+	case SPECTRE_V2_USER_STRICT:
+		return PR_SPEC_DISABLE;
+	default:
+		return PR_SPEC_NOT_AFFECTED;
+	}
+}
+
 int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which)
 {
 	switch (which) {
 	case PR_SPEC_STORE_BYPASS:
 		return ssb_prctl_get(task);
+	case PR_SPEC_INDIRECT_BRANCH:
+		return ib_prctl_get(task);
 	default:
 		return -ENODEV;
 	}
@@ -971,6 +1034,8 @@ static char *stibp_state(void)
 		return ", STIBP: disabled";
 	case SPECTRE_V2_USER_STRICT:
 		return ", STIBP: forced";
+	case SPECTRE_V2_USER_PRCTL:
+		return "";
 	}
 	return "";
 }
@@ -983,6 +1048,8 @@ static char *ibpb_state(void)
 			return ", IBPB: disabled";
 		case SPECTRE_V2_USER_STRICT:
 			return ", IBPB: always-on";
+		case SPECTRE_V2_USER_PRCTL:
+			return "";
 		}
 	}
 	return "";
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -453,6 +453,11 @@ static unsigned long speculation_ctrl_up
 			set_tsk_thread_flag(tsk, TIF_SSBD);
 		else
 			clear_tsk_thread_flag(tsk, TIF_SSBD);
+
+		if (task_spec_ib_disable(tsk))
+			set_tsk_thread_flag(tsk, TIF_SPEC_IB);
+		else
+			clear_tsk_thread_flag(tsk, TIF_SPEC_IB);
 	}
 	/* Return the updated threadinfo flags*/
 	return task_thread_info(tsk)->flags;
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1405,6 +1405,8 @@ static inline bool is_percpu_thread(void
 #define PFA_SPREAD_SLAB			2	/* Spread some slab caches over cpuset */
 #define PFA_SPEC_SSB_DISABLE		3	/* Speculative Store Bypass disabled */
 #define PFA_SPEC_SSB_FORCE_DISABLE	4	/* Speculative Store Bypass force disabled*/
+#define PFA_SPEC_IB_DISABLE		5	/* Indirect branch speculation restricted */
+#define PFA_SPEC_IB_FORCE_DISABLE	6	/* Indirect branch speculation permanently restricted */
 
 #define TASK_PFA_TEST(name, func)					\
 	static inline bool task_##func(struct task_struct *p)		\
@@ -1436,6 +1438,13 @@ TASK_PFA_CLEAR(SPEC_SSB_DISABLE, spec_ss
 TASK_PFA_TEST(SPEC_SSB_FORCE_DISABLE, spec_ssb_force_disable)
 TASK_PFA_SET(SPEC_SSB_FORCE_DISABLE, spec_ssb_force_disable)
 
+TASK_PFA_TEST(SPEC_IB_DISABLE, spec_ib_disable)
+TASK_PFA_SET(SPEC_IB_DISABLE, spec_ib_disable)
+TASK_PFA_CLEAR(SPEC_IB_DISABLE, spec_ib_disable)
+
+TASK_PFA_TEST(SPEC_IB_FORCE_DISABLE, spec_ib_force_disable)
+TASK_PFA_SET(SPEC_IB_FORCE_DISABLE, spec_ib_force_disable)
+
 static inline void
 current_restore_flags(unsigned long orig_flags, unsigned long flags)
 {
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -203,6 +203,7 @@ struct prctl_mm_map {
 #define PR_SET_SPECULATION_CTRL		53
 /* Speculation control variants */
 # define PR_SPEC_STORE_BYPASS		0
+# define PR_SPEC_INDIRECT_BRANCH	1
 /* Return and control values for PR_SET/GET_SPECULATION_CTRL */
 # define PR_SPEC_NOT_AFFECTED		0
 # define PR_SPEC_PRCTL			(1UL << 0)



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 106/146] x86/speculation: Enable prctl mode for spectre_v2_user
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (104 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 105/146] x86/speculation: Add prctl() control for indirect branch speculation Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 107/146] x86/speculation: Add seccomp Spectre v2 user space protection mode Greg Kroah-Hartman
                   ` (44 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit 7cc765a67d8e04ef7d772425ca5a2a1e2b894c15 upstream

Now that all prerequisites are in place:

 - Add the prctl command line option

 - Default the 'auto' mode to 'prctl'

 - When SMT state changes, update the static key which controls the
   conditional STIBP evaluation on context switch.

 - At init update the static key which controls the conditional IBPB
   evaluation on context switch.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185005.958421388@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/admin-guide/kernel-parameters.txt |    7 +++-
 arch/x86/kernel/cpu/bugs.c                      |   41 ++++++++++++++++++------
 2 files changed, 38 insertions(+), 10 deletions(-)

--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4036,9 +4036,14 @@
 			off     - Unconditionally disable mitigations. Is
 				  enforced by spectre_v2=off
 
+			prctl   - Indirect branch speculation is enabled,
+				  but mitigation can be enabled via prctl
+				  per thread.  The mitigation control state
+				  is inherited on fork.
+
 			auto    - Kernel selects the mitigation depending on
 				  the available CPU features and vulnerability.
-				  Default is off.
+				  Default is prctl.
 
 			Not specifying this option is equivalent to
 			spectre_v2_user=auto.
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -254,11 +254,13 @@ enum spectre_v2_user_cmd {
 	SPECTRE_V2_USER_CMD_NONE,
 	SPECTRE_V2_USER_CMD_AUTO,
 	SPECTRE_V2_USER_CMD_FORCE,
+	SPECTRE_V2_USER_CMD_PRCTL,
 };
 
 static const char * const spectre_v2_user_strings[] = {
 	[SPECTRE_V2_USER_NONE]		= "User space: Vulnerable",
 	[SPECTRE_V2_USER_STRICT]	= "User space: Mitigation: STIBP protection",
+	[SPECTRE_V2_USER_PRCTL]		= "User space: Mitigation: STIBP via prctl",
 };
 
 static const struct {
@@ -269,6 +271,7 @@ static const struct {
 	{ "auto",	SPECTRE_V2_USER_CMD_AUTO,	false },
 	{ "off",	SPECTRE_V2_USER_CMD_NONE,	false },
 	{ "on",		SPECTRE_V2_USER_CMD_FORCE,	true  },
+	{ "prctl",	SPECTRE_V2_USER_CMD_PRCTL,	false },
 };
 
 static void __init spec_v2_user_print_cond(const char *reason, bool secure)
@@ -323,12 +326,15 @@ spectre_v2_user_select_mitigation(enum s
 		smt_possible = false;
 
 	switch (spectre_v2_parse_user_cmdline(v2_cmd)) {
-	case SPECTRE_V2_USER_CMD_AUTO:
 	case SPECTRE_V2_USER_CMD_NONE:
 		goto set_mode;
 	case SPECTRE_V2_USER_CMD_FORCE:
 		mode = SPECTRE_V2_USER_STRICT;
 		break;
+	case SPECTRE_V2_USER_CMD_AUTO:
+	case SPECTRE_V2_USER_CMD_PRCTL:
+		mode = SPECTRE_V2_USER_PRCTL;
+		break;
 	}
 
 	/* Initialize Indirect Branch Prediction Barrier */
@@ -339,6 +345,9 @@ spectre_v2_user_select_mitigation(enum s
 		case SPECTRE_V2_USER_STRICT:
 			static_branch_enable(&switch_mm_always_ibpb);
 			break;
+		case SPECTRE_V2_USER_PRCTL:
+			static_branch_enable(&switch_mm_cond_ibpb);
+			break;
 		default:
 			break;
 		}
@@ -351,6 +360,12 @@ spectre_v2_user_select_mitigation(enum s
 	if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
 		return;
 
+	/*
+	 * If SMT is not possible or STIBP is not available clear the STIPB
+	 * mode.
+	 */
+	if (!smt_possible || !boot_cpu_has(X86_FEATURE_STIBP))
+		mode = SPECTRE_V2_USER_NONE;
 set_mode:
 	spectre_v2_user = mode;
 	/* Only print the STIBP mode when SMT possible */
@@ -549,6 +564,15 @@ static void update_stibp_strict(void)
 	on_each_cpu(update_stibp_msr, NULL, 1);
 }
 
+/* Update the static key controlling the evaluation of TIF_SPEC_IB */
+static void update_indir_branch_cond(void)
+{
+	if (sched_smt_active())
+		static_branch_enable(&switch_to_cond_stibp);
+	else
+		static_branch_disable(&switch_to_cond_stibp);
+}
+
 void arch_smt_update(void)
 {
 	/* Enhanced IBRS implies STIBP. No update required. */
@@ -564,6 +588,7 @@ void arch_smt_update(void)
 		update_stibp_strict();
 		break;
 	case SPECTRE_V2_USER_PRCTL:
+		update_indir_branch_cond();
 		break;
 	}
 
@@ -1035,7 +1060,8 @@ static char *stibp_state(void)
 	case SPECTRE_V2_USER_STRICT:
 		return ", STIBP: forced";
 	case SPECTRE_V2_USER_PRCTL:
-		return "";
+		if (static_key_enabled(&switch_to_cond_stibp))
+			return ", STIBP: conditional";
 	}
 	return "";
 }
@@ -1043,14 +1069,11 @@ static char *stibp_state(void)
 static char *ibpb_state(void)
 {
 	if (boot_cpu_has(X86_FEATURE_IBPB)) {
-		switch (spectre_v2_user) {
-		case SPECTRE_V2_USER_NONE:
-			return ", IBPB: disabled";
-		case SPECTRE_V2_USER_STRICT:
+		if (static_key_enabled(&switch_mm_always_ibpb))
 			return ", IBPB: always-on";
-		case SPECTRE_V2_USER_PRCTL:
-			return "";
-		}
+		if (static_key_enabled(&switch_mm_cond_ibpb))
+			return ", IBPB: conditional";
+		return ", IBPB: disabled";
 	}
 	return "";
 }



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 107/146] x86/speculation: Add seccomp Spectre v2 user space protection mode
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (105 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 106/146] x86/speculation: Enable prctl mode for spectre_v2_user Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 108/146] x86/speculation: Provide IBPB always command line options Greg Kroah-Hartman
                   ` (43 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jiri Kosina, Thomas Gleixner,
	Ingo Molnar, Peter Zijlstra, Andy Lutomirski, Linus Torvalds,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit 6b3e64c237c072797a9ec918654a60e3a46488e2 upstream

If 'prctl' mode of user space protection from spectre v2 is selected
on the kernel command-line, STIBP and IBPB are applied on tasks which
restrict their indirect branch speculation via prctl.

SECCOMP enables the SSBD mitigation for sandboxed tasks already, so it
makes sense to prevent spectre v2 user space to user space attacks as
well.

The Intel mitigation guide documents how STIPB works:
    
   Setting bit 1 (STIBP) of the IA32_SPEC_CTRL MSR on a logical processor
   prevents the predicted targets of indirect branches on any logical
   processor of that core from being controlled by software that executes
   (or executed previously) on another logical processor of the same core.

Ergo setting STIBP protects the task itself from being attacked from a task
running on a different hyper-thread and protects the tasks running on
different hyper-threads from being attacked.

While the document suggests that the branch predictors are shielded between
the logical processors, the observed performance regressions suggest that
STIBP simply disables the branch predictor more or less completely. Of
course the document wording is vague, but the fact that there is also no
requirement for issuing IBPB when STIBP is used points clearly in that
direction. The kernel still issues IBPB even when STIBP is used until Intel
clarifies the whole mechanism.

IBPB is issued when the task switches out, so malicious sandbox code cannot
mistrain the branch predictor for the next user space task on the same
logical processor.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185006.051663132@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/admin-guide/kernel-parameters.txt |    9 ++++++++-
 arch/x86/include/asm/nospec-branch.h            |    1 +
 arch/x86/kernel/cpu/bugs.c                      |   17 ++++++++++++++++-
 3 files changed, 25 insertions(+), 2 deletions(-)

--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4041,9 +4041,16 @@
 				  per thread.  The mitigation control state
 				  is inherited on fork.
 
+			seccomp
+				- Same as "prctl" above, but all seccomp
+				  threads will enable the mitigation unless
+				  they explicitly opt out.
+
 			auto    - Kernel selects the mitigation depending on
 				  the available CPU features and vulnerability.
-				  Default is prctl.
+
+			Default mitigation:
+			If CONFIG_SECCOMP=y then "seccomp", otherwise "prctl"
 
 			Not specifying this option is equivalent to
 			spectre_v2_user=auto.
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -233,6 +233,7 @@ enum spectre_v2_user_mitigation {
 	SPECTRE_V2_USER_NONE,
 	SPECTRE_V2_USER_STRICT,
 	SPECTRE_V2_USER_PRCTL,
+	SPECTRE_V2_USER_SECCOMP,
 };
 
 /* The Speculative Store Bypass disable variants */
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -255,12 +255,14 @@ enum spectre_v2_user_cmd {
 	SPECTRE_V2_USER_CMD_AUTO,
 	SPECTRE_V2_USER_CMD_FORCE,
 	SPECTRE_V2_USER_CMD_PRCTL,
+	SPECTRE_V2_USER_CMD_SECCOMP,
 };
 
 static const char * const spectre_v2_user_strings[] = {
 	[SPECTRE_V2_USER_NONE]		= "User space: Vulnerable",
 	[SPECTRE_V2_USER_STRICT]	= "User space: Mitigation: STIBP protection",
 	[SPECTRE_V2_USER_PRCTL]		= "User space: Mitigation: STIBP via prctl",
+	[SPECTRE_V2_USER_SECCOMP]	= "User space: Mitigation: STIBP via seccomp and prctl",
 };
 
 static const struct {
@@ -272,6 +274,7 @@ static const struct {
 	{ "off",	SPECTRE_V2_USER_CMD_NONE,	false },
 	{ "on",		SPECTRE_V2_USER_CMD_FORCE,	true  },
 	{ "prctl",	SPECTRE_V2_USER_CMD_PRCTL,	false },
+	{ "seccomp",	SPECTRE_V2_USER_CMD_SECCOMP,	false },
 };
 
 static void __init spec_v2_user_print_cond(const char *reason, bool secure)
@@ -331,10 +334,16 @@ spectre_v2_user_select_mitigation(enum s
 	case SPECTRE_V2_USER_CMD_FORCE:
 		mode = SPECTRE_V2_USER_STRICT;
 		break;
-	case SPECTRE_V2_USER_CMD_AUTO:
 	case SPECTRE_V2_USER_CMD_PRCTL:
 		mode = SPECTRE_V2_USER_PRCTL;
 		break;
+	case SPECTRE_V2_USER_CMD_AUTO:
+	case SPECTRE_V2_USER_CMD_SECCOMP:
+		if (IS_ENABLED(CONFIG_SECCOMP))
+			mode = SPECTRE_V2_USER_SECCOMP;
+		else
+			mode = SPECTRE_V2_USER_PRCTL;
+		break;
 	}
 
 	/* Initialize Indirect Branch Prediction Barrier */
@@ -346,6 +355,7 @@ spectre_v2_user_select_mitigation(enum s
 			static_branch_enable(&switch_mm_always_ibpb);
 			break;
 		case SPECTRE_V2_USER_PRCTL:
+		case SPECTRE_V2_USER_SECCOMP:
 			static_branch_enable(&switch_mm_cond_ibpb);
 			break;
 		default:
@@ -588,6 +598,7 @@ void arch_smt_update(void)
 		update_stibp_strict();
 		break;
 	case SPECTRE_V2_USER_PRCTL:
+	case SPECTRE_V2_USER_SECCOMP:
 		update_indir_branch_cond();
 		break;
 	}
@@ -830,6 +841,8 @@ void arch_seccomp_spec_mitigate(struct t
 {
 	if (ssb_mode == SPEC_STORE_BYPASS_SECCOMP)
 		ssb_prctl_set(task, PR_SPEC_FORCE_DISABLE);
+	if (spectre_v2_user == SPECTRE_V2_USER_SECCOMP)
+		ib_prctl_set(task, PR_SPEC_FORCE_DISABLE);
 }
 #endif
 
@@ -861,6 +874,7 @@ static int ib_prctl_get(struct task_stru
 	case SPECTRE_V2_USER_NONE:
 		return PR_SPEC_ENABLE;
 	case SPECTRE_V2_USER_PRCTL:
+	case SPECTRE_V2_USER_SECCOMP:
 		if (task_spec_ib_force_disable(task))
 			return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE;
 		if (task_spec_ib_disable(task))
@@ -1060,6 +1074,7 @@ static char *stibp_state(void)
 	case SPECTRE_V2_USER_STRICT:
 		return ", STIBP: forced";
 	case SPECTRE_V2_USER_PRCTL:
+	case SPECTRE_V2_USER_SECCOMP:
 		if (static_key_enabled(&switch_to_cond_stibp))
 			return ", STIBP: conditional";
 	}



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 108/146] x86/speculation: Provide IBPB always command line options
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (106 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 107/146] x86/speculation: Add seccomp Spectre v2 user space protection mode Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 109/146] kvm: mmu: Fix race in emulated page table writes Greg Kroah-Hartman
                   ` (42 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Ingo Molnar,
	Peter Zijlstra, Andy Lutomirski, Linus Torvalds, Jiri Kosina,
	Tom Lendacky, Josh Poimboeuf, Andrea Arcangeli, David Woodhouse,
	Tim Chen, Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner tglx@linutronix.de

commit 55a974021ec952ee460dc31ca08722158639de72 upstream

Provide the possibility to enable IBPB always in combination with 'prctl'
and 'seccomp'.

Add the extra command line options and rework the IBPB selection to
evaluate the command instead of the mode selected by the STIPB switch case.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185006.144047038@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/admin-guide/kernel-parameters.txt |   12 ++++++++
 arch/x86/kernel/cpu/bugs.c                      |   34 ++++++++++++++++--------
 2 files changed, 35 insertions(+), 11 deletions(-)

--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4041,11 +4041,23 @@
 				  per thread.  The mitigation control state
 				  is inherited on fork.
 
+			prctl,ibpb
+				- Like "prctl" above, but only STIBP is
+				  controlled per thread. IBPB is issued
+				  always when switching between different user
+				  space processes.
+
 			seccomp
 				- Same as "prctl" above, but all seccomp
 				  threads will enable the mitigation unless
 				  they explicitly opt out.
 
+			seccomp,ibpb
+				- Like "seccomp" above, but only STIBP is
+				  controlled per thread. IBPB is issued
+				  always when switching between different
+				  user space processes.
+
 			auto    - Kernel selects the mitigation depending on
 				  the available CPU features and vulnerability.
 
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -255,7 +255,9 @@ enum spectre_v2_user_cmd {
 	SPECTRE_V2_USER_CMD_AUTO,
 	SPECTRE_V2_USER_CMD_FORCE,
 	SPECTRE_V2_USER_CMD_PRCTL,
+	SPECTRE_V2_USER_CMD_PRCTL_IBPB,
 	SPECTRE_V2_USER_CMD_SECCOMP,
+	SPECTRE_V2_USER_CMD_SECCOMP_IBPB,
 };
 
 static const char * const spectre_v2_user_strings[] = {
@@ -270,11 +272,13 @@ static const struct {
 	enum spectre_v2_user_cmd	cmd;
 	bool				secure;
 } v2_user_options[] __initdata = {
-	{ "auto",	SPECTRE_V2_USER_CMD_AUTO,	false },
-	{ "off",	SPECTRE_V2_USER_CMD_NONE,	false },
-	{ "on",		SPECTRE_V2_USER_CMD_FORCE,	true  },
-	{ "prctl",	SPECTRE_V2_USER_CMD_PRCTL,	false },
-	{ "seccomp",	SPECTRE_V2_USER_CMD_SECCOMP,	false },
+	{ "auto",		SPECTRE_V2_USER_CMD_AUTO,		false },
+	{ "off",		SPECTRE_V2_USER_CMD_NONE,		false },
+	{ "on",			SPECTRE_V2_USER_CMD_FORCE,		true  },
+	{ "prctl",		SPECTRE_V2_USER_CMD_PRCTL,		false },
+	{ "prctl,ibpb",		SPECTRE_V2_USER_CMD_PRCTL_IBPB,		false },
+	{ "seccomp",		SPECTRE_V2_USER_CMD_SECCOMP,		false },
+	{ "seccomp,ibpb",	SPECTRE_V2_USER_CMD_SECCOMP_IBPB,	false },
 };
 
 static void __init spec_v2_user_print_cond(const char *reason, bool secure)
@@ -320,6 +324,7 @@ spectre_v2_user_select_mitigation(enum s
 {
 	enum spectre_v2_user_mitigation mode = SPECTRE_V2_USER_NONE;
 	bool smt_possible = IS_ENABLED(CONFIG_SMP);
+	enum spectre_v2_user_cmd cmd;
 
 	if (!boot_cpu_has(X86_FEATURE_IBPB) && !boot_cpu_has(X86_FEATURE_STIBP))
 		return;
@@ -328,17 +333,20 @@ spectre_v2_user_select_mitigation(enum s
 	    cpu_smt_control == CPU_SMT_NOT_SUPPORTED)
 		smt_possible = false;
 
-	switch (spectre_v2_parse_user_cmdline(v2_cmd)) {
+	cmd = spectre_v2_parse_user_cmdline(v2_cmd);
+	switch (cmd) {
 	case SPECTRE_V2_USER_CMD_NONE:
 		goto set_mode;
 	case SPECTRE_V2_USER_CMD_FORCE:
 		mode = SPECTRE_V2_USER_STRICT;
 		break;
 	case SPECTRE_V2_USER_CMD_PRCTL:
+	case SPECTRE_V2_USER_CMD_PRCTL_IBPB:
 		mode = SPECTRE_V2_USER_PRCTL;
 		break;
 	case SPECTRE_V2_USER_CMD_AUTO:
 	case SPECTRE_V2_USER_CMD_SECCOMP:
+	case SPECTRE_V2_USER_CMD_SECCOMP_IBPB:
 		if (IS_ENABLED(CONFIG_SECCOMP))
 			mode = SPECTRE_V2_USER_SECCOMP;
 		else
@@ -350,12 +358,15 @@ spectre_v2_user_select_mitigation(enum s
 	if (boot_cpu_has(X86_FEATURE_IBPB)) {
 		setup_force_cpu_cap(X86_FEATURE_USE_IBPB);
 
-		switch (mode) {
-		case SPECTRE_V2_USER_STRICT:
+		switch (cmd) {
+		case SPECTRE_V2_USER_CMD_FORCE:
+		case SPECTRE_V2_USER_CMD_PRCTL_IBPB:
+		case SPECTRE_V2_USER_CMD_SECCOMP_IBPB:
 			static_branch_enable(&switch_mm_always_ibpb);
 			break;
-		case SPECTRE_V2_USER_PRCTL:
-		case SPECTRE_V2_USER_SECCOMP:
+		case SPECTRE_V2_USER_CMD_PRCTL:
+		case SPECTRE_V2_USER_CMD_AUTO:
+		case SPECTRE_V2_USER_CMD_SECCOMP:
 			static_branch_enable(&switch_mm_cond_ibpb);
 			break;
 		default:
@@ -363,7 +374,8 @@ spectre_v2_user_select_mitigation(enum s
 		}
 
 		pr_info("mitigation: Enabling %s Indirect Branch Prediction Barrier\n",
-			mode == SPECTRE_V2_USER_STRICT ? "always-on" : "conditional");
+			static_key_enabled(&switch_mm_always_ibpb) ?
+			"always-on" : "conditional");
 	}
 
 	/* If enhanced IBRS is enabled no STIPB required */



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 109/146] kvm: mmu: Fix race in emulated page table writes
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (107 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 108/146] x86/speculation: Provide IBPB always command line options Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 110/146] kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb Greg Kroah-Hartman
                   ` (41 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Junaid Shahid, Wanpeng Li, Paolo Bonzini

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Junaid Shahid <junaids@google.com>

commit 0e0fee5c539b61fdd098332e0e2cc375d9073706 upstream.

When a guest page table is updated via an emulated write,
kvm_mmu_pte_write() is called to update the shadow PTE using the just
written guest PTE value. But if two emulated guest PTE writes happened
concurrently, it is possible that the guest PTE and the shadow PTE end
up being out of sync. Emulated writes do not mark the shadow page as
unsync-ed, so this inconsistency will not be resolved even by a guest TLB
flush (unless the page was marked as unsync-ed at some other point).

This is fixed by re-reading the current value of the guest PTE after the
MMU lock has been acquired instead of just using the value that was
written prior to calling kvm_mmu_pte_write().

Signed-off-by: Junaid Shahid <junaids@google.com>
Reviewed-by: Wanpeng Li <wanpengli@tencent.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kvm/mmu.c |   27 +++++++++------------------
 1 file changed, 9 insertions(+), 18 deletions(-)

--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4734,9 +4734,9 @@ static bool need_remote_flush(u64 old, u
 }
 
 static u64 mmu_pte_write_fetch_gpte(struct kvm_vcpu *vcpu, gpa_t *gpa,
-				    const u8 *new, int *bytes)
+				    int *bytes)
 {
-	u64 gentry;
+	u64 gentry = 0;
 	int r;
 
 	/*
@@ -4748,22 +4748,12 @@ static u64 mmu_pte_write_fetch_gpte(stru
 		/* Handle a 32-bit guest writing two halves of a 64-bit gpte */
 		*gpa &= ~(gpa_t)7;
 		*bytes = 8;
-		r = kvm_vcpu_read_guest(vcpu, *gpa, &gentry, 8);
-		if (r)
-			gentry = 0;
-		new = (const u8 *)&gentry;
 	}
 
-	switch (*bytes) {
-	case 4:
-		gentry = *(const u32 *)new;
-		break;
-	case 8:
-		gentry = *(const u64 *)new;
-		break;
-	default:
-		gentry = 0;
-		break;
+	if (*bytes == 4 || *bytes == 8) {
+		r = kvm_vcpu_read_guest_atomic(vcpu, *gpa, &gentry, *bytes);
+		if (r)
+			gentry = 0;
 	}
 
 	return gentry;
@@ -4876,8 +4866,6 @@ static void kvm_mmu_pte_write(struct kvm
 
 	pgprintk("%s: gpa %llx bytes %d\n", __func__, gpa, bytes);
 
-	gentry = mmu_pte_write_fetch_gpte(vcpu, &gpa, new, &bytes);
-
 	/*
 	 * No need to care whether allocation memory is successful
 	 * or not since pte prefetch is skiped if it does not have
@@ -4886,6 +4874,9 @@ static void kvm_mmu_pte_write(struct kvm
 	mmu_topup_memory_caches(vcpu);
 
 	spin_lock(&vcpu->kvm->mmu_lock);
+
+	gentry = mmu_pte_write_fetch_gpte(vcpu, &gpa, &bytes);
+
 	++vcpu->kvm->stat.mmu_pte_write;
 	kvm_mmu_audit(vcpu, AUDIT_PRE_PTE_WRITE);
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 110/146] kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (108 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 109/146] kvm: mmu: Fix race in emulated page table writes Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 111/146] KVM: x86: Fix kernel info-leak in KVM_HC_CLOCK_PAIRING hypercall Greg Kroah-Hartman
                   ` (40 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Neel Natu, Jim Mattson,
	Konrad Rzeszutek Wilk, Paolo Bonzini

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jim Mattson <jmattson@google.com>

commit fd65d3142f734bc4376053c8d75670041903134d upstream.

Previously, we only called indirect_branch_prediction_barrier on the
logical CPU that freed a vmcb. This function should be called on all
logical CPUs that last loaded the vmcb in question.

Fixes: 15d45071523d ("KVM/x86: Add IBPB support")
Reported-by: Neel Natu <neelnatu@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kvm/svm.c |   20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1733,21 +1733,31 @@ out:
 	return ERR_PTR(err);
 }
 
+static void svm_clear_current_vmcb(struct vmcb *vmcb)
+{
+	int i;
+
+	for_each_online_cpu(i)
+		cmpxchg(&per_cpu(svm_data, i)->current_vmcb, vmcb, NULL);
+}
+
 static void svm_free_vcpu(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 
+	/*
+	 * The vmcb page can be recycled, causing a false negative in
+	 * svm_vcpu_load(). So, ensure that no logical CPU has this
+	 * vmcb page recorded as its current vmcb.
+	 */
+	svm_clear_current_vmcb(svm->vmcb);
+
 	__free_page(pfn_to_page(__sme_clr(svm->vmcb_pa) >> PAGE_SHIFT));
 	__free_pages(virt_to_page(svm->msrpm), MSRPM_ALLOC_ORDER);
 	__free_page(virt_to_page(svm->nested.hsave));
 	__free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER);
 	kvm_vcpu_uninit(vcpu);
 	kmem_cache_free(kvm_vcpu_cache, svm);
-	/*
-	 * The vmcb page can be recycled, causing a false negative in
-	 * svm_vcpu_load(). So do a full IBPB now.
-	 */
-	indirect_branch_prediction_barrier();
 }
 
 static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 111/146] KVM: x86: Fix kernel info-leak in KVM_HC_CLOCK_PAIRING hypercall
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (109 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 110/146] kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 112/146] KVM: X86: Fix scan ioapic use-before-initialization Greg Kroah-Hartman
                   ` (39 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+a8ef68d71211ba264f56,
	Mark Kanda, Liran Alon, Paolo Bonzini

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Liran Alon <liran.alon@oracle.com>

commit bcbfbd8ec21096027f1ee13ce6c185e8175166f6 upstream.

kvm_pv_clock_pairing() allocates local var
"struct kvm_clock_pairing clock_pairing" on stack and initializes
all it's fields besides padding (clock_pairing.pad[]).

Because clock_pairing var is written completely (including padding)
to guest memory, failure to init struct padding results in kernel
info-leak.

Fix the issue by making sure to also init the padding with zeroes.

Fixes: 55dd00a73a51 ("KVM: x86: add KVM_HC_CLOCK_PAIRING hypercall")
Reported-by: syzbot+a8ef68d71211ba264f56@syzkaller.appspotmail.com
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kvm/x86.c |    1 +
 1 file changed, 1 insertion(+)

--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6378,6 +6378,7 @@ static int kvm_pv_clock_pairing(struct k
 	clock_pairing.nsec = ts.tv_nsec;
 	clock_pairing.tsc = kvm_read_l1_tsc(vcpu, cycle);
 	clock_pairing.flags = 0;
+	memset(&clock_pairing.pad, 0, sizeof(clock_pairing.pad));
 
 	ret = 0;
 	if (kvm_write_guest(vcpu->kvm, paddr, &clock_pairing,



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 112/146] KVM: X86: Fix scan ioapic use-before-initialization
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (110 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 111/146] KVM: x86: Fix kernel info-leak in KVM_HC_CLOCK_PAIRING hypercall Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:49 ` [PATCH 4.14 113/146] xtensa: enable coprocessors that are being flushed Greg Kroah-Hartman
                   ` (38 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Wei Wu, Paolo Bonzini,
	Radim Krčmář,
	Wanpeng Li

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Wanpeng Li <wanpengli@tencent.com>

commit e97f852fd4561e77721bb9a4e0ea9d98305b1e93 upstream.

Reported by syzkaller:

 BUG: unable to handle kernel NULL pointer dereference at 00000000000001c8
 PGD 80000003ec4da067 P4D 80000003ec4da067 PUD 3f7bfa067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP PTI
 CPU: 7 PID: 5059 Comm: debug Tainted: G           OE     4.19.0-rc5 #16
 RIP: 0010:__lock_acquire+0x1a6/0x1990
 Call Trace:
  lock_acquire+0xdb/0x210
  _raw_spin_lock+0x38/0x70
  kvm_ioapic_scan_entry+0x3e/0x110 [kvm]
  vcpu_enter_guest+0x167e/0x1910 [kvm]
  kvm_arch_vcpu_ioctl_run+0x35c/0x610 [kvm]
  kvm_vcpu_ioctl+0x3e9/0x6d0 [kvm]
  do_vfs_ioctl+0xa5/0x690
  ksys_ioctl+0x6d/0x80
  __x64_sys_ioctl+0x1a/0x20
  do_syscall_64+0x83/0x6e0
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

The reason is that the testcase writes hyperv synic HV_X64_MSR_SINT6 msr
and triggers scan ioapic logic to load synic vectors into EOI exit bitmap.
However, irqchip is not initialized by this simple testcase, ioapic/apic
objects should not be accessed.
This can be triggered by the following program:

    #define _GNU_SOURCE

    #include <endian.h>
    #include <stdint.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <sys/syscall.h>
    #include <sys/types.h>
    #include <unistd.h>

    uint64_t r[3] = {0xffffffffffffffff, 0xffffffffffffffff, 0xffffffffffffffff};

    int main(void)
    {
    	syscall(__NR_mmap, 0x20000000, 0x1000000, 3, 0x32, -1, 0);
    	long res = 0;
    	memcpy((void*)0x20000040, "/dev/kvm", 9);
    	res = syscall(__NR_openat, 0xffffffffffffff9c, 0x20000040, 0, 0);
    	if (res != -1)
    		r[0] = res;
    	res = syscall(__NR_ioctl, r[0], 0xae01, 0);
    	if (res != -1)
    		r[1] = res;
    	res = syscall(__NR_ioctl, r[1], 0xae41, 0);
    	if (res != -1)
    		r[2] = res;
    	memcpy(
    			(void*)0x20000080,
    			"\x01\x00\x00\x00\x00\x5b\x61\xbb\x96\x00\x00\x40\x00\x00\x00\x00\x01\x00"
    			"\x08\x00\x00\x00\x00\x00\x0b\x77\xd1\x78\x4d\xd8\x3a\xed\xb1\x5c\x2e\x43"
    			"\xaa\x43\x39\xd6\xff\xf5\xf0\xa8\x98\xf2\x3e\x37\x29\x89\xde\x88\xc6\x33"
    			"\xfc\x2a\xdb\xb7\xe1\x4c\xac\x28\x61\x7b\x9c\xa9\xbc\x0d\xa0\x63\xfe\xfe"
    			"\xe8\x75\xde\xdd\x19\x38\xdc\x34\xf5\xec\x05\xfd\xeb\x5d\xed\x2e\xaf\x22"
    			"\xfa\xab\xb7\xe4\x42\x67\xd0\xaf\x06\x1c\x6a\x35\x67\x10\x55\xcb",
    			106);
    	syscall(__NR_ioctl, r[2], 0x4008ae89, 0x20000080);
    	syscall(__NR_ioctl, r[2], 0xae80, 0);
    	return 0;
    }

This patch fixes it by bailing out scan ioapic if ioapic is not initialized in
kernel.

Reported-by: Wei Wu <ww9210@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Wei Wu <ww9210@gmail.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kvm/x86.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6885,7 +6885,8 @@ static void vcpu_scan_ioapic(struct kvm_
 	else {
 		if (kvm_x86_ops->sync_pir_to_irr && vcpu->arch.apicv_active)
 			kvm_x86_ops->sync_pir_to_irr(vcpu);
-		kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
+		if (ioapic_in_kernel(vcpu->kvm))
+			kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
 	}
 	bitmap_or((ulong *)eoi_exit_bitmap, vcpu->arch.ioapic_handled_vectors,
 		  vcpu_to_synic(vcpu)->vec_bitmap, 256);



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 113/146] xtensa: enable coprocessors that are being flushed
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (111 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 112/146] KVM: X86: Fix scan ioapic use-before-initialization Greg Kroah-Hartman
@ 2018-12-04 10:49 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 114/146] xtensa: fix coprocessor context offset definitions Greg Kroah-Hartman
                   ` (37 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Max Filippov

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Max Filippov <jcmvbkbc@gmail.com>

commit 2958b66694e018c552be0b60521fec27e8d12988 upstream.

coprocessor_flush_all may be called from a context of a thread that is
different from the thread being flushed. In that case contents of the
cpenable special register may not match ti->cpenable of the target
thread, resulting in unhandled coprocessor exception in the kernel
context.
Set cpenable special register to the ti->cpenable of the target register
for the duration of the flush and restore it afterwards.
This fixes the following crash caused by coprocessor register inspection
in native gdb:

  (gdb) p/x $w0
  Illegal instruction in kernel: sig: 9 [#1] PREEMPT
  Call Trace:
    ___might_sleep+0x184/0x1a4
    __might_sleep+0x41/0xac
    exit_signals+0x14/0x218
    do_exit+0xc9/0x8b8
    die+0x99/0xa0
    do_illegal_instruction+0x18/0x6c
    common_exception+0x77/0x77
    coprocessor_flush+0x16/0x3c
    arch_ptrace+0x46c/0x674
    sys_ptrace+0x2ce/0x3b4
    system_call+0x54/0x80
    common_exception+0x77/0x77
  note: gdb[100] exited with preempt_count 1
  Killed

Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/xtensa/kernel/process.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/arch/xtensa/kernel/process.c
+++ b/arch/xtensa/kernel/process.c
@@ -88,18 +88,21 @@ void coprocessor_release_all(struct thre
 
 void coprocessor_flush_all(struct thread_info *ti)
 {
-	unsigned long cpenable;
+	unsigned long cpenable, old_cpenable;
 	int i;
 
 	preempt_disable();
 
+	RSR_CPENABLE(old_cpenable);
 	cpenable = ti->cpenable;
+	WSR_CPENABLE(cpenable);
 
 	for (i = 0; i < XCHAL_CP_MAX; i++) {
 		if ((cpenable & 1) != 0 && coprocessor_owner[i] == ti)
 			coprocessor_flush(ti, i);
 		cpenable >>= 1;
 	}
+	WSR_CPENABLE(old_cpenable);
 
 	preempt_enable();
 }



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 114/146] xtensa: fix coprocessor context offset definitions
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (112 preceding siblings ...)
  2018-12-04 10:49 ` [PATCH 4.14 113/146] xtensa: enable coprocessors that are being flushed Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 115/146] xtensa: fix coprocessor part of ptrace_{get,set}xregs Greg Kroah-Hartman
                   ` (36 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Max Filippov

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Max Filippov <jcmvbkbc@gmail.com>

commit 03bc996af0cc71c7f30c384d8ce7260172423b34 upstream.

Coprocessor context offsets are used by the assembly code that moves
coprocessor context between the individual fields of the
thread_info::xtregs_cp structure and coprocessor registers.
This fixes coprocessor context clobbering on flushing and reloading
during normal user code execution and user process debugging in the
presence of more than one coprocessor in the core configuration.

Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/xtensa/kernel/asm-offsets.c |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

--- a/arch/xtensa/kernel/asm-offsets.c
+++ b/arch/xtensa/kernel/asm-offsets.c
@@ -91,14 +91,14 @@ int main(void)
 	DEFINE(THREAD_SP, offsetof (struct task_struct, thread.sp));
 	DEFINE(THREAD_CPENABLE, offsetof (struct thread_info, cpenable));
 #if XTENSA_HAVE_COPROCESSORS
-	DEFINE(THREAD_XTREGS_CP0, offsetof (struct thread_info, xtregs_cp));
-	DEFINE(THREAD_XTREGS_CP1, offsetof (struct thread_info, xtregs_cp));
-	DEFINE(THREAD_XTREGS_CP2, offsetof (struct thread_info, xtregs_cp));
-	DEFINE(THREAD_XTREGS_CP3, offsetof (struct thread_info, xtregs_cp));
-	DEFINE(THREAD_XTREGS_CP4, offsetof (struct thread_info, xtregs_cp));
-	DEFINE(THREAD_XTREGS_CP5, offsetof (struct thread_info, xtregs_cp));
-	DEFINE(THREAD_XTREGS_CP6, offsetof (struct thread_info, xtregs_cp));
-	DEFINE(THREAD_XTREGS_CP7, offsetof (struct thread_info, xtregs_cp));
+	DEFINE(THREAD_XTREGS_CP0, offsetof(struct thread_info, xtregs_cp.cp0));
+	DEFINE(THREAD_XTREGS_CP1, offsetof(struct thread_info, xtregs_cp.cp1));
+	DEFINE(THREAD_XTREGS_CP2, offsetof(struct thread_info, xtregs_cp.cp2));
+	DEFINE(THREAD_XTREGS_CP3, offsetof(struct thread_info, xtregs_cp.cp3));
+	DEFINE(THREAD_XTREGS_CP4, offsetof(struct thread_info, xtregs_cp.cp4));
+	DEFINE(THREAD_XTREGS_CP5, offsetof(struct thread_info, xtregs_cp.cp5));
+	DEFINE(THREAD_XTREGS_CP6, offsetof(struct thread_info, xtregs_cp.cp6));
+	DEFINE(THREAD_XTREGS_CP7, offsetof(struct thread_info, xtregs_cp.cp7));
 #endif
 	DEFINE(THREAD_XTREGS_USER, offsetof (struct thread_info, xtregs_user));
 	DEFINE(XTREGS_USER_SIZE, sizeof(xtregs_user_t));



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 115/146] xtensa: fix coprocessor part of ptrace_{get,set}xregs
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (113 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 114/146] xtensa: fix coprocessor context offset definitions Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 116/146] Btrfs: ensure path name is null terminated at btrfs_control_ioctl Greg Kroah-Hartman
                   ` (35 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Max Filippov

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Max Filippov <jcmvbkbc@gmail.com>

commit 38a35a78c5e270cbe53c4fef6b0d3c2da90dd849 upstream.

Layout of coprocessor registers in the elf_xtregs_t and
xtregs_coprocessor_t may be different due to alignment. Thus it is not
always possible to copy data between the xtregs_coprocessor_t structure
and the elf_xtregs_t and get correct values for all registers.
Use a table of offsets and sizes of individual coprocessor register
groups to do coprocessor context copying in the ptrace_getxregs and
ptrace_setxregs.
This fixes incorrect coprocessor register values reading from the user
process by the native gdb on an xtensa core with multiple coprocessors
and registers with high alignment requirements.

Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/xtensa/kernel/ptrace.c |   42 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 38 insertions(+), 4 deletions(-)

--- a/arch/xtensa/kernel/ptrace.c
+++ b/arch/xtensa/kernel/ptrace.c
@@ -127,12 +127,37 @@ static int ptrace_setregs(struct task_st
 }
 
 
+#if XTENSA_HAVE_COPROCESSORS
+#define CP_OFFSETS(cp) \
+	{ \
+		.elf_xtregs_offset = offsetof(elf_xtregs_t, cp), \
+		.ti_offset = offsetof(struct thread_info, xtregs_cp.cp), \
+		.sz = sizeof(xtregs_ ## cp ## _t), \
+	}
+
+static const struct {
+	size_t elf_xtregs_offset;
+	size_t ti_offset;
+	size_t sz;
+} cp_offsets[] = {
+	CP_OFFSETS(cp0),
+	CP_OFFSETS(cp1),
+	CP_OFFSETS(cp2),
+	CP_OFFSETS(cp3),
+	CP_OFFSETS(cp4),
+	CP_OFFSETS(cp5),
+	CP_OFFSETS(cp6),
+	CP_OFFSETS(cp7),
+};
+#endif
+
 static int ptrace_getxregs(struct task_struct *child, void __user *uregs)
 {
 	struct pt_regs *regs = task_pt_regs(child);
 	struct thread_info *ti = task_thread_info(child);
 	elf_xtregs_t __user *xtregs = uregs;
 	int ret = 0;
+	int i __maybe_unused;
 
 	if (!access_ok(VERIFY_WRITE, uregs, sizeof(elf_xtregs_t)))
 		return -EIO;
@@ -140,8 +165,13 @@ static int ptrace_getxregs(struct task_s
 #if XTENSA_HAVE_COPROCESSORS
 	/* Flush all coprocessor registers to memory. */
 	coprocessor_flush_all(ti);
-	ret |= __copy_to_user(&xtregs->cp0, &ti->xtregs_cp,
-			      sizeof(xtregs_coprocessor_t));
+
+	for (i = 0; i < ARRAY_SIZE(cp_offsets); ++i)
+		ret |= __copy_to_user((char __user *)xtregs +
+				      cp_offsets[i].elf_xtregs_offset,
+				      (const char *)ti +
+				      cp_offsets[i].ti_offset,
+				      cp_offsets[i].sz);
 #endif
 	ret |= __copy_to_user(&xtregs->opt, &regs->xtregs_opt,
 			      sizeof(xtregs->opt));
@@ -157,6 +187,7 @@ static int ptrace_setxregs(struct task_s
 	struct pt_regs *regs = task_pt_regs(child);
 	elf_xtregs_t *xtregs = uregs;
 	int ret = 0;
+	int i __maybe_unused;
 
 	if (!access_ok(VERIFY_READ, uregs, sizeof(elf_xtregs_t)))
 		return -EFAULT;
@@ -166,8 +197,11 @@ static int ptrace_setxregs(struct task_s
 	coprocessor_flush_all(ti);
 	coprocessor_release_all(ti);
 
-	ret |= __copy_from_user(&ti->xtregs_cp, &xtregs->cp0,
-				sizeof(xtregs_coprocessor_t));
+	for (i = 0; i < ARRAY_SIZE(cp_offsets); ++i)
+		ret |= __copy_from_user((char *)ti + cp_offsets[i].ti_offset,
+					(const char __user *)xtregs +
+					cp_offsets[i].elf_xtregs_offset,
+					cp_offsets[i].sz);
 #endif
 	ret |= __copy_from_user(&regs->xtregs_opt, &xtregs->opt,
 				sizeof(xtregs->opt));



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 116/146] Btrfs: ensure path name is null terminated at btrfs_control_ioctl
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (114 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 115/146] xtensa: fix coprocessor part of ptrace_{get,set}xregs Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 117/146] btrfs: relocation: set trans to be NULL after ending transaction Greg Kroah-Hartman
                   ` (34 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Anand Jain, Filipe Manana, David Sterba

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Filipe Manana <fdmanana@suse.com>

commit f505754fd6599230371cb01b9332754ddc104be1 upstream.

We were using the path name received from user space without checking that
it is null terminated. While btrfs-progs is well behaved and does proper
validation and null termination, someone could call the ioctl and pass
a non-null terminated patch, leading to buffer overrun problems in the
kernel.  The ioctl is protected by CAP_SYS_ADMIN.

So just set the last byte of the path to a null character, similar to what
we do in other ioctls (add/remove/resize device, snapshot creation, etc).

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/super.c |    1 +
 1 file changed, 1 insertion(+)

--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2176,6 +2176,7 @@ static long btrfs_control_ioctl(struct f
 	vol = memdup_user((void __user *)arg, sizeof(*vol));
 	if (IS_ERR(vol))
 		return PTR_ERR(vol);
+	vol->name[BTRFS_PATH_NAME_MAX] = '\0';
 
 	switch (cmd) {
 	case BTRFS_IOC_SCAN_DEV:



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 117/146] btrfs: relocation: set trans to be NULL after ending transaction
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (115 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 116/146] Btrfs: ensure path name is null terminated at btrfs_control_ioctl Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 118/146] PCI: layerscape: Fix wrong invocation of outbound window disable accessor Greg Kroah-Hartman
                   ` (33 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Qu Wenruo, Pan Bian, David Sterba

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pan Bian <bianpan2016@163.com>

commit 42a657f57628402c73237547f0134e083e2f6764 upstream.

The function relocate_block_group calls btrfs_end_transaction to release
trans when update_backref_cache returns 1, and then continues the loop
body. If btrfs_block_rsv_refill fails this time, it will jump out the
loop and the freed trans will be accessed. This may result in a
use-after-free bug. The patch assigns NULL to trans after trans is
released so that it will not be accessed.

Fixes: 0647bf564f1 ("Btrfs: improve forever loop when doing balance relocation")
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Pan Bian <bianpan2016@163.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/relocation.c |    1 +
 1 file changed, 1 insertion(+)

--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -4048,6 +4048,7 @@ static noinline_for_stack int relocate_b
 restart:
 		if (update_backref_cache(trans, &rc->backref_cache)) {
 			btrfs_end_transaction(trans);
+			trans = NULL;
 			continue;
 		}
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 118/146] PCI: layerscape: Fix wrong invocation of outbound window disable accessor
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (116 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 117/146] btrfs: relocation: set trans to be NULL after ending transaction Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 119/146] arm64: dts: rockchip: Fix PCIe reset polarity for rk3399-puma-haikou Greg Kroah-Hartman
                   ` (32 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Hou Zhiqiang, Lorenzo Pieralisi

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>

commit c6fd6fe9dea44732cdcd970f1130b8cc50ad685a upstream.

The order of parameters is not correct when invoking the outbound
window disable routine. Fix it.

Fixes: 4a2745d760fa ("PCI: layerscape: Disable outbound windows configured by bootloader")
Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
[lorenzo.pieralisi@arm.com: commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/pci/dwc/pci-layerscape.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/pci/dwc/pci-layerscape.c
+++ b/drivers/pci/dwc/pci-layerscape.c
@@ -89,7 +89,7 @@ static void ls_pcie_disable_outbound_atu
 	int i;
 
 	for (i = 0; i < PCIE_IATU_NUM; i++)
-		dw_pcie_disable_atu(pcie->pci, DW_PCIE_REGION_OUTBOUND, i);
+		dw_pcie_disable_atu(pcie->pci, i, DW_PCIE_REGION_OUTBOUND);
 }
 
 static int ls1021_pcie_link_up(struct dw_pcie *pci)



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 119/146] arm64: dts: rockchip: Fix PCIe reset polarity for rk3399-puma-haikou.
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (117 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 118/146] PCI: layerscape: Fix wrong invocation of outbound window disable accessor Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 120/146] x86/MCE/AMD: Fix the thresholding machinery initialization order Greg Kroah-Hartman
                   ` (31 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Christoph Muellner, Heiko Stuebner

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Christoph Muellner <christoph.muellner@theobroma-systems.com>

commit c1d91f86a1b4c9c05854d59c6a0abd5d0f75b849 upstream.

This patch fixes the wrong polarity setting for the PCIe host driver's
pre-reset pin for rk3399-puma-haikou. Without this patch link training
will most likely fail.

Fixes: 60fd9f72ce8a ("arm64: dts: rockchip: add Haikou baseboard with RK3399-Q7 SoM")
Cc: stable@vger.kernel.org
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arm64/boot/dts/rockchip/rk3399-puma-haikou.dts |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/arm64/boot/dts/rockchip/rk3399-puma-haikou.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3399-puma-haikou.dts
@@ -130,7 +130,7 @@
 };
 
 &pcie0 {
-	ep-gpios = <&gpio4 RK_PC6 GPIO_ACTIVE_LOW>;
+	ep-gpios = <&gpio4 RK_PC6 GPIO_ACTIVE_HIGH>;
 	num-lanes = <4>;
 	pinctrl-names = "default";
 	pinctrl-0 = <&pcie_clkreqn_cpm>;



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 120/146] x86/MCE/AMD: Fix the thresholding machinery initialization order
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (118 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 119/146] arm64: dts: rockchip: Fix PCIe reset polarity for rk3399-puma-haikou Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 121/146] x86/fpu: Disable bottom halves while loading FPU registers Greg Kroah-Hartman
                   ` (30 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Rafał Miłecki,
	John Clemens, Borislav Petkov, John Clemens,
	Aravind Gopalakrishnan, linux-edac, Tony Luck, x86,
	Yazen Ghannam

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Borislav Petkov <bp@suse.de>

commit 60c8144afc287ef09ce8c1230c6aa972659ba1bb upstream.

Currently, the code sets up the thresholding interrupt vector and only
then goes about initializing the thresholding banks. Which is wrong,
because an early thresholding interrupt would cause a NULL pointer
dereference when accessing those banks and prevent the machine from
booting.

Therefore, set the thresholding interrupt vector only *after* having
initialized the banks successfully.

Fixes: 18807ddb7f88 ("x86/mce/AMD: Reset Threshold Limit after logging error")
Reported-by: Rafał Miłecki <rafal@milecki.pl>
Reported-by: John Clemens <clemej@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Rafał Miłecki <rafal@milecki.pl>
Tested-by: John Clemens <john@deater.net>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Cc: linux-edac@vger.kernel.org
Cc: stable@vger.kernel.org
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86@kernel.org
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Link: https://lkml.kernel.org/r/20181127101700.2964-1-zajec5@gmail.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201291
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/cpu/mcheck/mce_amd.c |   19 ++++++-------------
 1 file changed, 6 insertions(+), 13 deletions(-)

--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -56,7 +56,7 @@
 /* Threshold LVT offset is at MSR0xC0000410[15:12] */
 #define SMCA_THR_LVT_OFF	0xF000
 
-static bool thresholding_en;
+static bool thresholding_irq_en;
 
 static const char * const th_names[] = {
 	"load_store",
@@ -533,9 +533,8 @@ prepare_threshold_block(unsigned int ban
 
 set_offset:
 	offset = setup_APIC_mce_threshold(offset, new);
-
-	if ((offset == new) && (mce_threshold_vector != amd_threshold_interrupt))
-		mce_threshold_vector = amd_threshold_interrupt;
+	if (offset == new)
+		thresholding_irq_en = true;
 
 done:
 	mce_threshold_block_init(&b, offset);
@@ -1356,9 +1355,6 @@ int mce_threshold_remove_device(unsigned
 {
 	unsigned int bank;
 
-	if (!thresholding_en)
-		return 0;
-
 	for (bank = 0; bank < mca_cfg.banks; ++bank) {
 		if (!(per_cpu(bank_map, cpu) & (1 << bank)))
 			continue;
@@ -1376,9 +1372,6 @@ int mce_threshold_create_device(unsigned
 	struct threshold_bank **bp;
 	int err = 0;
 
-	if (!thresholding_en)
-		return 0;
-
 	bp = per_cpu(threshold_banks, cpu);
 	if (bp)
 		return 0;
@@ -1407,9 +1400,6 @@ static __init int threshold_init_device(
 {
 	unsigned lcpu = 0;
 
-	if (mce_threshold_vector == amd_threshold_interrupt)
-		thresholding_en = true;
-
 	/* to hit CPUs online before the notifier is up */
 	for_each_online_cpu(lcpu) {
 		int err = mce_threshold_create_device(lcpu);
@@ -1418,6 +1408,9 @@ static __init int threshold_init_device(
 			return err;
 	}
 
+	if (thresholding_irq_en)
+		mce_threshold_vector = amd_threshold_interrupt;
+
 	return 0;
 }
 /*



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 121/146] x86/fpu: Disable bottom halves while loading FPU registers
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (119 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 120/146] x86/MCE/AMD: Fix the thresholding machinery initialization order Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-05 16:26   ` Jari Ruusu
  2018-12-04 10:50 ` [PATCH 4.14 122/146] perf/x86/intel: Move branch tracing setup to the Intel-specific source file Greg Kroah-Hartman
                   ` (29 subsequent siblings)
  150 siblings, 1 reply; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Sebastian Andrzej Siewior,
	Borislav Petkov, Ingo Molnar, Thomas Gleixner, Andy Lutomirski,
	Dave Hansen, H. Peter Anvin, Jason A. Donenfeld, kvm ML,
	Paolo Bonzini, Radim Krčmář,
	Rik van Riel, x86-ml

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

commit 68239654acafe6aad5a3c1dc7237e60accfebc03 upstream.

The sequence

  fpu->initialized = 1;		/* step A */
  preempt_disable();		/* step B */
  fpu__restore(fpu);
  preempt_enable();

in __fpu__restore_sig() is racy in regard to a context switch.

For 32bit frames, __fpu__restore_sig() prepares the FPU state within
fpu->state. To ensure that a context switch (switch_fpu_prepare() in
particular) does not modify fpu->state it uses fpu__drop() which sets
fpu->initialized to 0.

After fpu->initialized is cleared, the CPU's FPU state is not saved
to fpu->state during a context switch. The new state is loaded via
fpu__restore(). It gets loaded into fpu->state from userland and
ensured it is sane. fpu->initialized is then set to 1 in order to avoid
fpu__initialize() doing anything (overwrite the new state) which is part
of fpu__restore().

A context switch between step A and B above would save CPU's current FPU
registers to fpu->state and overwrite the newly prepared state. This
looks like a tiny race window but the Kernel Test Robot reported this
back in 2016 while we had lazy FPU support. Borislav Petkov made the
link between that report and another patch that has been posted. Since
the removal of the lazy FPU support, this race goes unnoticed because
the warning has been removed.

Disable bottom halves around the restore sequence to avoid the race. BH
need to be disabled because BH is allowed to run (even with preemption
disabled) and might invoke kernel_fpu_begin() by doing IPsec.

 [ bp: massage commit message a bit. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: stable@vger.kernel.org
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20181120102635.ddv3fvavxajjlfqk@linutronix.de
Link: https://lkml.kernel.org/r/20160226074940.GA28911@pd.tnic
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/fpu/signal.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -344,10 +344,10 @@ static int __fpu__restore_sig(void __use
 			sanitize_restored_xstate(tsk, &env, xfeatures, fx_only);
 		}
 
+		local_bh_disable();
 		fpu->initialized = 1;
-		preempt_disable();
 		fpu__restore(fpu);
-		preempt_enable();
+		local_bh_enable();
 
 		return err;
 	} else {



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 122/146] perf/x86/intel: Move branch tracing setup to the Intel-specific source file
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (120 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 121/146] x86/fpu: Disable bottom halves while loading FPU registers Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 123/146] perf/x86/intel: Add generic branch tracing check to intel_pmu_has_bts() Greg Kroah-Hartman
                   ` (28 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Peter Zijlstra, Jiri Olsa,
	Peter Zijlstra, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Jiri Olsa, Linus Torvalds,
	Stephane Eranian, Thomas Gleixner, Vince Weaver, Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jiri Olsa <jolsa@kernel.org>

commit ed6101bbf6266ee83e620b19faa7c6ad56bb41ab upstream.

Moving branch tracing setup to Intel core object into separate
intel_pmu_bts_config function, because it's Intel specific.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20181121101612.16272-1-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/events/core.c       |   20 --------------------
 arch/x86/events/intel/core.c |   41 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 40 insertions(+), 21 deletions(-)

--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -438,26 +438,6 @@ int x86_setup_perfctr(struct perf_event
 	if (config == -1LL)
 		return -EINVAL;
 
-	/*
-	 * Branch tracing:
-	 */
-	if (attr->config == PERF_COUNT_HW_BRANCH_INSTRUCTIONS &&
-	    !attr->freq && hwc->sample_period == 1) {
-		/* BTS is not supported by this architecture. */
-		if (!x86_pmu.bts_active)
-			return -EOPNOTSUPP;
-
-		/* BTS is currently only allowed for user-mode. */
-		if (!attr->exclude_kernel)
-			return -EOPNOTSUPP;
-
-		/* disallow bts if conflicting events are present */
-		if (x86_add_exclusive(x86_lbr_exclusive_lbr))
-			return -EBUSY;
-
-		event->destroy = hw_perf_lbr_event_destroy;
-	}
-
 	hwc->config |= config;
 
 	return 0;
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2973,6 +2973,41 @@ static unsigned long intel_pmu_free_runn
 	return flags;
 }
 
+static int intel_pmu_bts_config(struct perf_event *event)
+{
+	struct perf_event_attr *attr = &event->attr;
+	struct hw_perf_event *hwc = &event->hw;
+
+	if (attr->config == PERF_COUNT_HW_BRANCH_INSTRUCTIONS &&
+	    !attr->freq && hwc->sample_period == 1) {
+		/* BTS is not supported by this architecture. */
+		if (!x86_pmu.bts_active)
+			return -EOPNOTSUPP;
+
+		/* BTS is currently only allowed for user-mode. */
+		if (!attr->exclude_kernel)
+			return -EOPNOTSUPP;
+
+		/* disallow bts if conflicting events are present */
+		if (x86_add_exclusive(x86_lbr_exclusive_lbr))
+			return -EBUSY;
+
+		event->destroy = hw_perf_lbr_event_destroy;
+	}
+
+	return 0;
+}
+
+static int core_pmu_hw_config(struct perf_event *event)
+{
+	int ret = x86_pmu_hw_config(event);
+
+	if (ret)
+		return ret;
+
+	return intel_pmu_bts_config(event);
+}
+
 static int intel_pmu_hw_config(struct perf_event *event)
 {
 	int ret = x86_pmu_hw_config(event);
@@ -2980,6 +3015,10 @@ static int intel_pmu_hw_config(struct pe
 	if (ret)
 		return ret;
 
+	ret = intel_pmu_bts_config(event);
+	if (ret)
+		return ret;
+
 	if (event->attr.precise_ip) {
 		if (!event->attr.freq) {
 			event->hw.flags |= PERF_X86_EVENT_AUTO_RELOAD;
@@ -3462,7 +3501,7 @@ static __initconst const struct x86_pmu
 	.enable_all		= core_pmu_enable_all,
 	.enable			= core_pmu_enable_event,
 	.disable		= x86_pmu_disable_event,
-	.hw_config		= x86_pmu_hw_config,
+	.hw_config		= core_pmu_hw_config,
 	.schedule_events	= x86_schedule_events,
 	.eventsel		= MSR_ARCH_PERFMON_EVENTSEL0,
 	.perfctr		= MSR_ARCH_PERFMON_PERFCTR0,



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 123/146] perf/x86/intel: Add generic branch tracing check to intel_pmu_has_bts()
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (121 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 122/146] perf/x86/intel: Move branch tracing setup to the Intel-specific source file Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 124/146] fs: fix lost error code in dio_complete Greg Kroah-Hartman
                   ` (27 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jiri Olsa, Peter Zijlstra,
	Alexander Shishkin, Arnaldo Carvalho de Melo,
	Arnaldo Carvalho de Melo, Jiri Olsa, Linus Torvalds,
	Peter Zijlstra, Stephane Eranian, Thomas Gleixner, Vince Weaver,
	Ingo Molnar

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jiri Olsa <jolsa@kernel.org>

commit 67266c1080ad56c31af72b9c18355fde8ccc124a upstream.

Currently we check the branch tracing only by checking for the
PERF_COUNT_HW_BRANCH_INSTRUCTIONS event of PERF_TYPE_HARDWARE
type. But we can define the same event with the PERF_TYPE_RAW
type.

Changing the intel_pmu_has_bts() code to check on event's final
hw config value, so both HW types are covered.

Adding unlikely to intel_pmu_has_bts() condition calls, because
it was used in the original code in intel_bts_constraints.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20181121101612.16272-2-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/events/intel/core.c |   17 +++--------------
 arch/x86/events/perf_event.h |   13 +++++++++----
 2 files changed, 12 insertions(+), 18 deletions(-)

--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2345,16 +2345,7 @@ done:
 static struct event_constraint *
 intel_bts_constraints(struct perf_event *event)
 {
-	struct hw_perf_event *hwc = &event->hw;
-	unsigned int hw_event, bts_event;
-
-	if (event->attr.freq)
-		return NULL;
-
-	hw_event = hwc->config & INTEL_ARCH_EVENT_MASK;
-	bts_event = x86_pmu.event_map(PERF_COUNT_HW_BRANCH_INSTRUCTIONS);
-
-	if (unlikely(hw_event == bts_event && hwc->sample_period == 1))
+	if (unlikely(intel_pmu_has_bts(event)))
 		return &bts_constraint;
 
 	return NULL;
@@ -2976,10 +2967,8 @@ static unsigned long intel_pmu_free_runn
 static int intel_pmu_bts_config(struct perf_event *event)
 {
 	struct perf_event_attr *attr = &event->attr;
-	struct hw_perf_event *hwc = &event->hw;
 
-	if (attr->config == PERF_COUNT_HW_BRANCH_INSTRUCTIONS &&
-	    !attr->freq && hwc->sample_period == 1) {
+	if (unlikely(intel_pmu_has_bts(event))) {
 		/* BTS is not supported by this architecture. */
 		if (!x86_pmu.bts_active)
 			return -EOPNOTSUPP;
@@ -3038,7 +3027,7 @@ static int intel_pmu_hw_config(struct pe
 		/*
 		 * BTS is set up earlier in this path, so don't account twice
 		 */
-		if (!intel_pmu_has_bts(event)) {
+		if (!unlikely(intel_pmu_has_bts(event))) {
 			/* disallow lbr if conflicting events are present */
 			if (x86_add_exclusive(x86_lbr_exclusive_lbr))
 				return -EBUSY;
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -850,11 +850,16 @@ static inline int amd_pmu_init(void)
 
 static inline bool intel_pmu_has_bts(struct perf_event *event)
 {
-	if (event->attr.config == PERF_COUNT_HW_BRANCH_INSTRUCTIONS &&
-	    !event->attr.freq && event->hw.sample_period == 1)
-		return true;
+	struct hw_perf_event *hwc = &event->hw;
+	unsigned int hw_event, bts_event;
 
-	return false;
+	if (event->attr.freq)
+		return false;
+
+	hw_event = hwc->config & INTEL_ARCH_EVENT_MASK;
+	bts_event = x86_pmu.event_map(PERF_COUNT_HW_BRANCH_INSTRUCTIONS);
+
+	return hw_event == bts_event && hwc->sample_period == 1;
 }
 
 int intel_pmu_save_and_restart(struct perf_event *event);



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 124/146] fs: fix lost error code in dio_complete
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (122 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 123/146] perf/x86/intel: Add generic branch tracing check to intel_pmu_has_bts() Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 125/146] ALSA: wss: Fix invalid snd_free_pages() at error path Greg Kroah-Hartman
                   ` (26 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ravi Nankani, Maximilian Heyne,
	Christoph Hellwig, Torsten Mehlan, Uwe Dannowski, Amit Shah,
	David Woodhouse, Jens Axboe

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Maximilian Heyne <mheyne@amazon.de>

commit 41e817bca3acd3980efe5dd7d28af0e6f4ab9247 upstream.

commit e259221763a40403d5bb232209998e8c45804ab8 ("fs: simplify the
generic_write_sync prototype") reworked callers of generic_write_sync(),
and ended up dropping the error return for the directio path. Prior to
that commit, in dio_complete(), an error would be bubbled up the stack,
but after that commit, errors passed on to dio_complete were eaten up.

This was reported on the list earlier, and a fix was proposed in
https://lore.kernel.org/lkml/20160921141539.GA17898@infradead.org/, but
never followed up with.  We recently hit this bug in our testing where
fencing io errors, which were previously erroring out with EIO, were
being returned as success operations after this commit.

The fix proposed on the list earlier was a little short -- it would have
still called generic_write_sync() in case `ret` already contained an
error. This fix ensures generic_write_sync() is only called when there's
no pending error in the write. Additionally, transferred is replaced
with ret to bring this code in line with other callers.

Fixes: e259221763a4 ("fs: simplify the generic_write_sync prototype")
Reported-by: Ravi Nankani <rnankani@amazon.com>
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
CC: Torsten Mehlan <tomeh@amazon.de>
CC: Uwe Dannowski <uwed@amazon.de>
CC: Amit Shah <aams@amazon.de>
CC: David Woodhouse <dwmw@amazon.co.uk>
CC: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/direct-io.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -304,8 +304,8 @@ static ssize_t dio_complete(struct dio *
 		 */
 		dio->iocb->ki_pos += transferred;
 
-		if (dio->op == REQ_OP_WRITE)
-			ret = generic_write_sync(dio->iocb,  transferred);
+		if (ret > 0 && dio->op == REQ_OP_WRITE)
+			ret = generic_write_sync(dio->iocb, ret);
 		dio->iocb->ki_complete(dio->iocb, ret, 0);
 	}
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 125/146] ALSA: wss: Fix invalid snd_free_pages() at error path
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (123 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 124/146] fs: fix lost error code in dio_complete Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 126/146] ALSA: ac97: Fix incorrect bit shift at AC97-SPSA control write Greg Kroah-Hartman
                   ` (25 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Takashi Sakamoto, Takashi Iwai

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <tiwai@suse.de>

commit 7b69154171b407844c273ab4c10b5f0ddcd6aa29 upstream.

Some spurious calls of snd_free_pages() have been overlooked and
remain in the error paths of wss driver code.  Since runtime->dma_area
is managed by the PCM core helper, we shouldn't release manually.

Drop the superfluous calls.

Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/isa/wss/wss_lib.c |    2 --
 1 file changed, 2 deletions(-)

--- a/sound/isa/wss/wss_lib.c
+++ b/sound/isa/wss/wss_lib.c
@@ -1531,7 +1531,6 @@ static int snd_wss_playback_open(struct
 	if (err < 0) {
 		if (chip->release_dma)
 			chip->release_dma(chip, chip->dma_private_data, chip->dma1);
-		snd_free_pages(runtime->dma_area, runtime->dma_bytes);
 		return err;
 	}
 	chip->playback_substream = substream;
@@ -1572,7 +1571,6 @@ static int snd_wss_capture_open(struct s
 	if (err < 0) {
 		if (chip->release_dma)
 			chip->release_dma(chip, chip->dma_private_data, chip->dma2);
-		snd_free_pages(runtime->dma_area, runtime->dma_bytes);
 		return err;
 	}
 	chip->capture_substream = substream;



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 126/146] ALSA: ac97: Fix incorrect bit shift at AC97-SPSA control write
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (124 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 125/146] ALSA: wss: Fix invalid snd_free_pages() at error path Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 127/146] ALSA: control: Fix race between adding and removing a user element Greg Kroah-Hartman
                   ` (24 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Takashi Iwai

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <tiwai@suse.de>

commit 7194eda1ba0872d917faf3b322540b4f57f11ba5 upstream.

The function snd_ac97_put_spsa() gets the bit shift value from the
associated private_value, but it extracts too much; the current code
extracts 8 bit values in bits 8-15, but this is a combination of two
nibbles (bits 8-11 and bits 12-15) for left and right shifts.
Due to the incorrect bits extraction, the actual shift may go beyond
the 32bit value, as spotted recently by UBSAN check:
 UBSAN: Undefined behaviour in sound/pci/ac97/ac97_codec.c:836:7
 shift exponent 68 is too large for 32-bit type 'int'

This patch fixes the shift value extraction by masking the properly
with 0x0f instead of 0xff.

Reported-and-tested-by: Meelis Roos <mroos@linux.ee>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/pci/ac97/ac97_codec.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/sound/pci/ac97/ac97_codec.c
+++ b/sound/pci/ac97/ac97_codec.c
@@ -824,7 +824,7 @@ static int snd_ac97_put_spsa(struct snd_
 {
 	struct snd_ac97 *ac97 = snd_kcontrol_chip(kcontrol);
 	int reg = kcontrol->private_value & 0xff;
-	int shift = (kcontrol->private_value >> 8) & 0xff;
+	int shift = (kcontrol->private_value >> 8) & 0x0f;
 	int mask = (kcontrol->private_value >> 16) & 0xff;
 	// int invert = (kcontrol->private_value >> 24) & 0xff;
 	unsigned short value, old, new;



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 127/146] ALSA: control: Fix race between adding and removing a user element
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (125 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 126/146] ALSA: ac97: Fix incorrect bit shift at AC97-SPSA control write Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 128/146] ALSA: sparc: Fix invalid snd_free_pages() at error path Greg Kroah-Hartman
                   ` (23 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+dc09047bce3820621ba2, Takashi Iwai

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <tiwai@suse.de>

commit e1a7bfe3807974e66f971f2589d4e0197ec0fced upstream.

The procedure for adding a user control element has some window opened
for race against the concurrent removal of a user element.  This was
caught by syzkaller, hitting a KASAN use-after-free error.

This patch addresses the bug by wrapping the whole procedure to add a
user control element with the card->controls_rwsem, instead of only
around the increment of card->user_ctl_count.

This required a slight code refactoring, too.  The function
snd_ctl_add() is split to two parts: a core function to add the
control element and a part calling it.  The former is called from the
function for adding a user control element inside the controls_rwsem.

One change to be noted is that snd_ctl_notify() for adding a control
element gets called inside the controls_rwsem as well while it was
called outside the rwsem.  But this should be OK, as snd_ctl_notify()
takes another (finer) rwlock instead of rwsem, and the call of
snd_ctl_notify() inside rwsem is already done in another code path.

Reported-by: syzbot+dc09047bce3820621ba2@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/core/control.c |   80 ++++++++++++++++++++++++++++-----------------------
 1 file changed, 45 insertions(+), 35 deletions(-)

--- a/sound/core/control.c
+++ b/sound/core/control.c
@@ -347,6 +347,40 @@ static int snd_ctl_find_hole(struct snd_
 	return 0;
 }
 
+/* add a new kcontrol object; call with card->controls_rwsem locked */
+static int __snd_ctl_add(struct snd_card *card, struct snd_kcontrol *kcontrol)
+{
+	struct snd_ctl_elem_id id;
+	unsigned int idx;
+	unsigned int count;
+
+	id = kcontrol->id;
+	if (id.index > UINT_MAX - kcontrol->count)
+		return -EINVAL;
+
+	if (snd_ctl_find_id(card, &id)) {
+		dev_err(card->dev,
+			"control %i:%i:%i:%s:%i is already present\n",
+			id.iface, id.device, id.subdevice, id.name, id.index);
+		return -EBUSY;
+	}
+
+	if (snd_ctl_find_hole(card, kcontrol->count) < 0)
+		return -ENOMEM;
+
+	list_add_tail(&kcontrol->list, &card->controls);
+	card->controls_count += kcontrol->count;
+	kcontrol->id.numid = card->last_numid + 1;
+	card->last_numid += kcontrol->count;
+
+	id = kcontrol->id;
+	count = kcontrol->count;
+	for (idx = 0; idx < count; idx++, id.index++, id.numid++)
+		snd_ctl_notify(card, SNDRV_CTL_EVENT_MASK_ADD, &id);
+
+	return 0;
+}
+
 /**
  * snd_ctl_add - add the control instance to the card
  * @card: the card instance
@@ -363,45 +397,18 @@ static int snd_ctl_find_hole(struct snd_
  */
 int snd_ctl_add(struct snd_card *card, struct snd_kcontrol *kcontrol)
 {
-	struct snd_ctl_elem_id id;
-	unsigned int idx;
-	unsigned int count;
 	int err = -EINVAL;
 
 	if (! kcontrol)
 		return err;
 	if (snd_BUG_ON(!card || !kcontrol->info))
 		goto error;
-	id = kcontrol->id;
-	if (id.index > UINT_MAX - kcontrol->count)
-		goto error;
 
 	down_write(&card->controls_rwsem);
-	if (snd_ctl_find_id(card, &id)) {
-		up_write(&card->controls_rwsem);
-		dev_err(card->dev, "control %i:%i:%i:%s:%i is already present\n",
-					id.iface,
-					id.device,
-					id.subdevice,
-					id.name,
-					id.index);
-		err = -EBUSY;
-		goto error;
-	}
-	if (snd_ctl_find_hole(card, kcontrol->count) < 0) {
-		up_write(&card->controls_rwsem);
-		err = -ENOMEM;
-		goto error;
-	}
-	list_add_tail(&kcontrol->list, &card->controls);
-	card->controls_count += kcontrol->count;
-	kcontrol->id.numid = card->last_numid + 1;
-	card->last_numid += kcontrol->count;
-	id = kcontrol->id;
-	count = kcontrol->count;
+	err = __snd_ctl_add(card, kcontrol);
 	up_write(&card->controls_rwsem);
-	for (idx = 0; idx < count; idx++, id.index++, id.numid++)
-		snd_ctl_notify(card, SNDRV_CTL_EVENT_MASK_ADD, &id);
+	if (err < 0)
+		goto error;
 	return 0;
 
  error:
@@ -1360,9 +1367,12 @@ static int snd_ctl_elem_add(struct snd_c
 		kctl->tlv.c = snd_ctl_elem_user_tlv;
 
 	/* This function manage to free the instance on failure. */
-	err = snd_ctl_add(card, kctl);
-	if (err < 0)
-		return err;
+	down_write(&card->controls_rwsem);
+	err = __snd_ctl_add(card, kctl);
+	if (err < 0) {
+		snd_ctl_free_one(kctl);
+		goto unlock;
+	}
 	offset = snd_ctl_get_ioff(kctl, &info->id);
 	snd_ctl_build_ioff(&info->id, kctl, offset);
 	/*
@@ -1373,10 +1383,10 @@ static int snd_ctl_elem_add(struct snd_c
 	 * which locks the element.
 	 */
 
-	down_write(&card->controls_rwsem);
 	card->user_ctl_count++;
-	up_write(&card->controls_rwsem);
 
+ unlock:
+	up_write(&card->controls_rwsem);
 	return 0;
 }
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 128/146] ALSA: sparc: Fix invalid snd_free_pages() at error path
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (126 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 127/146] ALSA: control: Fix race between adding and removing a user element Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 129/146] ALSA: hda/realtek - Support ALC300 Greg Kroah-Hartman
                   ` (22 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Takashi Sakamoto, Takashi Iwai

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <tiwai@suse.de>

commit 9a20332ab373b1f8f947e0a9c923652b32dab031 upstream.

Some spurious calls of snd_free_pages() have been overlooked and
remain in the error paths of sparc cs4231 driver code.  Since
runtime->dma_area is managed by the PCM core helper, we shouldn't
release manually.

Drop the superfluous calls.

Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/sparc/cs4231.c |    8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

--- a/sound/sparc/cs4231.c
+++ b/sound/sparc/cs4231.c
@@ -1146,10 +1146,8 @@ static int snd_cs4231_playback_open(stru
 	runtime->hw = snd_cs4231_playback;
 
 	err = snd_cs4231_open(chip, CS4231_MODE_PLAY);
-	if (err < 0) {
-		snd_free_pages(runtime->dma_area, runtime->dma_bytes);
+	if (err < 0)
 		return err;
-	}
 	chip->playback_substream = substream;
 	chip->p_periods_sent = 0;
 	snd_pcm_set_sync(substream);
@@ -1167,10 +1165,8 @@ static int snd_cs4231_capture_open(struc
 	runtime->hw = snd_cs4231_capture;
 
 	err = snd_cs4231_open(chip, CS4231_MODE_RECORD);
-	if (err < 0) {
-		snd_free_pages(runtime->dma_area, runtime->dma_bytes);
+	if (err < 0)
 		return err;
-	}
 	chip->capture_substream = substream;
 	chip->c_periods_sent = 0;
 	snd_pcm_set_sync(substream);



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 129/146] ALSA: hda/realtek - Support ALC300
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (127 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 128/146] ALSA: sparc: Fix invalid snd_free_pages() at error path Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 130/146] ALSA: hda/realtek - fix headset mic detection for MSI MS-B171 Greg Kroah-Hartman
                   ` (21 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Kailang Yang, Takashi Iwai

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kailang Yang <kailang@realtek.com>

commit 1078bef0cd9291355a20369b21cd823026ab8eaa upstream.

This patch will enable ALC300.

[ It's almost equivalent with other ALC269-compatible ones, and
  apparently has no loopback mixer -- tiwai ]

Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/pci/hda/patch_realtek.c |    8 ++++++++
 1 file changed, 8 insertions(+)

--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -343,6 +343,7 @@ static void alc_fill_eapd_coef(struct hd
 	case 0x10ec0285:
 	case 0x10ec0298:
 	case 0x10ec0289:
+	case 0x10ec0300:
 		alc_update_coef_idx(codec, 0x10, 1<<9, 0);
 		break;
 	case 0x10ec0275:
@@ -2758,6 +2759,7 @@ enum {
 	ALC269_TYPE_ALC215,
 	ALC269_TYPE_ALC225,
 	ALC269_TYPE_ALC294,
+	ALC269_TYPE_ALC300,
 	ALC269_TYPE_ALC700,
 };
 
@@ -2792,6 +2794,7 @@ static int alc269_parse_auto_config(stru
 	case ALC269_TYPE_ALC215:
 	case ALC269_TYPE_ALC225:
 	case ALC269_TYPE_ALC294:
+	case ALC269_TYPE_ALC300:
 	case ALC269_TYPE_ALC700:
 		ssids = alc269_ssids;
 		break;
@@ -7089,6 +7092,10 @@ static int patch_alc269(struct hda_codec
 		spec->gen.mixer_nid = 0; /* ALC2x4 does not have any loopback mixer path */
 		alc_update_coef_idx(codec, 0x6b, 0x0018, (1<<4) | (1<<3)); /* UAJ MIC Vref control by verb */
 		break;
+	case 0x10ec0300:
+		spec->codec_variant = ALC269_TYPE_ALC300;
+		spec->gen.mixer_nid = 0; /* no loopback on ALC300 */
+		break;
 	case 0x10ec0700:
 	case 0x10ec0701:
 	case 0x10ec0703:
@@ -8160,6 +8167,7 @@ static const struct hda_device_id snd_hd
 	HDA_CODEC_ENTRY(0x10ec0295, "ALC295", patch_alc269),
 	HDA_CODEC_ENTRY(0x10ec0298, "ALC298", patch_alc269),
 	HDA_CODEC_ENTRY(0x10ec0299, "ALC299", patch_alc269),
+	HDA_CODEC_ENTRY(0x10ec0300, "ALC300", patch_alc269),
 	HDA_CODEC_REV_ENTRY(0x10ec0861, 0x100340, "ALC660", patch_alc861),
 	HDA_CODEC_ENTRY(0x10ec0660, "ALC660-VD", patch_alc861vd),
 	HDA_CODEC_ENTRY(0x10ec0861, "ALC861", patch_alc861),



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 130/146] ALSA: hda/realtek - fix headset mic detection for MSI MS-B171
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (128 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 129/146] ALSA: hda/realtek - Support ALC300 Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 131/146] ext2: fix potential use after free Greg Kroah-Hartman
                   ` (20 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Anisse Astier, Takashi Iwai

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Anisse Astier <anisse@astier.eu>

commit 8cd65271f8e545ddeed10ecc2e417936bdff168e upstream.

MSI Cubi N 8GL (MS-B171) needs the same fixup as its older model, the
MS-B120, in order for the headset mic to be properly detected.

They both use a single 3-way jack for both mic and headset with an
ALC283 codec, with the same pins used.

Cc: stable@vger.kernel.org
Signed-off-by: Anisse Astier <anisse@astier.eu>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/pci/hda/patch_realtek.c |    1 +
 1 file changed, 1 insertion(+)

--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -6411,6 +6411,7 @@ static const struct snd_pci_quirk alc269
 	SND_PCI_QUIRK(0x144d, 0xc740, "Samsung Ativ book 8 (NP870Z5G)", ALC269_FIXUP_ATIV_BOOK_8),
 	SND_PCI_QUIRK(0x1458, 0xfa53, "Gigabyte BXBT-2807", ALC283_FIXUP_HEADSET_MIC),
 	SND_PCI_QUIRK(0x1462, 0xb120, "MSI Cubi MS-B120", ALC283_FIXUP_HEADSET_MIC),
+	SND_PCI_QUIRK(0x1462, 0xb171, "Cubi N 8GL (MS-B171)", ALC283_FIXUP_HEADSET_MIC),
 	SND_PCI_QUIRK(0x17aa, 0x1036, "Lenovo P520", ALC233_FIXUP_LENOVO_MULTI_CODECS),
 	SND_PCI_QUIRK(0x17aa, 0x20f2, "Thinkpad SL410/510", ALC269_FIXUP_SKU_IGNORE),
 	SND_PCI_QUIRK(0x17aa, 0x215e, "Thinkpad L512", ALC269_FIXUP_SKU_IGNORE),



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 131/146] ext2: fix potential use after free
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (129 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 130/146] ALSA: hda/realtek - fix headset mic detection for MSI MS-B171 Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 132/146] ARM: dts: rockchip: Remove @0 from the veyron memory node Greg Kroah-Hartman
                   ` (19 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Pan Bian, Jan Kara

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pan Bian <bianpan2016@163.com>

commit ecebf55d27a11538ea84aee0be643dd953f830d5 upstream.

The function ext2_xattr_set calls brelse(bh) to drop the reference count
of bh. After that, bh may be freed. However, following brelse(bh),
it reads bh->b_data via macro HDR(bh). This may result in a
use-after-free bug. This patch moves brelse(bh) after reading field.

CC: stable@vger.kernel.org
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/ext2/xattr.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/ext2/xattr.c
+++ b/fs/ext2/xattr.c
@@ -612,9 +612,9 @@ skip_replace:
 	}
 
 cleanup:
-	brelse(bh);
 	if (!(bh && header == HDR(bh)))
 		kfree(header);
+	brelse(bh);
 	up_write(&EXT2_I(inode)->xattr_sem);
 
 	return error;



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 132/146] ARM: dts: rockchip: Remove @0 from the veyron memory node
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (130 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 131/146] ext2: fix potential use after free Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 133/146] dmaengine: at_hdmac: fix memory leak in at_dma_xlate() Greg Kroah-Hartman
                   ` (18 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Heikki Lindholm, Heiko Stuebner

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Heiko Stuebner <heiko@sntech.de>

commit 672e60b72bbe7aace88721db55b380b6a51fb8f9 upstream.

The Coreboot version on veyron ChromeOS devices seems to ignore
memory@0 nodes when updating the available memory and instead
inserts another memory node without the address.

This leads to 4GB systems only ever be using 2GB as the memory@0
node takes precedence. So remove the @0 for veyron devices.

Fixes: 0b639b815f15 ("ARM: dts: rockchip: Add missing unit name to memory nodes in rk3288 boards")
Cc: stable@vger.kernel.org
Reported-by: Heikki Lindholm <holin@iki.fi>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arm/boot/dts/rk3288-veyron.dtsi |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/arch/arm/boot/dts/rk3288-veyron.dtsi
+++ b/arch/arm/boot/dts/rk3288-veyron.dtsi
@@ -47,7 +47,11 @@
 #include "rk3288.dtsi"
 
 / {
-	memory@0 {
+	/*
+	 * The default coreboot on veyron devices ignores memory@0 nodes
+	 * and would instead create another memory node.
+	 */
+	memory {
 		device_type = "memory";
 		reg = <0x0 0x0 0x0 0x80000000>;
 	};



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 133/146] dmaengine: at_hdmac: fix memory leak in at_dma_xlate()
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (131 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 132/146] ARM: dts: rockchip: Remove @0 from the veyron memory node Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 134/146] dmaengine: at_hdmac: fix module unloading Greg Kroah-Hartman
                   ` (17 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mario Forner, Alexandre Belloni,
	Ludovic Desroches, Richard Genoud, Vinod Koul

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Richard Genoud <richard.genoud@gmail.com>

commit 98f5f932254b88ce828bc8e4d1642d14e5854caa upstream.

The leak was found when opening/closing a serial port a great number of
time, increasing kmalloc-32 in slabinfo.

Each time the port was opened, dma_request_slave_channel() was called.
Then, in at_dma_xlate(), atslave was allocated with devm_kzalloc() and
never freed. (Well, it was free at module unload, but that's not what we
want).
So, here, kzalloc is more suited for the job since it has to be freed in
atc_free_chan_resources().

Cc: stable@vger.kernel.org
Fixes: bbe89c8e3d59 ("at_hdmac: move to generic DMA binding")
Reported-by: Mario Forner <m.forner@be4energy.com>
Suggested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/dma/at_hdmac.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- a/drivers/dma/at_hdmac.c
+++ b/drivers/dma/at_hdmac.c
@@ -1641,6 +1641,12 @@ static void atc_free_chan_resources(stru
 	atchan->descs_allocated = 0;
 	atchan->status = 0;
 
+	/*
+	 * Free atslave allocated in at_dma_xlate()
+	 */
+	kfree(chan->private);
+	chan->private = NULL;
+
 	dev_vdbg(chan2dev(chan), "free_chan_resources: done\n");
 }
 
@@ -1675,7 +1681,7 @@ static struct dma_chan *at_dma_xlate(str
 	dma_cap_zero(mask);
 	dma_cap_set(DMA_SLAVE, mask);
 
-	atslave = devm_kzalloc(&dmac_pdev->dev, sizeof(*atslave), GFP_KERNEL);
+	atslave = kzalloc(sizeof(*atslave), GFP_KERNEL);
 	if (!atslave)
 		return NULL;
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 134/146] dmaengine: at_hdmac: fix module unloading
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (132 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 133/146] dmaengine: at_hdmac: fix memory leak in at_dma_xlate() Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 135/146] btrfs: release metadata before running delayed refs Greg Kroah-Hartman
                   ` (16 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ludovic Desroches, Richard Genoud,
	Vinod Koul

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Richard Genoud <richard.genoud@gmail.com>

commit 77e75fda94d2ebb86aa9d35fb1860f6395bf95de upstream.

of_dma_controller_free() was not called on module onloading.
This lead to a soft lockup:
watchdog: BUG: soft lockup - CPU#0 stuck for 23s!
Modules linked in: at_hdmac [last unloaded: at_hdmac]
when of_dma_request_slave_channel() tried to call ofdma->of_dma_xlate().

Cc: stable@vger.kernel.org
Fixes: bbe89c8e3d59 ("at_hdmac: move to generic DMA binding")
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/dma/at_hdmac.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/dma/at_hdmac.c
+++ b/drivers/dma/at_hdmac.c
@@ -2006,6 +2006,8 @@ static int at_dma_remove(struct platform
 	struct resource		*io;
 
 	at_dma_off(atdma);
+	if (pdev->dev.of_node)
+		of_dma_controller_free(pdev->dev.of_node);
 	dma_async_device_unregister(&atdma->dma_common);
 
 	dma_pool_destroy(atdma->memset_pool);



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 135/146] btrfs: release metadata before running delayed refs
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (133 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 134/146] dmaengine: at_hdmac: fix module unloading Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 136/146] staging: vchiq_arm: fix compat VCHIQ_IOC_AWAIT_COMPLETION Greg Kroah-Hartman
                   ` (15 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Omar Sandoval, Liu Bo,
	Nikolay Borisov, Josef Bacik, David Sterba, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

We want to release the unused reservation we have since it refills the
delayed refs reserve, which will make everything go smoother when
running the delayed refs if we're short on our reservation.

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/transaction.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index f74005ca8f08..73c1fbca0c35 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1955,6 +1955,9 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
 		return ret;
 	}
 
+	btrfs_trans_release_metadata(trans, fs_info);
+	trans->block_rsv = NULL;
+
 	/* make a pass through all the delayed refs we have so far
 	 * any runnings procs may add more while we are here
 	 */
@@ -1964,9 +1967,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
 		return ret;
 	}
 
-	btrfs_trans_release_metadata(trans, fs_info);
-	trans->block_rsv = NULL;
-
 	cur_trans = trans->transaction;
 
 	/*
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 166+ messages in thread

* [PATCH 4.14 136/146] staging: vchiq_arm: fix compat VCHIQ_IOC_AWAIT_COMPLETION
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (134 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 135/146] btrfs: release metadata before running delayed refs Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 137/146] staging: rtl8723bs: Add missing return for cfg80211_rtw_get_station Greg Kroah-Hartman
                   ` (14 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Ben Wolsieffer, Stefan Wahren

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ben Wolsieffer <benwolsieffer@gmail.com>

commit 5a96b2d38dc054c0bbcbcd585b116566cbd877fe upstream.

The compatibility ioctl wrapper for VCHIQ_IOC_AWAIT_COMPLETION assumes that
the native ioctl always uses a message buffer and decrements msgbufcount.
Certain message types do not use a message buffer and in this case
msgbufcount is not decremented, and completion->header for the message is
NULL. Because the wrapper unconditionally decrements msgbufcount, the
calling process may assume that a message buffer has been used even when
it has not.

This results in a memory leak in the userspace code that interfaces with
this driver. When msgbufcount is decremented, the userspace code assumes
that the buffer can be freed though the reference in completion->header,
which cannot happen when the reference is NULL.

This patch causes the wrapper to only decrement msgbufcount when the
native ioctl decrements it. Note that we cannot simply copy the native
ioctl's value of msgbufcount, because the wrapper only retrieves messages
from the native ioctl one at a time, while userspace may request multiple
messages.

See https://github.com/raspberrypi/linux/pull/2703 for more discussion of
this patch.

Fixes: 5569a1260933 ("staging: vchiq_arm: Add compatibility wrappers for ioctls")
Signed-off-by: Ben Wolsieffer <benwolsieffer@gmail.com>
Acked-by: Stefan Wahren <stefan.wahren@i2se.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c
+++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c
@@ -1461,6 +1461,7 @@ vchiq_compat_ioctl_await_completion(stru
 	struct vchiq_await_completion32 args32;
 	struct vchiq_completion_data32 completion32;
 	unsigned int *msgbufcount32;
+	unsigned int msgbufcount_native;
 	compat_uptr_t msgbuf32;
 	void *msgbuf;
 	void **msgbufptr;
@@ -1572,7 +1573,11 @@ vchiq_compat_ioctl_await_completion(stru
 			 sizeof(completion32)))
 		return -EFAULT;
 
-	args32.msgbufcount--;
+	if (get_user(msgbufcount_native, &args->msgbufcount))
+		return -EFAULT;
+
+	if (!msgbufcount_native)
+		args32.msgbufcount--;
 
 	msgbufcount32 =
 		&((struct vchiq_await_completion32 __user *)arg)->msgbufcount;



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 137/146] staging: rtl8723bs: Add missing return for cfg80211_rtw_get_station
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (135 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 136/146] staging: vchiq_arm: fix compat VCHIQ_IOC_AWAIT_COMPLETION Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 138/146] USB: usb-storage: Add new IDs to ums-realtek Greg Kroah-Hartman
                   ` (13 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, linux-wireless, youling257, Larry Finger

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Larry Finger <Larry.Finger@lwfinger.net>

commit 8561fb31a1f9594e2807681f5c0721894e367f19 upstream.

With Androidx86 8.1, wificond returns "failed to get
nl80211_sta_info_tx_failed" and wificondControl returns "Invalid signal
poll result from wificond". The fix is to OR sinfo->filled with
BIT_ULL(NL80211_STA_INFO_TX_FAILED).

This missing bit is apparently not needed with NetworkManager, but it
does no harm in that case.

Reported-and-Tested-by: youling257 <youling257@gmail.com>
Cc: linux-wireless@vger.kernel.org
Cc: youling257 <youling257@gmail.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c
+++ b/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c
@@ -1293,7 +1293,7 @@ static int cfg80211_rtw_get_station(stru
 
 		sinfo->filled |= BIT(NL80211_STA_INFO_TX_PACKETS);
 		sinfo->tx_packets = psta->sta_stats.tx_pkts;
-
+		sinfo->filled |= BIT_ULL(NL80211_STA_INFO_TX_FAILED);
 	}
 
 	/* for Ad-Hoc/AP mode */



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 138/146] USB: usb-storage: Add new IDs to ums-realtek
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (136 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 137/146] staging: rtl8723bs: Add missing return for cfg80211_rtw_get_station Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 139/146] usb: core: quirks: add RESET_RESUME quirk for Cherry G230 Stream series Greg Kroah-Hartman
                   ` (12 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Kai-Heng Feng, Alan Stern

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kai-Heng Feng <kai.heng.feng@canonical.com>

commit a84a1bcc992f0545a51d2e120b8ca2ef20e2ea97 upstream.

There are two new Realtek card readers require ums-realtek to work
correctly.

Add the new IDs to support them.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/usb/storage/unusual_realtek.h |   10 ++++++++++
 1 file changed, 10 insertions(+)

--- a/drivers/usb/storage/unusual_realtek.h
+++ b/drivers/usb/storage/unusual_realtek.h
@@ -39,4 +39,14 @@ UNUSUAL_DEV(0x0bda, 0x0159, 0x0000, 0x99
 		"USB Card Reader",
 		USB_SC_DEVICE, USB_PR_DEVICE, init_realtek_cr, 0),
 
+UNUSUAL_DEV(0x0bda, 0x0177, 0x0000, 0x9999,
+		"Realtek",
+		"USB Card Reader",
+		USB_SC_DEVICE, USB_PR_DEVICE, init_realtek_cr, 0),
+
+UNUSUAL_DEV(0x0bda, 0x0184, 0x0000, 0x9999,
+		"Realtek",
+		"USB Card Reader",
+		USB_SC_DEVICE, USB_PR_DEVICE, init_realtek_cr, 0),
+
 #endif  /* defined(CONFIG_USB_STORAGE_REALTEK) || ... */



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 139/146] usb: core: quirks: add RESET_RESUME quirk for Cherry G230 Stream series
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (137 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 138/146] USB: usb-storage: Add new IDs to ums-realtek Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 140/146] Revert "usb: dwc3: gadget: skip Set/Clear Halt when invalid" Greg Kroah-Hartman
                   ` (11 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Michael Niewöhner

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michael Niewöhner <linux@mniewoehner.de>

commit effd14f66cc1ef6701a19c5a56e39c35f4d395a5 upstream.

Cherry G230 Stream 2.0 (G85-231) and 3.0 (G85-232) need this quirk to
function correctly. This fixes a but where double pressing numlock locks
up the device completely with need to replug the keyboard.

Signed-off-by: Michael Niewöhner <linux@mniewoehner.de>
Tested-by: Michael Niewöhner <linux@mniewoehner.de>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/usb/core/quirks.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/usb/core/quirks.c
+++ b/drivers/usb/core/quirks.c
@@ -64,6 +64,9 @@ static const struct usb_device_id usb_qu
 	/* Microsoft LifeCam-VX700 v2.0 */
 	{ USB_DEVICE(0x045e, 0x0770), .driver_info = USB_QUIRK_RESET_RESUME },
 
+	/* Cherry Stream G230 2.0 (G85-231) and 3.0 (G85-232) */
+	{ USB_DEVICE(0x046a, 0x0023), .driver_info = USB_QUIRK_RESET_RESUME },
+
 	/* Logitech HD Pro Webcams C920, C920-C, C925e and C930e */
 	{ USB_DEVICE(0x046d, 0x082d), .driver_info = USB_QUIRK_DELAY_INIT },
 	{ USB_DEVICE(0x046d, 0x0841), .driver_info = USB_QUIRK_DELAY_INIT },



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 140/146] Revert "usb: dwc3: gadget: skip Set/Clear Halt when invalid"
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (138 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 139/146] usb: core: quirks: add RESET_RESUME quirk for Cherry G230 Stream series Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 141/146] iio:st_magn: Fix enable device after trigger Greg Kroah-Hartman
                   ` (10 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Felipe Balbi

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Felipe Balbi <felipe.balbi@linux.intel.com>

commit 38317f5c0f2faae5110854f36edad810f841d62f upstream.

This reverts commit ffb80fc672c3a7b6afd0cefcb1524fb99917b2f3.

Turns out that commit is wrong. Host controllers are allowed to use
Clear Feature HALT as means to sync data toggle between host and
periperal.

Cc: <stable@vger.kernel.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/usb/dwc3/gadget.c |    5 -----
 1 file changed, 5 deletions(-)

--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1511,9 +1511,6 @@ int __dwc3_gadget_ep_set_halt(struct dwc
 		unsigned transfer_in_flight;
 		unsigned started;
 
-		if (dep->flags & DWC3_EP_STALL)
-			return 0;
-
 		if (dep->number > 1)
 			trb = dwc3_ep_prev_trb(dep, dep->trb_enqueue);
 		else
@@ -1535,8 +1532,6 @@ int __dwc3_gadget_ep_set_halt(struct dwc
 		else
 			dep->flags |= DWC3_EP_STALL;
 	} else {
-		if (!(dep->flags & DWC3_EP_STALL))
-			return 0;
 
 		ret = dwc3_send_clear_stall_ep_cmd(dep);
 		if (ret)



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 141/146] iio:st_magn: Fix enable device after trigger
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (139 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 140/146] Revert "usb: dwc3: gadget: skip Set/Clear Halt when invalid" Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 142/146] lib/test_kmod.c: fix rmmod double free Greg Kroah-Hartman
                   ` (9 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Martin Kelly, Denis Ciocca, Stable,
	Jonathan Cameron

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Martin Kelly <martin@martingkelly.com>

commit fe5192ac81ad0d4dfe1395d11f393f0513c15f7f upstream.

Currently, we enable the device before we enable the device trigger. At
high frequencies, this can cause interrupts that don't yet have a poll
function associated with them and are thus treated as spurious. At high
frequencies with level interrupts, this can even cause an interrupt storm
of repeated spurious interrupts (~100,000 on my Beagleboard with the
LSM9DS1 magnetometer). If these repeat too much, the interrupt will get
disabled and the device will stop functioning.

To prevent these problems, enable the device prior to enabling the device
trigger, and disable the divec prior to disabling the trigger. This means
there's no window of time during which the device creates interrupts but we
have no trigger to answer them.

Fixes: 90efe055629 ("iio: st_sensors: harden interrupt handling")
Signed-off-by: Martin Kelly <martin@martingkelly.com>
Tested-by: Denis Ciocca <denis.ciocca@st.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/iio/magnetometer/st_magn_buffer.c |   12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

--- a/drivers/iio/magnetometer/st_magn_buffer.c
+++ b/drivers/iio/magnetometer/st_magn_buffer.c
@@ -30,11 +30,6 @@ int st_magn_trig_set_state(struct iio_tr
 	return st_sensors_set_dataready_irq(indio_dev, state);
 }
 
-static int st_magn_buffer_preenable(struct iio_dev *indio_dev)
-{
-	return st_sensors_set_enable(indio_dev, true);
-}
-
 static int st_magn_buffer_postenable(struct iio_dev *indio_dev)
 {
 	int err;
@@ -50,7 +45,7 @@ static int st_magn_buffer_postenable(str
 	if (err < 0)
 		goto st_magn_buffer_postenable_error;
 
-	return err;
+	return st_sensors_set_enable(indio_dev, true);
 
 st_magn_buffer_postenable_error:
 	kfree(mdata->buffer_data);
@@ -63,11 +58,11 @@ static int st_magn_buffer_predisable(str
 	int err;
 	struct st_sensor_data *mdata = iio_priv(indio_dev);
 
-	err = iio_triggered_buffer_predisable(indio_dev);
+	err = st_sensors_set_enable(indio_dev, false);
 	if (err < 0)
 		goto st_magn_buffer_predisable_error;
 
-	err = st_sensors_set_enable(indio_dev, false);
+	err = iio_triggered_buffer_predisable(indio_dev);
 
 st_magn_buffer_predisable_error:
 	kfree(mdata->buffer_data);
@@ -75,7 +70,6 @@ st_magn_buffer_predisable_error:
 }
 
 static const struct iio_buffer_setup_ops st_magn_buffer_setup_ops = {
-	.preenable = &st_magn_buffer_preenable,
 	.postenable = &st_magn_buffer_postenable,
 	.predisable = &st_magn_buffer_predisable,
 };



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 142/146] lib/test_kmod.c: fix rmmod double free
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (140 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 141/146] iio:st_magn: Fix enable device after trigger Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 143/146] mm: use swp_offset as key in shmem_replace_page() Greg Kroah-Hartman
                   ` (8 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Luis Chamberlain, Randy Dunlap,
	Andrew Morton, Linus Torvalds

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Luis Chamberlain <mcgrof@kernel.org>

commit 5618cf031fecda63847cafd1091e7b8bd626cdb1 upstream.

We free the misc device string twice on rmmod; fix this.  Without this
we cannot remove the module without crashing.

Link: http://lkml.kernel.org/r/20181124050500.5257-1-mcgrof@kernel.org
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org>	[4.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 lib/test_kmod.c |    1 -
 1 file changed, 1 deletion(-)

--- a/lib/test_kmod.c
+++ b/lib/test_kmod.c
@@ -1221,7 +1221,6 @@ void unregister_test_dev_kmod(struct kmo
 
 	dev_info(test_dev->dev, "removing interface\n");
 	misc_deregister(&test_dev->misc_dev);
-	kfree(&test_dev->misc_dev.name);
 
 	mutex_unlock(&test_dev->config_mutex);
 	mutex_unlock(&test_dev->trigger_mutex);



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 143/146] mm: use swp_offset as key in shmem_replace_page()
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (141 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 142/146] lib/test_kmod.c: fix rmmod double free Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 144/146] Drivers: hv: vmbus: check the creation_status in vmbus_establish_gpadl() Greg Kroah-Hartman
                   ` (7 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Yu Zhao, Matthew Wilcox,
	Hugh Dickins, Andrew Morton, Linus Torvalds

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Yu Zhao <yuzhao@google.com>

commit c1cb20d43728aa9b5393bd8d489bc85c142949b2 upstream.

We changed the key of swap cache tree from swp_entry_t.val to
swp_offset.  We need to do so in shmem_replace_page() as well.

Hugh said:
 "shmem_replace_page() has been wrong since the day I wrote it: good
  enough to work on swap "type" 0, which is all most people ever use
  (especially those few who need shmem_replace_page() at all), but
  broken once there are any non-0 swp_type bits set in the higher order
  bits"

Link: http://lkml.kernel.org/r/20181121215442.138545-1-yuzhao@google.com
Fixes: f6ab1f7f6b2d ("mm, swap: use offset of swap entry as key of swap cache")
Signed-off-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: Matthew Wilcox <willy@infradead.org>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>	[4.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/shmem.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1532,11 +1532,13 @@ static int shmem_replace_page(struct pag
 {
 	struct page *oldpage, *newpage;
 	struct address_space *swap_mapping;
+	swp_entry_t entry;
 	pgoff_t swap_index;
 	int error;
 
 	oldpage = *pagep;
-	swap_index = page_private(oldpage);
+	entry.val = page_private(oldpage);
+	swap_index = swp_offset(entry);
 	swap_mapping = page_mapping(oldpage);
 
 	/*
@@ -1555,7 +1557,7 @@ static int shmem_replace_page(struct pag
 	__SetPageLocked(newpage);
 	__SetPageSwapBacked(newpage);
 	SetPageUptodate(newpage);
-	set_page_private(newpage, swap_index);
+	set_page_private(newpage, entry.val);
 	SetPageSwapCache(newpage);
 
 	/*



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 144/146] Drivers: hv: vmbus: check the creation_status in vmbus_establish_gpadl()
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (142 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 143/146] mm: use swp_offset as key in shmem_replace_page() Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 145/146] misc: mic/scif: fix copy-paste error in scif_create_remote_lookup Greg Kroah-Hartman
                   ` (6 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stephen Hemminger, K. Y. Srinivasan,
	Haiyang Zhang, Dexuan Cui

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dexuan Cui <decui@microsoft.com>

commit eceb05965489784f24bbf4d61ba60e475a983016 upstream.

This is a longstanding issue: if the vmbus upper-layer drivers try to
consume too many GPADLs, the host may return with an error
0xC0000044 (STATUS_QUOTA_EXCEEDED), but currently we forget to check
the creation_status, and hence we can pass an invalid GPADL handle
into the OPEN_CHANNEL message, and get an error code 0xc0000225 in
open_info->response.open_result.status, and finally we hang in
vmbus_open() -> "goto error_free_info" -> vmbus_teardown_gpadl().

With this patch, we can exit gracefully on STATUS_QUOTA_EXCEEDED.

Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/hv/channel.c |    8 ++++++++
 1 file changed, 8 insertions(+)

--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -454,6 +454,14 @@ int vmbus_establish_gpadl(struct vmbus_c
 	}
 	wait_for_completion(&msginfo->waitevent);
 
+	if (msginfo->response.gpadl_created.creation_status != 0) {
+		pr_err("Failed to establish GPADL: err = 0x%x\n",
+		       msginfo->response.gpadl_created.creation_status);
+
+		ret = -EDQUOT;
+		goto cleanup;
+	}
+
 	if (channel->rescind) {
 		ret = -ENODEV;
 		goto cleanup;



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 145/146] misc: mic/scif: fix copy-paste error in scif_create_remote_lookup
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (143 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 144/146] Drivers: hv: vmbus: check the creation_status in vmbus_establish_gpadl() Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 10:50 ` [PATCH 4.14 146/146] binder: fix race that allows malicious free of live buffer Greg Kroah-Hartman
                   ` (5 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, YueHaibing

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: YueHaibing <yuehaibing@huawei.com>

commit 6484a677294aa5d08c0210f2f387ebb9be646115 upstream.

gcc '-Wunused-but-set-variable' warning:

drivers/misc/mic/scif/scif_rma.c: In function 'scif_create_remote_lookup':
drivers/misc/mic/scif/scif_rma.c:373:25: warning:
 variable 'vmalloc_num_pages' set but not used [-Wunused-but-set-variable]

'vmalloc_num_pages' should be used to determine if the address is
within the vmalloc range.

Fixes: ba612aa8b487 ("misc: mic: SCIF memory registration and unregistration")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/misc/mic/scif/scif_rma.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/misc/mic/scif/scif_rma.c
+++ b/drivers/misc/mic/scif/scif_rma.c
@@ -417,7 +417,7 @@ static int scif_create_remote_lookup(str
 		if (err)
 			goto error_window;
 		err = scif_map_page(&window->num_pages_lookup.lookup[j],
-				    vmalloc_dma_phys ?
+				    vmalloc_num_pages ?
 				    vmalloc_to_page(&window->num_pages[i]) :
 				    virt_to_page(&window->num_pages[i]),
 				    remote_dev);



^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH 4.14 146/146] binder: fix race that allows malicious free of live buffer
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (144 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 145/146] misc: mic/scif: fix copy-paste error in scif_create_remote_lookup Greg Kroah-Hartman
@ 2018-12-04 10:50 ` Greg Kroah-Hartman
  2018-12-04 17:32 ` [PATCH 4.14 000/146] 4.14.86-stable review kernelci.org bot
                   ` (4 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 10:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Todd Kjos, Arve Hjønnevåg

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Todd Kjos <tkjos@android.com>

commit 7bada55ab50697861eee6bb7d60b41e68a961a9c upstream.

Malicious code can attempt to free buffers using the BC_FREE_BUFFER
ioctl to binder. There are protections against a user freeing a buffer
while in use by the kernel, however there was a window where
BC_FREE_BUFFER could be used to free a recently allocated buffer that
was not completely initialized. This resulted in a use-after-free
detected by KASAN with a malicious test program.

This window is closed by setting the buffer's allow_user_free attribute
to 0 when the buffer is allocated or when the user has previously freed
it instead of waiting for the caller to set it. The problem was that
when the struct buffer was recycled, allow_user_free was stale and set
to 1 allowing a free to go through.

Signed-off-by: Todd Kjos <tkjos@google.com>
Acked-by: Arve Hjønnevåg <arve@android.com>
Cc: stable <stable@vger.kernel.org> # 4.14
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


---
 drivers/android/binder.c       |   21 ++++++++++++---------
 drivers/android/binder_alloc.c |   14 ++++++--------
 drivers/android/binder_alloc.h |    3 +--
 3 files changed, 19 insertions(+), 19 deletions(-)

--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -2918,7 +2918,6 @@ static void binder_transaction(struct bi
 		t->buffer = NULL;
 		goto err_binder_alloc_buf_failed;
 	}
-	t->buffer->allow_user_free = 0;
 	t->buffer->debug_id = t->debug_id;
 	t->buffer->transaction = t;
 	t->buffer->target_node = target_node;
@@ -3407,14 +3406,18 @@ static int binder_thread_write(struct bi
 
 			buffer = binder_alloc_prepare_to_free(&proc->alloc,
 							      data_ptr);
-			if (buffer == NULL) {
-				binder_user_error("%d:%d BC_FREE_BUFFER u%016llx no match\n",
-					proc->pid, thread->pid, (u64)data_ptr);
-				break;
-			}
-			if (!buffer->allow_user_free) {
-				binder_user_error("%d:%d BC_FREE_BUFFER u%016llx matched unreturned buffer\n",
-					proc->pid, thread->pid, (u64)data_ptr);
+			if (IS_ERR_OR_NULL(buffer)) {
+				if (PTR_ERR(buffer) == -EPERM) {
+					binder_user_error(
+						"%d:%d BC_FREE_BUFFER u%016llx matched unreturned or currently freeing buffer\n",
+						proc->pid, thread->pid,
+						(u64)data_ptr);
+				} else {
+					binder_user_error(
+						"%d:%d BC_FREE_BUFFER u%016llx no match\n",
+						proc->pid, thread->pid,
+						(u64)data_ptr);
+				}
 				break;
 			}
 			binder_debug(BINDER_DEBUG_FREE_BUFFER,
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -149,14 +149,12 @@ static struct binder_buffer *binder_allo
 		else {
 			/*
 			 * Guard against user threads attempting to
-			 * free the buffer twice
+			 * free the buffer when in use by kernel or
+			 * after it's already been freed.
 			 */
-			if (buffer->free_in_progress) {
-				pr_err("%d:%d FREE_BUFFER u%016llx user freed buffer twice\n",
-				       alloc->pid, current->pid, (u64)user_ptr);
-				return NULL;
-			}
-			buffer->free_in_progress = 1;
+			if (!buffer->allow_user_free)
+				return ERR_PTR(-EPERM);
+			buffer->allow_user_free = 0;
 			return buffer;
 		}
 	}
@@ -486,7 +484,7 @@ struct binder_buffer *binder_alloc_new_b
 
 	rb_erase(best_fit, &alloc->free_buffers);
 	buffer->free = 0;
-	buffer->free_in_progress = 0;
+	buffer->allow_user_free = 0;
 	binder_insert_allocated_buffer_locked(alloc, buffer);
 	binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
 		     "%d: binder_alloc_buf size %zd got %pK\n",
--- a/drivers/android/binder_alloc.h
+++ b/drivers/android/binder_alloc.h
@@ -50,8 +50,7 @@ struct binder_buffer {
 	unsigned free:1;
 	unsigned allow_user_free:1;
 	unsigned async_transaction:1;
-	unsigned free_in_progress:1;
-	unsigned debug_id:28;
+	unsigned debug_id:29;
 
 	struct binder_transaction *transaction;
 



^ permalink raw reply	[flat|nested] 166+ messages in thread

* Re: [PATCH 4.14 098/146] x86/process: Consolidate and simplify switch_to_xtra() code
  2018-12-04 10:49 ` [PATCH 4.14 098/146] x86/process: Consolidate and simplify switch_to_xtra() code Greg Kroah-Hartman
@ 2018-12-04 11:14   ` Thomas Gleixner
  2018-12-04 13:39     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 166+ messages in thread
From: Thomas Gleixner @ 2018-12-04 11:14 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, Ingo Molnar, Peter Zijlstra,
	Andy Lutomirski, Linus Torvalds, Jiri Kosina, Tom Lendacky,
	Josh Poimboeuf, Andrea Arcangeli, David Woodhouse, Tim Chen,
	Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

On Tue, 4 Dec 2018, Greg Kroah-Hartman wrote:
> --- a/arch/x86/kernel/process_32.c
> +++ b/arch/x86/kernel/process_32.c
> @@ -234,7 +234,6 @@ __switch_to(struct task_struct *prev_p,
>  	struct fpu *prev_fpu = &prev->fpu;
>  	struct fpu *next_fpu = &next->fpu;
>  	int cpu = smp_processor_id();
> -	struct tss_struct *tss = &per_cpu(cpu_tss_rw, cpu);
>  
>  	/* never put a printk in __switch_to... printk() calls wake_up*() indirectly */
>  
> @@ -266,12 +265,7 @@ __switch_to(struct task_struct *prev_p,
>  	if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl))
>  		set_iopl_mask(next->iopl);
>  
> -	/*
> -	 * Now maybe handle debug registers and/or IO bitmaps
> -	 */
> -	if (unlikely(task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV ||
> -		     task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT))
> -		__switch_to_xtra(prev_p, next_p, tss);
> +	switch_to_extra(prev_p, next_p);

This is missing the hunk below.

Thanks,

	tglx

8<------------------

diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 67cecc9a2b6f..c2df91eab573 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -59,6 +59,8 @@
 #include <asm/intel_rdt_sched.h>
 #include <asm/proto.h>
 
+#include "process.h"
+
 void __show_regs(struct pt_regs *regs, int all)
 {
 	unsigned long cr0 = 0L, cr2 = 0L, cr3 = 0L, cr4 = 0L;

^ permalink raw reply related	[flat|nested] 166+ messages in thread

* Re: [PATCH 4.14 018/146] libceph: implement CEPHX_V2 calculation mode
  2018-12-04 10:48 ` [PATCH 4.14 018/146] libceph: implement CEPHX_V2 calculation mode Greg Kroah-Hartman
@ 2018-12-04 12:06   ` Ilya Dryomov
  2018-12-04 13:41     ` Greg KH
  0 siblings, 1 reply; 166+ messages in thread
From: Ilya Dryomov @ 2018-12-04 12:06 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel, stable, Sage Weil, ben.hutchings, sashal

On Tue, Dec 4, 2018 at 12:01 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> 4.14-stable review patch.  If anyone has any objections, please let me know.
>
> ------------------
>
> commit cc255c76c70f7a87d97939621eae04b600d9f4a1 upstream.
>
> Derive the signature from the entire buffer (both AES cipher blocks)
> instead of using just the first half of the first block, leaving out
> data_crc entirely.
>
> This addresses CVE-2018-1129.
>
> Link: http://tracker.ceph.com/issues/24837
> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
> Reviewed-by: Sage Weil <sage@redhat.com>
> Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>  include/linux/ceph/ceph_features.h |  7 +--
>  net/ceph/auth_x.c                  | 73 +++++++++++++++++++++++-------
>  2 files changed, 60 insertions(+), 20 deletions(-)
>
> diff --git a/include/linux/ceph/ceph_features.h b/include/linux/ceph/ceph_features.h
> index 59042d5ac520..70f42eef813b 100644
> --- a/include/linux/ceph/ceph_features.h
> +++ b/include/linux/ceph/ceph_features.h
> @@ -165,9 +165,9 @@ DEFINE_CEPH_FEATURE(58, 1, FS_FILE_LAYOUT_V2) // overlap
>  DEFINE_CEPH_FEATURE(59, 1, FS_BTIME)
>  DEFINE_CEPH_FEATURE(59, 1, FS_CHANGE_ATTR) // overlap
>  DEFINE_CEPH_FEATURE(59, 1, MSG_ADDR2) // overlap
> -DEFINE_CEPH_FEATURE(60, 1, BLKIN_TRACING)  // *do not share this bit*
> +DEFINE_CEPH_FEATURE(60, 1, OSD_RECOVERY_DELETES) // *do not share this bit*
> +DEFINE_CEPH_FEATURE(61, 1, CEPHX_V2)             // *do not share this bit*
>
> -DEFINE_CEPH_FEATURE(61, 1, RESERVED2)          // unused, but slow down!
>  DEFINE_CEPH_FEATURE(62, 1, RESERVED)           // do not use; used as a sentinal
>  DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, LUMINOUS) // client-facing
>
> @@ -209,7 +209,8 @@ DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, LUMINOUS) // client-facin
>          CEPH_FEATURE_SERVER_JEWEL |            \
>          CEPH_FEATURE_MON_STATEFUL_SUB |        \
>          CEPH_FEATURE_CRUSH_TUNABLES5 |         \
> -        CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING)
> +        CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING | \
> +        CEPH_FEATURE_CEPHX_V2)
>
>  #define CEPH_FEATURES_REQUIRED_DEFAULT   \
>         (CEPH_FEATURE_NOSRCADDR |        \
> diff --git a/net/ceph/auth_x.c b/net/ceph/auth_x.c
> index ce28bb07d8fd..10eb759bbcb4 100644
> --- a/net/ceph/auth_x.c
> +++ b/net/ceph/auth_x.c
> @@ -9,6 +9,7 @@
>
>  #include <linux/ceph/decode.h>
>  #include <linux/ceph/auth.h>
> +#include <linux/ceph/ceph_features.h>
>  #include <linux/ceph/libceph.h>
>  #include <linux/ceph/messenger.h>
>
> @@ -803,26 +804,64 @@ static int calc_signature(struct ceph_x_authorizer *au, struct ceph_msg *msg,
>                           __le64 *psig)
>  {
>         void *enc_buf = au->enc_buf;
> -       struct {
> -               __le32 len;
> -               __le32 header_crc;
> -               __le32 front_crc;
> -               __le32 middle_crc;
> -               __le32 data_crc;
> -       } __packed *sigblock = enc_buf + ceph_x_encrypt_offset();
>         int ret;
>
> -       sigblock->len = cpu_to_le32(4*sizeof(u32));
> -       sigblock->header_crc = msg->hdr.crc;
> -       sigblock->front_crc = msg->footer.front_crc;
> -       sigblock->middle_crc = msg->footer.middle_crc;
> -       sigblock->data_crc =  msg->footer.data_crc;
> -       ret = ceph_x_encrypt(&au->session_key, enc_buf, CEPHX_AU_ENC_BUF_LEN,
> -                            sizeof(*sigblock));
> -       if (ret < 0)
> -               return ret;
> +       if (!CEPH_HAVE_FEATURE(msg->con->peer_features, CEPHX_V2)) {
> +               struct {
> +                       __le32 len;
> +                       __le32 header_crc;
> +                       __le32 front_crc;
> +                       __le32 middle_crc;
> +                       __le32 data_crc;
> +               } __packed *sigblock = enc_buf + ceph_x_encrypt_offset();
> +
> +               sigblock->len = cpu_to_le32(4*sizeof(u32));
> +               sigblock->header_crc = msg->hdr.crc;
> +               sigblock->front_crc = msg->footer.front_crc;
> +               sigblock->middle_crc = msg->footer.middle_crc;
> +               sigblock->data_crc =  msg->footer.data_crc;
> +
> +               ret = ceph_x_encrypt(&au->session_key, enc_buf,
> +                                    CEPHX_AU_ENC_BUF_LEN, sizeof(*sigblock));
> +               if (ret < 0)
> +                       return ret;
> +
> +               *psig = *(__le64 *)(enc_buf + sizeof(u32));
> +       } else {
> +               struct {
> +                       __le32 header_crc;
> +                       __le32 front_crc;
> +                       __le32 front_len;
> +                       __le32 middle_crc;
> +                       __le32 middle_len;
> +                       __le32 data_crc;
> +                       __le32 data_len;
> +                       __le32 seq_lower_word;
> +               } __packed *sigblock = enc_buf;
> +               struct {
> +                       __le64 a, b, c, d;
> +               } __packed *penc = enc_buf;
> +               int ciphertext_len;
> +
> +               sigblock->header_crc = msg->hdr.crc;
> +               sigblock->front_crc = msg->footer.front_crc;
> +               sigblock->front_len = msg->hdr.front_len;
> +               sigblock->middle_crc = msg->footer.middle_crc;
> +               sigblock->middle_len = msg->hdr.middle_len;
> +               sigblock->data_crc =  msg->footer.data_crc;
> +               sigblock->data_len = msg->hdr.data_len;
> +               sigblock->seq_lower_word = *(__le32 *)&msg->hdr.seq;
> +
> +               /* no leading len, no ceph_x_encrypt_header */
> +               ret = ceph_crypt(&au->session_key, true, enc_buf,
> +                                CEPHX_AU_ENC_BUF_LEN, sizeof(*sigblock),
> +                                &ciphertext_len);
> +               if (ret)
> +                       return ret;
> +
> +               *psig = penc->a ^ penc->b ^ penc->c ^ penc->d;
> +       }
>
> -       *psig = *(__le64 *)(enc_buf + sizeof(u32));
>         return 0;
>  }

Hi Greg,

I thought this series (patches 13 - 18) was dropped from the 4.14 queue.
If it wasn't, you also need to pick up the following:

f1d10e046379 libceph: weaken sizeof check in ceph_x_verify_authorizer_reply()
130f52f2b203 libceph: check authorizer reply/challenge length before reading

See our discussion with Sasha:

  https://www.spinics.net/lists/stable/msg272462.html

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 166+ messages in thread

* Re: [PATCH 4.14 098/146] x86/process: Consolidate and simplify switch_to_xtra() code
  2018-12-04 11:14   ` Thomas Gleixner
@ 2018-12-04 13:39     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-04 13:39 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, stable, Ingo Molnar, Peter Zijlstra,
	Andy Lutomirski, Linus Torvalds, Jiri Kosina, Tom Lendacky,
	Josh Poimboeuf, Andrea Arcangeli, David Woodhouse, Tim Chen,
	Andi Kleen, Dave Hansen, Casey Schaufler, Asit Mallick,
	Arjan van de Ven, Jon Masters, Waiman Long, Dave Stewart,
	Kees Cook

On Tue, Dec 04, 2018 at 12:14:13PM +0100, Thomas Gleixner wrote:
> On Tue, 4 Dec 2018, Greg Kroah-Hartman wrote:
> > --- a/arch/x86/kernel/process_32.c
> > +++ b/arch/x86/kernel/process_32.c
> > @@ -234,7 +234,6 @@ __switch_to(struct task_struct *prev_p,
> >  	struct fpu *prev_fpu = &prev->fpu;
> >  	struct fpu *next_fpu = &next->fpu;
> >  	int cpu = smp_processor_id();
> > -	struct tss_struct *tss = &per_cpu(cpu_tss_rw, cpu);
> >  
> >  	/* never put a printk in __switch_to... printk() calls wake_up*() indirectly */
> >  
> > @@ -266,12 +265,7 @@ __switch_to(struct task_struct *prev_p,
> >  	if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl))
> >  		set_iopl_mask(next->iopl);
> >  
> > -	/*
> > -	 * Now maybe handle debug registers and/or IO bitmaps
> > -	 */
> > -	if (unlikely(task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV ||
> > -		     task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT))
> > -		__switch_to_xtra(prev_p, next_p, tss);
> > +	switch_to_extra(prev_p, next_p);
> 
> This is missing the hunk below.
> 
> Thanks,
> 
> 	tglx
> 
> 8<------------------
> 
> diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
> index 67cecc9a2b6f..c2df91eab573 100644
> --- a/arch/x86/kernel/process_32.c
> +++ b/arch/x86/kernel/process_32.c
> @@ -59,6 +59,8 @@
>  #include <asm/intel_rdt_sched.h>
>  #include <asm/proto.h>
>  
> +#include "process.h"
> +
>  void __show_regs(struct pt_regs *regs, int all)
>  {
>  	unsigned long cr0 = 0L, cr2 = 0L, cr3 = 0L, cr4 = 0L;

Thanks, now updated.

greg k-h

^ permalink raw reply	[flat|nested] 166+ messages in thread

* Re: [PATCH 4.14 018/146] libceph: implement CEPHX_V2 calculation mode
  2018-12-04 12:06   ` Ilya Dryomov
@ 2018-12-04 13:41     ` Greg KH
  0 siblings, 0 replies; 166+ messages in thread
From: Greg KH @ 2018-12-04 13:41 UTC (permalink / raw)
  To: Ilya Dryomov; +Cc: linux-kernel, stable, Sage Weil, ben.hutchings, sashal

On Tue, Dec 04, 2018 at 01:06:40PM +0100, Ilya Dryomov wrote:
> On Tue, Dec 4, 2018 at 12:01 PM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > 4.14-stable review patch.  If anyone has any objections, please let me know.
> >
> > ------------------
> >
> > commit cc255c76c70f7a87d97939621eae04b600d9f4a1 upstream.
> >
> > Derive the signature from the entire buffer (both AES cipher blocks)
> > instead of using just the first half of the first block, leaving out
> > data_crc entirely.
> >
> > This addresses CVE-2018-1129.
> >
> > Link: http://tracker.ceph.com/issues/24837
> > Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
> > Reviewed-by: Sage Weil <sage@redhat.com>
> > Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
> > Signed-off-by: Sasha Levin <sashal@kernel.org>
> > ---
> >  include/linux/ceph/ceph_features.h |  7 +--
> >  net/ceph/auth_x.c                  | 73 +++++++++++++++++++++++-------
> >  2 files changed, 60 insertions(+), 20 deletions(-)
> >
> > diff --git a/include/linux/ceph/ceph_features.h b/include/linux/ceph/ceph_features.h
> > index 59042d5ac520..70f42eef813b 100644
> > --- a/include/linux/ceph/ceph_features.h
> > +++ b/include/linux/ceph/ceph_features.h
> > @@ -165,9 +165,9 @@ DEFINE_CEPH_FEATURE(58, 1, FS_FILE_LAYOUT_V2) // overlap
> >  DEFINE_CEPH_FEATURE(59, 1, FS_BTIME)
> >  DEFINE_CEPH_FEATURE(59, 1, FS_CHANGE_ATTR) // overlap
> >  DEFINE_CEPH_FEATURE(59, 1, MSG_ADDR2) // overlap
> > -DEFINE_CEPH_FEATURE(60, 1, BLKIN_TRACING)  // *do not share this bit*
> > +DEFINE_CEPH_FEATURE(60, 1, OSD_RECOVERY_DELETES) // *do not share this bit*
> > +DEFINE_CEPH_FEATURE(61, 1, CEPHX_V2)             // *do not share this bit*
> >
> > -DEFINE_CEPH_FEATURE(61, 1, RESERVED2)          // unused, but slow down!
> >  DEFINE_CEPH_FEATURE(62, 1, RESERVED)           // do not use; used as a sentinal
> >  DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, LUMINOUS) // client-facing
> >
> > @@ -209,7 +209,8 @@ DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, LUMINOUS) // client-facin
> >          CEPH_FEATURE_SERVER_JEWEL |            \
> >          CEPH_FEATURE_MON_STATEFUL_SUB |        \
> >          CEPH_FEATURE_CRUSH_TUNABLES5 |         \
> > -        CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING)
> > +        CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING | \
> > +        CEPH_FEATURE_CEPHX_V2)
> >
> >  #define CEPH_FEATURES_REQUIRED_DEFAULT   \
> >         (CEPH_FEATURE_NOSRCADDR |        \
> > diff --git a/net/ceph/auth_x.c b/net/ceph/auth_x.c
> > index ce28bb07d8fd..10eb759bbcb4 100644
> > --- a/net/ceph/auth_x.c
> > +++ b/net/ceph/auth_x.c
> > @@ -9,6 +9,7 @@
> >
> >  #include <linux/ceph/decode.h>
> >  #include <linux/ceph/auth.h>
> > +#include <linux/ceph/ceph_features.h>
> >  #include <linux/ceph/libceph.h>
> >  #include <linux/ceph/messenger.h>
> >
> > @@ -803,26 +804,64 @@ static int calc_signature(struct ceph_x_authorizer *au, struct ceph_msg *msg,
> >                           __le64 *psig)
> >  {
> >         void *enc_buf = au->enc_buf;
> > -       struct {
> > -               __le32 len;
> > -               __le32 header_crc;
> > -               __le32 front_crc;
> > -               __le32 middle_crc;
> > -               __le32 data_crc;
> > -       } __packed *sigblock = enc_buf + ceph_x_encrypt_offset();
> >         int ret;
> >
> > -       sigblock->len = cpu_to_le32(4*sizeof(u32));
> > -       sigblock->header_crc = msg->hdr.crc;
> > -       sigblock->front_crc = msg->footer.front_crc;
> > -       sigblock->middle_crc = msg->footer.middle_crc;
> > -       sigblock->data_crc =  msg->footer.data_crc;
> > -       ret = ceph_x_encrypt(&au->session_key, enc_buf, CEPHX_AU_ENC_BUF_LEN,
> > -                            sizeof(*sigblock));
> > -       if (ret < 0)
> > -               return ret;
> > +       if (!CEPH_HAVE_FEATURE(msg->con->peer_features, CEPHX_V2)) {
> > +               struct {
> > +                       __le32 len;
> > +                       __le32 header_crc;
> > +                       __le32 front_crc;
> > +                       __le32 middle_crc;
> > +                       __le32 data_crc;
> > +               } __packed *sigblock = enc_buf + ceph_x_encrypt_offset();
> > +
> > +               sigblock->len = cpu_to_le32(4*sizeof(u32));
> > +               sigblock->header_crc = msg->hdr.crc;
> > +               sigblock->front_crc = msg->footer.front_crc;
> > +               sigblock->middle_crc = msg->footer.middle_crc;
> > +               sigblock->data_crc =  msg->footer.data_crc;
> > +
> > +               ret = ceph_x_encrypt(&au->session_key, enc_buf,
> > +                                    CEPHX_AU_ENC_BUF_LEN, sizeof(*sigblock));
> > +               if (ret < 0)
> > +                       return ret;
> > +
> > +               *psig = *(__le64 *)(enc_buf + sizeof(u32));
> > +       } else {
> > +               struct {
> > +                       __le32 header_crc;
> > +                       __le32 front_crc;
> > +                       __le32 front_len;
> > +                       __le32 middle_crc;
> > +                       __le32 middle_len;
> > +                       __le32 data_crc;
> > +                       __le32 data_len;
> > +                       __le32 seq_lower_word;
> > +               } __packed *sigblock = enc_buf;
> > +               struct {
> > +                       __le64 a, b, c, d;
> > +               } __packed *penc = enc_buf;
> > +               int ciphertext_len;
> > +
> > +               sigblock->header_crc = msg->hdr.crc;
> > +               sigblock->front_crc = msg->footer.front_crc;
> > +               sigblock->front_len = msg->hdr.front_len;
> > +               sigblock->middle_crc = msg->footer.middle_crc;
> > +               sigblock->middle_len = msg->hdr.middle_len;
> > +               sigblock->data_crc =  msg->footer.data_crc;
> > +               sigblock->data_len = msg->hdr.data_len;
> > +               sigblock->seq_lower_word = *(__le32 *)&msg->hdr.seq;
> > +
> > +               /* no leading len, no ceph_x_encrypt_header */
> > +               ret = ceph_crypt(&au->session_key, true, enc_buf,
> > +                                CEPHX_AU_ENC_BUF_LEN, sizeof(*sigblock),
> > +                                &ciphertext_len);
> > +               if (ret)
> > +                       return ret;
> > +
> > +               *psig = penc->a ^ penc->b ^ penc->c ^ penc->d;
> > +       }
> >
> > -       *psig = *(__le64 *)(enc_buf + sizeof(u32));
> >         return 0;
> >  }
> 
> Hi Greg,
> 
> I thought this series (patches 13 - 18) was dropped from the 4.14 queue.
> If it wasn't, you also need to pick up the following:
> 
> f1d10e046379 libceph: weaken sizeof check in ceph_x_verify_authorizer_reply()
> 130f52f2b203 libceph: check authorizer reply/challenge length before reading
> 
> See our discussion with Sasha:
> 
>   https://www.spinics.net/lists/stable/msg272462.html

Ah, missed that, sorry.  I've queued these patches up now, thanks!

greg k-h

^ permalink raw reply	[flat|nested] 166+ messages in thread

* Re: [PATCH 4.14 000/146] 4.14.86-stable review
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (145 preceding siblings ...)
  2018-12-04 10:50 ` [PATCH 4.14 146/146] binder: fix race that allows malicious free of live buffer Greg Kroah-Hartman
@ 2018-12-04 17:32 ` kernelci.org bot
  2018-12-04 21:42 ` Guenter Roeck
                   ` (3 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: kernelci.org bot @ 2018-12-04 17:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuah, patches,
	ben.hutchings, lkft-triage, stable

stable-rc/linux-4.14.y boot: 120 boots: 0 failed, 118 passed with 2 offline (v4.14.85-147-g2bfd08691add)

Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.14.y/kernel/v4.14.85-147-g2bfd08691add/
Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-4.14.y/kernel/v4.14.85-147-g2bfd08691add/

Tree: stable-rc
Branch: linux-4.14.y
Git Describe: v4.14.85-147-g2bfd08691add
Git Commit: 2bfd08691add5955f6d58006360c4a78edce64d0
Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Tested: 59 unique boards, 23 SoC families, 14 builds out of 197

Offline Platforms:

arm:

    multi_v7_defconfig:
        stih410-b2120: 1 offline lab

arm64:

    defconfig:
        meson-gxl-s905x-p212: 1 offline lab

---
For more info write to <info@kernelci.org>

^ permalink raw reply	[flat|nested] 166+ messages in thread

* Re: [PATCH 4.14 054/146] f2fs: fix to do sanity check with block address in main area
  2018-12-04 10:49 ` [PATCH 4.14 054/146] f2fs: fix to do sanity check with block address in main area Greg Kroah-Hartman
@ 2018-12-04 20:27   ` Sudip Mukherjee
  2018-12-05  6:59     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 166+ messages in thread
From: Sudip Mukherjee @ 2018-12-04 20:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Stable, yuchao0, jaegeuk, Ben Hutchings, sashal

Hi Greg,

On Tue, Dec 4, 2018 at 11:05 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> 4.14-stable review patch.  If anyone has any objections, please let me know.
>
> ------------------
>
> commit c9b60788fc760d136211853f10ce73dc152d1f4a upstream.

There is another upstream commit which fixes this.
89d13c38501d ("f2fs: fix missing up_read")

-- 
Regards
Sudip

^ permalink raw reply	[flat|nested] 166+ messages in thread

* Re: [PATCH 4.14 000/146] 4.14.86-stable review
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (146 preceding siblings ...)
  2018-12-04 17:32 ` [PATCH 4.14 000/146] 4.14.86-stable review kernelci.org bot
@ 2018-12-04 21:42 ` Guenter Roeck
  2018-12-05  5:18 ` Naresh Kamboju
                   ` (2 subsequent siblings)
  150 siblings, 0 replies; 166+ messages in thread
From: Guenter Roeck @ 2018-12-04 21:42 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, shuah, patches, ben.hutchings, lkft-triage, stable

On 12/4/18 2:48 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.14.86 release.
> There are 146 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu Dec  6 10:36:52 UTC 2018.
> Anything received after that time might be too late.
> 

Build results:
	total: 175 pass: 175 fail: 0
Qemu test results:
	total: 322 pass: 322 fail: 0

Details are available at https://kerneltests.org/builders/.

Guenter

^ permalink raw reply	[flat|nested] 166+ messages in thread

* Re: [PATCH 4.14 000/146] 4.14.86-stable review
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (147 preceding siblings ...)
  2018-12-04 21:42 ` Guenter Roeck
@ 2018-12-05  5:18 ` Naresh Kamboju
  2018-12-05  7:41   ` Greg Kroah-Hartman
  2018-12-05  9:31 ` Jon Hunter
  2018-12-05 23:53 ` shuah
  150 siblings, 1 reply; 166+ messages in thread
From: Naresh Kamboju @ 2018-12-05  5:18 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: open list, Shuah Khan, patches, lkft-triage, Ben Hutchings,
	linux- stable, Andrew Morton, Linus Torvalds, Guenter Roeck

On Tue, 4 Dec 2018 at 16:34, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> This is the start of the stable review cycle for the 4.14.86 release.
> There are 146 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Thu Dec  6 10:36:52 UTC 2018.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
>         https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.86-rc1.gz
> or in the git tree and branch at:
>         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Summary
------------------------------------------------------------------------

kernel: 4.14.86-rc2
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.14.y
git commit: a5c883b384be3a583f2875dc44c9ec3941803121
git describe: v4.14.85-149-ga5c883b384be
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.14-oe/build/v4.14.85-149-ga5c883b384be

No regressions (compared to build v4.14.85)

No fixes (compared to build v4.14.85)

Ran 21468 total tests in the following environments and test suites.

Environments
--------------
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- qemu_arm
- qemu_arm64
- qemu_i386
- qemu_x86_64
- x15 - arm
- x86_64

Test Suites
-----------
* boot
* install-android-platform-tools-r2600
* kselftest
* libhugetlbfs
* ltp-cap_bounds-tests
* ltp-containers-tests
* ltp-cve-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* ltp-timers-tests
* ltp-fs-tests
* ltp-open-posix-tests
* kselftest-vsyscall-mode-native
* kselftest-vsyscall-mode-none

-- 
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 166+ messages in thread

* Re: [PATCH 4.14 054/146] f2fs: fix to do sanity check with block address in main area
  2018-12-04 20:27   ` Sudip Mukherjee
@ 2018-12-05  6:59     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-05  6:59 UTC (permalink / raw)
  To: Sudip Mukherjee
  Cc: linux-kernel, Stable, yuchao0, jaegeuk, Ben Hutchings, sashal

On Tue, Dec 04, 2018 at 08:27:32PM +0000, Sudip Mukherjee wrote:
> Hi Greg,
> 
> On Tue, Dec 4, 2018 at 11:05 AM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > 4.14-stable review patch.  If anyone has any objections, please let me know.
> >
> > ------------------
> >
> > commit c9b60788fc760d136211853f10ce73dc152d1f4a upstream.
> 
> There is another upstream commit which fixes this.
> 89d13c38501d ("f2fs: fix missing up_read")

Thanks for pointing this out, now queued up.

greg k-h

^ permalink raw reply	[flat|nested] 166+ messages in thread

* Re: [PATCH 4.14 000/146] 4.14.86-stable review
  2018-12-05  5:18 ` Naresh Kamboju
@ 2018-12-05  7:41   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-05  7:41 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: open list, Shuah Khan, patches, lkft-triage, Ben Hutchings,
	linux- stable, Andrew Morton, Linus Torvalds, Guenter Roeck

On Wed, Dec 05, 2018 at 10:48:59AM +0530, Naresh Kamboju wrote:
> On Tue, 4 Dec 2018 at 16:34, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > This is the start of the stable review cycle for the 4.14.86 release.
> > There are 146 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Thu Dec  6 10:36:52 UTC 2018.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> >         https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.86-rc1.gz
> > or in the git tree and branch at:
> >         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
> 
> Results from Linaro’s test farm.
> No regressions on arm64, arm, x86_64, and i386.

Thanks for testing two of these and letting me know.

greg k-h

^ permalink raw reply	[flat|nested] 166+ messages in thread

* Re: [PATCH 4.14 000/146] 4.14.86-stable review
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (148 preceding siblings ...)
  2018-12-05  5:18 ` Naresh Kamboju
@ 2018-12-05  9:31 ` Jon Hunter
  2018-12-05  9:45   ` Greg Kroah-Hartman
  2018-12-05 23:53 ` shuah
  150 siblings, 1 reply; 166+ messages in thread
From: Jon Hunter @ 2018-12-05  9:31 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, linux, shuah, patches, ben.hutchings,
	lkft-triage, stable, linux-tegra


On 04/12/2018 10:48, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.14.86 release.
> There are 146 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu Dec  6 10:36:52 UTC 2018.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.86-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

All tests are passing for Tegra ...

Test results for stable-v4.14:
    8 builds:	8 pass, 0 fail
    16 boots:	16 pass, 0 fail
    14 tests:	14 pass, 0 fail

Linux version:	4.14.86-rc2-ga5c883b
Boards tested:	tegra124-jetson-tk1, tegra20-ventana,
                tegra210-p2371-2180, tegra30-cardhu-a04

Cheers
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 166+ messages in thread

* Re: [PATCH 4.14 000/146] 4.14.86-stable review
  2018-12-05  9:31 ` Jon Hunter
@ 2018-12-05  9:45   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-05  9:45 UTC (permalink / raw)
  To: Jon Hunter
  Cc: linux-kernel, torvalds, akpm, linux, shuah, patches,
	ben.hutchings, lkft-triage, stable, linux-tegra

On Wed, Dec 05, 2018 at 09:31:45AM +0000, Jon Hunter wrote:
> 
> On 04/12/2018 10:48, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.14.86 release.
> > There are 146 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Thu Dec  6 10:36:52 UTC 2018.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > 	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.86-rc1.gz
> > or in the git tree and branch at:
> > 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> 
> All tests are passing for Tegra ...
> 
> Test results for stable-v4.14:
>     8 builds:	8 pass, 0 fail
>     16 boots:	16 pass, 0 fail
>     14 tests:	14 pass, 0 fail
> 
> Linux version:	4.14.86-rc2-ga5c883b
> Boards tested:	tegra124-jetson-tk1, tegra20-ventana,
>                 tegra210-p2371-2180, tegra30-cardhu-a04
> 

Great, thanks for testing these and letting me know.

greg k-h

^ permalink raw reply	[flat|nested] 166+ messages in thread

* Re: [PATCH 4.14 121/146] x86/fpu: Disable bottom halves while loading  FPU registers
  2018-12-04 10:50 ` [PATCH 4.14 121/146] x86/fpu: Disable bottom halves while loading FPU registers Greg Kroah-Hartman
@ 2018-12-05 16:26   ` Jari Ruusu
  2018-12-05 19:00     ` Borislav Petkov
  0 siblings, 1 reply; 166+ messages in thread
From: Jari Ruusu @ 2018-12-05 16:26 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, Sebastian Andrzej Siewior, Borislav Petkov,
	Ingo Molnar, Thomas Gleixner, Andy Lutomirski, Dave Hansen,
	H. Peter Anvin, Jason A. Donenfeld, kvm ML, Paolo Bonzini,
	 Radim Kr?má?,
	Rik van Riel, x86-ml

Greg Kroah-Hartman wrote:
> commit 68239654acafe6aad5a3c1dc7237e60accfebc03 upstream.
> 
> The sequence
> 
>   fpu->initialized = 1;         /* step A */
>   preempt_disable();            /* step B */
>   fpu__restore(fpu);
>   preempt_enable();
> 
> in __fpu__restore_sig() is racy in regard to a context switch.

That same race appears to be present in older kernel branches also.
The context is sligthly different, so the patch for 4.14 does not
apply cleanly to older kernels. For 4.9 branch, this edit works:

    s/fpu->initialized/fpu->fpstate_active/

--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -342,10 +342,10 @@ static int __fpu__restore_sig(void __use
 			sanitize_restored_xstate(tsk, &env, xfeatures, fx_only);
 		}
 
+		local_bh_disable();
 		fpu->fpstate_active = 1;
-		preempt_disable();
 		fpu__restore(fpu);
-		preempt_enable();
+		local_bh_enable();
 
 		return err;
 	} else {

^ permalink raw reply	[flat|nested] 166+ messages in thread

* Re: [PATCH 4.14 121/146] x86/fpu: Disable bottom halves while loading FPU registers
  2018-12-05 16:26   ` Jari Ruusu
@ 2018-12-05 19:00     ` Borislav Petkov
  2018-12-06 10:54       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 166+ messages in thread
From: Borislav Petkov @ 2018-12-05 19:00 UTC (permalink / raw)
  To: Jari Ruusu
  Cc: Greg Kroah-Hartman, linux-kernel, stable,
	Sebastian Andrzej Siewior, Ingo Molnar, Thomas Gleixner,
	Andy Lutomirski, Dave Hansen, H. Peter Anvin, Jason A. Donenfeld,
	kvm ML, Paolo Bonzini, Radim Kr?má?,
	Rik van Riel, x86-ml

On Wed, Dec 05, 2018 at 06:26:24PM +0200, Jari Ruusu wrote:
> That same race appears to be present in older kernel branches also.
> The context is sligthly different, so the patch for 4.14 does not
> apply cleanly to older kernels. For 4.9 branch, this edit works:

You could take the upstream one, amend it with your change, test it and
send it to Greg - I believe he'll take the backport gladly.

:-)

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 166+ messages in thread

* Re: [PATCH 4.14 000/146] 4.14.86-stable review
  2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
                   ` (149 preceding siblings ...)
  2018-12-05  9:31 ` Jon Hunter
@ 2018-12-05 23:53 ` shuah
  150 siblings, 0 replies; 166+ messages in thread
From: shuah @ 2018-12-05 23:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, linux, patches, ben.hutchings, lkft-triage,
	stable, shuah

On 12/4/18 3:48 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.14.86 release.
> There are 146 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu Dec  6 10:36:52 UTC 2018.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.86-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah


^ permalink raw reply	[flat|nested] 166+ messages in thread

* Re: [PATCH 4.14 121/146] x86/fpu: Disable bottom halves while loading FPU registers
  2018-12-05 19:00     ` Borislav Petkov
@ 2018-12-06 10:54       ` Greg Kroah-Hartman
  2018-12-21 16:23         ` [PATCH v4.9 STABLE] " Sebastian Andrzej Siewior
  0 siblings, 1 reply; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-06 10:54 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Jari Ruusu, linux-kernel, stable, Sebastian Andrzej Siewior,
	Ingo Molnar, Thomas Gleixner, Andy Lutomirski, Dave Hansen,
	H. Peter Anvin, Jason A. Donenfeld, kvm ML, Paolo Bonzini,
	Radim Kr?má?,
	Rik van Riel, x86-ml

On Wed, Dec 05, 2018 at 08:00:20PM +0100, Borislav Petkov wrote:
> On Wed, Dec 05, 2018 at 06:26:24PM +0200, Jari Ruusu wrote:
> > That same race appears to be present in older kernel branches also.
> > The context is sligthly different, so the patch for 4.14 does not
> > apply cleanly to older kernels. For 4.9 branch, this edit works:
> 
> You could take the upstream one, amend it with your change, test it and
> send it to Greg - I believe he'll take the backport gladly.
> 
> :-)

Yes, that's the easiest way for me to accept such a patch, otherwise it
gets put on the end of the very-long-queue...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 166+ messages in thread

* [PATCH v4.9 STABLE] x86/fpu: Disable bottom halves while loading FPU registers
  2018-12-06 10:54       ` Greg Kroah-Hartman
@ 2018-12-21 16:23         ` Sebastian Andrzej Siewior
  2018-12-21 16:29           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 166+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-12-21 16:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Borislav Petkov, Jari Ruusu, linux-kernel, stable, Ingo Molnar,
	Thomas Gleixner, Andy Lutomirski, Dave Hansen, H. Peter Anvin,
	Jason A. Donenfeld, kvm ML, Paolo Bonzini, Radim Kr?má?,
	Rik van Riel, x86-ml

The sequence

  fpu->initialized = 1;		/* step A */
  preempt_disable();		/* step B */
  fpu__restore(fpu);
  preempt_enable();

in __fpu__restore_sig() is racy in regard to a context switch.

For 32bit frames, __fpu__restore_sig() prepares the FPU state within
fpu->state. To ensure that a context switch (switch_fpu_prepare() in
particular) does not modify fpu->state it uses fpu__drop() which sets
fpu->initialized to 0.

After fpu->initialized is cleared, the CPU's FPU state is not saved
to fpu->state during a context switch. The new state is loaded via
fpu__restore(). It gets loaded into fpu->state from userland and
ensured it is sane. fpu->initialized is then set to 1 in order to avoid
fpu__initialize() doing anything (overwrite the new state) which is part
of fpu__restore().

A context switch between step A and B above would save CPU's current FPU
registers to fpu->state and overwrite the newly prepared state. This
looks like a tiny race window but the Kernel Test Robot reported this
back in 2016 while we had lazy FPU support. Borislav Petkov made the
link between that report and another patch that has been posted. Since
the removal of the lazy FPU support, this race goes unnoticed because
the warning has been removed.

Disable bottom halves around the restore sequence to avoid the race. BH
need to be disabled because BH is allowed to run (even with preemption
disabled) and might invoke kernel_fpu_begin() by doing IPsec.

 [ bp: massage commit message a bit. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: stable@vger.kernel.org
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20181120102635.ddv3fvavxajjlfqk@linutronix.de
Link: https://lkml.kernel.org/r/20160226074940.GA28911@pd.tnic
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/kernel/fpu/signal.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index ae52ef05d0981..769831d9fd114 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -342,10 +342,10 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 			sanitize_restored_xstate(tsk, &env, xfeatures, fx_only);
 		}
 
+		local_bh_disable();
 		fpu->fpstate_active = 1;
-		preempt_disable();
 		fpu__restore(fpu);
-		preempt_enable();
+		local_bh_enable();
 
 		return err;
 	} else {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 166+ messages in thread

* Re: [PATCH v4.9 STABLE] x86/fpu: Disable bottom halves while loading FPU registers
  2018-12-21 16:23         ` [PATCH v4.9 STABLE] " Sebastian Andrzej Siewior
@ 2018-12-21 16:29           ` Greg Kroah-Hartman
  2018-12-21 16:38             ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 166+ messages in thread
From: Greg Kroah-Hartman @ 2018-12-21 16:29 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Borislav Petkov, Jari Ruusu, linux-kernel, stable, Ingo Molnar,
	Thomas Gleixner, Andy Lutomirski, Dave Hansen, H. Peter Anvin,
	Jason A. Donenfeld, kvm ML, Paolo Bonzini, Radim Kr?má?,
	Rik van Riel, x86-ml

On Fri, Dec 21, 2018 at 05:23:38PM +0100, Sebastian Andrzej Siewior wrote:
> The sequence
> 
>   fpu->initialized = 1;		/* step A */
>   preempt_disable();		/* step B */
>   fpu__restore(fpu);
>   preempt_enable();
> 
> in __fpu__restore_sig() is racy in regard to a context switch.
> 
> For 32bit frames, __fpu__restore_sig() prepares the FPU state within
> fpu->state. To ensure that a context switch (switch_fpu_prepare() in
> particular) does not modify fpu->state it uses fpu__drop() which sets
> fpu->initialized to 0.
> 
> After fpu->initialized is cleared, the CPU's FPU state is not saved
> to fpu->state during a context switch. The new state is loaded via
> fpu__restore(). It gets loaded into fpu->state from userland and
> ensured it is sane. fpu->initialized is then set to 1 in order to avoid
> fpu__initialize() doing anything (overwrite the new state) which is part
> of fpu__restore().
> 
> A context switch between step A and B above would save CPU's current FPU
> registers to fpu->state and overwrite the newly prepared state. This
> looks like a tiny race window but the Kernel Test Robot reported this
> back in 2016 while we had lazy FPU support. Borislav Petkov made the
> link between that report and another patch that has been posted. Since
> the removal of the lazy FPU support, this race goes unnoticed because
> the warning has been removed.
> 
> Disable bottom halves around the restore sequence to avoid the race. BH
> need to be disabled because BH is allowed to run (even with preemption
> disabled) and might invoke kernel_fpu_begin() by doing IPsec.
> 
>  [ bp: massage commit message a bit. ]
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Signed-off-by: Borislav Petkov <bp@suse.de>
> Acked-by: Ingo Molnar <mingo@kernel.org>
> Acked-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
> Cc: kvm ML <kvm@vger.kernel.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: stable@vger.kernel.org
> Cc: x86-ml <x86@kernel.org>
> Link: http://lkml.kernel.org/r/20181120102635.ddv3fvavxajjlfqk@linutronix.de
> Link: https://lkml.kernel.org/r/20160226074940.GA28911@pd.tnic
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  arch/x86/kernel/fpu/signal.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

What is the git commit id of this patch upstream?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 166+ messages in thread

* Re: [PATCH v4.9 STABLE] x86/fpu: Disable bottom halves while loading FPU registers
  2018-12-21 16:29           ` Greg Kroah-Hartman
@ 2018-12-21 16:38             ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 166+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-12-21 16:38 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Borislav Petkov, Jari Ruusu, linux-kernel, stable, Ingo Molnar,
	Thomas Gleixner, Andy Lutomirski, Dave Hansen, H. Peter Anvin,
	Jason A. Donenfeld, kvm ML, Paolo Bonzini, Radim Kr?má?,
	Rik van Riel, x86-ml

On 2018-12-21 17:29:05 [+0100], Greg Kroah-Hartman wrote:
> What is the git commit id of this patch upstream?

commit 68239654acafe6aad5a3c1dc7237e60accfebc03 upstream.

I'm sorry. I cherry picked the original commit, resolved the conflict
and forgot about the original commit id…

> thanks,
> 
> greg k-h

Sebastian

^ permalink raw reply	[flat|nested] 166+ messages in thread

end of thread, other threads:[~2018-12-21 16:39 UTC | newest]

Thread overview: 166+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-04 10:48 [PATCH 4.14 000/146] 4.14.86-stable review Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 001/146] mm/huge_memory: rename freeze_page() to unmap_page() Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 002/146] mm/huge_memory.c: reorder operations in __split_huge_page_tail() Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 003/146] mm/huge_memory: splitting set mapping+index before unfreeze Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 004/146] mm/huge_memory: fix lockdep complaint on 32-bit i_size_read() Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 005/146] mm/khugepaged: collapse_shmem() stop if punched or truncated Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 006/146] mm/khugepaged: fix crashes due to misaccounted holes Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 007/146] mm/khugepaged: collapse_shmem() remember to clear holes Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 008/146] mm/khugepaged: minor reorderings in collapse_shmem() Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 009/146] mm/khugepaged: collapse_shmem() without freezing new_page Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 010/146] mm/khugepaged: collapse_shmem() do not crash on Compound Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 011/146] media: em28xx: Fix use-after-free when disconnecting Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 012/146] ubi: Initialize Fastmap checkmapping correctly Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 013/146] libceph: store ceph_auth_handshake pointer in ceph_connection Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 014/146] libceph: factor out __prepare_write_connect() Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 015/146] libceph: factor out __ceph_x_decrypt() Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 016/146] libceph: factor out encrypt_authorizer() Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 017/146] libceph: add authorizer challenge Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 018/146] libceph: implement CEPHX_V2 calculation mode Greg Kroah-Hartman
2018-12-04 12:06   ` Ilya Dryomov
2018-12-04 13:41     ` Greg KH
2018-12-04 10:48 ` [PATCH 4.14 019/146] bpf: Prevent memory disambiguation attack Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 020/146] tls: Add function to update the TLS socket configuration Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 021/146] tls: Fix TLS ulp context leak, when TLS_TX setsockopt is not used Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 022/146] tls: Avoid copying crypto_info again after cipher_type check Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 023/146] tls: dont override sk_write_space if tls_set_sw_offload fails Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 024/146] tls: Use correct sk->sk_prot for IPV6 Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 025/146] net/tls: Fixed return value when tls_complete_pending_work() fails Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 026/146] wil6210: missing length check in wmi_set_ie Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 027/146] btrfs: validate type when reading a chunk Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 028/146] btrfs: Verify that every chunk has corresponding block group at mount time Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 029/146] btrfs: Refactor check_leaf function for later expansion Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 030/146] btrfs: Check if item pointer overlaps with the item itself Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 031/146] btrfs: Add sanity check for EXTENT_DATA when reading out leaf Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 032/146] btrfs: Add checker for EXTENT_CSUM Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 033/146] btrfs: Move leaf and node validation checker to tree-checker.c Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 034/146] btrfs: tree-checker: Enhance btrfs_check_node output Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 035/146] btrfs: tree-checker: Fix false panic for sanity test Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 036/146] btrfs: tree-checker: Add checker for dir item Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 037/146] btrfs: tree-checker: use %zu format string for size_t Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 038/146] btrfs: tree-check: reduce stack consumption in check_dir_item Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 039/146] btrfs: tree-checker: Verify block_group_item Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 040/146] btrfs: tree-checker: Detect invalid and empty essential trees Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 041/146] btrfs: Check that each block group has corresponding chunk at mount time Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 042/146] btrfs: tree-checker: Check level for leaves and nodes Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 043/146] btrfs: tree-checker: Fix misleading group system information Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 044/146] f2fs: check blkaddr more accuratly before issue a bio Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 045/146] f2fs: sanity check on sit entry Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 046/146] f2fs: enhance sanity_check_raw_super() to avoid potential overflow Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 047/146] f2fs: clean up with is_valid_blkaddr() Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 048/146] f2fs: introduce and spread verify_blkaddr Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 049/146] f2fs: fix to do sanity check with secs_per_zone Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 050/146] f2fs: Add sanity_check_inode() function Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 051/146] f2fs: fix to do sanity check with extra_attr feature Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 052/146] f2fs: fix to do sanity check with user_block_count Greg Kroah-Hartman
2018-12-04 10:48 ` [PATCH 4.14 053/146] f2fs: fix to do sanity check with node footer and iblocks Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 054/146] f2fs: fix to do sanity check with block address in main area Greg Kroah-Hartman
2018-12-04 20:27   ` Sudip Mukherjee
2018-12-05  6:59     ` Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 055/146] f2fs: fix to do sanity check with i_extra_isize Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 056/146] f2fs: fix to do sanity check with cp_pack_start_sum Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 057/146] xfs: dont fail when converting shortform attr to long form during ATTR_REPLACE Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 058/146] Revert "wlcore: Add missing PM call for wlcore_cmd_wait_for_event_or_timeout()" Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 059/146] net: skb_scrub_packet(): Scrub offload_fwd_mark Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 060/146] net: thunderx: set xdp_prog to NULL if bpf_prog_add fails Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 061/146] virtio-net: disable guest csum during XDP set Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 062/146] virtio-net: fail XDP set if guest csum is negotiated Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 063/146] net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 064/146] packet: copy user buffers before orphan or clone Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 065/146] rapidio/rionet: do not free skb before reading its length Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 066/146] s390/qeth: fix length check in SNMP processing Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 067/146] usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2 Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 068/146] sched/core: Fix cpu.max vs. cpuhotplug deadlock Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 069/146] x86/bugs: Add AMDs variant of SSB_NO Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 070/146] x86/bugs: Add AMDs SPEC_CTRL MSR usage Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 071/146] x86/bugs: Switch the selection of mitigation from CPU vendor to CPU features Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 072/146] x86/bugs: Update when to check for the LS_CFG SSBD mitigation Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 073/146] x86/bugs: Fix the AMD SSBD usage of the SPEC_CTRL MSR Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 074/146] x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 075/146] x86/speculation: Apply IBPB more strictly to avoid cross-process data leak Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 076/146] x86/speculation: Propagate information about RSB filling mitigation to sysfs Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 077/146] x86/speculation: Add RETPOLINE_AMD support to the inline asm CALL_NOSPEC variant Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 078/146] x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 079/146] x86/retpoline: Remove minimal retpoline support Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 080/146] x86/speculation: Update the TIF_SSBD comment Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 081/146] x86/speculation: Clean up spectre_v2_parse_cmdline() Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 082/146] x86/speculation: Remove unnecessary ret variable in cpu_show_common() Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 083/146] x86/speculation: Move STIPB/IBPB string conditionals out of cpu_show_common() Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 084/146] x86/speculation: Disable STIBP when enhanced IBRS is in use Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 085/146] x86/speculation: Rename SSBD update functions Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 086/146] x86/speculation: Reorganize speculation control MSRs update Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 087/146] sched/smt: Make sched_smt_present track topology Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 088/146] x86/Kconfig: Select SCHED_SMT if SMP enabled Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 089/146] sched/smt: Expose sched_smt_present static key Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 090/146] x86/speculation: Rework SMT state change Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 091/146] x86/l1tf: Show actual SMT state Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 092/146] x86/speculation: Reorder the spec_v2 code Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 093/146] x86/speculation: Mark string arrays const correctly Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 094/146] x86/speculataion: Mark command line parser data __initdata Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 095/146] x86/speculation: Unify conditional spectre v2 print functions Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 096/146] x86/speculation: Add command line control for indirect branch speculation Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 097/146] x86/speculation: Prepare for per task indirect branch speculation control Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 098/146] x86/process: Consolidate and simplify switch_to_xtra() code Greg Kroah-Hartman
2018-12-04 11:14   ` Thomas Gleixner
2018-12-04 13:39     ` Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 099/146] x86/speculation: Avoid __switch_to_xtra() calls Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 100/146] x86/speculation: Prepare for conditional IBPB in switch_mm() Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 101/146] ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRS Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 102/146] x86/speculation: Split out TIF update Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 103/146] x86/speculation: Prevent stale SPEC_CTRL msr content Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 104/146] x86/speculation: Prepare arch_smt_update() for PRCTL mode Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 105/146] x86/speculation: Add prctl() control for indirect branch speculation Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 106/146] x86/speculation: Enable prctl mode for spectre_v2_user Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 107/146] x86/speculation: Add seccomp Spectre v2 user space protection mode Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 108/146] x86/speculation: Provide IBPB always command line options Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 109/146] kvm: mmu: Fix race in emulated page table writes Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 110/146] kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 111/146] KVM: x86: Fix kernel info-leak in KVM_HC_CLOCK_PAIRING hypercall Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 112/146] KVM: X86: Fix scan ioapic use-before-initialization Greg Kroah-Hartman
2018-12-04 10:49 ` [PATCH 4.14 113/146] xtensa: enable coprocessors that are being flushed Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 114/146] xtensa: fix coprocessor context offset definitions Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 115/146] xtensa: fix coprocessor part of ptrace_{get,set}xregs Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 116/146] Btrfs: ensure path name is null terminated at btrfs_control_ioctl Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 117/146] btrfs: relocation: set trans to be NULL after ending transaction Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 118/146] PCI: layerscape: Fix wrong invocation of outbound window disable accessor Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 119/146] arm64: dts: rockchip: Fix PCIe reset polarity for rk3399-puma-haikou Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 120/146] x86/MCE/AMD: Fix the thresholding machinery initialization order Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 121/146] x86/fpu: Disable bottom halves while loading FPU registers Greg Kroah-Hartman
2018-12-05 16:26   ` Jari Ruusu
2018-12-05 19:00     ` Borislav Petkov
2018-12-06 10:54       ` Greg Kroah-Hartman
2018-12-21 16:23         ` [PATCH v4.9 STABLE] " Sebastian Andrzej Siewior
2018-12-21 16:29           ` Greg Kroah-Hartman
2018-12-21 16:38             ` Sebastian Andrzej Siewior
2018-12-04 10:50 ` [PATCH 4.14 122/146] perf/x86/intel: Move branch tracing setup to the Intel-specific source file Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 123/146] perf/x86/intel: Add generic branch tracing check to intel_pmu_has_bts() Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 124/146] fs: fix lost error code in dio_complete Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 125/146] ALSA: wss: Fix invalid snd_free_pages() at error path Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 126/146] ALSA: ac97: Fix incorrect bit shift at AC97-SPSA control write Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 127/146] ALSA: control: Fix race between adding and removing a user element Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 128/146] ALSA: sparc: Fix invalid snd_free_pages() at error path Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 129/146] ALSA: hda/realtek - Support ALC300 Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 130/146] ALSA: hda/realtek - fix headset mic detection for MSI MS-B171 Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 131/146] ext2: fix potential use after free Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 132/146] ARM: dts: rockchip: Remove @0 from the veyron memory node Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 133/146] dmaengine: at_hdmac: fix memory leak in at_dma_xlate() Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 134/146] dmaengine: at_hdmac: fix module unloading Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 135/146] btrfs: release metadata before running delayed refs Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 136/146] staging: vchiq_arm: fix compat VCHIQ_IOC_AWAIT_COMPLETION Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 137/146] staging: rtl8723bs: Add missing return for cfg80211_rtw_get_station Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 138/146] USB: usb-storage: Add new IDs to ums-realtek Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 139/146] usb: core: quirks: add RESET_RESUME quirk for Cherry G230 Stream series Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 140/146] Revert "usb: dwc3: gadget: skip Set/Clear Halt when invalid" Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 141/146] iio:st_magn: Fix enable device after trigger Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 142/146] lib/test_kmod.c: fix rmmod double free Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 143/146] mm: use swp_offset as key in shmem_replace_page() Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 144/146] Drivers: hv: vmbus: check the creation_status in vmbus_establish_gpadl() Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 145/146] misc: mic/scif: fix copy-paste error in scif_create_remote_lookup Greg Kroah-Hartman
2018-12-04 10:50 ` [PATCH 4.14 146/146] binder: fix race that allows malicious free of live buffer Greg Kroah-Hartman
2018-12-04 17:32 ` [PATCH 4.14 000/146] 4.14.86-stable review kernelci.org bot
2018-12-04 21:42 ` Guenter Roeck
2018-12-05  5:18 ` Naresh Kamboju
2018-12-05  7:41   ` Greg Kroah-Hartman
2018-12-05  9:31 ` Jon Hunter
2018-12-05  9:45   ` Greg Kroah-Hartman
2018-12-05 23:53 ` shuah

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).