linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 4.14 000/126] 4.14.71-stable review
@ 2018-09-17 22:40 Greg Kroah-Hartman
  2018-09-17 22:40 ` [PATCH 4.14 001/126] i2c: xiic: Make the start and the byte count write atomic Greg Kroah-Hartman
                   ` (128 more replies)
  0 siblings, 129 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuah, patches,
	ben.hutchings, lkft-triage, stable

This is the start of the stable review cycle for the 4.14.71 release.
There are 126 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed Sep 19 21:16:12 UTC 2018.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.71-rc1.gz
or in the git tree and branch at:
	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 4.14.71-rc1

Linus Torvalds <torvalds@linux-foundation.org>
    mm: get rid of vmacache_flush_all() entirely

Ian Kent <raven@themaw.net>
    autofs: fix autofs_sbi() does not check super block type

Jason Wang <jasowang@redhat.com>
    tuntap: fix use after free during release

Jason Wang <jasowang@redhat.com>
    tun: fix use after free for ptr_ring

Wei Yongjun <weiyongjun1@huawei.com>
    mtd: ubi: wl: Fix error return code in ubi_wl_init()

Taehee Yoo <ap420073@gmail.com>
    ip: frags: fix crash in ip_do_fragment()

Peter Oskolkov <posk@google.com>
    ip: process in-order fragments efficiently

Peter Oskolkov <posk@google.com>
    ip: add helpers to process in-order fragments faster.

Dan Carpenter <dan.carpenter@oracle.com>
    ipv4: frags: precedence bug in ip_expire()

Eric Dumazet <edumazet@google.com>
    net: sk_buff rbnode reorg

Eric Dumazet <edumazet@google.com>
    net: add rb_to_skb() and other rb tree helpers

Eric Dumazet <edumazet@google.com>
    net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends

Florian Westphal <fw@strlen.de>
    ipv6: defrag: drop non-last frags smaller than min mtu

Peter Oskolkov <posk@google.com>
    net: modify skb_rbtree_purge to return the truesize of all purged skbs.

Eric Dumazet <edumazet@google.com>
    net: speed up skb_rbtree_purge()

Peter Oskolkov <posk@google.com>
    ip: discard IPv4 datagrams with overlapping segments.

Eric Dumazet <edumazet@google.com>
    inet: frags: fix ip6frag_low_thresh boundary

Eric Dumazet <edumazet@google.com>
    inet: frags: get rid of ipfrag_skb_cb/FRAG_CB

Eric Dumazet <edumazet@google.com>
    inet: frags: reorganize struct netns_frags

Eric Dumazet <edumazet@google.com>
    rhashtable: reorganize struct rhashtable layout

Eric Dumazet <edumazet@google.com>
    ipv6: frags: rewrite ip6_expire_frag_queue()

Eric Dumazet <edumazet@google.com>
    inet: frags: do not clone skb in ip_expire()

Eric Dumazet <edumazet@google.com>
    inet: frags: break the 2GB limit for frags storage

Eric Dumazet <edumazet@google.com>
    inet: frags: remove inet_frag_maybe_warn_overflow()

Eric Dumazet <edumazet@google.com>
    inet: frags: get rif of inet_frag_evicting()

Eric Dumazet <edumazet@google.com>
    inet: frags: remove some helpers

Eric Dumazet <edumazet@google.com>
    inet: frags: use rhashtables for reassembly units

Eric Dumazet <edumazet@google.com>
    rhashtable: add schedule points

Eric Dumazet <edumazet@google.com>
    ipv6: export ip6 fragments sysctl to unprivileged users

Eric Dumazet <edumazet@google.com>
    inet: frags: refactor lowpan_net_frag_init()

Eric Dumazet <edumazet@google.com>
    inet: frags: refactor ipv6_frag_init()

Kees Cook <keescook@chromium.org>
    inet: frags: Convert timers to use timer_setup()

Eric Dumazet <edumazet@google.com>
    inet: frags: refactor ipfrag_init()

Eric Dumazet <edumazet@google.com>
    inet: frags: add a pointer to struct netns_frags

Eric Dumazet <edumazet@google.com>
    inet: frags: change inet_frags_init_net() return value

Jani Nikula <jani.nikula@intel.com>
    drm/i915: set DP Main Stream Attribute for color range on DDI platforms

Parav Pandit <parav@mellanox.com>
    RDMA/cma: Do not ignore net namespace for unbound cm_id

Paul Burton <paul.burton@mips.com>
    MIPS: WARN_ON invalid DMA cache maintenance, not BUG_ON

Trond Myklebust <trond.myklebust@hammerspace.com>
    NFSv4.1: Fix a potential layoutget/layoutrecall deadlock

Chao Yu <yuchao0@huawei.com>
    f2fs: fix to do sanity check with {sit,nat}_ver_bitmap_bytesize

Zumeng Chen <zumeng.chen@gmail.com>
    mfd: ti_am335x_tscadc: Fix struct clk memory leak

Geert Uytterhoeven <geert+renesas@glider.be>
    iommu/ipmmu-vmsa: Fix allocation in atomic context

Dan Carpenter <dan.carpenter@oracle.com>
    f2fs: Fix uninitialized return in f2fs_ioc_shutdown()

Chao Yu <yuchao0@huawei.com>
    f2fs: fix to wait on page writeback before updating page

Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com>
    media: helene: fix xtal frequency setting at power on

Mauricio Faria de Oliveira <mfo@canonical.com>
    partitions/aix: fix usage of uninitialized lv_info and lvname structures

Mauricio Faria de Oliveira <mfo@canonical.com>
    partitions/aix: append null character to print data from disk

Sylwester Nawrocki <s.nawrocki@samsung.com>
    media: s5p-mfc: Fix buffer look up in s5p_mfc_handle_frame_{new, copy_time} functions

Nick Dyer <nick.dyer@itdev.co.uk>
    Input: atmel_mxt_ts - only use first T9 instance

John Pittman <jpittman@redhat.com>
    dm cache: only allow a single io_mode cache feature to be requested

Petr Machata <petrm@mellanox.com>
    net: dcb: For wild-card lookups, use priority -1, not 0

Nicholas Mc Guire <hofrat@osadl.org>
    MIPS: generic: fix missing of_node_put()

Nicholas Mc Guire <hofrat@osadl.org>
    MIPS: Octeon: add missing of_node_put()

Chao Yu <yuchao0@huawei.com>
    f2fs: fix to do sanity check with reserved blkaddr of inline inode

Peter Rosin <peda@axentia.se>
    tpm/tpm_i2c_infineon: switch to i2c_lock_bus(..., I2C_LOCK_SEGMENT)

Linus Walleij <linus.walleij@linaro.org>
    tpm_tis_spi: Pass the SPI IRQ down to the driver

Chao Yu <yuchao0@huawei.com>
    f2fs: fix to skip GC if type in SSA and SIT is inconsistent

Jinbum Park <jinb.park7@gmail.com>
    pktcdvd: Fix possible Spectre-v1 for pkt_devs

Chao Yu <yuchao0@huawei.com>
    f2fs: try grabbing node page lock aggressively in sync scenario

Yelena Krivosheev <yelena@marvell.com>
    net: mvneta: fix mtu change on port without link

Daniel Kurtz <djkurtz@chromium.org>
    pinctrl/amd: only handle irq if it is pending and unmasked

Anton Vasilyev <vasilyev@ispras.ru>
    gpio: ml-ioh: Fix buffer underwrite on probe error path

Dan Carpenter <dan.carpenter@oracle.com>
    pinctrl: imx: off by one in imx_pinconf_group_dbg_show()

Joerg Roedel <jroedel@suse.de>
    x86/mm: Remove in_nmi() warning from vmalloc_fault()

Marcel Holtmann <marcel@holtmann.org>
    Bluetooth: hidp: Fix handling of strncpy for hid->name information

Surabhi Vishnoi <svishnoi@codeaurora.org>
    ath10k: disable bundle mgmt tx completion event support

Huaisheng Ye <yehs1@lenovo.com>
    tools/testing/nvdimm: kaddr and pfn can be NULL to ->direct_access()

Anton Vasilyev <vasilyev@ispras.ru>
    scsi: 3ware: fix return 0 on the error path of probe

Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
    ata: libahci: Correct setting of DEVSLP register

Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
    ata: libahci: Allow reconfigure of DEVSLP register

Paul Burton <paul.burton@mips.com>
    MIPS: Fix ISA virt/bus conversion for non-zero PHYS_OFFSET

Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
    rpmsg: core: add support to power domains for devices

Loic Poulain <loic.poulain@linaro.org>
    wlcore: Set rx_status boottime_ns field on rx

Sven Eckelmann <sven.eckelmann@openmesh.com>
    ath10k: prevent active scans on potential unusable channels

Felix Fietkau <nbd@nbd.name>
    ath9k_hw: fix channel maximum power level test

Felix Fietkau <nbd@nbd.name>
    ath9k: report tx status on EOSP

Finn Thain <fthain@telegraphics.com.au>
    macintosh/via-pmu: Add missing mmio accessors

Kan Liang <kan.liang@linux.intel.com>
    perf evlist: Fix error out while applying initial delay and LBR

Jiri Olsa <jolsa@kernel.org>
    perf c2c report: Fix crash for empty browser

Olga Kornievskaia <kolga@netapp.com>
    NFSv4.0 fix client reference leak in callback

Christophe Leroy <christophe.leroy@c-s.fr>
    perf tools: Allow overriding MAX_NR_CPUS at compile time

Randy Dunlap <rdunlap@infradead.org>
    f2fs: fix defined but not used build warnings

Yunlong Song <yunlong.song@huawei.com>
    f2fs: do not set free of current section

Chao Yu <yuchao0@huawei.com>
    f2fs: fix to active page in lru list for read path

Anton Vasilyev <vasilyev@ispras.ru>
    tty: rocket: Fix possible buffer overwrite on register_PCI

Michael Kelley <mikelley@microsoft.com>
    Drivers: hv: vmbus: Cleanup synic memory free path

Anton Vasilyev <vasilyev@ispras.ru>
    firmware: vpd: Fix section enabled flag on vpd_section_destroy

Dan Carpenter <dan.carpenter@oracle.com>
    uio: potential double frees if __uio_register_device() fails

Anton Vasilyev <vasilyev@ispras.ru>
    misc: ti-st: Fix memory leak in the error path of probe()

Philipp Zabel <p.zabel@pengutronix.de>
    gpu: ipu-v3: default to id 0 on missing OF alias

Todor Tomov <todor.tomov@linaro.org>
    media: camss: csid: Configure data type and decode format properly

Gaurav Kohli <gkohli@codeaurora.org>
    timers: Clear timer_base::must_forward_clk with timer_base::lock held

BingJing Chang <bingjingc@synology.com>
    md/raid5: fix data corruption of replacements after originals dropped

Mike Christie <mchristi@redhat.com>
    scsi: target: fix __transport_register_session locking

Ming Lei <ming.lei@redhat.com>
    blk-mq: fix updating tags depth

Arun Parameswaran <arun.parameswaran@broadcom.com>
    net: phy: Fix the register offsets in Broadcom iProc mdio mux driver

Anton Vasilyev <vasilyev@ispras.ru>
    media: dw2102: Fix memleak on sequence of probes

Anton Vasilyev <vasilyev@ispras.ru>
    media: davinci: vpif_display: Mix memory leak on probe error path

Roman Gushchin <guro@fb.com>
    selftests/bpf: fix a typo in map in map test

Reza Arbab <arbab@linux.ibm.com>
    powerpc/powernv: Fix concurrency issue with npu->mmio_atsd_usage

Dmitry Osipenko <digetx@gmail.com>
    gpio: tegra: Move driver registration to subsys_init level

Johan Hedberg <johan.hedberg@intel.com>
    Bluetooth: h5: Fix missing dependency on BT_HCIUART_SERDEV

Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
    i2c: aspeed: Add an explicit type casting for *get_clk_reg_val

Florian Fainelli <f.fainelli@gmail.com>
    ethtool: Remove trailing semicolon for static inline

Dan Carpenter <dan.carpenter@oracle.com>
    misc: mic: SCIF Fix scif_get_new_port() error handling

Alexey Brodkin <abrodkin@synopsys.com>
    ARC: [plat-axs*]: Enable SWAP

Tomas Winkler <tomas.winkler@intel.com>
    tpm: separate cmd_ready/go_idle from runtime_pm

Arnd Bergmann <arnd@arndb.de>
    crypto: aes-generic - fix aes-generic regression on powerpc

Gustavo A. R. Silva <gustavo@embeddedor.com>
    switchtec: Fix Spectre v1 vulnerability

Filippo Sironi <sironi@amazon.de>
    x86/microcode: Update the new microcode revision unconditionally

Prarit Bhargava <prarit@redhat.com>
    x86/microcode: Make sure boot_cpu_data.microcode is up-to-date

Thomas Gleixner <tglx@linutronix.de>
    cpu/hotplug: Prevent state corruption on error rollback

Neeraj Upadhyay <neeraju@codeaurora.org>
    cpu/hotplug: Adjust misplaced smb() in cpuhp_thread_fun()

Takashi Iwai <tiwai@suse.de>
    ALSA: hda - Fix cancel_work_sync() stall from jackpoll work

Sean Christopherson <sean.j.christopherson@intel.com>
    KVM: VMX: Do not allow reexecute_instruction() when skipping MMIO instr

Pierre Morel <pmorel@linux.ibm.com>
    KVM: s390: vsie: copy wrapping keys to right place

Filipe Manana <fdmanana@suse.com>
    Btrfs: fix data corruption when deduplicating between different files

Steve French <stfrench@microsoft.com>
    smb3: check for and properly advertise directory lease support

Steve French <stfrench@microsoft.com>
    SMB3: Backup intent flag missing for directory opens with backupuid mounts

Paul Burton <paul.burton@mips.com>
    MIPS: VDSO: Match data page cache colouring when D$ aliases

Minchan Kim <minchan@kernel.org>
    android: binder: fix the race mmap and alloc_new_buf_locked

Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
    block: bfq: swap puts in bfqg_and_blkg_put

Jens Axboe <axboe@kernel.dk>
    nbd: don't allow invalid blocksize settings

James Smart <jsmart2021@gmail.com>
    scsi: lpfc: Correct MDS diag and nvmet configuration

Felipe Balbi <felipe.balbi@linux.intel.com>
    i2c: i801: fix DNV's SMBCTRL register offset

Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
    i2c: xiic: Make the start and the byte count write atomic


-------------

Diffstat:

 Documentation/networking/ip-sysctl.txt             |  13 +-
 Makefile                                           |   4 +-
 arch/arc/configs/axs101_defconfig                  |   1 -
 arch/arc/configs/axs103_defconfig                  |   1 -
 arch/arc/configs/axs103_smp_defconfig              |   1 -
 arch/mips/cavium-octeon/octeon-platform.c          |   2 +
 arch/mips/generic/init.c                           |   1 +
 arch/mips/include/asm/io.h                         |   8 +-
 arch/mips/kernel/vdso.c                            |  20 +
 arch/mips/mm/c-r4k.c                               |   6 +-
 arch/powerpc/platforms/powernv/npu-dma.c           |   5 +-
 arch/s390/kvm/vsie.c                               |   3 +-
 arch/x86/kernel/cpu/microcode/amd.c                |  24 +-
 arch/x86/kernel/cpu/microcode/intel.c              |  17 +-
 arch/x86/kvm/vmx.c                                 |   4 +-
 arch/x86/mm/fault.c                                |   2 -
 block/bfq-cgroup.c                                 |   4 +-
 block/blk-mq-tag.c                                 |   8 +-
 block/partitions/aix.c                             |  13 +-
 crypto/Makefile                                    |   2 +-
 drivers/android/binder_alloc.c                     |  42 +-
 drivers/ata/libahci.c                              |  20 +-
 drivers/block/nbd.c                                |   3 +
 drivers/block/pktcdvd.c                            |   4 +-
 drivers/bluetooth/Kconfig                          |   1 +
 drivers/char/tpm/tpm-interface.c                   |  50 +-
 drivers/char/tpm/tpm.h                             |  12 +-
 drivers/char/tpm/tpm2-space.c                      |  16 +-
 drivers/char/tpm/tpm_crb.c                         | 101 +---
 drivers/char/tpm/tpm_i2c_infineon.c                |   8 +-
 drivers/char/tpm/tpm_tis_spi.c                     |   9 +-
 drivers/firmware/google/vpd.c                      |   5 +-
 drivers/gpio/gpio-ml-ioh.c                         |   3 +-
 drivers/gpio/gpio-tegra.c                          |   2 +-
 drivers/gpu/drm/i915/i915_reg.h                    |   1 +
 drivers/gpu/drm/i915/intel_ddi.c                   |   4 +
 drivers/gpu/ipu-v3/ipu-common.c                    |   2 +
 drivers/hv/hv.c                                    |  14 +-
 drivers/i2c/busses/i2c-aspeed.c                    |   2 +-
 drivers/i2c/busses/i2c-i801.c                      |   7 +-
 drivers/i2c/busses/i2c-xiic.c                      |   4 +
 drivers/infiniband/core/cma.c                      |  13 +-
 drivers/input/touchscreen/atmel_mxt_ts.c           |   7 +-
 drivers/iommu/ipmmu-vmsa.c                         |   9 +-
 drivers/macintosh/via-pmu.c                        |   9 +-
 drivers/md/dm-cache-target.c                       |  19 +-
 drivers/md/raid5.c                                 |   6 +
 drivers/media/dvb-frontends/helene.c               |   5 +-
 drivers/media/platform/davinci/vpif_display.c      |  24 +-
 .../media/platform/qcom/camss-8x16/camss-csid.c    |  16 +-
 drivers/media/platform/s5p-mfc/s5p_mfc.c           |  23 +-
 drivers/media/usb/dvb-usb/dw2102.c                 |  19 +-
 drivers/mfd/ti_am335x_tscadc.c                     |   3 +-
 drivers/misc/mic/scif/scif_api.c                   |  20 +-
 drivers/misc/ti-st/st_kim.c                        |   4 +-
 drivers/mtd/ubi/wl.c                               |   8 +-
 drivers/net/ethernet/marvell/mvneta.c              |   1 -
 drivers/net/phy/mdio-mux-bcm-iproc.c               |  20 +-
 drivers/net/tun.c                                  |  21 +-
 drivers/net/wireless/ath/ath10k/mac.c              |   7 +
 drivers/net/wireless/ath/ath10k/wmi-tlv.c          |   5 +
 drivers/net/wireless/ath/ath10k/wmi-tlv.h          |   5 +
 drivers/net/wireless/ath/ath9k/hw.c                |   7 +-
 drivers/net/wireless/ath/ath9k/xmit.c              |   3 +-
 drivers/net/wireless/ti/wlcore/rx.c                |   8 +-
 drivers/pci/switch/switchtec.c                     |   4 +
 drivers/pinctrl/freescale/pinctrl-imx.c            |   2 +-
 drivers/pinctrl/pinctrl-amd.c                      |   3 +-
 drivers/rpmsg/rpmsg_core.c                         |   7 +
 drivers/scsi/3w-9xxx.c                             |   6 +-
 drivers/scsi/3w-sas.c                              |   3 +
 drivers/scsi/3w-xxxx.c                             |   2 +
 drivers/scsi/lpfc/lpfc.h                           |   2 +-
 drivers/target/target_core_transport.c             |   5 +-
 drivers/tty/rocket.c                               |   2 +-
 drivers/uio/uio.c                                  |   3 +-
 fs/autofs4/autofs_i.h                              |   4 +-
 fs/autofs4/inode.c                                 |   1 -
 fs/btrfs/ioctl.c                                   |  19 +
 fs/cifs/inode.c                                    |   2 +
 fs/cifs/smb2ops.c                                  |  35 +-
 fs/cifs/smb2pdu.c                                  |   3 +
 fs/f2fs/f2fs.h                                     |   7 +-
 fs/f2fs/file.c                                     |   2 +-
 fs/f2fs/gc.c                                       |   8 +-
 fs/f2fs/inline.c                                   |  22 +
 fs/f2fs/node.c                                     |   4 +-
 fs/f2fs/segment.h                                  |   3 +
 fs/f2fs/super.c                                    |  21 +-
 fs/f2fs/sysfs.c                                    |  10 +-
 fs/nfs/callback_proc.c                             |   4 +-
 fs/nfs/callback_xdr.c                              |  11 +-
 include/linux/mm_types.h                           |   2 +-
 include/linux/mm_types_task.h                      |   2 +-
 include/linux/rhashtable.h                         |   8 +-
 include/linux/skbuff.h                             |  50 +-
 include/linux/tpm.h                                |   2 +
 include/linux/vm_event_item.h                      |   1 -
 include/linux/vmacache.h                           |   5 -
 include/net/inet_frag.h                            | 135 +++--
 include/net/ip.h                                   |   1 -
 include/net/ipv6.h                                 |  26 +-
 include/uapi/linux/ethtool.h                       |   4 +-
 include/uapi/linux/snmp.h                          |   1 +
 kernel/cpu.c                                       |  11 +-
 kernel/time/timer.c                                |  29 +-
 lib/rhashtable.c                                   |   2 +
 mm/debug.c                                         |   4 +-
 mm/vmacache.c                                      |  38 --
 net/bluetooth/hidp/core.c                          |   2 +-
 net/core/skbuff.c                                  |  31 +-
 net/dcb/dcbnl.c                                    |  11 +-
 net/ieee802154/6lowpan/6lowpan_i.h                 |  26 +-
 net/ieee802154/6lowpan/reassembly.c                | 153 +++---
 net/ipv4/inet_fragment.c                           | 378 +++-----------
 net/ipv4/ip_fragment.c                             | 578 ++++++++++++---------
 net/ipv4/proc.c                                    |   7 +-
 net/ipv4/tcp_fastopen.c                            |   8 +-
 net/ipv4/tcp_input.c                               |  33 +-
 net/ipv6/netfilter/nf_conntrack_reasm.c            | 105 ++--
 net/ipv6/proc.c                                    |   5 +-
 net/ipv6/reassembly.c                              | 217 ++++----
 net/sched/sch_netem.c                              |  14 +-
 sound/pci/hda/hda_codec.c                          |   3 +-
 tools/perf/builtin-c2c.c                           |   3 +
 tools/perf/perf.h                                  |   2 +
 tools/perf/util/evsel.c                            |  14 +
 tools/testing/nvdimm/pmem-dax.c                    |  12 +-
 tools/testing/selftests/bpf/test_verifier.c        |   6 +-
 129 files changed, 1473 insertions(+), 1362 deletions(-)



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 001/126] i2c: xiic: Make the start and the byte count write atomic
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
@ 2018-09-17 22:40 ` Greg Kroah-Hartman
  2018-09-17 22:40 ` [PATCH 4.14 002/126] i2c: i801: fix DNVs SMBCTRL register offset Greg Kroah-Hartman
                   ` (127 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Shubhrajyoti Datta, Michal Simek,
	Wolfram Sang, stable

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>

commit ae7304c3ea28a3ba47a7a8312c76c654ef24967e upstream.

Disable interrupts while configuring the transfer and enable them back.

We have below as the programming sequence
1. start and slave address
2. byte count and stop

In some customer platform there was a lot of interrupts between 1 and 2
and after slave address (around 7 clock cyles) if 2 is not executed
then the transaction is nacked.

To fix this case make the 2 writes atomic.

Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
[wsa: added a newline for better readability]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/i2c/busses/i2c-xiic.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/drivers/i2c/busses/i2c-xiic.c
+++ b/drivers/i2c/busses/i2c-xiic.c
@@ -538,6 +538,7 @@ static void xiic_start_recv(struct xiic_
 {
 	u8 rx_watermark;
 	struct i2c_msg *msg = i2c->rx_msg = i2c->tx_msg;
+	unsigned long flags;
 
 	/* Clear and enable Rx full interrupt. */
 	xiic_irq_clr_en(i2c, XIIC_INTR_RX_FULL_MASK | XIIC_INTR_TX_ERROR_MASK);
@@ -553,6 +554,7 @@ static void xiic_start_recv(struct xiic_
 		rx_watermark = IIC_RX_FIFO_DEPTH;
 	xiic_setreg8(i2c, XIIC_RFD_REG_OFFSET, rx_watermark - 1);
 
+	local_irq_save(flags);
 	if (!(msg->flags & I2C_M_NOSTART))
 		/* write the address */
 		xiic_setreg16(i2c, XIIC_DTR_REG_OFFSET,
@@ -563,6 +565,8 @@ static void xiic_start_recv(struct xiic_
 
 	xiic_setreg16(i2c, XIIC_DTR_REG_OFFSET,
 		msg->len | ((i2c->nmsgs == 1) ? XIIC_TX_DYN_STOP_MASK : 0));
+	local_irq_restore(flags);
+
 	if (i2c->nmsgs == 1)
 		/* very last, enable bus not busy as well */
 		xiic_irq_clr_en(i2c, XIIC_INTR_BNB_MASK);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 002/126] i2c: i801: fix DNVs SMBCTRL register offset
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
  2018-09-17 22:40 ` [PATCH 4.14 001/126] i2c: xiic: Make the start and the byte count write atomic Greg Kroah-Hartman
@ 2018-09-17 22:40 ` Greg Kroah-Hartman
  2018-09-17 22:40 ` [PATCH 4.14 003/126] scsi: lpfc: Correct MDS diag and nvmet configuration Greg Kroah-Hartman
                   ` (126 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Felipe Balbi, Jean Delvare, Wolfram Sang

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Felipe Balbi <felipe.balbi@linux.intel.com>

commit 851a15114895c5bce163a6f2d57e0aa4658a1be4 upstream.

DNV's iTCO is slightly different with SMBCTRL sitting at a different
offset when compared to all other devices. Let's fix so that we can
properly use iTCO watchdog.

Fixes: 84d7f2ebd70d ("i2c: i801: Add support for Intel DNV")
Cc: <stable@vger.kernel.org> # v4.4+
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/i2c/busses/i2c-i801.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/drivers/i2c/busses/i2c-i801.c
+++ b/drivers/i2c/busses/i2c-i801.c
@@ -138,6 +138,7 @@
 
 #define SBREG_BAR		0x10
 #define SBREG_SMBCTRL		0xc6000c
+#define SBREG_SMBCTRL_DNV	0xcf000c
 
 /* Host status bits for SMBPCISTS */
 #define SMBPCISTS_INTS		BIT(3)
@@ -1395,7 +1396,11 @@ static void i801_add_tco(struct i801_pri
 	spin_unlock(&p2sb_spinlock);
 
 	res = &tco_res[ICH_RES_MEM_OFF];
-	res->start = (resource_size_t)base64_addr + SBREG_SMBCTRL;
+	if (pci_dev->device == PCI_DEVICE_ID_INTEL_DNV_SMBUS)
+		res->start = (resource_size_t)base64_addr + SBREG_SMBCTRL_DNV;
+	else
+		res->start = (resource_size_t)base64_addr + SBREG_SMBCTRL;
+
 	res->end = res->start + 3;
 	res->flags = IORESOURCE_MEM;
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 003/126] scsi: lpfc: Correct MDS diag and nvmet configuration
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
  2018-09-17 22:40 ` [PATCH 4.14 001/126] i2c: xiic: Make the start and the byte count write atomic Greg Kroah-Hartman
  2018-09-17 22:40 ` [PATCH 4.14 002/126] i2c: i801: fix DNVs SMBCTRL register offset Greg Kroah-Hartman
@ 2018-09-17 22:40 ` Greg Kroah-Hartman
  2018-09-17 22:40 ` [PATCH 4.14 004/126] nbd: dont allow invalid blocksize settings Greg Kroah-Hartman
                   ` (125 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dick Kennedy, James Smart,
	Ewan D. Milne, Martin K. Petersen

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: James Smart <jsmart2021@gmail.com>

commit 53e13ee087a80e8d4fc95436318436e5c2c1f8c2 upstream.

A recent change added some MDS processing in the lpfc_drain_txq routine
that relies on the fcp_wq being allocated. For nvmet operation the fcp_wq
is not allocated because it can only be an nvme-target.  When the original
MDS support was added LS_MDS_LOOPBACK was defined wrong, (0x16) it should
have been 0x10 (decimal value used for hex setting). This incorrect value
allowed MDS_LOOPBACK to be set simultaneously with LS_NPIV_FAB_SUPPORTED,
causing the driver to crash when it accesses the non-existent fcp_wq.

Correct the bad value setting for LS_MDS_LOOPBACK.

Fixes: 	ae9e28f36a6c  ("lpfc: Add MDS Diagnostic support.")
Cc: <stable@vger.kernel.org> # v4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Tested-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/scsi/lpfc/lpfc.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -676,7 +676,7 @@ struct lpfc_hba {
 #define LS_NPIV_FAB_SUPPORTED 0x2	/* Fabric supports NPIV */
 #define LS_IGNORE_ERATT       0x4	/* intr handler should ignore ERATT */
 #define LS_MDS_LINK_DOWN      0x8	/* MDS Diagnostics Link Down */
-#define LS_MDS_LOOPBACK      0x16	/* MDS Diagnostics Link Up (Loopback) */
+#define LS_MDS_LOOPBACK      0x10	/* MDS Diagnostics Link Up (Loopback) */
 
 	uint32_t hba_flag;	/* hba generic flags */
 #define HBA_ERATT_HANDLED	0x1 /* This flag is set when eratt handled */



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 004/126] nbd: dont allow invalid blocksize settings
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2018-09-17 22:40 ` [PATCH 4.14 003/126] scsi: lpfc: Correct MDS diag and nvmet configuration Greg Kroah-Hartman
@ 2018-09-17 22:40 ` Greg Kroah-Hartman
  2018-09-17 22:40 ` [PATCH 4.14 005/126] block: bfq: swap puts in bfqg_and_blkg_put Greg Kroah-Hartman
                   ` (124 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, syzbot, Josef Bacik, Jens Axboe

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jens Axboe <axboe@kernel.dk>

commit bc811f05d77f47059c197a98b6ad242eb03999cb upstream.

syzbot reports a divide-by-zero off the NBD_SET_BLKSIZE ioctl.
We need proper validation of the input here. Not just if it's
zero, but also if the value is a power-of-2 and in a valid
range. Add that.

Cc: stable@vger.kernel.org
Reported-by: syzbot <syzbot+25dbecbec1e62c6b0dd4@syzkaller.appspotmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/block/nbd.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1228,6 +1228,9 @@ static int __nbd_ioctl(struct block_devi
 	case NBD_SET_SOCK:
 		return nbd_add_socket(nbd, arg, false);
 	case NBD_SET_BLKSIZE:
+		if (!arg || !is_power_of_2(arg) || arg < 512 ||
+		    arg > PAGE_SIZE)
+			return -EINVAL;
 		nbd_size_set(nbd, arg,
 			     div_s64(config->bytesize, arg));
 		return 0;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 005/126] block: bfq: swap puts in bfqg_and_blkg_put
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (3 preceding siblings ...)
  2018-09-17 22:40 ` [PATCH 4.14 004/126] nbd: dont allow invalid blocksize settings Greg Kroah-Hartman
@ 2018-09-17 22:40 ` Greg Kroah-Hartman
  2018-09-17 22:40 ` [PATCH 4.14 006/126] android: binder: fix the race mmap and alloc_new_buf_locked Greg Kroah-Hartman
                   ` (123 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Paolo Valente, Konstantin Khlebnikov,
	Jens Axboe

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>

commit d5274b3cd6a814ccb2f56d81ee87cbbf51bd4cf7 upstream.

Fix trivial use-after-free. This could be last reference to bfqg.

Fixes: 8f9bebc33dd7 ("block, bfq: access and cache blkg data only when safe")
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 block/bfq-cgroup.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -224,9 +224,9 @@ static void bfqg_and_blkg_get(struct bfq
 
 void bfqg_and_blkg_put(struct bfq_group *bfqg)
 {
-	bfqg_put(bfqg);
-
 	blkg_put(bfqg_to_blkg(bfqg));
+
+	bfqg_put(bfqg);
 }
 
 void bfqg_stats_update_io_add(struct bfq_group *bfqg, struct bfq_queue *bfqq,



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 006/126] android: binder: fix the race mmap and alloc_new_buf_locked
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  2018-09-17 22:40 ` [PATCH 4.14 005/126] block: bfq: swap puts in bfqg_and_blkg_put Greg Kroah-Hartman
@ 2018-09-17 22:40 ` Greg Kroah-Hartman
  2018-09-17 22:40 ` [PATCH 4.14 007/126] MIPS: VDSO: Match data page cache colouring when D$ aliases Greg Kroah-Hartman
                   ` (122 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Todd Kjos, Minchan Kim, Martijn Coenen

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Minchan Kim <minchan@kernel.org>

commit da1b9564e85b1d7baf66cbfabcab27e183a1db63 upstream.

There is RaceFuzzer report like below because we have no lock to close
below the race between binder_mmap and binder_alloc_new_buf_locked.
To close the race, let's use memory barrier so that if someone see
alloc->vma is not NULL, alloc->vma_vm_mm should be never NULL.

(I didn't add stable mark intentionallybecause standard android
userspace libraries that interact with binder (libbinder & libhwbinder)
prevent the mmap/ioctl race. - from Todd)

"
Thread interleaving:
CPU0 (binder_alloc_mmap_handler)              CPU1 (binder_alloc_new_buf_locked)
=====                                         =====
// drivers/android/binder_alloc.c
// #L718 (v4.18-rc3)
alloc->vma = vma;
                                              // drivers/android/binder_alloc.c
                                              // #L346 (v4.18-rc3)
                                              if (alloc->vma == NULL) {
                                                  ...
                                                  // alloc->vma is not NULL at this point
                                                  return ERR_PTR(-ESRCH);
                                              }
                                              ...
                                              // #L438
                                              binder_update_page_range(alloc, 0,
                                                      (void *)PAGE_ALIGN((uintptr_t)buffer->data),
                                                      end_page_addr);

                                              // In binder_update_page_range() #L218
                                              // But still alloc->vma_vm_mm is NULL here
                                              if (need_mm && mmget_not_zero(alloc->vma_vm_mm))
alloc->vma_vm_mm = vma->vm_mm;

Crash Log:
==================================================================
BUG: KASAN: null-ptr-deref in __atomic_add_unless include/asm-generic/atomic-instrumented.h:89 [inline]
BUG: KASAN: null-ptr-deref in atomic_add_unless include/linux/atomic.h:533 [inline]
BUG: KASAN: null-ptr-deref in mmget_not_zero include/linux/sched/mm.h:75 [inline]
BUG: KASAN: null-ptr-deref in binder_update_page_range+0xece/0x18e0 drivers/android/binder_alloc.c:218
Write of size 4 at addr 0000000000000058 by task syz-executor0/11184

CPU: 1 PID: 11184 Comm: syz-executor0 Not tainted 4.18.0-rc3 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x16e/0x22c lib/dump_stack.c:113
 kasan_report_error mm/kasan/report.c:352 [inline]
 kasan_report+0x163/0x380 mm/kasan/report.c:412
 check_memory_region_inline mm/kasan/kasan.c:260 [inline]
 check_memory_region+0x140/0x1a0 mm/kasan/kasan.c:267
 kasan_check_write+0x14/0x20 mm/kasan/kasan.c:278
 __atomic_add_unless include/asm-generic/atomic-instrumented.h:89 [inline]
 atomic_add_unless include/linux/atomic.h:533 [inline]
 mmget_not_zero include/linux/sched/mm.h:75 [inline]
 binder_update_page_range+0xece/0x18e0 drivers/android/binder_alloc.c:218
 binder_alloc_new_buf_locked drivers/android/binder_alloc.c:443 [inline]
 binder_alloc_new_buf+0x467/0xc30 drivers/android/binder_alloc.c:513
 binder_transaction+0x125b/0x4fb0 drivers/android/binder.c:2957
 binder_thread_write+0xc08/0x2770 drivers/android/binder.c:3528
 binder_ioctl_write_read.isra.39+0x24f/0x8e0 drivers/android/binder.c:4456
 binder_ioctl+0xa86/0xf34 drivers/android/binder.c:4596
 vfs_ioctl fs/ioctl.c:46 [inline]
 do_vfs_ioctl+0x154/0xd40 fs/ioctl.c:686
 ksys_ioctl+0x94/0xb0 fs/ioctl.c:701
 __do_sys_ioctl fs/ioctl.c:708 [inline]
 __se_sys_ioctl fs/ioctl.c:706 [inline]
 __x64_sys_ioctl+0x43/0x50 fs/ioctl.c:706
 do_syscall_64+0x167/0x4b0 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
"

Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Martijn Coenen <maco@android.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/android/binder_alloc.c |   42 +++++++++++++++++++++++++++++++++--------
 1 file changed, 34 insertions(+), 8 deletions(-)

--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -324,6 +324,34 @@ err_no_vma:
 	return vma ? -ENOMEM : -ESRCH;
 }
 
+static inline void binder_alloc_set_vma(struct binder_alloc *alloc,
+		struct vm_area_struct *vma)
+{
+	if (vma)
+		alloc->vma_vm_mm = vma->vm_mm;
+	/*
+	 * If we see alloc->vma is not NULL, buffer data structures set up
+	 * completely. Look at smp_rmb side binder_alloc_get_vma.
+	 * We also want to guarantee new alloc->vma_vm_mm is always visible
+	 * if alloc->vma is set.
+	 */
+	smp_wmb();
+	alloc->vma = vma;
+}
+
+static inline struct vm_area_struct *binder_alloc_get_vma(
+		struct binder_alloc *alloc)
+{
+	struct vm_area_struct *vma = NULL;
+
+	if (alloc->vma) {
+		/* Look at description in binder_alloc_set_vma */
+		smp_rmb();
+		vma = alloc->vma;
+	}
+	return vma;
+}
+
 struct binder_buffer *binder_alloc_new_buf_locked(struct binder_alloc *alloc,
 						  size_t data_size,
 						  size_t offsets_size,
@@ -339,7 +367,7 @@ struct binder_buffer *binder_alloc_new_b
 	size_t size, data_offsets_size;
 	int ret;
 
-	if (alloc->vma == NULL) {
+	if (!binder_alloc_get_vma(alloc)) {
 		pr_err("%d: binder_alloc_buf, no vma\n",
 		       alloc->pid);
 		return ERR_PTR(-ESRCH);
@@ -712,9 +740,7 @@ int binder_alloc_mmap_handler(struct bin
 	buffer->free = 1;
 	binder_insert_free_buffer(alloc, buffer);
 	alloc->free_async_space = alloc->buffer_size / 2;
-	barrier();
-	alloc->vma = vma;
-	alloc->vma_vm_mm = vma->vm_mm;
+	binder_alloc_set_vma(alloc, vma);
 	mmgrab(alloc->vma_vm_mm);
 
 	return 0;
@@ -741,10 +767,10 @@ void binder_alloc_deferred_release(struc
 	int buffers, page_count;
 	struct binder_buffer *buffer;
 
-	BUG_ON(alloc->vma);
-
 	buffers = 0;
 	mutex_lock(&alloc->mutex);
+	BUG_ON(alloc->vma);
+
 	while ((n = rb_first(&alloc->allocated_buffers))) {
 		buffer = rb_entry(n, struct binder_buffer, rb_node);
 
@@ -886,7 +912,7 @@ int binder_alloc_get_allocated_count(str
  */
 void binder_alloc_vma_close(struct binder_alloc *alloc)
 {
-	WRITE_ONCE(alloc->vma, NULL);
+	binder_alloc_set_vma(alloc, NULL);
 }
 
 /**
@@ -921,7 +947,7 @@ enum lru_status binder_alloc_free_page(s
 
 	index = page - alloc->pages;
 	page_addr = (uintptr_t)alloc->buffer + index * PAGE_SIZE;
-	vma = alloc->vma;
+	vma = binder_alloc_get_vma(alloc);
 	if (vma) {
 		if (!mmget_not_zero(alloc->vma_vm_mm))
 			goto err_mmget;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 007/126] MIPS: VDSO: Match data page cache colouring when D$ aliases
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (5 preceding siblings ...)
  2018-09-17 22:40 ` [PATCH 4.14 006/126] android: binder: fix the race mmap and alloc_new_buf_locked Greg Kroah-Hartman
@ 2018-09-17 22:40 ` Greg Kroah-Hartman
  2018-09-17 22:40 ` [PATCH 4.14 008/126] SMB3: Backup intent flag missing for directory opens with backupuid mounts Greg Kroah-Hartman
                   ` (121 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Paul Burton, Hauke Mehrtens,
	Rene Nielsen, Alexandre Belloni, James Hogan, linux-mips

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Paul Burton <paul.burton@mips.com>

commit 0f02cfbc3d9e413d450d8d0fd660077c23f67eff upstream.

When a system suffers from dcache aliasing a user program may observe
stale VDSO data from an aliased cache line. Notably this can break the
expectation that clock_gettime(CLOCK_MONOTONIC, ...) is, as its name
suggests, monotonic.

In order to ensure that users observe updates to the VDSO data page as
intended, align the user mappings of the VDSO data page such that their
cache colouring matches that of the virtual address range which the
kernel will use to update the data page - typically its unmapped address
within kseg0.

This ensures that we don't introduce aliasing cache lines for the VDSO
data page, and therefore that userland will observe updates without
requiring cache invalidation.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Reported-by: Hauke Mehrtens <hauke@hauke-m.de>
Reported-by: Rene Nielsen <rene.nielsen@microsemi.com>
Reported-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO")
Patchwork: https://patchwork.linux-mips.org/patch/20344/
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Tested-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/mips/kernel/vdso.c |   20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -13,6 +13,7 @@
 #include <linux/err.h>
 #include <linux/init.h>
 #include <linux/ioport.h>
+#include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
@@ -20,6 +21,7 @@
 
 #include <asm/abi.h>
 #include <asm/mips-cps.h>
+#include <asm/page.h>
 #include <asm/vdso.h>
 
 /* Kernel-provided data used by the VDSO. */
@@ -128,12 +130,30 @@ int arch_setup_additional_pages(struct l
 	vvar_size = gic_size + PAGE_SIZE;
 	size = vvar_size + image->size;
 
+	/*
+	 * Find a region that's large enough for us to perform the
+	 * colour-matching alignment below.
+	 */
+	if (cpu_has_dc_aliases)
+		size += shm_align_mask + 1;
+
 	base = get_unmapped_area(NULL, 0, size, 0, 0);
 	if (IS_ERR_VALUE(base)) {
 		ret = base;
 		goto out;
 	}
 
+	/*
+	 * If we suffer from dcache aliasing, ensure that the VDSO data page
+	 * mapping is coloured the same as the kernel's mapping of that memory.
+	 * This ensures that when the kernel updates the VDSO data userland
+	 * will observe it without requiring cache invalidations.
+	 */
+	if (cpu_has_dc_aliases) {
+		base = __ALIGN_MASK(base, shm_align_mask);
+		base += ((unsigned long)&vdso_data - gic_size) & shm_align_mask;
+	}
+
 	data_addr = base + gic_size;
 	vdso_addr = data_addr + PAGE_SIZE;
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 008/126] SMB3: Backup intent flag missing for directory opens with backupuid mounts
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (6 preceding siblings ...)
  2018-09-17 22:40 ` [PATCH 4.14 007/126] MIPS: VDSO: Match data page cache colouring when D$ aliases Greg Kroah-Hartman
@ 2018-09-17 22:40 ` Greg Kroah-Hartman
  2018-09-17 22:40 ` [PATCH 4.14 009/126] smb3: check for and properly advertise directory lease support Greg Kroah-Hartman
                   ` (120 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Steve French, Pavel Shilovsky

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Steve French <stfrench@microsoft.com>

commit 5e19697b56a64004e2d0ff1bb952ea05493c088f upstream.

When "backup intent" is requested on the mount (e.g. backupuid or
backupgid mount options), the corresponding flag needs to be set
on opens of directories (and files) but was missing in some
places causing access denied trying to enumerate and backup
servers.

Fixes kernel bugzilla #200953
https://bugzilla.kernel.org/show_bug.cgi?id=200953

Reported-and-tested-by: <whh@rubrik.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/cifs/inode.c   |    2 ++
 fs/cifs/smb2ops.c |   25 ++++++++++++++++++++-----
 2 files changed, 22 insertions(+), 5 deletions(-)

--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -467,6 +467,8 @@ cifs_sfu_type(struct cifs_fattr *fattr,
 	oparms.cifs_sb = cifs_sb;
 	oparms.desired_access = GENERIC_READ;
 	oparms.create_options = CREATE_NOT_DIR;
+	if (backup_cred(cifs_sb))
+		oparms.create_options |= CREATE_OPEN_BACKUP_INTENT;
 	oparms.disposition = FILE_OPEN;
 	oparms.path = path;
 	oparms.fid = &fid;
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -385,7 +385,10 @@ smb2_is_path_accessible(const unsigned i
 	oparms.tcon = tcon;
 	oparms.desired_access = FILE_READ_ATTRIBUTES;
 	oparms.disposition = FILE_OPEN;
-	oparms.create_options = 0;
+	if (backup_cred(cifs_sb))
+		oparms.create_options = CREATE_OPEN_BACKUP_INTENT;
+	else
+		oparms.create_options = 0;
 	oparms.fid = &fid;
 	oparms.reconnect = false;
 
@@ -534,7 +537,10 @@ smb2_query_eas(const unsigned int xid, s
 	oparms.tcon = tcon;
 	oparms.desired_access = FILE_READ_EA;
 	oparms.disposition = FILE_OPEN;
-	oparms.create_options = 0;
+	if (backup_cred(cifs_sb))
+		oparms.create_options = CREATE_OPEN_BACKUP_INTENT;
+	else
+		oparms.create_options = 0;
 	oparms.fid = &fid;
 	oparms.reconnect = false;
 
@@ -613,7 +619,10 @@ smb2_set_ea(const unsigned int xid, stru
 	oparms.tcon = tcon;
 	oparms.desired_access = FILE_WRITE_EA;
 	oparms.disposition = FILE_OPEN;
-	oparms.create_options = 0;
+	if (backup_cred(cifs_sb))
+		oparms.create_options = CREATE_OPEN_BACKUP_INTENT;
+	else
+		oparms.create_options = 0;
 	oparms.fid = &fid;
 	oparms.reconnect = false;
 
@@ -1215,7 +1224,10 @@ smb2_query_dir_first(const unsigned int
 	oparms.tcon = tcon;
 	oparms.desired_access = FILE_READ_ATTRIBUTES | FILE_READ_DATA;
 	oparms.disposition = FILE_OPEN;
-	oparms.create_options = 0;
+	if (backup_cred(cifs_sb))
+		oparms.create_options = CREATE_OPEN_BACKUP_INTENT;
+	else
+		oparms.create_options = 0;
 	oparms.fid = fid;
 	oparms.reconnect = false;
 
@@ -1491,7 +1503,10 @@ smb2_query_symlink(const unsigned int xi
 	oparms.tcon = tcon;
 	oparms.desired_access = FILE_READ_ATTRIBUTES;
 	oparms.disposition = FILE_OPEN;
-	oparms.create_options = 0;
+	if (backup_cred(cifs_sb))
+		oparms.create_options = CREATE_OPEN_BACKUP_INTENT;
+	else
+		oparms.create_options = 0;
 	oparms.fid = &fid;
 	oparms.reconnect = false;
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 009/126] smb3: check for and properly advertise directory lease support
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (7 preceding siblings ...)
  2018-09-17 22:40 ` [PATCH 4.14 008/126] SMB3: Backup intent flag missing for directory opens with backupuid mounts Greg Kroah-Hartman
@ 2018-09-17 22:40 ` Greg Kroah-Hartman
  2018-09-17 22:40 ` [PATCH 4.14 010/126] Btrfs: fix data corruption when deduplicating between different files Greg Kroah-Hartman
                   ` (119 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Steve French, Ronnie Sahlberg

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Steve French <stfrench@microsoft.com>

commit f801568332321e2b1e7a8bd26c3e4913a312a2ec upstream.

Although servers will typically ignore unsupported features,
we should advertise the support for directory leases (as
Windows e.g. does) in the negotiate protocol capabilities we
pass to the server, and should check for the server capability
(CAP_DIRECTORY_LEASING) before sending a lease request for an
open of a directory.  This will prevent us from accidentally
sending directory leases to SMB2.1 or SMB2 server for example.

Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/cifs/smb2ops.c |   10 +++++-----
 fs/cifs/smb2pdu.c |    3 +++
 2 files changed, 8 insertions(+), 5 deletions(-)

--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -3215,7 +3215,7 @@ struct smb_version_values smb21_values =
 struct smb_version_values smb3any_values = {
 	.version_string = SMB3ANY_VERSION_STRING,
 	.protocol_id = SMB302_PROT_ID, /* doesn't matter, send protocol array */
-	.req_capabilities = SMB2_GLOBAL_CAP_DFS | SMB2_GLOBAL_CAP_LEASING | SMB2_GLOBAL_CAP_LARGE_MTU | SMB2_GLOBAL_CAP_PERSISTENT_HANDLES | SMB2_GLOBAL_CAP_ENCRYPTION,
+	.req_capabilities = SMB2_GLOBAL_CAP_DFS | SMB2_GLOBAL_CAP_LEASING | SMB2_GLOBAL_CAP_LARGE_MTU | SMB2_GLOBAL_CAP_PERSISTENT_HANDLES | SMB2_GLOBAL_CAP_ENCRYPTION | SMB2_GLOBAL_CAP_DIRECTORY_LEASING,
 	.large_lock_type = 0,
 	.exclusive_lock_type = SMB2_LOCKFLAG_EXCLUSIVE_LOCK,
 	.shared_lock_type = SMB2_LOCKFLAG_SHARED_LOCK,
@@ -3235,7 +3235,7 @@ struct smb_version_values smb3any_values
 struct smb_version_values smbdefault_values = {
 	.version_string = SMBDEFAULT_VERSION_STRING,
 	.protocol_id = SMB302_PROT_ID, /* doesn't matter, send protocol array */
-	.req_capabilities = SMB2_GLOBAL_CAP_DFS | SMB2_GLOBAL_CAP_LEASING | SMB2_GLOBAL_CAP_LARGE_MTU | SMB2_GLOBAL_CAP_PERSISTENT_HANDLES | SMB2_GLOBAL_CAP_ENCRYPTION,
+	.req_capabilities = SMB2_GLOBAL_CAP_DFS | SMB2_GLOBAL_CAP_LEASING | SMB2_GLOBAL_CAP_LARGE_MTU | SMB2_GLOBAL_CAP_PERSISTENT_HANDLES | SMB2_GLOBAL_CAP_ENCRYPTION | SMB2_GLOBAL_CAP_DIRECTORY_LEASING,
 	.large_lock_type = 0,
 	.exclusive_lock_type = SMB2_LOCKFLAG_EXCLUSIVE_LOCK,
 	.shared_lock_type = SMB2_LOCKFLAG_SHARED_LOCK,
@@ -3255,7 +3255,7 @@ struct smb_version_values smbdefault_val
 struct smb_version_values smb30_values = {
 	.version_string = SMB30_VERSION_STRING,
 	.protocol_id = SMB30_PROT_ID,
-	.req_capabilities = SMB2_GLOBAL_CAP_DFS | SMB2_GLOBAL_CAP_LEASING | SMB2_GLOBAL_CAP_LARGE_MTU | SMB2_GLOBAL_CAP_PERSISTENT_HANDLES | SMB2_GLOBAL_CAP_ENCRYPTION,
+	.req_capabilities = SMB2_GLOBAL_CAP_DFS | SMB2_GLOBAL_CAP_LEASING | SMB2_GLOBAL_CAP_LARGE_MTU | SMB2_GLOBAL_CAP_PERSISTENT_HANDLES | SMB2_GLOBAL_CAP_ENCRYPTION | SMB2_GLOBAL_CAP_DIRECTORY_LEASING,
 	.large_lock_type = 0,
 	.exclusive_lock_type = SMB2_LOCKFLAG_EXCLUSIVE_LOCK,
 	.shared_lock_type = SMB2_LOCKFLAG_SHARED_LOCK,
@@ -3275,7 +3275,7 @@ struct smb_version_values smb30_values =
 struct smb_version_values smb302_values = {
 	.version_string = SMB302_VERSION_STRING,
 	.protocol_id = SMB302_PROT_ID,
-	.req_capabilities = SMB2_GLOBAL_CAP_DFS | SMB2_GLOBAL_CAP_LEASING | SMB2_GLOBAL_CAP_LARGE_MTU | SMB2_GLOBAL_CAP_PERSISTENT_HANDLES | SMB2_GLOBAL_CAP_ENCRYPTION,
+	.req_capabilities = SMB2_GLOBAL_CAP_DFS | SMB2_GLOBAL_CAP_LEASING | SMB2_GLOBAL_CAP_LARGE_MTU | SMB2_GLOBAL_CAP_PERSISTENT_HANDLES | SMB2_GLOBAL_CAP_ENCRYPTION | SMB2_GLOBAL_CAP_DIRECTORY_LEASING,
 	.large_lock_type = 0,
 	.exclusive_lock_type = SMB2_LOCKFLAG_EXCLUSIVE_LOCK,
 	.shared_lock_type = SMB2_LOCKFLAG_SHARED_LOCK,
@@ -3296,7 +3296,7 @@ struct smb_version_values smb302_values
 struct smb_version_values smb311_values = {
 	.version_string = SMB311_VERSION_STRING,
 	.protocol_id = SMB311_PROT_ID,
-	.req_capabilities = SMB2_GLOBAL_CAP_DFS | SMB2_GLOBAL_CAP_LEASING | SMB2_GLOBAL_CAP_LARGE_MTU | SMB2_GLOBAL_CAP_PERSISTENT_HANDLES | SMB2_GLOBAL_CAP_ENCRYPTION,
+	.req_capabilities = SMB2_GLOBAL_CAP_DFS | SMB2_GLOBAL_CAP_LEASING | SMB2_GLOBAL_CAP_LARGE_MTU | SMB2_GLOBAL_CAP_PERSISTENT_HANDLES | SMB2_GLOBAL_CAP_ENCRYPTION | SMB2_GLOBAL_CAP_DIRECTORY_LEASING,
 	.large_lock_type = 0,
 	.exclusive_lock_type = SMB2_LOCKFLAG_EXCLUSIVE_LOCK,
 	.shared_lock_type = SMB2_LOCKFLAG_SHARED_LOCK,
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -1816,6 +1816,9 @@ SMB2_open(const unsigned int xid, struct
 	if (!(server->capabilities & SMB2_GLOBAL_CAP_LEASING) ||
 	    *oplock == SMB2_OPLOCK_LEVEL_NONE)
 		req->RequestedOplockLevel = *oplock;
+	else if (!(server->capabilities & SMB2_GLOBAL_CAP_DIRECTORY_LEASING) &&
+		  (oparms->create_options & CREATE_NOT_FILE))
+		req->RequestedOplockLevel = *oplock; /* no srv lease support */
 	else {
 		rc = add_lease_context(server, iov, &n_iov, oplock);
 		if (rc) {



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 010/126] Btrfs: fix data corruption when deduplicating between different files
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (8 preceding siblings ...)
  2018-09-17 22:40 ` [PATCH 4.14 009/126] smb3: check for and properly advertise directory lease support Greg Kroah-Hartman
@ 2018-09-17 22:40 ` Greg Kroah-Hartman
  2018-09-17 22:40 ` [PATCH 4.14 011/126] KVM: s390: vsie: copy wrapping keys to right place Greg Kroah-Hartman
                   ` (118 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Filipe Manana, David Sterba

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Filipe Manana <fdmanana@suse.com>

commit de02b9f6bb65a6a1848f346f7a3617b7a9b930c0 upstream.

If we deduplicate extents between two different files we can end up
corrupting data if the source range ends at the size of the source file,
the source file's size is not aligned to the filesystem's block size
and the destination range does not go past the size of the destination
file size.

Example:

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt

  $ xfs_io -f -c "pwrite -S 0x6b 0 2518890" /mnt/foo
  # The first byte with a value of 0xae starts at an offset (2518890)
  # which is not a multiple of the sector size.
  $ xfs_io -c "pwrite -S 0xae 2518890 102398" /mnt/foo

  # Confirm the file content is full of bytes with values 0x6b and 0xae.
  $ od -t x1 /mnt/foo
  0000000 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
  *
  11467540 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ae ae ae ae ae ae
  11467560 ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae
  *
  11777540 ae ae ae ae ae ae ae ae
  11777550

  # Create a second file with a length not aligned to the sector size,
  # whose bytes all have the value 0x6b, so that its extent(s) can be
  # deduplicated with the first file.
  $ xfs_io -f -c "pwrite -S 0x6b 0 557771" /mnt/bar

  # Now deduplicate the entire second file into a range of the first file
  # that also has all bytes with the value 0x6b. The destination range's
  # end offset must not be aligned to the sector size and must be less
  # then the offset of the first byte with the value 0xae (byte at offset
  # 2518890).
  $ xfs_io -c "dedupe /mnt/bar 0 1957888 557771" /mnt/foo

  # The bytes in the range starting at offset 2515659 (end of the
  # deduplication range) and ending at offset 2519040 (start offset
  # rounded up to the block size) must all have the value 0xae (and not
  # replaced with 0x00 values). In other words, we should have exactly
  # the same data we had before we asked for deduplication.
  $ od -t x1 /mnt/foo
  0000000 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
  *
  11467540 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ae ae ae ae ae ae
  11467560 ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae
  *
  11777540 ae ae ae ae ae ae ae ae
  11777550

  # Unmount the filesystem and mount it again. This guarantees any file
  # data in the page cache is dropped.
  $ umount /dev/sdb
  $ mount /dev/sdb /mnt

  $ od -t x1 /mnt/foo
  0000000 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
  *
  11461300 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 00 00 00 00 00
  11461320 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  *
  11470000 ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae
  *
  11777540 ae ae ae ae ae ae ae ae
  11777550

  # The bytes in range 2515659 to 2519040 have a value of 0x00 and not a
  # value of 0xae, data corruption happened due to the deduplication
  # operation.

So fix this by rounding down, to the sector size, the length used for the
deduplication when the following conditions are met:

  1) Source file's range ends at its i_size;
  2) Source file's i_size is not aligned to the sector size;
  3) Destination range does not cross the i_size of the destination file.

Fixes: e1d227a42ea2 ("btrfs: Handle unaligned length in extent_same")
CC: stable@vger.kernel.org # 4.2+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/ioctl.c |   19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3158,6 +3158,25 @@ static int btrfs_extent_same(struct inod
 
 		same_lock_start = min_t(u64, loff, dst_loff);
 		same_lock_len = max_t(u64, loff, dst_loff) + len - same_lock_start;
+	} else {
+		/*
+		 * If the source and destination inodes are different, the
+		 * source's range end offset matches the source's i_size, that
+		 * i_size is not a multiple of the sector size, and the
+		 * destination range does not go past the destination's i_size,
+		 * we must round down the length to the nearest sector size
+		 * multiple. If we don't do this adjustment we end replacing
+		 * with zeroes the bytes in the range that starts at the
+		 * deduplication range's end offset and ends at the next sector
+		 * size multiple.
+		 */
+		if (loff + olen == i_size_read(src) &&
+		    dst_loff + len < i_size_read(dst)) {
+			const u64 sz = BTRFS_I(src)->root->fs_info->sectorsize;
+
+			len = round_down(i_size_read(src), sz) - loff;
+			olen = len;
+		}
 	}
 
 	/* don't make the dst file partly checksummed */



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 011/126] KVM: s390: vsie: copy wrapping keys to right place
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (9 preceding siblings ...)
  2018-09-17 22:40 ` [PATCH 4.14 010/126] Btrfs: fix data corruption when deduplicating between different files Greg Kroah-Hartman
@ 2018-09-17 22:40 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 012/126] KVM: VMX: Do not allow reexecute_instruction() when skipping MMIO instr Greg Kroah-Hartman
                   ` (117 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Pierre Morel, David Hildenbrand,
	Cornelia Huck, Janosch Frank, Christian Borntraeger

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pierre Morel <pmorel@linux.ibm.com>

commit 204c97245612b6c255edf4e21e24d417c4a0c008 upstream.

Copy the key mask to the right offset inside the shadow CRYCB

Fixes: bbeaa58b3 ("KVM: s390: vsie: support aes dea wrapping keys")
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Cc: stable@vger.kernel.org # v4.8+
Message-Id: <1535019956-23539-2-git-send-email-pmorel@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/s390/kvm/vsie.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/arch/s390/kvm/vsie.c
+++ b/arch/s390/kvm/vsie.c
@@ -170,7 +170,8 @@ static int shadow_crycb(struct kvm_vcpu
 		return set_validity_icpt(scb_s, 0x0039U);
 
 	/* copy only the wrapping keys */
-	if (read_guest_real(vcpu, crycb_addr + 72, &vsie_page->crycb, 56))
+	if (read_guest_real(vcpu, crycb_addr + 72,
+			    vsie_page->crycb.dea_wrapping_key_mask, 56))
 		return set_validity_icpt(scb_s, 0x0035U);
 
 	scb_s->ecb3 |= ecb3_flags;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 012/126] KVM: VMX: Do not allow reexecute_instruction() when skipping MMIO instr
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (10 preceding siblings ...)
  2018-09-17 22:40 ` [PATCH 4.14 011/126] KVM: s390: vsie: copy wrapping keys to right place Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 013/126] ALSA: hda - Fix cancel_work_sync() stall from jackpoll work Greg Kroah-Hartman
                   ` (116 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Vitaly Kuznetsov,
	Sean Christopherson, Radim Krčmář

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sean Christopherson <sean.j.christopherson@intel.com>

commit c4409905cd6eb42cfd06126e9226b0150e05a715 upstream.

Re-execution after an emulation decode failure is only intended to
handle a case where two or vCPUs race to write a shadowed page, i.e.
we should never re-execute an instruction as part of MMIO emulation.
As handle_ept_misconfig() is only used for MMIO emulation, it should
pass EMULTYPE_NO_REEXECUTE when using the emulator to skip an instr
in the fast-MMIO case where VM_EXIT_INSTRUCTION_LEN is invalid.

And because the cr2 value passed to x86_emulate_instruction() is only
destined for use when retrying or reexecuting, we can simply call
emulate_instruction().

Fixes: d391f1207067 ("x86/kvm/vmx: do not use vm-exit instruction length
                      for fast MMIO when running nested")
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kvm/vmx.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6965,8 +6965,8 @@ static int handle_ept_misconfig(struct k
 		if (!static_cpu_has(X86_FEATURE_HYPERVISOR))
 			return kvm_skip_emulated_instruction(vcpu);
 		else
-			return x86_emulate_instruction(vcpu, gpa, EMULTYPE_SKIP,
-						       NULL, 0) == EMULATE_DONE;
+			return emulate_instruction(vcpu, EMULTYPE_SKIP) ==
+								EMULATE_DONE;
 	}
 
 	ret = kvm_mmu_page_fault(vcpu, gpa, PFERR_RSVD_MASK, NULL, 0);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 013/126] ALSA: hda - Fix cancel_work_sync() stall from jackpoll work
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (11 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 012/126] KVM: VMX: Do not allow reexecute_instruction() when skipping MMIO instr Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 014/126] cpu/hotplug: Adjust misplaced smb() in cpuhp_thread_fun() Greg Kroah-Hartman
                   ` (115 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Lukas Wunner, Takashi Iwai

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <tiwai@suse.de>

commit 16037643969e095509cd8446a3f8e406a6dc3a2c upstream.

On AMD/ATI controllers, the HD-audio controller driver allows a bus
reset upon the error recovery, and its procedure includes the
cancellation of pending jack polling work as found in
snd_hda_bus_codec_reset().  This works usually fine, but it becomes a
problem when the reset happens from the jack poll work itself; then
calling cancel_work_sync() from the work being processed tries to wait
the finish endlessly.

As a workaround, this patch adds the check of current_work() and
applies the cancel_work_sync() only when it's not from the
jackpoll_work.

This doesn't fix the root cause of the reported error below, but at
least, it eases the unexpected stall of the whole system.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200937
Cc: <stable@vger.kernel.org>
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/pci/hda/hda_codec.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/sound/pci/hda/hda_codec.c
+++ b/sound/pci/hda/hda_codec.c
@@ -3923,7 +3923,8 @@ void snd_hda_bus_reset_codecs(struct hda
 
 	list_for_each_codec(codec, bus) {
 		/* FIXME: maybe a better way needed for forced reset */
-		cancel_delayed_work_sync(&codec->jackpoll_work);
+		if (current_work() != &codec->jackpoll_work.work)
+			cancel_delayed_work_sync(&codec->jackpoll_work);
 #ifdef CONFIG_PM
 		if (hda_codec_is_power_on(codec)) {
 			hda_call_codec_suspend(codec);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 014/126] cpu/hotplug: Adjust misplaced smb() in cpuhp_thread_fun()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (12 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 013/126] ALSA: hda - Fix cancel_work_sync() stall from jackpoll work Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 015/126] cpu/hotplug: Prevent state corruption on error rollback Greg Kroah-Hartman
                   ` (114 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Neeraj Upadhyay, Thomas Gleixner,
	Peter Zijlstra (Intel),
	josh, peterz, jiangshanlai, dzickus, brendan.jackman, malat,
	mojha, sramana, linux-arm-msm

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Neeraj Upadhyay <neeraju@codeaurora.org>

commit f8b7530aa0a1def79c93101216b5b17cf408a70a upstream.

The smp_mb() in cpuhp_thread_fun() is misplaced. It needs to be after the
load of st->should_run to prevent reordering of the later load/stores
w.r.t. the load of st->should_run.

Fixes: 4dddfb5faa61 ("smp/hotplug: Rewrite AP state machine core")
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infraded.org>
Cc: josh@joshtriplett.org
Cc: peterz@infradead.org
Cc: jiangshanlai@gmail.com
Cc: dzickus@redhat.com
Cc: brendan.jackman@arm.com
Cc: malat@debian.org
Cc: mojha@codeaurora.org
Cc: sramana@codeaurora.org
Cc: linux-arm-msm@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1536126727-11629-1-git-send-email-neeraju@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/cpu.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -612,15 +612,15 @@ static void cpuhp_thread_fun(unsigned in
 	bool bringup = st->bringup;
 	enum cpuhp_state state;
 
+	if (WARN_ON_ONCE(!st->should_run))
+		return;
+
 	/*
 	 * ACQUIRE for the cpuhp_should_run() load of ->should_run. Ensures
 	 * that if we see ->should_run we also see the rest of the state.
 	 */
 	smp_mb();
 
-	if (WARN_ON_ONCE(!st->should_run))
-		return;
-
 	cpuhp_lock_acquire(bringup);
 
 	if (st->single) {



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 015/126] cpu/hotplug: Prevent state corruption on error rollback
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (13 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 014/126] cpu/hotplug: Adjust misplaced smb() in cpuhp_thread_fun() Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 016/126] x86/microcode: Make sure boot_cpu_data.microcode is up-to-date Greg Kroah-Hartman
                   ` (113 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Neeraj Upadhyay, Thomas Gleixner,
	Geert Uytterhoeven, Sudeep Holla, josh, peterz, jiangshanlai,
	dzickus, brendan.jackman, malat, sramana, linux-arm-msm

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

commit 69fa6eb7d6a64801ea261025cce9723d9442d773 upstream.

When a teardown callback fails, the CPU hotplug code brings the CPU back to
the previous state. The previous state becomes the new target state. The
rollback happens in undo_cpu_down() which increments the state
unconditionally even if the state is already the same as the target.

As a consequence the next CPU hotplug operation will start at the wrong
state. This is easily to observe when __cpu_disable() fails.

Prevent the unconditional undo by checking the state vs. target before
incrementing state and fix up the consequently wrong conditional in the
unplug code which handles the failure of the final CPU take down on the
control CPU side.

Fixes: 4dddfb5faa61 ("smp/hotplug: Rewrite AP state machine core")
Reported-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: josh@joshtriplett.org
Cc: peterz@infradead.org
Cc: jiangshanlai@gmail.com
Cc: dzickus@redhat.com
Cc: brendan.jackman@arm.com
Cc: malat@debian.org
Cc: sramana@codeaurora.org
Cc: linux-arm-msm@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1809051419580.1416@nanos.tec.linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

----

---
 kernel/cpu.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -932,7 +932,8 @@ static int cpuhp_down_callbacks(unsigned
 		ret = cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
 		if (ret) {
 			st->target = prev_state;
-			undo_cpu_down(cpu, st);
+			if (st->state < prev_state)
+				undo_cpu_down(cpu, st);
 			break;
 		}
 	}
@@ -985,7 +986,7 @@ static int __ref _cpu_down(unsigned int
 	 * to do the further cleanups.
 	 */
 	ret = cpuhp_down_callbacks(cpu, st, target);
-	if (ret && st->state > CPUHP_TEARDOWN_CPU && st->state < prev_state) {
+	if (ret && st->state == CPUHP_TEARDOWN_CPU && st->state < prev_state) {
 		cpuhp_reset_state(st, prev_state);
 		__cpuhp_kick_ap(st);
 	}



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 016/126] x86/microcode: Make sure boot_cpu_data.microcode is up-to-date
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (14 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 015/126] cpu/hotplug: Prevent state corruption on error rollback Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 017/126] x86/microcode: Update the new microcode revision unconditionally Greg Kroah-Hartman
                   ` (112 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Prarit Bhargava, Borislav Petkov,
	Thomas Gleixner, Tony Luck, sironi

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Prarit Bhargava <prarit@redhat.com>

commit 370a132bb2227ff76278f98370e0e701d86ff752 upstream.

When preparing an MCE record for logging, boot_cpu_data.microcode is used
to read out the microcode revision on the box.

However, on systems where late microcode update has happened, the microcode
revision output in a MCE log record is wrong because
boot_cpu_data.microcode is not updated when the microcode gets updated.

But, the microcode revision saved in boot_cpu_data's microcode member
should be kept up-to-date, regardless, for consistency.

Make it so.

Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check records")
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: sironi@amazon.de
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20180731112739.32338-1-prarit@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/cpu/microcode/amd.c   |    4 ++++
 arch/x86/kernel/cpu/microcode/intel.c |    4 ++++
 2 files changed, 8 insertions(+)

--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -537,6 +537,10 @@ static enum ucode_state apply_microcode_
 	uci->cpu_sig.rev = mc_amd->hdr.patch_id;
 	c->microcode = mc_amd->hdr.patch_id;
 
+	/* Update boot_cpu_data's revision too, if we're on the BSP: */
+	if (c->cpu_index == boot_cpu_data.cpu_index)
+		boot_cpu_data.microcode = mc_amd->hdr.patch_id;
+
 	return UCODE_UPDATED;
 }
 
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -851,6 +851,10 @@ static enum ucode_state apply_microcode_
 	uci->cpu_sig.rev = rev;
 	c->microcode = rev;
 
+	/* Update boot_cpu_data's revision too, if we're on the BSP: */
+	if (c->cpu_index == boot_cpu_data.cpu_index)
+		boot_cpu_data.microcode = rev;
+
 	return UCODE_UPDATED;
 }
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 017/126] x86/microcode: Update the new microcode revision unconditionally
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (15 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 016/126] x86/microcode: Make sure boot_cpu_data.microcode is up-to-date Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 018/126] switchtec: Fix Spectre v1 vulnerability Greg Kroah-Hartman
                   ` (111 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Filippo Sironi, Borislav Petkov,
	Thomas Gleixner, prarit

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Filippo Sironi <sironi@amazon.de>

commit 8da38ebaad23fe1b0c4a205438676f6356607cfc upstream.

Handle the case where microcode gets loaded on the BSP's hyperthread
sibling first and the boot_cpu_data's microcode revision doesn't get
updated because of early exit due to the siblings sharing a microcode
engine.

For that, simply write the updated revision on all CPUs unconditionally.

Signed-off-by: Filippo Sironi <sironi@amazon.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: prarit@redhat.com
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1533050970-14385-1-git-send-email-sironi@amazon.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/cpu/microcode/amd.c   |   22 +++++++++++++---------
 arch/x86/kernel/cpu/microcode/intel.c |   13 ++++++++-----
 2 files changed, 21 insertions(+), 14 deletions(-)

--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -504,6 +504,7 @@ static enum ucode_state apply_microcode_
 	struct microcode_amd *mc_amd;
 	struct ucode_cpu_info *uci;
 	struct ucode_patch *p;
+	enum ucode_state ret;
 	u32 rev, dummy;
 
 	BUG_ON(raw_smp_processor_id() != cpu);
@@ -521,9 +522,8 @@ static enum ucode_state apply_microcode_
 
 	/* need to apply patch? */
 	if (rev >= mc_amd->hdr.patch_id) {
-		c->microcode = rev;
-		uci->cpu_sig.rev = rev;
-		return UCODE_OK;
+		ret = UCODE_OK;
+		goto out;
 	}
 
 	if (__apply_microcode_amd(mc_amd)) {
@@ -531,17 +531,21 @@ static enum ucode_state apply_microcode_
 			cpu, mc_amd->hdr.patch_id);
 		return UCODE_ERROR;
 	}
-	pr_info("CPU%d: new patch_level=0x%08x\n", cpu,
-		mc_amd->hdr.patch_id);
 
-	uci->cpu_sig.rev = mc_amd->hdr.patch_id;
-	c->microcode = mc_amd->hdr.patch_id;
+	rev = mc_amd->hdr.patch_id;
+	ret = UCODE_UPDATED;
+
+	pr_info("CPU%d: new patch_level=0x%08x\n", cpu, rev);
+
+out:
+	uci->cpu_sig.rev = rev;
+	c->microcode	 = rev;
 
 	/* Update boot_cpu_data's revision too, if we're on the BSP: */
 	if (c->cpu_index == boot_cpu_data.cpu_index)
-		boot_cpu_data.microcode = mc_amd->hdr.patch_id;
+		boot_cpu_data.microcode = rev;
 
-	return UCODE_UPDATED;
+	return ret;
 }
 
 static int install_equiv_cpu_table(const u8 *buf)
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -795,6 +795,7 @@ static enum ucode_state apply_microcode_
 	struct ucode_cpu_info *uci = ucode_cpu_info + cpu;
 	struct cpuinfo_x86 *c = &cpu_data(cpu);
 	struct microcode_intel *mc;
+	enum ucode_state ret;
 	static int prev_rev;
 	u32 rev;
 
@@ -817,9 +818,8 @@ static enum ucode_state apply_microcode_
 	 */
 	rev = intel_get_microcode_revision();
 	if (rev >= mc->hdr.rev) {
-		uci->cpu_sig.rev = rev;
-		c->microcode = rev;
-		return UCODE_OK;
+		ret = UCODE_OK;
+		goto out;
 	}
 
 	/*
@@ -848,14 +848,17 @@ static enum ucode_state apply_microcode_
 		prev_rev = rev;
 	}
 
+	ret = UCODE_UPDATED;
+
+out:
 	uci->cpu_sig.rev = rev;
-	c->microcode = rev;
+	c->microcode	 = rev;
 
 	/* Update boot_cpu_data's revision too, if we're on the BSP: */
 	if (c->cpu_index == boot_cpu_data.cpu_index)
 		boot_cpu_data.microcode = rev;
 
-	return UCODE_UPDATED;
+	return ret;
 }
 
 static enum ucode_state generic_load_microcode(int cpu, void *data, size_t size,



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 018/126] switchtec: Fix Spectre v1 vulnerability
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (16 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 017/126] x86/microcode: Update the new microcode revision unconditionally Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 019/126] crypto: aes-generic - fix aes-generic regression on powerpc Greg Kroah-Hartman
                   ` (110 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Gustavo A. R. Silva, Bjorn Helgaas,
	Logan Gunthorpe

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Gustavo A. R. Silva <gustavo@embeddedor.com>

commit 46feb6b495f7628a6dbf36c4e6d80faf378372d4 upstream.

p.port can is indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

  drivers/pci/switch/switchtec.c:912 ioctl_port_to_pff() warn: potential spectre issue 'pcfg->dsp_pff_inst_id' [r]

Fix this by sanitizing p.port before using it to index
pcfg->dsp_pff_inst_id

Notice that given that speculation windows are large, the policy is to kill
the speculation on the first load and not worry if it can be completed with
a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Logan Gunthorpe <logang@deltatee.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/pci/switch/switchtec.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/drivers/pci/switch/switchtec.c
+++ b/drivers/pci/switch/switchtec.c
@@ -24,6 +24,8 @@
 #include <linux/cdev.h>
 #include <linux/wait.h>
 
+#include <linux/nospec.h>
+
 MODULE_DESCRIPTION("Microsemi Switchtec(tm) PCIe Management Driver");
 MODULE_VERSION("0.1");
 MODULE_LICENSE("GPL");
@@ -1173,6 +1175,8 @@ static int ioctl_port_to_pff(struct swit
 	default:
 		if (p.port > ARRAY_SIZE(pcfg->dsp_pff_inst_id))
 			return -EINVAL;
+		p.port = array_index_nospec(p.port,
+					ARRAY_SIZE(pcfg->dsp_pff_inst_id) + 1);
 		p.pff = ioread32(&pcfg->dsp_pff_inst_id[p.port - 1]);
 		break;
 	}



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 019/126] crypto: aes-generic - fix aes-generic regression on powerpc
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (17 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 018/126] switchtec: Fix Spectre v1 vulnerability Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 020/126] tpm: separate cmd_ready/go_idle from runtime_pm Greg Kroah-Hartman
                   ` (109 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, kbuild test robot, Arnd Bergmann,
	Herbert Xu, Horia Geanta

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Arnd Bergmann <arnd@arndb.de>

commit 6e36719fbe90213fbba9f50093fa2d4d69b0e93c upstream.

My last bugfix added -Os on the command line, which unfortunately caused
a build regression on powerpc in some configurations.

I've done some more analysis of the original problem and found slightly
different workaround that avoids this regression and also results in
better performance on gcc-7.0: -fcode-hoisting is an optimization step
that got added in gcc-7 and that for all gcc-7 versions causes worse
performance.

This disables -fcode-hoisting on all compilers that understand the option.
For gcc-7.1 and 7.2 I found the same performance as my previous patch
(using -Os), in gcc-7.0 it was even better. On gcc-8 I could see no
change in performance from this patch. In theory, code hoisting should
not be able make things better for the AES cipher, so leaving it
disabled for gcc-8 only serves to simplify the Makefile change.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Link: https://www.mail-archive.com/linux-crypto@vger.kernel.org/msg30418.html
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651
Fixes: 148b974deea9 ("crypto: aes-generic - build with -Os on gcc-7+")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Horia Geanta <horia.geanta@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 crypto/Makefile |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -98,7 +98,7 @@ obj-$(CONFIG_CRYPTO_TWOFISH_COMMON) += t
 obj-$(CONFIG_CRYPTO_SERPENT) += serpent_generic.o
 CFLAGS_serpent_generic.o := $(call cc-option,-fsched-pressure)  # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
 obj-$(CONFIG_CRYPTO_AES) += aes_generic.o
-CFLAGS_aes_generic.o := $(call cc-ifversion, -ge, 0701, -Os) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
+CFLAGS_aes_generic.o := $(call cc-option,-fno-code-hoisting) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
 obj-$(CONFIG_CRYPTO_AES_TI) += aes_ti.o
 obj-$(CONFIG_CRYPTO_CAMELLIA) += camellia_generic.o
 obj-$(CONFIG_CRYPTO_CAST_COMMON) += cast_common.o



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 020/126] tpm: separate cmd_ready/go_idle from runtime_pm
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (18 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 019/126] crypto: aes-generic - fix aes-generic regression on powerpc Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 021/126] ARC: [plat-axs*]: Enable SWAP Greg Kroah-Hartman
                   ` (108 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Tomas Winkler, Jarkko Sakkinen

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tomas Winkler <tomas.winkler@intel.com>

commit 627448e85c766587f6fdde1ea3886d6615081c77 upstream.

Fix tpm ptt initialization error:
tpm tpm0: A TPM error (378) occurred get tpm pcr allocation.

We cannot use go_idle cmd_ready commands via runtime_pm handles
as with the introduction of localities this is no longer an optional
feature, while runtime pm can be not enabled.
Though cmd_ready/go_idle provides a power saving, it's also a part of
TPM2 protocol and should be called explicitly.
This patch exposes cmd_read/go_idle via tpm class ops and removes
runtime pm support as it is not used by any driver.

When calling from nested context always use both flags:
TPM_TRANSMIT_UNLOCKED and TPM_TRANSMIT_RAW. Both are needed to resolve
tpm spaces and locality request recursive calls to tpm_transmit().
TPM_TRANSMIT_RAW should never be used standalone as it will fail
on double locking. While TPM_TRANSMIT_UNLOCKED standalone should be
called from non-recursive locked contexts.

New wrappers are added tpm_cmd_ready() and tpm_go_idle() to
streamline tpm_try_transmit code.

tpm_crb no longer needs own power saving functions and can drop using
tpm_pm_suspend/resume.

This patch cannot be really separated from the locality fix.
Fixes: 888d867df441 (tpm: cmd_ready command can be issued only after granting locality)

Cc: stable@vger.kernel.org
Fixes: 888d867df441 (tpm: cmd_ready command can be issued only after granting locality)
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/char/tpm/tpm-interface.c |   50 +++++++++++++++----
 drivers/char/tpm/tpm.h           |   12 +++-
 drivers/char/tpm/tpm2-space.c    |   16 +++---
 drivers/char/tpm/tpm_crb.c       |  101 ++++++++++-----------------------------
 include/linux/tpm.h              |    2 
 5 files changed, 90 insertions(+), 91 deletions(-)

--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -369,10 +369,13 @@ err_len:
 	return -EINVAL;
 }
 
-static int tpm_request_locality(struct tpm_chip *chip)
+static int tpm_request_locality(struct tpm_chip *chip, unsigned int flags)
 {
 	int rc;
 
+	if (flags & TPM_TRANSMIT_RAW)
+		return 0;
+
 	if (!chip->ops->request_locality)
 		return 0;
 
@@ -385,10 +388,13 @@ static int tpm_request_locality(struct t
 	return 0;
 }
 
-static void tpm_relinquish_locality(struct tpm_chip *chip)
+static void tpm_relinquish_locality(struct tpm_chip *chip, unsigned int flags)
 {
 	int rc;
 
+	if (flags & TPM_TRANSMIT_RAW)
+		return;
+
 	if (!chip->ops->relinquish_locality)
 		return;
 
@@ -399,6 +405,28 @@ static void tpm_relinquish_locality(stru
 	chip->locality = -1;
 }
 
+static int tpm_cmd_ready(struct tpm_chip *chip, unsigned int flags)
+{
+	if (flags & TPM_TRANSMIT_RAW)
+		return 0;
+
+	if (!chip->ops->cmd_ready)
+		return 0;
+
+	return chip->ops->cmd_ready(chip);
+}
+
+static int tpm_go_idle(struct tpm_chip *chip, unsigned int flags)
+{
+	if (flags & TPM_TRANSMIT_RAW)
+		return 0;
+
+	if (!chip->ops->go_idle)
+		return 0;
+
+	return chip->ops->go_idle(chip);
+}
+
 static ssize_t tpm_try_transmit(struct tpm_chip *chip,
 				struct tpm_space *space,
 				u8 *buf, size_t bufsiz,
@@ -449,14 +477,15 @@ static ssize_t tpm_try_transmit(struct t
 	/* Store the decision as chip->locality will be changed. */
 	need_locality = chip->locality == -1;
 
-	if (!(flags & TPM_TRANSMIT_RAW) && need_locality) {
-		rc = tpm_request_locality(chip);
+	if (need_locality) {
+		rc = tpm_request_locality(chip, flags);
 		if (rc < 0)
 			goto out_no_locality;
 	}
 
-	if (chip->dev.parent)
-		pm_runtime_get_sync(chip->dev.parent);
+	rc = tpm_cmd_ready(chip, flags);
+	if (rc)
+		goto out;
 
 	rc = tpm2_prepare_space(chip, space, ordinal, buf);
 	if (rc)
@@ -516,13 +545,16 @@ out_recv:
 	}
 
 	rc = tpm2_commit_space(chip, space, ordinal, buf, &len);
+	if (rc)
+		dev_err(&chip->dev, "tpm2_commit_space: error %d\n", rc);
 
 out:
-	if (chip->dev.parent)
-		pm_runtime_put_sync(chip->dev.parent);
+	rc = tpm_go_idle(chip, flags);
+	if (rc)
+		goto out;
 
 	if (need_locality)
-		tpm_relinquish_locality(chip);
+		tpm_relinquish_locality(chip, flags);
 
 out_no_locality:
 	if (chip->ops->clk_enable != NULL)
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -511,9 +511,17 @@ extern const struct file_operations tpm_
 extern const struct file_operations tpmrm_fops;
 extern struct idr dev_nums_idr;
 
+/**
+ * enum tpm_transmit_flags
+ *
+ * @TPM_TRANSMIT_UNLOCKED: used to lock sequence of tpm_transmit calls.
+ * @TPM_TRANSMIT_RAW: prevent recursive calls into setup steps
+ *                    (go idle, locality,..). Always use with UNLOCKED
+ *                    as it will fail on double locking.
+ */
 enum tpm_transmit_flags {
-	TPM_TRANSMIT_UNLOCKED	= BIT(0),
-	TPM_TRANSMIT_RAW	= BIT(1),
+	TPM_TRANSMIT_UNLOCKED = BIT(0),
+	TPM_TRANSMIT_RAW      = BIT(1),
 };
 
 ssize_t tpm_transmit(struct tpm_chip *chip, struct tpm_space *space,
--- a/drivers/char/tpm/tpm2-space.c
+++ b/drivers/char/tpm/tpm2-space.c
@@ -39,7 +39,8 @@ static void tpm2_flush_sessions(struct t
 	for (i = 0; i < ARRAY_SIZE(space->session_tbl); i++) {
 		if (space->session_tbl[i])
 			tpm2_flush_context_cmd(chip, space->session_tbl[i],
-					       TPM_TRANSMIT_UNLOCKED);
+					       TPM_TRANSMIT_UNLOCKED |
+					       TPM_TRANSMIT_RAW);
 	}
 }
 
@@ -84,7 +85,7 @@ static int tpm2_load_context(struct tpm_
 	tpm_buf_append(&tbuf, &buf[*offset], body_size);
 
 	rc = tpm_transmit_cmd(chip, NULL, tbuf.data, PAGE_SIZE, 4,
-			      TPM_TRANSMIT_UNLOCKED, NULL);
+			      TPM_TRANSMIT_UNLOCKED | TPM_TRANSMIT_RAW, NULL);
 	if (rc < 0) {
 		dev_warn(&chip->dev, "%s: failed with a system error %d\n",
 			 __func__, rc);
@@ -133,7 +134,7 @@ static int tpm2_save_context(struct tpm_
 	tpm_buf_append_u32(&tbuf, handle);
 
 	rc = tpm_transmit_cmd(chip, NULL, tbuf.data, PAGE_SIZE, 0,
-			      TPM_TRANSMIT_UNLOCKED, NULL);
+			      TPM_TRANSMIT_UNLOCKED | TPM_TRANSMIT_RAW, NULL);
 	if (rc < 0) {
 		dev_warn(&chip->dev, "%s: failed with a system error %d\n",
 			 __func__, rc);
@@ -170,7 +171,8 @@ static void tpm2_flush_space(struct tpm_
 	for (i = 0; i < ARRAY_SIZE(space->context_tbl); i++)
 		if (space->context_tbl[i] && ~space->context_tbl[i])
 			tpm2_flush_context_cmd(chip, space->context_tbl[i],
-					       TPM_TRANSMIT_UNLOCKED);
+					       TPM_TRANSMIT_UNLOCKED |
+					       TPM_TRANSMIT_RAW);
 
 	tpm2_flush_sessions(chip, space);
 }
@@ -377,7 +379,8 @@ static int tpm2_map_response_header(stru
 
 	return 0;
 out_no_slots:
-	tpm2_flush_context_cmd(chip, phandle, TPM_TRANSMIT_UNLOCKED);
+	tpm2_flush_context_cmd(chip, phandle,
+			       TPM_TRANSMIT_UNLOCKED | TPM_TRANSMIT_RAW);
 	dev_warn(&chip->dev, "%s: out of slots for 0x%08X\n", __func__,
 		 phandle);
 	return -ENOMEM;
@@ -465,7 +468,8 @@ static int tpm2_save_space(struct tpm_ch
 			return rc;
 
 		tpm2_flush_context_cmd(chip, space->context_tbl[i],
-				       TPM_TRANSMIT_UNLOCKED);
+				       TPM_TRANSMIT_UNLOCKED |
+				       TPM_TRANSMIT_RAW);
 		space->context_tbl[i] = ~0;
 	}
 
--- a/drivers/char/tpm/tpm_crb.c
+++ b/drivers/char/tpm/tpm_crb.c
@@ -137,7 +137,7 @@ static bool crb_wait_for_reg_32(u32 __io
 }
 
 /**
- * crb_go_idle - request tpm crb device to go the idle state
+ * __crb_go_idle - request tpm crb device to go the idle state
  *
  * @dev:  crb device
  * @priv: crb private data
@@ -151,7 +151,7 @@ static bool crb_wait_for_reg_32(u32 __io
  *
  * Return: 0 always
  */
-static int crb_go_idle(struct device *dev, struct crb_priv *priv)
+static int __crb_go_idle(struct device *dev, struct crb_priv *priv)
 {
 	if ((priv->flags & CRB_FL_ACPI_START) ||
 	    (priv->flags & CRB_FL_CRB_SMC_START))
@@ -166,11 +166,20 @@ static int crb_go_idle(struct device *de
 		dev_warn(dev, "goIdle timed out\n");
 		return -ETIME;
 	}
+
 	return 0;
 }
 
+static int crb_go_idle(struct tpm_chip *chip)
+{
+	struct device *dev = &chip->dev;
+	struct crb_priv *priv = dev_get_drvdata(dev);
+
+	return __crb_go_idle(dev, priv);
+}
+
 /**
- * crb_cmd_ready - request tpm crb device to enter ready state
+ * __crb_cmd_ready - request tpm crb device to enter ready state
  *
  * @dev:  crb device
  * @priv: crb private data
@@ -183,7 +192,7 @@ static int crb_go_idle(struct device *de
  *
  * Return: 0 on success -ETIME on timeout;
  */
-static int crb_cmd_ready(struct device *dev, struct crb_priv *priv)
+static int __crb_cmd_ready(struct device *dev, struct crb_priv *priv)
 {
 	if ((priv->flags & CRB_FL_ACPI_START) ||
 	    (priv->flags & CRB_FL_CRB_SMC_START))
@@ -201,6 +210,14 @@ static int crb_cmd_ready(struct device *
 	return 0;
 }
 
+static int crb_cmd_ready(struct tpm_chip *chip)
+{
+	struct device *dev = &chip->dev;
+	struct crb_priv *priv = dev_get_drvdata(dev);
+
+	return __crb_cmd_ready(dev, priv);
+}
+
 static int __crb_request_locality(struct device *dev,
 				  struct crb_priv *priv, int loc)
 {
@@ -393,6 +410,8 @@ static const struct tpm_class_ops tpm_cr
 	.send = crb_send,
 	.cancel = crb_cancel,
 	.req_canceled = crb_req_canceled,
+	.go_idle  = crb_go_idle,
+	.cmd_ready = crb_cmd_ready,
 	.request_locality = crb_request_locality,
 	.relinquish_locality = crb_relinquish_locality,
 	.req_complete_mask = CRB_DRV_STS_COMPLETE,
@@ -508,7 +527,7 @@ static int crb_map_io(struct acpi_device
 	 * PTT HW bug w/a: wake up the device to access
 	 * possibly not retained registers.
 	 */
-	ret = crb_cmd_ready(dev, priv);
+	ret = __crb_cmd_ready(dev, priv);
 	if (ret)
 		return ret;
 
@@ -553,7 +572,7 @@ out:
 	if (!ret)
 		priv->cmd_size = cmd_size;
 
-	crb_go_idle(dev, priv);
+	__crb_go_idle(dev, priv);
 
 	__crb_relinquish_locality(dev, priv, 0);
 
@@ -624,32 +643,7 @@ static int crb_acpi_add(struct acpi_devi
 	chip->acpi_dev_handle = device->handle;
 	chip->flags = TPM_CHIP_FLAG_TPM2;
 
-	rc = __crb_request_locality(dev, priv, 0);
-	if (rc)
-		return rc;
-
-	rc  = crb_cmd_ready(dev, priv);
-	if (rc)
-		goto out;
-
-	pm_runtime_get_noresume(dev);
-	pm_runtime_set_active(dev);
-	pm_runtime_enable(dev);
-
-	rc = tpm_chip_register(chip);
-	if (rc) {
-		crb_go_idle(dev, priv);
-		pm_runtime_put_noidle(dev);
-		pm_runtime_disable(dev);
-		goto out;
-	}
-
-	pm_runtime_put_sync(dev);
-
-out:
-	__crb_relinquish_locality(dev, priv, 0);
-
-	return rc;
+	return tpm_chip_register(chip);
 }
 
 static int crb_acpi_remove(struct acpi_device *device)
@@ -659,52 +653,11 @@ static int crb_acpi_remove(struct acpi_d
 
 	tpm_chip_unregister(chip);
 
-	pm_runtime_disable(dev);
-
 	return 0;
 }
 
-static int __maybe_unused crb_pm_runtime_suspend(struct device *dev)
-{
-	struct tpm_chip *chip = dev_get_drvdata(dev);
-	struct crb_priv *priv = dev_get_drvdata(&chip->dev);
-
-	return crb_go_idle(dev, priv);
-}
-
-static int __maybe_unused crb_pm_runtime_resume(struct device *dev)
-{
-	struct tpm_chip *chip = dev_get_drvdata(dev);
-	struct crb_priv *priv = dev_get_drvdata(&chip->dev);
-
-	return crb_cmd_ready(dev, priv);
-}
-
-static int __maybe_unused crb_pm_suspend(struct device *dev)
-{
-	int ret;
-
-	ret = tpm_pm_suspend(dev);
-	if (ret)
-		return ret;
-
-	return crb_pm_runtime_suspend(dev);
-}
-
-static int __maybe_unused crb_pm_resume(struct device *dev)
-{
-	int ret;
-
-	ret = crb_pm_runtime_resume(dev);
-	if (ret)
-		return ret;
-
-	return tpm_pm_resume(dev);
-}
-
 static const struct dev_pm_ops crb_pm = {
-	SET_SYSTEM_SLEEP_PM_OPS(crb_pm_suspend, crb_pm_resume)
-	SET_RUNTIME_PM_OPS(crb_pm_runtime_suspend, crb_pm_runtime_resume, NULL)
+	SET_SYSTEM_SLEEP_PM_OPS(tpm_pm_suspend, tpm_pm_resume)
 };
 
 static const struct acpi_device_id crb_device_ids[] = {
--- a/include/linux/tpm.h
+++ b/include/linux/tpm.h
@@ -48,6 +48,8 @@ struct tpm_class_ops {
 	u8 (*status) (struct tpm_chip *chip);
 	bool (*update_timeouts)(struct tpm_chip *chip,
 				unsigned long *timeout_cap);
+	int (*go_idle)(struct tpm_chip *chip);
+	int (*cmd_ready)(struct tpm_chip *chip);
 	int (*request_locality)(struct tpm_chip *chip, int loc);
 	int (*relinquish_locality)(struct tpm_chip *chip, int loc);
 	void (*clk_enable)(struct tpm_chip *chip, bool value);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 021/126] ARC: [plat-axs*]: Enable SWAP
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (19 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 020/126] tpm: separate cmd_ready/go_idle from runtime_pm Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 022/126] misc: mic: SCIF Fix scif_get_new_port() error handling Greg Kroah-Hartman
                   ` (107 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Alexey Brodkin, Vineet Gupta

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexey Brodkin <abrodkin@synopsys.com>

commit c83532fb0fe053d2e43e9387354cb1b52ba26427 upstream.

SWAP support on ARC was fixed earlier by
commit 6e3761145a9b ("ARC: Fix CONFIG_SWAP")
so now we may safely enable it on platforms that
have external media like USB and SD-card.

Note: it was already allowed for HSDK

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: stable@vger.kernel.org # 6e3761145a9b: ARC: Fix CONFIG_SWAP
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arc/configs/axs101_defconfig     |    1 -
 arch/arc/configs/axs103_defconfig     |    1 -
 arch/arc/configs/axs103_smp_defconfig |    1 -
 3 files changed, 3 deletions(-)

--- a/arch/arc/configs/axs101_defconfig
+++ b/arch/arc/configs/axs101_defconfig
@@ -1,5 +1,4 @@
 CONFIG_DEFAULT_HOSTNAME="ARCLinux"
-# CONFIG_SWAP is not set
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
 # CONFIG_CROSS_MEMORY_ATTACH is not set
--- a/arch/arc/configs/axs103_defconfig
+++ b/arch/arc/configs/axs103_defconfig
@@ -1,5 +1,4 @@
 CONFIG_DEFAULT_HOSTNAME="ARCLinux"
-# CONFIG_SWAP is not set
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
 # CONFIG_CROSS_MEMORY_ATTACH is not set
--- a/arch/arc/configs/axs103_smp_defconfig
+++ b/arch/arc/configs/axs103_smp_defconfig
@@ -1,5 +1,4 @@
 CONFIG_DEFAULT_HOSTNAME="ARCLinux"
-# CONFIG_SWAP is not set
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
 # CONFIG_CROSS_MEMORY_ATTACH is not set



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 022/126] misc: mic: SCIF Fix scif_get_new_port() error handling
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (20 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 021/126] ARC: [plat-axs*]: Enable SWAP Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 023/126] ethtool: Remove trailing semicolon for static inline Greg Kroah-Hartman
                   ` (106 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Dan Carpenter, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

[ Upstream commit a39284ae9d2ad09975c8ae33f1bd0f05fbfbf6ee ]

There are only 2 callers of scif_get_new_port() and both appear to get
the error handling wrong.  Both treat zero returns as error, but it
actually returns negative error codes and >= 0 on success.

Fixes: e9089f43c9a7 ("misc: mic: SCIF open close bind and listen APIs")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/misc/mic/scif/scif_api.c |   20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

--- a/drivers/misc/mic/scif/scif_api.c
+++ b/drivers/misc/mic/scif/scif_api.c
@@ -370,11 +370,10 @@ int scif_bind(scif_epd_t epd, u16 pn)
 			goto scif_bind_exit;
 		}
 	} else {
-		pn = scif_get_new_port();
-		if (!pn) {
-			ret = -ENOSPC;
+		ret = scif_get_new_port();
+		if (ret < 0)
 			goto scif_bind_exit;
-		}
+		pn = ret;
 	}
 
 	ep->state = SCIFEP_BOUND;
@@ -648,13 +647,12 @@ int __scif_connect(scif_epd_t epd, struc
 			err = -EISCONN;
 		break;
 	case SCIFEP_UNBOUND:
-		ep->port.port = scif_get_new_port();
-		if (!ep->port.port) {
-			err = -ENOSPC;
-		} else {
-			ep->port.node = scif_info.nodeid;
-			ep->conn_async_state = ASYNC_CONN_IDLE;
-		}
+		err = scif_get_new_port();
+		if (err < 0)
+			break;
+		ep->port.port = err;
+		ep->port.node = scif_info.nodeid;
+		ep->conn_async_state = ASYNC_CONN_IDLE;
 		/* Fall through */
 	case SCIFEP_BOUND:
 		/*



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 023/126] ethtool: Remove trailing semicolon for static inline
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (21 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 022/126] misc: mic: SCIF Fix scif_get_new_port() error handling Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 024/126] i2c: aspeed: Add an explicit type casting for *get_clk_reg_val Greg Kroah-Hartman
                   ` (105 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Florian Fainelli, David S. Miller,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Florian Fainelli <f.fainelli@gmail.com>

[ Upstream commit d89d41556141a527030a15233135ba622ba3350d ]

Android's header sanitization tool chokes on static inline functions having a
trailing semicolon, leading to an incorrectly parsed header file. While the
tool should obviously be fixed, also fix the header files for the two affected
functions: ethtool_get_flow_spec_ring() and ethtool_get_flow_spec_ring_vf().

Fixes: 8cf6f497de40 ("ethtool: Add helper routines to pass vf to rx_flow_spec")
Reporetd-by: Blair Prescott <blair.prescott@broadcom.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/uapi/linux/ethtool.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -898,13 +898,13 @@ struct ethtool_rx_flow_spec {
 static inline __u64 ethtool_get_flow_spec_ring(__u64 ring_cookie)
 {
 	return ETHTOOL_RX_FLOW_SPEC_RING & ring_cookie;
-};
+}
 
 static inline __u64 ethtool_get_flow_spec_ring_vf(__u64 ring_cookie)
 {
 	return (ETHTOOL_RX_FLOW_SPEC_RING_VF & ring_cookie) >>
 				ETHTOOL_RX_FLOW_SPEC_RING_VF_OFF;
-};
+}
 
 /**
  * struct ethtool_rxnfc - command to get or set RX flow classification rules



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 024/126] i2c: aspeed: Add an explicit type casting for *get_clk_reg_val
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (22 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 023/126] ethtool: Remove trailing semicolon for static inline Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 025/126] Bluetooth: h5: Fix missing dependency on BT_HCIUART_SERDEV Greg Kroah-Hartman
                   ` (104 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Wolfram Sang, Jae Hyun Yoo,
	Brendan Higgins, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>

[ Upstream commit 5799c4b2f1dbc0166d9b1d94443deaafc6e7a070 ]

This commit fixes this sparse warning:
drivers/i2c/busses/i2c-aspeed.c:875:38: warning: incorrect type in assignment (different modifiers)
drivers/i2c/busses/i2c-aspeed.c:875:38:    expected unsigned int ( *get_clk_reg_val )( ... )
drivers/i2c/busses/i2c-aspeed.c:875:38:    got void const *const data

Reported-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/i2c/busses/i2c-aspeed.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/i2c/busses/i2c-aspeed.c
+++ b/drivers/i2c/busses/i2c-aspeed.c
@@ -859,7 +859,7 @@ static int aspeed_i2c_probe_bus(struct p
 	if (!match)
 		bus->get_clk_reg_val = aspeed_i2c_24xx_get_clk_reg_val;
 	else
-		bus->get_clk_reg_val = match->data;
+		bus->get_clk_reg_val = (u32 (*)(u32))match->data;
 
 	/* Initialize the I2C adapter */
 	spin_lock_init(&bus->lock);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 025/126] Bluetooth: h5: Fix missing dependency on BT_HCIUART_SERDEV
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (23 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 024/126] i2c: aspeed: Add an explicit type casting for *get_clk_reg_val Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 026/126] gpio: tegra: Move driver registration to subsys_init level Greg Kroah-Hartman
                   ` (103 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Johan Hedberg, Marcel Holtmann, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hedberg <johan.hedberg@intel.com>

[ Upstream commit 6c3711ec64fd23a9abc8aaf59a9429569a6282df ]

This driver was recently updated to use serdev, so add the appropriate
dependency. Without this one can get compiler warnings like this if
CONFIG_SERIAL_DEV_BUS is not enabled:

  CC [M]  drivers/bluetooth/hci_h5.o
drivers/bluetooth/hci_h5.c:934:36: warning: ‘h5_serdev_driver’ defined but not used [-Wunused-variable]
 static struct serdev_device_driver h5_serdev_driver = {
                                    ^~~~~~~~~~~~~~~~

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/bluetooth/Kconfig |    1 +
 1 file changed, 1 insertion(+)

--- a/drivers/bluetooth/Kconfig
+++ b/drivers/bluetooth/Kconfig
@@ -146,6 +146,7 @@ config BT_HCIUART_LL
 config BT_HCIUART_3WIRE
 	bool "Three-wire UART (H5) protocol support"
 	depends on BT_HCIUART
+	depends on BT_HCIUART_SERDEV
 	help
 	  The HCI Three-wire UART Transport Layer makes it possible to
 	  user the Bluetooth HCI over a serial port interface. The HCI



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 026/126] gpio: tegra: Move driver registration to subsys_init level
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (24 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 025/126] Bluetooth: h5: Fix missing dependency on BT_HCIUART_SERDEV Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 027/126] powerpc/powernv: Fix concurrency issue with npu->mmio_atsd_usage Greg Kroah-Hartman
                   ` (102 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dmitry Osipenko, Stefan Agner,
	Linus Walleij, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dmitry Osipenko <digetx@gmail.com>

[ Upstream commit 40b25bce0adbe641a744d1291bc0e51fb7f3c3d8 ]

There is a bug in regards to deferred probing within the drivers core
that causes GPIO-driver to suspend after its users. The bug appears if
GPIO-driver probe is getting deferred, which happens after introducing
dependency on PINCTRL-driver for the GPIO-driver by defining "gpio-ranges"
property in device-tree. The bug in the drivers core is old (more than 4
years now) and is well known, unfortunately there is no easy fix for it.
The good news is that we can workaround the deferred probe issue by
changing GPIO / PINCTRL drivers registration order and hence by moving
PINCTRL driver registration to the arch_init level and GPIO to the
subsys_init.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpio/gpio-tegra.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpio/gpio-tegra.c
+++ b/drivers/gpio/gpio-tegra.c
@@ -728,4 +728,4 @@ static int __init tegra_gpio_init(void)
 {
 	return platform_driver_register(&tegra_gpio_driver);
 }
-postcore_initcall(tegra_gpio_init);
+subsys_initcall(tegra_gpio_init);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 027/126] powerpc/powernv: Fix concurrency issue with npu->mmio_atsd_usage
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (25 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 026/126] gpio: tegra: Move driver registration to subsys_init level Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 028/126] selftests/bpf: fix a typo in map in map test Greg Kroah-Hartman
                   ` (101 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Reza Arbab, Alistair Popple,
	Michael Ellerman, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Reza Arbab <arbab@linux.ibm.com>

[ Upstream commit 9eab9901b015f489199105c470de1ffc337cfabb ]

We've encountered a performance issue when multiple processors stress
{get,put}_mmio_atsd_reg(). These functions contend for
mmio_atsd_usage, an unsigned long used as a bitmask.

The accesses to mmio_atsd_usage are done using test_and_set_bit_lock()
and clear_bit_unlock(). As implemented, both of these will require
a (successful) stwcx to that same cache line.

What we end up with is thread A, attempting to unlock, being slowed by
other threads repeatedly attempting to lock. A's stwcx instructions
fail and retry because the memory reservation is lost every time a
different thread beats it to the punch.

There may be a long-term way to fix this at a larger scale, but for
now resolve the immediate problem by gating our call to
test_and_set_bit_lock() with one to test_bit(), which is obviously
implemented without using a store.

Fixes: 1ab66d1fbada ("powerpc/powernv: Introduce address translation services for Nvlink2")
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Acked-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/powerpc/platforms/powernv/npu-dma.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/arch/powerpc/platforms/powernv/npu-dma.c
+++ b/arch/powerpc/platforms/powernv/npu-dma.c
@@ -427,8 +427,9 @@ static int get_mmio_atsd_reg(struct npu
 	int i;
 
 	for (i = 0; i < npu->mmio_atsd_count; i++) {
-		if (!test_and_set_bit_lock(i, &npu->mmio_atsd_usage))
-			return i;
+		if (!test_bit(i, &npu->mmio_atsd_usage))
+			if (!test_and_set_bit_lock(i, &npu->mmio_atsd_usage))
+				return i;
 	}
 
 	return -ENOSPC;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 028/126] selftests/bpf: fix a typo in map in map test
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (26 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 027/126] powerpc/powernv: Fix concurrency issue with npu->mmio_atsd_usage Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 029/126] media: davinci: vpif_display: Mix memory leak on probe error path Greg Kroah-Hartman
                   ` (100 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Roman Gushchin, Martin KaFai Lau,
	Arthur Fabre, Daniel Borkmann, Alexei Starovoitov, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Roman Gushchin <guro@fb.com>

[ Upstream commit 0069fb854364da79fd99236ea620affc8e1152d5 ]

Commit fbeb1603bf4e ("bpf: verifier: MOV64 don't mark dst reg unbounded")
revealed a typo in commit fb30d4b71214 ("bpf: Add tests for map-in-map"):
BPF_MOV64_REG(BPF_REG_0, 0) was used instead of
BPF_MOV64_IMM(BPF_REG_0, 0).

I've noticed the problem by running bpf kselftests.

Fixes: fb30d4b71214 ("bpf: Add tests for map-in-map")
Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Arthur Fabre <afabre@cloudflare.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 tools/testing/selftests/bpf/test_verifier.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -5895,7 +5895,7 @@ static struct bpf_test tests[] = {
 			BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
 			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
 				     BPF_FUNC_map_lookup_elem),
-			BPF_MOV64_REG(BPF_REG_0, 0),
+			BPF_MOV64_IMM(BPF_REG_0, 0),
 			BPF_EXIT_INSN(),
 		},
 		.fixup_map_in_map = { 3 },
@@ -5918,7 +5918,7 @@ static struct bpf_test tests[] = {
 			BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 8),
 			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
 				     BPF_FUNC_map_lookup_elem),
-			BPF_MOV64_REG(BPF_REG_0, 0),
+			BPF_MOV64_IMM(BPF_REG_0, 0),
 			BPF_EXIT_INSN(),
 		},
 		.fixup_map_in_map = { 3 },
@@ -5941,7 +5941,7 @@ static struct bpf_test tests[] = {
 			BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
 			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
 				     BPF_FUNC_map_lookup_elem),
-			BPF_MOV64_REG(BPF_REG_0, 0),
+			BPF_MOV64_IMM(BPF_REG_0, 0),
 			BPF_EXIT_INSN(),
 		},
 		.fixup_map_in_map = { 3 },



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 029/126] media: davinci: vpif_display: Mix memory leak on probe error path
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (27 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 028/126] selftests/bpf: fix a typo in map in map test Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 030/126] media: dw2102: Fix memleak on sequence of probes Greg Kroah-Hartman
                   ` (99 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Anton Vasilyev,
	Mauro Carvalho Chehab, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Anton Vasilyev <vasilyev@ispras.ru>

[ Upstream commit 61e641f36ed81ae473177c085f0bfd83ad3b55ed ]

If vpif_probe() fails on v4l2_device_register() then memory allocated
at initialize_vpif() for global vpif_obj.dev[i] become unreleased.

The patch adds deallocation of vpif_obj.dev[i] on the error path and
removes duplicated check on platform_data presence.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/media/platform/davinci/vpif_display.c |   24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

--- a/drivers/media/platform/davinci/vpif_display.c
+++ b/drivers/media/platform/davinci/vpif_display.c
@@ -1114,6 +1114,14 @@ vpif_init_free_channel_objects:
 	return err;
 }
 
+static void free_vpif_objs(void)
+{
+	int i;
+
+	for (i = 0; i < VPIF_DISPLAY_MAX_DEVICES; i++)
+		kfree(vpif_obj.dev[i]);
+}
+
 static int vpif_async_bound(struct v4l2_async_notifier *notifier,
 			    struct v4l2_subdev *subdev,
 			    struct v4l2_async_subdev *asd)
@@ -1250,11 +1258,6 @@ static __init int vpif_probe(struct plat
 		return -EINVAL;
 	}
 
-	if (!pdev->dev.platform_data) {
-		dev_warn(&pdev->dev, "Missing platform data.  Giving up.\n");
-		return -EINVAL;
-	}
-
 	vpif_dev = &pdev->dev;
 	err = initialize_vpif();
 
@@ -1266,7 +1269,7 @@ static __init int vpif_probe(struct plat
 	err = v4l2_device_register(vpif_dev, &vpif_obj.v4l2_dev);
 	if (err) {
 		v4l2_err(vpif_dev->driver, "Error registering v4l2 device\n");
-		return err;
+		goto vpif_free;
 	}
 
 	while ((res = platform_get_resource(pdev, IORESOURCE_IRQ, res_idx))) {
@@ -1309,7 +1312,10 @@ static __init int vpif_probe(struct plat
 			if (vpif_obj.sd[i])
 				vpif_obj.sd[i]->grp_id = 1 << i;
 		}
-		vpif_probe_complete();
+		err = vpif_probe_complete();
+		if (err) {
+			goto probe_subdev_out;
+		}
 	} else {
 		vpif_obj.notifier.subdevs = vpif_obj.config->asd;
 		vpif_obj.notifier.num_subdevs = vpif_obj.config->asd_sizes[0];
@@ -1330,6 +1336,8 @@ probe_subdev_out:
 	kfree(vpif_obj.sd);
 vpif_unregister:
 	v4l2_device_unregister(&vpif_obj.v4l2_dev);
+vpif_free:
+	free_vpif_objs();
 
 	return err;
 }
@@ -1351,8 +1359,8 @@ static int vpif_remove(struct platform_d
 		ch = vpif_obj.dev[i];
 		/* Unregister video device */
 		video_unregister_device(&ch->video_dev);
-		kfree(vpif_obj.dev[i]);
 	}
+	free_vpif_objs();
 
 	return 0;
 }



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 030/126] media: dw2102: Fix memleak on sequence of probes
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (28 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 029/126] media: davinci: vpif_display: Mix memory leak on probe error path Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 031/126] net: phy: Fix the register offsets in Broadcom iProc mdio mux driver Greg Kroah-Hartman
                   ` (98 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Anton Vasilyev,
	Mauro Carvalho Chehab, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Anton Vasilyev <vasilyev@ispras.ru>

[ Upstream commit 299c7007e93645067e1d2743f4e50156de78c4ff ]

Each call to dw2102_probe() allocates memory by kmemdup for structures
p1100, s660, p7500 and s421, but there is no their deallocation.
dvb_usb_device_init() copies the corresponding structure into
dvb_usb_device->props, so there is no use of original structure after
dvb_usb_device_init().

The patch moves structures from global scope to local and adds their
deallocation.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/media/usb/dvb-usb/dw2102.c |   19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

--- a/drivers/media/usb/dvb-usb/dw2102.c
+++ b/drivers/media/usb/dvb-usb/dw2102.c
@@ -2103,14 +2103,12 @@ static struct dvb_usb_device_properties
 	}
 };
 
-static struct dvb_usb_device_properties *p1100;
 static const struct dvb_usb_device_description d1100 = {
 	"Prof 1100 USB ",
 	{&dw2102_table[PROF_1100], NULL},
 	{NULL},
 };
 
-static struct dvb_usb_device_properties *s660;
 static const struct dvb_usb_device_description d660 = {
 	"TeVii S660 USB",
 	{&dw2102_table[TEVII_S660], NULL},
@@ -2129,14 +2127,12 @@ static const struct dvb_usb_device_descr
 	{NULL},
 };
 
-static struct dvb_usb_device_properties *p7500;
 static const struct dvb_usb_device_description d7500 = {
 	"Prof 7500 USB DVB-S2",
 	{&dw2102_table[PROF_7500], NULL},
 	{NULL},
 };
 
-static struct dvb_usb_device_properties *s421;
 static const struct dvb_usb_device_description d421 = {
 	"TeVii S421 PCI",
 	{&dw2102_table[TEVII_S421], NULL},
@@ -2336,6 +2332,11 @@ static int dw2102_probe(struct usb_inter
 		const struct usb_device_id *id)
 {
 	int retval = -ENOMEM;
+	struct dvb_usb_device_properties *p1100;
+	struct dvb_usb_device_properties *s660;
+	struct dvb_usb_device_properties *p7500;
+	struct dvb_usb_device_properties *s421;
+
 	p1100 = kmemdup(&s6x0_properties,
 			sizeof(struct dvb_usb_device_properties), GFP_KERNEL);
 	if (!p1100)
@@ -2404,8 +2405,16 @@ static int dw2102_probe(struct usb_inter
 	    0 == dvb_usb_device_init(intf, &t220_properties,
 			 THIS_MODULE, NULL, adapter_nr) ||
 	    0 == dvb_usb_device_init(intf, &tt_s2_4600_properties,
-			 THIS_MODULE, NULL, adapter_nr))
+			 THIS_MODULE, NULL, adapter_nr)) {
+
+		/* clean up copied properties */
+		kfree(s421);
+		kfree(p7500);
+		kfree(s660);
+		kfree(p1100);
+
 		return 0;
+	}
 
 	retval = -ENODEV;
 	kfree(s421);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 031/126] net: phy: Fix the register offsets in Broadcom iProc mdio mux driver
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (29 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 030/126] media: dw2102: Fix memleak on sequence of probes Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 032/126] blk-mq: fix updating tags depth Greg Kroah-Hartman
                   ` (97 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Arun Parameswaran, Andrew Lunn,
	Florian Fainelli, David S. Miller, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Arun Parameswaran <arun.parameswaran@broadcom.com>

[ Upstream commit 77fefa93bfebe4df44f154f2aa5938e32630d0bf ]

Modify the register offsets in the Broadcom iProc mdio mux to start
from the top of the register address space.

Earlier, the base address pointed to the end of the block's register
space. The base address will now point to the start of the mdio's
address space. The offsets have been fixed to match this.

Signed-off-by: Arun Parameswaran <arun.parameswaran@broadcom.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/phy/mdio-mux-bcm-iproc.c |   20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

--- a/drivers/net/phy/mdio-mux-bcm-iproc.c
+++ b/drivers/net/phy/mdio-mux-bcm-iproc.c
@@ -22,7 +22,7 @@
 #include <linux/mdio-mux.h>
 #include <linux/delay.h>
 
-#define MDIO_PARAM_OFFSET		0x00
+#define MDIO_PARAM_OFFSET		0x23c
 #define MDIO_PARAM_MIIM_CYCLE		29
 #define MDIO_PARAM_INTERNAL_SEL		25
 #define MDIO_PARAM_BUS_ID		22
@@ -30,20 +30,22 @@
 #define MDIO_PARAM_PHY_ID		16
 #define MDIO_PARAM_PHY_DATA		0
 
-#define MDIO_READ_OFFSET		0x04
+#define MDIO_READ_OFFSET		0x240
 #define MDIO_READ_DATA_MASK		0xffff
-#define MDIO_ADDR_OFFSET		0x08
+#define MDIO_ADDR_OFFSET		0x244
 
-#define MDIO_CTRL_OFFSET		0x0C
+#define MDIO_CTRL_OFFSET		0x248
 #define MDIO_CTRL_WRITE_OP		0x1
 #define MDIO_CTRL_READ_OP		0x2
 
-#define MDIO_STAT_OFFSET		0x10
+#define MDIO_STAT_OFFSET		0x24c
 #define MDIO_STAT_DONE			1
 
 #define BUS_MAX_ADDR			32
 #define EXT_BUS_START_ADDR		16
 
+#define MDIO_REG_ADDR_SPACE_SIZE	0x250
+
 struct iproc_mdiomux_desc {
 	void *mux_handle;
 	void __iomem *base;
@@ -169,6 +171,14 @@ static int mdio_mux_iproc_probe(struct p
 	md->dev = &pdev->dev;
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (res->start & 0xfff) {
+		/* For backward compatibility in case the
+		 * base address is specified with an offset.
+		 */
+		dev_info(&pdev->dev, "fix base address in dt-blob\n");
+		res->start &= ~0xfff;
+		res->end = res->start + MDIO_REG_ADDR_SPACE_SIZE - 1;
+	}
 	md->base = devm_ioremap_resource(&pdev->dev, res);
 	if (IS_ERR(md->base)) {
 		dev_err(&pdev->dev, "failed to ioremap register\n");



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 032/126] blk-mq: fix updating tags depth
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (30 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 031/126] net: phy: Fix the register offsets in Broadcom iProc mdio mux driver Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 033/126] scsi: target: fix __transport_register_session locking Greg Kroah-Hartman
                   ` (96 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ewan D. Milne, Christoph Hellwig,
	Bart Van Assche, Omar Sandoval, Ming Lei, Jens Axboe,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ming Lei <ming.lei@redhat.com>

[ Upstream commit 75d6e175fc511e95ae3eb8f708680133bc211ed3 ]

The passed 'nr' from userspace represents the total depth, meantime
inside 'struct blk_mq_tags', 'nr_tags' stores the total tag depth,
and 'nr_reserved_tags' stores the reserved part.

There are two issues in blk_mq_tag_update_depth() now:

1) for growing tags, we should have used the passed 'nr', and keep the
number of reserved tags not changed.

2) the passed 'nr' should have been used for checking against
'tags->nr_tags', instead of number of the normal part.

This patch fixes the above two cases, and avoids kernel crash caused
by wrong resizing sbitmap queue.

Cc: "Ewan D. Milne" <emilne@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Omar Sandoval <osandov@fb.com>
Tested by: Marco Patalano <mpatalan@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 block/blk-mq-tag.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -416,8 +416,6 @@ int blk_mq_tag_update_depth(struct blk_m
 	if (tdepth <= tags->nr_reserved_tags)
 		return -EINVAL;
 
-	tdepth -= tags->nr_reserved_tags;
-
 	/*
 	 * If we are allowed to grow beyond the original size, allocate
 	 * a new set of tags before freeing the old one.
@@ -437,7 +435,8 @@ int blk_mq_tag_update_depth(struct blk_m
 		if (tdepth > 16 * BLKDEV_MAX_RQ)
 			return -EINVAL;
 
-		new = blk_mq_alloc_rq_map(set, hctx->queue_num, tdepth, 0);
+		new = blk_mq_alloc_rq_map(set, hctx->queue_num, tdepth,
+				tags->nr_reserved_tags);
 		if (!new)
 			return -ENOMEM;
 		ret = blk_mq_alloc_rqs(set, new, hctx->queue_num, tdepth);
@@ -454,7 +453,8 @@ int blk_mq_tag_update_depth(struct blk_m
 		 * Don't need (or can't) update reserved tags here, they
 		 * remain static and should never need resizing.
 		 */
-		sbitmap_queue_resize(&tags->bitmap_tags, tdepth);
+		sbitmap_queue_resize(&tags->bitmap_tags,
+				tdepth - tags->nr_reserved_tags);
 	}
 
 	return 0;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 033/126] scsi: target: fix __transport_register_session locking
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (31 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 032/126] blk-mq: fix updating tags depth Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 034/126] md/raid5: fix data corruption of replacements after originals dropped Greg Kroah-Hartman
                   ` (95 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mike Christie, Bart Van Assche,
	Christoph Hellwig, Martin K. Petersen, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mike Christie <mchristi@redhat.com>

[ Upstream commit 6a64f6e1591322beb8ce16e952a53582caf2a15c ]

When __transport_register_session is called from transport_register_session
irqs will already have been disabled, so we do not want the unlock irq call
to enable them until the higher level has done the final
spin_unlock_irqrestore/ spin_unlock_irq.

This has __transport_register_session use the save/restore call.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/target/target_core_transport.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -317,6 +317,7 @@ void __transport_register_session(
 {
 	const struct target_core_fabric_ops *tfo = se_tpg->se_tpg_tfo;
 	unsigned char buf[PR_REG_ISID_LEN];
+	unsigned long flags;
 
 	se_sess->se_tpg = se_tpg;
 	se_sess->fabric_sess_ptr = fabric_sess_ptr;
@@ -353,7 +354,7 @@ void __transport_register_session(
 			se_sess->sess_bin_isid = get_unaligned_be64(&buf[0]);
 		}
 
-		spin_lock_irq(&se_nacl->nacl_sess_lock);
+		spin_lock_irqsave(&se_nacl->nacl_sess_lock, flags);
 		/*
 		 * The se_nacl->nacl_sess pointer will be set to the
 		 * last active I_T Nexus for each struct se_node_acl.
@@ -362,7 +363,7 @@ void __transport_register_session(
 
 		list_add_tail(&se_sess->sess_acl_list,
 			      &se_nacl->acl_sess_list);
-		spin_unlock_irq(&se_nacl->nacl_sess_lock);
+		spin_unlock_irqrestore(&se_nacl->nacl_sess_lock, flags);
 	}
 	list_add_tail(&se_sess->sess_list, &se_tpg->tpg_sess_list);
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 034/126] md/raid5: fix data corruption of replacements after originals dropped
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (32 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 033/126] scsi: target: fix __transport_register_session locking Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 035/126] timers: Clear timer_base::must_forward_clk with timer_base::lock held Greg Kroah-Hartman
                   ` (94 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alex Chen, Alex Wu,
	Chung-Chiang Cheng, BingJing Chang, Shaohua Li, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: BingJing Chang <bingjingc@synology.com>

[ Upstream commit d63e2fc804c46e50eee825c5d3a7228e07048b47 ]

During raid5 replacement, the stripes can be marked with R5_NeedReplace
flag. Data can be read from being-replaced devices and written to
replacing spares without reading all other devices. (It's 'replace'
mode. s.replacing = 1) If a being-replaced device is dropped, the
replacement progress will be interrupted and resumed with pure recovery
mode. However, existing stripes before being interrupted cannot read
from the dropped device anymore. It prints lots of WARN_ON messages.
And it results in data corruption because existing stripes write
problematic data into its replacement device and update the progress.

\# Erase disks (1MB + 2GB)
dd if=/dev/zero of=/dev/sda bs=1MB count=2049
dd if=/dev/zero of=/dev/sdb bs=1MB count=2049
dd if=/dev/zero of=/dev/sdc bs=1MB count=2049
dd if=/dev/zero of=/dev/sdd bs=1MB count=2049
mdadm -C /dev/md0 -amd -R -l5 -n3 -x0 /dev/sd[abc] -z 2097152
\# Ensure array stores non-zero data
dd if=/root/data_4GB.iso of=/dev/md0 bs=1MB
\# Start replacement
mdadm /dev/md0 -a /dev/sdd
mdadm /dev/md0 --replace /dev/sda

Then, Hot-plug out /dev/sda during recovery, and wait for recovery done.
echo check > /sys/block/md0/md/sync_action
cat /sys/block/md0/md/mismatch_cnt # it will be greater than 0.

Soon after you hot-plug out /dev/sda, you will see many WARN_ON
messages. The replacement recovery will be interrupted shortly. After
the recovery finishes, it will result in data corruption.

Actually, it's just an unhandled case of replacement. In commit
<f94c0b6658c7> (md/raid5: fix interaction of 'replace' and 'recovery'.),
if a NeedReplace device is not UPTODATE then that is an error, the
commit just simply print WARN_ON but also mark these corrupted stripes
with R5_WantReplace. (it means it's ready for writes.)

To fix this case, we can leverage 'sync and replace' mode mentioned in
commit <9a3e1101b827> (md/raid5: detect and handle replacements during
recovery.). We can add logics to detect and use 'sync and replace' mode
for these stripes.

Reported-by: Alex Chen <alexchen@synology.com>
Reviewed-by: Alex Wu <alexwu@synology.com>
Reviewed-by: Chung-Chiang Cheng <cccheng@synology.com>
Signed-off-by: BingJing Chang <bingjingc@synology.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/md/raid5.c |    6 ++++++
 1 file changed, 6 insertions(+)

--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -4516,6 +4516,12 @@ static void analyse_stripe(struct stripe
 			s->failed++;
 			if (rdev && !test_bit(Faulty, &rdev->flags))
 				do_recovery = 1;
+			else if (!rdev) {
+				rdev = rcu_dereference(
+				    conf->disks[i].replacement);
+				if (rdev && !test_bit(Faulty, &rdev->flags))
+					do_recovery = 1;
+			}
 		}
 
 		if (test_bit(R5_InJournal, &dev->flags))



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 035/126] timers: Clear timer_base::must_forward_clk with timer_base::lock held
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (33 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 034/126] md/raid5: fix data corruption of replacements after originals dropped Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 036/126] media: camss: csid: Configure data type and decode format properly Greg Kroah-Hartman
                   ` (93 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Gaurav Kohli, Thomas Gleixner,
	john.stultz, sboyd, linux-arm-msm, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Gaurav Kohli <gkohli@codeaurora.org>

[ Upstream commit 363e934d8811d799c88faffc5bfca782fd728334 ]

timer_base::must_forward_clock is indicating that the base clock might be
stale due to a long idle sleep.

The forwarding of the base clock takes place in the timer softirq or when a
timer is enqueued to a base which is idle. If the enqueue of timer to an
idle base happens from a remote CPU, then the following race can happen:

  CPU0					CPU1
  run_timer_softirq			mod_timer

					base = lock_timer_base(timer);
  base->must_forward_clk = false
					if (base->must_forward_clk)
				       	    forward(base); -> skipped

					enqueue_timer(base, timer, idx);
					-> idx is calculated high due to
					   stale base
					unlock_timer_base(timer);
  base = lock_timer_base(timer);
  forward(base);

The root cause is that timer_base::must_forward_clk is cleared outside the
timer_base::lock held region, so the remote queuing CPU observes it as
cleared, but the base clock is still stale. This can cause large
granularity values for timers, i.e. the accuracy of the expiry time
suffers.

Prevent this by clearing the flag with timer_base::lock held, so that the
forwarding takes place before the cleared flag is observable by a remote
CPU.

Signed-off-by: Gaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john.stultz@linaro.org
Cc: sboyd@kernel.org
Cc: linux-arm-msm@vger.kernel.org
Link: https://lkml.kernel.org/r/1533199863-22748-1-git-send-email-gkohli@codeaurora.org
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/time/timer.c |   29 ++++++++++++++++-------------
 1 file changed, 16 insertions(+), 13 deletions(-)

--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1609,6 +1609,22 @@ static inline void __run_timers(struct t
 
 	raw_spin_lock_irq(&base->lock);
 
+	/*
+	 * timer_base::must_forward_clk must be cleared before running
+	 * timers so that any timer functions that call mod_timer() will
+	 * not try to forward the base. Idle tracking / clock forwarding
+	 * logic is only used with BASE_STD timers.
+	 *
+	 * The must_forward_clk flag is cleared unconditionally also for
+	 * the deferrable base. The deferrable base is not affected by idle
+	 * tracking and never forwarded, so clearing the flag is a NOOP.
+	 *
+	 * The fact that the deferrable base is never forwarded can cause
+	 * large variations in granularity for deferrable timers, but they
+	 * can be deferred for long periods due to idle anyway.
+	 */
+	base->must_forward_clk = false;
+
 	while (time_after_eq(jiffies, base->clk)) {
 
 		levels = collect_expired_timers(base, heads);
@@ -1628,19 +1644,6 @@ static __latent_entropy void run_timer_s
 {
 	struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]);
 
-	/*
-	 * must_forward_clk must be cleared before running timers so that any
-	 * timer functions that call mod_timer will not try to forward the
-	 * base. idle trcking / clock forwarding logic is only used with
-	 * BASE_STD timers.
-	 *
-	 * The deferrable base does not do idle tracking at all, so we do
-	 * not forward it. This can result in very large variations in
-	 * granularity for deferrable timers, but they can be deferred for
-	 * long periods due to idle.
-	 */
-	base->must_forward_clk = false;
-
 	__run_timers(base);
 	if (IS_ENABLED(CONFIG_NO_HZ_COMMON))
 		__run_timers(this_cpu_ptr(&timer_bases[BASE_DEF]));



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 036/126] media: camss: csid: Configure data type and decode format properly
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (34 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 035/126] timers: Clear timer_base::must_forward_clk with timer_base::lock held Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 037/126] gpu: ipu-v3: default to id 0 on missing OF alias Greg Kroah-Hartman
                   ` (92 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Todor Tomov, Hans Verkuil,
	Mauro Carvalho Chehab, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Todor Tomov <todor.tomov@linaro.org>

[ Upstream commit c628e78899ff8006b5f9d8206da54ed3bb994342 ]

The CSID decodes the input data stream. When the input comes from
the Test Generator the format of the stream is set on the source
media pad. When the input comes from the CSIPHY the format is the
one on the sink media pad. Use the proper format for each case.

Signed-off-by: Todor Tomov <todor.tomov@linaro.org>
Signed-off-by: Hans Verkuil <hansverk@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/media/platform/qcom/camss-8x16/camss-csid.c |   16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

--- a/drivers/media/platform/qcom/camss-8x16/camss-csid.c
+++ b/drivers/media/platform/qcom/camss-8x16/camss-csid.c
@@ -392,9 +392,6 @@ static int csid_set_stream(struct v4l2_s
 		    !media_entity_remote_pad(&csid->pads[MSM_CSID_PAD_SINK]))
 			return -ENOLINK;
 
-		dt = csid_get_fmt_entry(csid->fmt[MSM_CSID_PAD_SRC].code)->
-								data_type;
-
 		if (tg->enabled) {
 			/* Config Test Generator */
 			struct v4l2_mbus_framefmt *f =
@@ -416,6 +413,9 @@ static int csid_set_stream(struct v4l2_s
 			writel_relaxed(val, csid->base +
 				       CAMSS_CSID_TG_DT_n_CGG_0(0));
 
+			dt = csid_get_fmt_entry(
+				csid->fmt[MSM_CSID_PAD_SRC].code)->data_type;
+
 			/* 5:0 data type */
 			val = dt;
 			writel_relaxed(val, csid->base +
@@ -425,6 +425,9 @@ static int csid_set_stream(struct v4l2_s
 			val = tg->payload_mode;
 			writel_relaxed(val, csid->base +
 				       CAMSS_CSID_TG_DT_n_CGG_2(0));
+
+			df = csid_get_fmt_entry(
+				csid->fmt[MSM_CSID_PAD_SRC].code)->decode_format;
 		} else {
 			struct csid_phy_config *phy = &csid->phy;
 
@@ -439,13 +442,16 @@ static int csid_set_stream(struct v4l2_s
 
 			writel_relaxed(val,
 				       csid->base + CAMSS_CSID_CORE_CTRL_1);
+
+			dt = csid_get_fmt_entry(
+				csid->fmt[MSM_CSID_PAD_SINK].code)->data_type;
+			df = csid_get_fmt_entry(
+				csid->fmt[MSM_CSID_PAD_SINK].code)->decode_format;
 		}
 
 		/* Config LUT */
 
 		dt_shift = (cid % 4) * 8;
-		df = csid_get_fmt_entry(csid->fmt[MSM_CSID_PAD_SINK].code)->
-								decode_format;
 
 		val = readl_relaxed(csid->base + CAMSS_CSID_CID_LUT_VC_n(vc));
 		val &= ~(0xff << dt_shift);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 037/126] gpu: ipu-v3: default to id 0 on missing OF alias
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (35 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 036/126] media: camss: csid: Configure data type and decode format properly Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 038/126] misc: ti-st: Fix memory leak in the error path of probe() Greg Kroah-Hartman
                   ` (91 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Philipp Zabel, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Philipp Zabel <p.zabel@pengutronix.de>

[ Upstream commit 2d87e6c1b99c402360fdfe19ce4f579ab2f96adf ]

This is better than storing -ENODEV in the id number. This fixes SoCs
with only one IPU that don't specify an IPU alias in the device tree.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/ipu-v3/ipu-common.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/gpu/ipu-v3/ipu-common.c
+++ b/drivers/gpu/ipu-v3/ipu-common.c
@@ -1401,6 +1401,8 @@ static int ipu_probe(struct platform_dev
 		return -ENODEV;
 
 	ipu->id = of_alias_get_id(np, "ipu");
+	if (ipu->id < 0)
+		ipu->id = 0;
 
 	if (of_device_is_compatible(np, "fsl,imx6qp-ipu") &&
 	    IS_ENABLED(CONFIG_DRM)) {



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 038/126] misc: ti-st: Fix memory leak in the error path of probe()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (36 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 037/126] gpu: ipu-v3: default to id 0 on missing OF alias Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 039/126] uio: potential double frees if __uio_register_device() fails Greg Kroah-Hartman
                   ` (90 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Anton Vasilyev, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Anton Vasilyev <vasilyev@ispras.ru>

[ Upstream commit 81ae962d7f180c0092859440c82996cccb254976 ]

Free resources instead of direct return of the error code if kim_probe
fails.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/misc/ti-st/st_kim.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/misc/ti-st/st_kim.c
+++ b/drivers/misc/ti-st/st_kim.c
@@ -756,14 +756,14 @@ static int kim_probe(struct platform_dev
 	err = gpio_request(kim_gdata->nshutdown, "kim");
 	if (unlikely(err)) {
 		pr_err(" gpio %d request failed ", kim_gdata->nshutdown);
-		return err;
+		goto err_sysfs_group;
 	}
 
 	/* Configure nShutdown GPIO as output=0 */
 	err = gpio_direction_output(kim_gdata->nshutdown, 0);
 	if (unlikely(err)) {
 		pr_err(" unable to configure gpio %d", kim_gdata->nshutdown);
-		return err;
+		goto err_sysfs_group;
 	}
 	/* get reference of pdev for request_firmware
 	 */



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 039/126] uio: potential double frees if __uio_register_device() fails
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (37 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 038/126] misc: ti-st: Fix memory leak in the error path of probe() Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 040/126] firmware: vpd: Fix section enabled flag on vpd_section_destroy Greg Kroah-Hartman
                   ` (89 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Dan Carpenter, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

[ Upstream commit f019f07ecf6a6b8bd6d7853bce70925d90af02d1 ]

The uio_unregister_device() function assumes that if "info->uio_dev" is
non-NULL that means "info" is fully allocated.  Setting info->uio_de
has to be the last thing in the function.

In the current code, if request_threaded_irq() fails then we return with
info->uio_dev set to non-NULL but info is not fully allocated and it can
lead to double frees.

Fixes: beafc54c4e2f ("UIO: Add the User IO core code")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/uio/uio.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -841,8 +841,6 @@ int __uio_register_device(struct module
 	if (ret)
 		goto err_uio_dev_add_attributes;
 
-	info->uio_dev = idev;
-
 	if (info->irq && (info->irq != UIO_IRQ_CUSTOM)) {
 		/*
 		 * Note that we deliberately don't use devm_request_irq
@@ -858,6 +856,7 @@ int __uio_register_device(struct module
 			goto err_request_irq;
 	}
 
+	info->uio_dev = idev;
 	return 0;
 
 err_request_irq:



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 040/126] firmware: vpd: Fix section enabled flag on vpd_section_destroy
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (38 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 039/126] uio: potential double frees if __uio_register_device() fails Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 041/126] Drivers: hv: vmbus: Cleanup synic memory free path Greg Kroah-Hartman
                   ` (88 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Anton Vasilyev, Guenter Roeck, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Anton Vasilyev <vasilyev@ispras.ru>

[ Upstream commit 45ca3f76de0507ecf143f770570af2942f263812 ]

static struct ro_vpd and rw_vpd are initialized by vpd_sections_init()
in vpd_probe() based on header's ro and rw sizes.
In vpd_remove() vpd_section_destroy() performs deinitialization based
on enabled flag, which is set to true by vpd_sections_init().
This leads to call of vpd_section_destroy() on already destroyed section
for probe-release-probe-release sequence if first probe performs
ro_vpd initialization and second probe does not initialize it.

The patch adds changing enabled flag on vpd_section_destroy and adds
cleanup on the error path of vpd_sections_init.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/firmware/google/vpd.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/drivers/firmware/google/vpd.c
+++ b/drivers/firmware/google/vpd.c
@@ -246,6 +246,7 @@ static int vpd_section_destroy(struct vp
 		sysfs_remove_bin_file(vpd_kobj, &sec->bin_attr);
 		kfree(sec->raw_name);
 		memunmap(sec->baseaddr);
+		sec->enabled = false;
 	}
 
 	return 0;
@@ -279,8 +280,10 @@ static int vpd_sections_init(phys_addr_t
 		ret = vpd_section_init("rw", &rw_vpd,
 				       physaddr + sizeof(struct vpd_cbmem) +
 				       header.ro_size, header.rw_size);
-		if (ret)
+		if (ret) {
+			vpd_section_destroy(&ro_vpd);
 			return ret;
+		}
 	}
 
 	return 0;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 041/126] Drivers: hv: vmbus: Cleanup synic memory free path
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (39 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 040/126] firmware: vpd: Fix section enabled flag on vpd_section_destroy Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 042/126] tty: rocket: Fix possible buffer overwrite on register_PCI Greg Kroah-Hartman
                   ` (87 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Michael Kelley, Dan Carpenter,
	K. Y. Srinivasan, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michael Kelley <mikelley@microsoft.com>

[ Upstream commit 572086325ce9a9e348b8748e830653f3959e88b6 ]

clk_evt memory is not being freed when the synic is shutdown
or when there is an allocation error.  Add the appropriate
kfree() call, along with a comment to clarify how the memory
gets freed after an allocation error.  Make the free path
consistent by removing checks for NULL since kfree() and
free_page() already do the check.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/hv/hv.c |   14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -196,6 +196,10 @@ int hv_synic_alloc(void)
 
 	return 0;
 err:
+	/*
+	 * Any memory allocations that succeeded will be freed when
+	 * the caller cleans up by calling hv_synic_free()
+	 */
 	return -ENOMEM;
 }
 
@@ -208,12 +212,10 @@ void hv_synic_free(void)
 		struct hv_per_cpu_context *hv_cpu
 			= per_cpu_ptr(hv_context.cpu_context, cpu);
 
-		if (hv_cpu->synic_event_page)
-			free_page((unsigned long)hv_cpu->synic_event_page);
-		if (hv_cpu->synic_message_page)
-			free_page((unsigned long)hv_cpu->synic_message_page);
-		if (hv_cpu->post_msg_page)
-			free_page((unsigned long)hv_cpu->post_msg_page);
+		kfree(hv_cpu->clk_evt);
+		free_page((unsigned long)hv_cpu->synic_event_page);
+		free_page((unsigned long)hv_cpu->synic_message_page);
+		free_page((unsigned long)hv_cpu->post_msg_page);
 	}
 
 	kfree(hv_context.hv_numa_map);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 042/126] tty: rocket: Fix possible buffer overwrite on register_PCI
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (40 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 041/126] Drivers: hv: vmbus: Cleanup synic memory free path Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 043/126] f2fs: fix to active page in lru list for read path Greg Kroah-Hartman
                   ` (86 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Anton Vasilyev, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Anton Vasilyev <vasilyev@ispras.ru>

[ Upstream commit 0419056ec8fd01ddf5460d2dba0491aad22657dd ]

If number of isa and pci boards exceed NUM_BOARDS on the path
rp_init()->init_PCI()->register_PCI() then buffer overwrite occurs
in register_PCI() on assign rcktpt_io_addr[i].

The patch adds check on upper bound for index of registered
board in register_PCI.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/tty/rocket.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/tty/rocket.c
+++ b/drivers/tty/rocket.c
@@ -1894,7 +1894,7 @@ static __init int register_PCI(int i, st
 	ByteIO_t UPCIRingInd = 0;
 
 	if (!dev || !pci_match_id(rocket_pci_ids, dev) ||
-	    pci_enable_device(dev))
+	    pci_enable_device(dev) || i >= NUM_BOARDS)
 		return 0;
 
 	rcktpt_io_addr[i] = pci_resource_start(dev, 0);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 043/126] f2fs: fix to active page in lru list for read path
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (41 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 042/126] tty: rocket: Fix possible buffer overwrite on register_PCI Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 044/126] f2fs: do not set free of current section Greg Kroah-Hartman
                   ` (85 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Xianrong Zhou, Chao Yu, Jaegeuk Kim,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Chao Yu <yuchao0@huawei.com>

[ Upstream commit 82cf4f132e6d16dca6fc3bd955019246141bc645 ]

If config CONFIG_F2FS_FAULT_INJECTION is on, for both read or write path
we will call find_lock_page() to get the page, but for read path, it
missed to passing FGP_ACCESSED to allocator to active the page in LRU
list, result in being reclaimed in advance incorrectly, fix it.

Reported-by: Xianrong Zhou <zhouxianrong@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/f2fs/f2fs.h |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1766,8 +1766,13 @@ static inline struct page *f2fs_grab_cac
 						pgoff_t index, bool for_write)
 {
 #ifdef CONFIG_F2FS_FAULT_INJECTION
-	struct page *page = find_lock_page(mapping, index);
+	struct page *page;
 
+	if (!for_write)
+		page = find_get_page_flags(mapping, index,
+						FGP_LOCK | FGP_ACCESSED);
+	else
+		page = find_lock_page(mapping, index);
 	if (page)
 		return page;
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 044/126] f2fs: do not set free of current section
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (42 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 043/126] f2fs: fix to active page in lru list for read path Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 045/126] f2fs: fix defined but not used build warnings Greg Kroah-Hartman
                   ` (84 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Yunlong Song, Chao Yu, Jaegeuk Kim,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Yunlong Song <yunlong.song@huawei.com>

[ Upstream commit 3611ce9911267cb93d364bd71ddea6821278d11f ]

For the case when sbi->segs_per_sec > 1, take section:segment = 5 for
example, if segment 1 is just used and allocate new segment 2, and the
blocks of segment 1 is invalidated, at this time, the previous code will
use __set_test_and_free to free the free_secmap and free_sections++,
this is not correct since it is still a current section, so fix it.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/f2fs/segment.h |    3 +++
 1 file changed, 3 insertions(+)

--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -414,6 +414,8 @@ static inline void __set_test_and_free(s
 	if (test_and_clear_bit(segno, free_i->free_segmap)) {
 		free_i->free_segments++;
 
+		if (IS_CURSEC(sbi, secno))
+			goto skip_free;
 		next = find_next_bit(free_i->free_segmap,
 				start_segno + sbi->segs_per_sec, start_segno);
 		if (next >= start_segno + sbi->segs_per_sec) {
@@ -421,6 +423,7 @@ static inline void __set_test_and_free(s
 				free_i->free_sections++;
 		}
 	}
+skip_free:
 	spin_unlock(&free_i->segmap_lock);
 }
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 045/126] f2fs: fix defined but not used build warnings
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (43 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 044/126] f2fs: do not set free of current section Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 046/126] perf tools: Allow overriding MAX_NR_CPUS at compile time Greg Kroah-Hartman
                   ` (83 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Randy Dunlap, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Randy Dunlap <rdunlap@infradead.org>

[ Upstream commit cb15d1e43db0a6341c1e26ac6a2c74e61b74f1aa ]

Fix build warnings in f2fs when CONFIG_PROC_FS is not enabled
by marking the unused functions as __maybe_unused.

../fs/f2fs/sysfs.c:519:12: warning: 'segment_info_seq_show' defined but not used [-Wunused-function]
../fs/f2fs/sysfs.c:546:12: warning: 'segment_bits_seq_show' defined but not used [-Wunused-function]
../fs/f2fs/sysfs.c:570:12: warning: 'iostat_info_seq_show' defined but not used [-Wunused-function]

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Chao Yu <yuchao0@huawei.com>
Cc: linux-f2fs-devel@lists.sourceforge.net
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/f2fs/sysfs.c |   10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -9,6 +9,7 @@
  * it under the terms of the GNU General Public License version 2 as
  * published by the Free Software Foundation.
  */
+#include <linux/compiler.h>
 #include <linux/proc_fs.h>
 #include <linux/f2fs_fs.h>
 #include <linux/seq_file.h>
@@ -381,7 +382,8 @@ static struct kobject f2fs_feat = {
 	.kset	= &f2fs_kset,
 };
 
-static int segment_info_seq_show(struct seq_file *seq, void *offset)
+static int __maybe_unused segment_info_seq_show(struct seq_file *seq,
+						void *offset)
 {
 	struct super_block *sb = seq->private;
 	struct f2fs_sb_info *sbi = F2FS_SB(sb);
@@ -408,7 +410,8 @@ static int segment_info_seq_show(struct
 	return 0;
 }
 
-static int segment_bits_seq_show(struct seq_file *seq, void *offset)
+static int __maybe_unused segment_bits_seq_show(struct seq_file *seq,
+						void *offset)
 {
 	struct super_block *sb = seq->private;
 	struct f2fs_sb_info *sbi = F2FS_SB(sb);
@@ -432,7 +435,8 @@ static int segment_bits_seq_show(struct
 	return 0;
 }
 
-static int iostat_info_seq_show(struct seq_file *seq, void *offset)
+static int __maybe_unused iostat_info_seq_show(struct seq_file *seq,
+					       void *offset)
 {
 	struct super_block *sb = seq->private;
 	struct f2fs_sb_info *sbi = F2FS_SB(sb);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 046/126] perf tools: Allow overriding MAX_NR_CPUS at compile time
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (44 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 045/126] f2fs: fix defined but not used build warnings Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 047/126] NFSv4.0 fix client reference leak in callback Greg Kroah-Hartman
                   ` (82 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Christophe Leroy, Alexander Shishkin,
	Peter Zijlstra, linuxppc-dev, Arnaldo Carvalho de Melo,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Christophe Leroy <christophe.leroy@c-s.fr>

[ Upstream commit 21b8732eb4479b579bda9ee38e62b2c312c2a0e5 ]

After update of kernel, the perf tool doesn't run anymore on my 32MB RAM
powerpc board, but still runs on a 128MB RAM board:

  ~# strace perf
  execve("/usr/sbin/perf", ["perf"], [/* 12 vars */]) = -1 ENOMEM (Cannot allocate memory)
  --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
  +++ killed by SIGSEGV +++
  Segmentation fault

objdump -x shows that .bss section has a huge size of 24Mbytes:

 27 .bss          016baca8  101cebb8  101cebb8  001cd988  2**3

With especially the following objects having quite big size:

  10205f80 l     O .bss	00140000     runtime_cycles_stats
  10345f80 l     O .bss	00140000     runtime_stalled_cycles_front_stats
  10485f80 l     O .bss	00140000     runtime_stalled_cycles_back_stats
  105c5f80 l     O .bss	00140000     runtime_branches_stats
  10705f80 l     O .bss	00140000     runtime_cacherefs_stats
  10845f80 l     O .bss	00140000     runtime_l1_dcache_stats
  10985f80 l     O .bss	00140000     runtime_l1_icache_stats
  10ac5f80 l     O .bss	00140000     runtime_ll_cache_stats
  10c05f80 l     O .bss	00140000     runtime_itlb_cache_stats
  10d45f80 l     O .bss	00140000     runtime_dtlb_cache_stats
  10e85f80 l     O .bss	00140000     runtime_cycles_in_tx_stats
  10fc5f80 l     O .bss	00140000     runtime_transaction_stats
  11105f80 l     O .bss	00140000     runtime_elision_stats
  11245f80 l     O .bss	00140000     runtime_topdown_total_slots
  11385f80 l     O .bss	00140000     runtime_topdown_slots_retired
  114c5f80 l     O .bss	00140000     runtime_topdown_slots_issued
  11605f80 l     O .bss	00140000     runtime_topdown_fetch_bubbles
  11745f80 l     O .bss	00140000     runtime_topdown_recovery_bubbles

This is due to commit 4d255766d28b1 ("perf: Bump max number of cpus
to 1024"), because many tables are sized with MAX_NR_CPUS

This patch gives the opportunity to redefine MAX_NR_CPUS via

  $ make EXTRA_CFLAGS=-DMAX_NR_CPUS=1

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20170922112043.8349468C57@po15668-vm-win7.idsi0.si.c-s.fr
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 tools/perf/perf.h |    2 ++
 1 file changed, 2 insertions(+)

--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -24,7 +24,9 @@ static inline unsigned long long rdclock
 	return ts.tv_sec * 1000000000ULL + ts.tv_nsec;
 }
 
+#ifndef MAX_NR_CPUS
 #define MAX_NR_CPUS			1024
+#endif
 
 extern const char *input_name;
 extern bool perf_host, perf_guest;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 047/126] NFSv4.0 fix client reference leak in callback
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (45 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 046/126] perf tools: Allow overriding MAX_NR_CPUS at compile time Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 048/126] perf c2c report: Fix crash for empty browser Greg Kroah-Hartman
                   ` (81 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Olga Kornievskaia, Anna Schumaker,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Olga Kornievskaia <kolga@netapp.com>

[ Upstream commit 32cd3ee511f4e07ca25d71163b50e704808d22f4 ]

If there is an error during processing of a callback message, it leads
to refrence leak on the client structure and eventually an unclean
superblock.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/nfs/callback_xdr.c |   11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

--- a/fs/nfs/callback_xdr.c
+++ b/fs/nfs/callback_xdr.c
@@ -904,16 +904,21 @@ static __be32 nfs4_callback_compound(str
 
 	if (hdr_arg.minorversion == 0) {
 		cps.clp = nfs4_find_client_ident(SVC_NET(rqstp), hdr_arg.cb_ident);
-		if (!cps.clp || !check_gss_callback_principal(cps.clp, rqstp))
+		if (!cps.clp || !check_gss_callback_principal(cps.clp, rqstp)) {
+			if (cps.clp)
+				nfs_put_client(cps.clp);
 			goto out_invalidcred;
+		}
 	}
 
 	cps.minorversion = hdr_arg.minorversion;
 	hdr_res.taglen = hdr_arg.taglen;
 	hdr_res.tag = hdr_arg.tag;
-	if (encode_compound_hdr_res(&xdr_out, &hdr_res) != 0)
+	if (encode_compound_hdr_res(&xdr_out, &hdr_res) != 0) {
+		if (cps.clp)
+			nfs_put_client(cps.clp);
 		return rpc_system_err;
-
+	}
 	while (status == 0 && nops != hdr_arg.nops) {
 		status = process_op(nops, rqstp, &xdr_in,
 				    rqstp->rq_argp, &xdr_out, rqstp->rq_resp,



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 048/126] perf c2c report: Fix crash for empty browser
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (46 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 047/126] NFSv4.0 fix client reference leak in callback Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 049/126] perf evlist: Fix error out while applying initial delay and LBR Greg Kroah-Hartman
                   ` (80 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, rodia, Jiri Olsa,
	Arnaldo Carvalho de Melo, Alexander Shishkin, David Ahern,
	Don Zickus, Joe Mario, Namhyung Kim, Peter Zijlstra, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jiri Olsa <jolsa@kernel.org>

[ Upstream commit 73978332572ccf5e364c31e9a70ba953f8202b46 ]

'perf c2c' scans read/write accesses and tries to find false sharing
cases, so when the events it wants were not asked for or ended up not
taking place, we get no histograms.

So do not try to display entry details if there's not any. Currently
this ends up in crash:

  $ perf c2c report # then press 'd'
  perf: Segmentation fault
  $

Committer testing:

Before:

Record a perf.data file without events of interest to 'perf c2c report',
then call it and press 'd':

  # perf record sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.001 MB perf.data (6 samples) ]
  # perf c2c report
  perf: Segmentation fault
  -------- backtrace --------
  perf[0x5b1d2a]
  /lib64/libc.so.6(+0x346df)[0x7fcb566e36df]
  perf[0x46fcae]
  perf[0x4a9f1e]
  perf[0x4aa220]
  perf(main+0x301)[0x42c561]
  /lib64/libc.so.6(__libc_start_main+0xe9)[0x7fcb566cff29]
  perf(_start+0x29)[0x42c999]
  #

After the patch the segfault doesn't take place, a follow up patch to
tell the user why nothing changes when 'd' is pressed would be good.

Reported-by: rodia@autistici.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: f1c5fd4d0bb9 ("perf c2c report: Add TUI cacheline browser")
Link: http://lkml.kernel.org/r/20180724062008.26126-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 tools/perf/builtin-c2c.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2229,6 +2229,9 @@ static int perf_c2c__browse_cacheline(st
 	" s             Togle full lenght of symbol and source line columns \n"
 	" q             Return back to cacheline list \n";
 
+	if (!he)
+		return 0;
+
 	/* Display compact version first. */
 	c2c.symbol_full = false;
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 049/126] perf evlist: Fix error out while applying initial delay and LBR
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (47 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 048/126] perf c2c report: Fix crash for empty browser Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 050/126] macintosh/via-pmu: Add missing mmio accessors Greg Kroah-Hartman
                   ` (79 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Sunil K Pandey, Kan Liang, Jiri Olsa,
	Arnaldo Carvalho de Melo, Andi Kleen, Namhyung Kim,
	Peter Zijlstra, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kan Liang <kan.liang@linux.intel.com>

[ Upstream commit 95035c5e167ae6e740b1ddd30210ae0eaf39a5db ]

'perf record' will error out if both --delay and LBR are applied.

For example:

  # perf record -D 1000 -a -e cycles -j any -- sleep 2
  Error:
  dummy:HG: PMU Hardware doesn't support sampling/overflow-interrupts.
  Try 'perf stat'
  #

A dummy event is added implicitly for initial delay, which has the same
configurations as real sampling events. The dummy event is a software
event. If LBR is configured, perf must error out.

The dummy event will only be used to track PERF_RECORD_MMAP while perf
waits for the initial delay to enable the real events. The BRANCH_STACK
bit can be safely cleared for the dummy event.

After applying the patch:

  # perf record -D 1000 -a -e cycles -j any -- sleep 2
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.054 MB perf.data (828 samples) ]
  #

Reported-by: Sunil K Pandey <sunil.k.pandey@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1531145722-16404-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 tools/perf/util/evsel.c |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -824,6 +824,12 @@ static void apply_config_terms(struct pe
 	}
 }
 
+static bool is_dummy_event(struct perf_evsel *evsel)
+{
+	return (evsel->attr.type == PERF_TYPE_SOFTWARE) &&
+	       (evsel->attr.config == PERF_COUNT_SW_DUMMY);
+}
+
 /*
  * The enable_on_exec/disabled value strategy:
  *
@@ -1054,6 +1060,14 @@ void perf_evsel__config(struct perf_evse
 		else
 			perf_evsel__reset_sample_bit(evsel, PERIOD);
 	}
+
+	/*
+	 * For initial_delay, a dummy event is added implicitly.
+	 * The software event will trigger -EOPNOTSUPP error out,
+	 * if BRANCH_STACK bit is set.
+	 */
+	if (opts->initial_delay && is_dummy_event(evsel))
+		perf_evsel__reset_sample_bit(evsel, BRANCH_STACK);
 }
 
 static int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads)



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 050/126] macintosh/via-pmu: Add missing mmio accessors
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (48 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 049/126] perf evlist: Fix error out while applying initial delay and LBR Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 051/126] ath9k: report tx status on EOSP Greg Kroah-Hartman
                   ` (78 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stan Johnson, Finn Thain,
	Geert Uytterhoeven, Michael Ellerman, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Finn Thain <fthain@telegraphics.com.au>

[ Upstream commit 576d5290d678a651b9f36050fc1717e0573aca13 ]

Add missing in_8() accessors to init_pmu() and pmu_sr_intr().

This fixes several sparse warnings:
drivers/macintosh/via-pmu.c:536:29: warning: dereference of noderef expression
drivers/macintosh/via-pmu.c:537:33: warning: dereference of noderef expression
drivers/macintosh/via-pmu.c:1455:17: warning: dereference of noderef expression
drivers/macintosh/via-pmu.c:1456:69: warning: dereference of noderef expression

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/macintosh/via-pmu.c |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

--- a/drivers/macintosh/via-pmu.c
+++ b/drivers/macintosh/via-pmu.c
@@ -532,8 +532,9 @@ init_pmu(void)
 	int timeout;
 	struct adb_request req;
 
-	out_8(&via[B], via[B] | TREQ);			/* negate TREQ */
-	out_8(&via[DIRB], (via[DIRB] | TREQ) & ~TACK);	/* TACK in, TREQ out */
+	/* Negate TREQ. Set TACK to input and TREQ to output. */
+	out_8(&via[B], in_8(&via[B]) | TREQ);
+	out_8(&via[DIRB], (in_8(&via[DIRB]) | TREQ) & ~TACK);
 
 	pmu_request(&req, NULL, 2, PMU_SET_INTR_MASK, pmu_intr_mask);
 	timeout =  100000;
@@ -1455,8 +1456,8 @@ pmu_sr_intr(void)
 	struct adb_request *req;
 	int bite = 0;
 
-	if (via[B] & TREQ) {
-		printk(KERN_ERR "PMU: spurious SR intr (%x)\n", via[B]);
+	if (in_8(&via[B]) & TREQ) {
+		printk(KERN_ERR "PMU: spurious SR intr (%x)\n", in_8(&via[B]));
 		out_8(&via[IFR], SR_INT);
 		return NULL;
 	}



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 051/126] ath9k: report tx status on EOSP
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (49 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 050/126] macintosh/via-pmu: Add missing mmio accessors Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 052/126] ath9k_hw: fix channel maximum power level test Greg Kroah-Hartman
                   ` (77 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Felix Fietkau, Kalle Valo, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Felix Fietkau <nbd@nbd.name>

[ Upstream commit 36e14a787dd0b459760de3622e9709edb745a6af ]

Fixes missed indications of end of U-APSD service period to mac80211

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/wireless/ath/ath9k/xmit.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -86,7 +86,8 @@ static void ath_tx_status(struct ieee802
 	struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
 	struct ieee80211_sta *sta = info->status.status_driver_data[0];
 
-	if (info->flags & IEEE80211_TX_CTL_REQ_TX_STATUS) {
+	if (info->flags & (IEEE80211_TX_CTL_REQ_TX_STATUS |
+			   IEEE80211_TX_STATUS_EOSP)) {
 		ieee80211_tx_status(hw, skb);
 		return;
 	}



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 052/126] ath9k_hw: fix channel maximum power level test
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (50 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 051/126] ath9k: report tx status on EOSP Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 053/126] ath10k: prevent active scans on potential unusable channels Greg Kroah-Hartman
                   ` (76 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Felix Fietkau, Kalle Valo, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Felix Fietkau <nbd@nbd.name>

[ Upstream commit 461d8a6bb9879b0e619752d040292e67aa06f1d2 ]

The tx power applied by set_txpower is limited by the CTL (conformance
test limit) entries in the EEPROM. These can change based on the user
configured regulatory domain.
Depending on the EEPROM data this can cause the tx power to become too
limited, if the original regdomain CTLs impose lower limits than the CTLs
of the user configured regdomain.

To fix this issue, set the initial channel limits without any CTL
restrictions and only apply the CTL at run time when setting the channel
and the real tx power.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/wireless/ath/ath9k/hw.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/drivers/net/wireless/ath/ath9k/hw.c
+++ b/drivers/net/wireless/ath/ath9k/hw.c
@@ -2915,16 +2915,19 @@ void ath9k_hw_apply_txpower(struct ath_h
 	struct ath_regulatory *reg = ath9k_hw_regulatory(ah);
 	struct ieee80211_channel *channel;
 	int chan_pwr, new_pwr;
+	u16 ctl = NO_CTL;
 
 	if (!chan)
 		return;
 
+	if (!test)
+		ctl = ath9k_regd_get_ctl(reg, chan);
+
 	channel = chan->chan;
 	chan_pwr = min_t(int, channel->max_power * 2, MAX_RATE_POWER);
 	new_pwr = min_t(int, chan_pwr, reg->power_limit);
 
-	ah->eep_ops->set_txpower(ah, chan,
-				 ath9k_regd_get_ctl(reg, chan),
+	ah->eep_ops->set_txpower(ah, chan, ctl,
 				 get_antenna_gain(ah, chan), new_pwr, test);
 }
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 053/126] ath10k: prevent active scans on potential unusable channels
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (51 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 052/126] ath9k_hw: fix channel maximum power level test Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 054/126] wlcore: Set rx_status boottime_ns field on rx Greg Kroah-Hartman
                   ` (75 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Sven Eckelmann, Kalle Valo, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sven Eckelmann <sven.eckelmann@openmesh.com>

[ Upstream commit 3f259111583801013cb605bb4414aa529adccf1c ]

The QCA4019 hw1.0 firmware 10.4-3.2.1-00050 and 10.4-3.5.3-00053 (and most
likely all other) seem to ignore the WMI_CHAN_FLAG_DFS flag during the
scan. This results in transmission (probe requests) on channels which are
not "available" for transmissions.

Since the firmware is closed source and nothing can be done from our side
to fix the problem in it, the driver has to work around this problem. The
WMI_CHAN_FLAG_PASSIVE seems to be interpreted by the firmware to not
scan actively on a channel unless an AP was detected on it. Simple probe
requests will then be transmitted by the STA on the channel.

ath10k must therefore also use this flag when it queues a radar channel for
scanning. This should reduce the chance of an active scan when the channel
might be "unusable" for transmissions.

Fixes: e8a50f8ba44b ("ath10k: introduce DFS implementation")
Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/wireless/ath/ath10k/mac.c |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -3074,6 +3074,13 @@ static int ath10k_update_channel_list(st
 			passive = channel->flags & IEEE80211_CHAN_NO_IR;
 			ch->passive = passive;
 
+			/* the firmware is ignoring the "radar" flag of the
+			 * channel and is scanning actively using Probe Requests
+			 * on "Radar detection"/DFS channels which are not
+			 * marked as "available"
+			 */
+			ch->passive |= ch->chan_radar;
+
 			ch->freq = channel->center_freq;
 			ch->band_center_freq1 = channel->center_freq;
 			ch->min_power = 0;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 054/126] wlcore: Set rx_status boottime_ns field on rx
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (52 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 053/126] ath10k: prevent active scans on potential unusable channels Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 055/126] rpmsg: core: add support to power domains for devices Greg Kroah-Hartman
                   ` (74 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Loic Poulain, Kalle Valo, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Loic Poulain <loic.poulain@linaro.org>

[ Upstream commit 37a634f60fd6dfbda2c312657eec7ef0750546e7 ]

When receiving a beacon or probe response, we should update the
boottime_ns field which is the timestamp the frame was received at.
(cf mac80211.h)

This fixes a scanning issue with Android since it relies on this
timestamp to determine when the AP has been seen for the last time
(via the nl80211 BSS_LAST_SEEN_BOOTTIME parameter).

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/wireless/ti/wlcore/rx.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/drivers/net/wireless/ti/wlcore/rx.c
+++ b/drivers/net/wireless/ti/wlcore/rx.c
@@ -59,7 +59,7 @@ static u32 wlcore_rx_get_align_buf_size(
 static void wl1271_rx_status(struct wl1271 *wl,
 			     struct wl1271_rx_descriptor *desc,
 			     struct ieee80211_rx_status *status,
-			     u8 beacon)
+			     u8 beacon, u8 probe_rsp)
 {
 	memset(status, 0, sizeof(struct ieee80211_rx_status));
 
@@ -106,6 +106,9 @@ static void wl1271_rx_status(struct wl12
 		}
 	}
 
+	if (beacon || probe_rsp)
+		status->boottime_ns = ktime_get_boot_ns();
+
 	if (beacon)
 		wlcore_set_pending_regdomain_ch(wl, (u16)desc->channel,
 						status->band);
@@ -191,7 +194,8 @@ static int wl1271_rx_handle_data(struct
 	if (ieee80211_is_data_present(hdr->frame_control))
 		is_data = 1;
 
-	wl1271_rx_status(wl, desc, IEEE80211_SKB_RXCB(skb), beacon);
+	wl1271_rx_status(wl, desc, IEEE80211_SKB_RXCB(skb), beacon,
+			 ieee80211_is_probe_resp(hdr->frame_control));
 	wlcore_hw_set_rx_csum(wl, desc, skb);
 
 	seq_num = (le16_to_cpu(hdr->seq_ctrl) & IEEE80211_SCTL_SEQ) >> 4;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 055/126] rpmsg: core: add support to power domains for devices
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (53 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 054/126] wlcore: Set rx_status boottime_ns field on rx Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 056/126] MIPS: Fix ISA virt/bus conversion for non-zero PHYS_OFFSET Greg Kroah-Hartman
                   ` (73 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Srinivas Kandagatla, Bjorn Andersson,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>

[ Upstream commit fe782affd0f440a4e60e2cc81b8f2eccb2923113 ]

Some of the rpmsg devices need to switch on power domains to communicate
with remote processor. For example on Qualcomm DB820c platform LPASS
power domain needs to switched on for any kind of audio services.
This patch adds the missing power domain support in rpmsg core.

Without this patch attempting to play audio via QDSP on DB820c would
reboot the system.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/rpmsg/rpmsg_core.c |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/drivers/rpmsg/rpmsg_core.c
+++ b/drivers/rpmsg/rpmsg_core.c
@@ -23,6 +23,7 @@
 #include <linux/module.h>
 #include <linux/rpmsg.h>
 #include <linux/of_device.h>
+#include <linux/pm_domain.h>
 #include <linux/slab.h>
 
 #include "rpmsg_internal.h"
@@ -418,6 +419,10 @@ static int rpmsg_dev_probe(struct device
 	struct rpmsg_endpoint *ept = NULL;
 	int err;
 
+	err = dev_pm_domain_attach(dev, true);
+	if (err)
+		goto out;
+
 	if (rpdrv->callback) {
 		strncpy(chinfo.name, rpdev->id.name, RPMSG_NAME_SIZE);
 		chinfo.src = rpdev->src;
@@ -459,6 +464,8 @@ static int rpmsg_dev_remove(struct devic
 
 	rpdrv->remove(rpdev);
 
+	dev_pm_domain_detach(dev, true);
+
 	if (rpdev->ept)
 		rpmsg_destroy_ept(rpdev->ept);
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 056/126] MIPS: Fix ISA virt/bus conversion for non-zero PHYS_OFFSET
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (54 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 055/126] rpmsg: core: add support to power domains for devices Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 057/126] ata: libahci: Allow reconfigure of DEVSLP register Greg Kroah-Hartman
                   ` (72 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Paul Burton, James Hogan,
	Ralf Baechle, linux-mips, Vladimir Kondratiev, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Paul Burton <paul.burton@mips.com>

[ Upstream commit 0494d7ffdcebc6935410ea0719b24ab626675351 ]

isa_virt_to_bus() & isa_bus_to_virt() claim to treat ISA bus addresses
as being identical to physical addresses, but they fail to do so in the
presence of a non-zero PHYS_OFFSET.

Correct this by having them use virt_to_phys() & phys_to_virt(), which
consolidates the calculations to one place & ensures that ISA bus
addresses do indeed match physical addresses.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/20047/
Cc: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Vladimir Kondratiev <vladimir.kondratiev@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/mips/include/asm/io.h |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/arch/mips/include/asm/io.h
+++ b/arch/mips/include/asm/io.h
@@ -141,14 +141,14 @@ static inline void * phys_to_virt(unsign
 /*
  * ISA I/O bus memory addresses are 1:1 with the physical address.
  */
-static inline unsigned long isa_virt_to_bus(volatile void * address)
+static inline unsigned long isa_virt_to_bus(volatile void *address)
 {
-	return (unsigned long)address - PAGE_OFFSET;
+	return virt_to_phys(address);
 }
 
-static inline void * isa_bus_to_virt(unsigned long address)
+static inline void *isa_bus_to_virt(unsigned long address)
 {
-	return (void *)(address + PAGE_OFFSET);
+	return phys_to_virt(address);
 }
 
 #define isa_page_to_bus page_to_phys



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 057/126] ata: libahci: Allow reconfigure of DEVSLP register
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (55 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 056/126] MIPS: Fix ISA virt/bus conversion for non-zero PHYS_OFFSET Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 058/126] ata: libahci: Correct setting " Greg Kroah-Hartman
                   ` (71 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Srinivas Pandruvada, Hans de Goede,
	Tejun Heo, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>

[ Upstream commit 11c291461b6ea8d1195a96d6bba6673a94aacebc ]

There are two modes in which DEVSLP can be entered. The OS initiated or
hardware autonomous.

In hardware autonomous mode, BIOS configures the AHCI controller and the
device to enable DEVSLP. But they may not be ideal for all cases. So in
this case, OS should be able to reconfigure DEVSLP register.

Currently if the DEVSLP is already enabled, we can't set again as it will
simply return. There are some systems where the firmware is setting high
DITO by default, in this case we can't modify here to correct settings.
With the default in several seconds, we are not able to transition to
DEVSLP.

This change will allow reconfiguration of devslp register if DITO is
different.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/ata/libahci.c |   18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -2096,7 +2096,7 @@ static void ahci_set_aggressive_devslp(s
 	struct ahci_host_priv *hpriv = ap->host->private_data;
 	void __iomem *port_mmio = ahci_port_base(ap);
 	struct ata_device *dev = ap->link.device;
-	u32 devslp, dm, dito, mdat, deto;
+	u32 devslp, dm, dito, mdat, deto, dito_conf;
 	int rc;
 	unsigned int err_mask;
 
@@ -2120,8 +2120,15 @@ static void ahci_set_aggressive_devslp(s
 		return;
 	}
 
-	/* device sleep was already enabled */
-	if (devslp & PORT_DEVSLP_ADSE)
+	dm = (devslp & PORT_DEVSLP_DM_MASK) >> PORT_DEVSLP_DM_OFFSET;
+	dito = devslp_idle_timeout / (dm + 1);
+	if (dito > 0x3ff)
+		dito = 0x3ff;
+
+	dito_conf = (devslp >> PORT_DEVSLP_DITO_OFFSET) & 0x3FF;
+
+	/* device sleep was already enabled and same dito */
+	if ((devslp & PORT_DEVSLP_ADSE) && (dito_conf == dito))
 		return;
 
 	/* set DITO, MDAT, DETO and enable DevSlp, need to stop engine first */
@@ -2129,11 +2136,6 @@ static void ahci_set_aggressive_devslp(s
 	if (rc)
 		return;
 
-	dm = (devslp & PORT_DEVSLP_DM_MASK) >> PORT_DEVSLP_DM_OFFSET;
-	dito = devslp_idle_timeout / (dm + 1);
-	if (dito > 0x3ff)
-		dito = 0x3ff;
-
 	/* Use the nominal value 10 ms if the read MDAT is zero,
 	 * the nominal value of DETO is 20 ms.
 	 */



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 058/126] ata: libahci: Correct setting of DEVSLP register
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (56 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 057/126] ata: libahci: Allow reconfigure of DEVSLP register Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 059/126] scsi: 3ware: fix return 0 on the error path of probe Greg Kroah-Hartman
                   ` (70 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Srinivas Pandruvada,
	Rafael J. Wysocki, Hans de Goede, Tejun Heo, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>

[ Upstream commit 2dbb3ec29a6c069035857a2fc4c24e80e5dfe3cc ]

We have seen that on some platforms, SATA device never show any DEVSLP
residency. This prevent power gating of SATA IP, which prevent system
to transition to low power mode in systems with SLP_S0 aka modern
standby systems. The PHY logic is off only in DEVSLP not in slumber.
Reference:
https://www.intel.com/content/dam/www/public/us/en/documents/datasheets
/332995-skylake-i-o-platform-datasheet-volume-1.pdf
Section 28.7.6.1

Here driver is trying to do read-modify-write the devslp register. But
not resetting the bits for which this driver will modify values (DITO,
MDAT and DETO). So simply reset those bits before updating to new values.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/ata/libahci.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -2153,6 +2153,8 @@ static void ahci_set_aggressive_devslp(s
 		deto = 20;
 	}
 
+	/* Make dito, mdat, deto bits to 0s */
+	devslp &= ~GENMASK_ULL(24, 2);
 	devslp |= ((dito << PORT_DEVSLP_DITO_OFFSET) |
 		   (mdat << PORT_DEVSLP_MDAT_OFFSET) |
 		   (deto << PORT_DEVSLP_DETO_OFFSET) |



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 059/126] scsi: 3ware: fix return 0 on the error path of probe
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (57 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 058/126] ata: libahci: Correct setting " Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 060/126] tools/testing/nvdimm: kaddr and pfn can be NULL to ->direct_access() Greg Kroah-Hartman
                   ` (69 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Anton Vasilyev, Adam Radford,
	Martin K. Petersen, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Anton Vasilyev <vasilyev@ispras.ru>

[ Upstream commit 4dc98c1995482262e70e83ef029135247fafe0f2 ]

tw_probe() returns 0 in case of fail of tw_initialize_device_extension(),
pci_resource_start() or tw_reset_sequence() and releases resources.
twl_probe() returns 0 in case of fail of twl_initialize_device_extension(),
pci_iomap() and twl_reset_sequence().  twa_probe() returns 0 in case of
fail of tw_initialize_device_extension(), ioremap() and
twa_reset_sequence().

The patch adds retval initialization for these cases.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
Acked-by: Adam Radford <aradford@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/3w-9xxx.c |    6 +++++-
 drivers/scsi/3w-sas.c  |    3 +++
 drivers/scsi/3w-xxxx.c |    2 ++
 3 files changed, 10 insertions(+), 1 deletion(-)

--- a/drivers/scsi/3w-9xxx.c
+++ b/drivers/scsi/3w-9xxx.c
@@ -2042,6 +2042,7 @@ static int twa_probe(struct pci_dev *pde
 
 	if (twa_initialize_device_extension(tw_dev)) {
 		TW_PRINTK(tw_dev->host, TW_DRIVER, 0x25, "Failed to initialize device extension");
+		retval = -ENOMEM;
 		goto out_free_device_extension;
 	}
 
@@ -2064,6 +2065,7 @@ static int twa_probe(struct pci_dev *pde
 	tw_dev->base_addr = ioremap(mem_addr, mem_len);
 	if (!tw_dev->base_addr) {
 		TW_PRINTK(tw_dev->host, TW_DRIVER, 0x35, "Failed to ioremap");
+		retval = -ENOMEM;
 		goto out_release_mem_region;
 	}
 
@@ -2071,8 +2073,10 @@ static int twa_probe(struct pci_dev *pde
 	TW_DISABLE_INTERRUPTS(tw_dev);
 
 	/* Initialize the card */
-	if (twa_reset_sequence(tw_dev, 0))
+	if (twa_reset_sequence(tw_dev, 0)) {
+		retval = -ENOMEM;
 		goto out_iounmap;
+	}
 
 	/* Set host specific parameters */
 	if ((pdev->device == PCI_DEVICE_ID_3WARE_9650SE) ||
--- a/drivers/scsi/3w-sas.c
+++ b/drivers/scsi/3w-sas.c
@@ -1597,6 +1597,7 @@ static int twl_probe(struct pci_dev *pde
 
 	if (twl_initialize_device_extension(tw_dev)) {
 		TW_PRINTK(tw_dev->host, TW_DRIVER, 0x1a, "Failed to initialize device extension");
+		retval = -ENOMEM;
 		goto out_free_device_extension;
 	}
 
@@ -1611,6 +1612,7 @@ static int twl_probe(struct pci_dev *pde
 	tw_dev->base_addr = pci_iomap(pdev, 1, 0);
 	if (!tw_dev->base_addr) {
 		TW_PRINTK(tw_dev->host, TW_DRIVER, 0x1c, "Failed to ioremap");
+		retval = -ENOMEM;
 		goto out_release_mem_region;
 	}
 
@@ -1620,6 +1622,7 @@ static int twl_probe(struct pci_dev *pde
 	/* Initialize the card */
 	if (twl_reset_sequence(tw_dev, 0)) {
 		TW_PRINTK(tw_dev->host, TW_DRIVER, 0x1d, "Controller reset failed during probe");
+		retval = -ENOMEM;
 		goto out_iounmap;
 	}
 
--- a/drivers/scsi/3w-xxxx.c
+++ b/drivers/scsi/3w-xxxx.c
@@ -2280,6 +2280,7 @@ static int tw_probe(struct pci_dev *pdev
 
 	if (tw_initialize_device_extension(tw_dev)) {
 		printk(KERN_WARNING "3w-xxxx: Failed to initialize device extension.");
+		retval = -ENOMEM;
 		goto out_free_device_extension;
 	}
 
@@ -2294,6 +2295,7 @@ static int tw_probe(struct pci_dev *pdev
 	tw_dev->base_addr = pci_resource_start(pdev, 0);
 	if (!tw_dev->base_addr) {
 		printk(KERN_WARNING "3w-xxxx: Failed to get io address.");
+		retval = -ENOMEM;
 		goto out_release_mem_region;
 	}
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 060/126] tools/testing/nvdimm: kaddr and pfn can be NULL to ->direct_access()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (58 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 059/126] scsi: 3ware: fix return 0 on the error path of probe Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 061/126] ath10k: disable bundle mgmt tx completion event support Greg Kroah-Hartman
                   ` (68 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dan Williams, Huaisheng Ye,
	Ross Zwisler, Dave Jiang, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Huaisheng Ye <yehs1@lenovo.com>

[ Upstream commit 45df5d3dc0c7289c1e67afe6d2ba806ad5174314 ]

The mock / test version of pmem_direct_access() needs to check the
validity of pointers kaddr and pfn for NULL assignment. If anyone
equals to NULL, it doesn't need to calculate the value.

If pointer equals to NULL, that is to say callers may have no need for
kaddr or pfn, so this patch is prepared for allowing them to pass in
NULL instead of having to pass in a local pointer or variable that
they then just throw away.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 tools/testing/nvdimm/pmem-dax.c |   12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

--- a/tools/testing/nvdimm/pmem-dax.c
+++ b/tools/testing/nvdimm/pmem-dax.c
@@ -31,17 +31,21 @@ long __pmem_direct_access(struct pmem_de
 	if (get_nfit_res(pmem->phys_addr + offset)) {
 		struct page *page;
 
-		*kaddr = pmem->virt_addr + offset;
+		if (kaddr)
+			*kaddr = pmem->virt_addr + offset;
 		page = vmalloc_to_page(pmem->virt_addr + offset);
-		*pfn = page_to_pfn_t(page);
+		if (pfn)
+			*pfn = page_to_pfn_t(page);
 		pr_debug_ratelimited("%s: pmem: %p pgoff: %#lx pfn: %#lx\n",
 				__func__, pmem, pgoff, page_to_pfn(page));
 
 		return 1;
 	}
 
-	*kaddr = pmem->virt_addr + offset;
-	*pfn = phys_to_pfn_t(pmem->phys_addr + offset, pmem->pfn_flags);
+	if (kaddr)
+		*kaddr = pmem->virt_addr + offset;
+	if (pfn)
+		*pfn = phys_to_pfn_t(pmem->phys_addr + offset, pmem->pfn_flags);
 
 	/*
 	 * If badblocks are present, limit known good range to the



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 061/126] ath10k: disable bundle mgmt tx completion event support
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (59 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 060/126] tools/testing/nvdimm: kaddr and pfn can be NULL to ->direct_access() Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 062/126] Bluetooth: hidp: Fix handling of strncpy for hid->name information Greg Kroah-Hartman
                   ` (67 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Surabhi Vishnoi, Rakesh Pillai,
	Kalle Valo, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Surabhi Vishnoi <svishnoi@codeaurora.org>

[ Upstream commit 673bc519c55843c68c3aecff71a4101e79d28d2b ]

The tx completion of multiple mgmt frames can be bundled
in a single event and sent by the firmware to host, if this
capability is not disabled explicitly by the host. If the host
cannot handle the bundled mgmt tx completion, this capability
support needs to be disabled in the wmi init cmd, sent to the firmware.

Add the host capability indication flag in the wmi ready command,
to let firmware know the features supported by the host driver.
This field is ignored if it is not supported by firmware.

Set the host capability indication flag(i.e. host_capab) to zero,
for disabling the support of bundle mgmt tx completion. This will
indicate the firmware to send completion event for every mgmt tx
completion, instead of bundling them together and sending in a single
event.

Tested HW: WCN3990
Tested FW: WLAN.HL.2.0-01188-QCAHLSWMTPLZ-1

Signed-off-by: Surabhi Vishnoi <svishnoi@codeaurora.org>
Signed-off-by: Rakesh Pillai <pillair@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/wireless/ath/ath10k/wmi-tlv.c |    5 +++++
 drivers/net/wireless/ath/ath10k/wmi-tlv.h |    5 +++++
 2 files changed, 10 insertions(+)

--- a/drivers/net/wireless/ath/ath10k/wmi-tlv.c
+++ b/drivers/net/wireless/ath/ath10k/wmi-tlv.c
@@ -1451,6 +1451,11 @@ static struct sk_buff *ath10k_wmi_tlv_op
 	cfg->keep_alive_pattern_size = __cpu_to_le32(0);
 	cfg->max_tdls_concurrent_sleep_sta = __cpu_to_le32(1);
 	cfg->max_tdls_concurrent_buffer_sta = __cpu_to_le32(1);
+	cfg->wmi_send_separate = __cpu_to_le32(0);
+	cfg->num_ocb_vdevs = __cpu_to_le32(0);
+	cfg->num_ocb_channels = __cpu_to_le32(0);
+	cfg->num_ocb_schedules = __cpu_to_le32(0);
+	cfg->host_capab = __cpu_to_le32(0);
 
 	ath10k_wmi_put_host_mem_chunks(ar, chunks);
 
--- a/drivers/net/wireless/ath/ath10k/wmi-tlv.h
+++ b/drivers/net/wireless/ath/ath10k/wmi-tlv.h
@@ -1228,6 +1228,11 @@ struct wmi_tlv_resource_config {
 	__le32 keep_alive_pattern_size;
 	__le32 max_tdls_concurrent_sleep_sta;
 	__le32 max_tdls_concurrent_buffer_sta;
+	__le32 wmi_send_separate;
+	__le32 num_ocb_vdevs;
+	__le32 num_ocb_channels;
+	__le32 num_ocb_schedules;
+	__le32 host_capab;
 } __packed;
 
 struct wmi_tlv_init_cmd {



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 062/126] Bluetooth: hidp: Fix handling of strncpy for hid->name information
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (60 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 061/126] ath10k: disable bundle mgmt tx completion event support Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 063/126] x86/mm: Remove in_nmi() warning from vmalloc_fault() Greg Kroah-Hartman
                   ` (66 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Marcel Holtmann, Johan Hedberg, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Marcel Holtmann <marcel@holtmann.org>

[ Upstream commit b3cadaa485f0c20add1644a5c877b0765b285c0c ]

This fixes two issues with setting hid->name information.

  CC      net/bluetooth/hidp/core.o
In function ‘hidp_setup_hid’,
    inlined from ‘hidp_session_dev_init’ at net/bluetooth/hidp/core.c:815:9,
    inlined from ‘hidp_session_new’ at net/bluetooth/hidp/core.c:953:8,
    inlined from ‘hidp_connection_add’ at net/bluetooth/hidp/core.c:1366:8:
net/bluetooth/hidp/core.c:778:2: warning: ‘strncpy’ output may be truncated copying 127 bytes from a string of length 127 [-Wstringop-truncation]
  strncpy(hid->name, req->name, sizeof(req->name) - 1);
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  CC      net/bluetooth/hidp/core.o
net/bluetooth/hidp/core.c: In function ‘hidp_setup_hid’:
net/bluetooth/hidp/core.c:778:38: warning: argument to ‘sizeof’ in ‘strncpy’ call is the same expression as the source; did you mean to use the size of the destination? [-Wsizeof-pointer-memaccess]
  strncpy(hid->name, req->name, sizeof(req->name));
                                      ^

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/bluetooth/hidp/core.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/bluetooth/hidp/core.c
+++ b/net/bluetooth/hidp/core.c
@@ -775,7 +775,7 @@ static int hidp_setup_hid(struct hidp_se
 	hid->version = req->version;
 	hid->country = req->country;
 
-	strncpy(hid->name, req->name, sizeof(req->name) - 1);
+	strncpy(hid->name, req->name, sizeof(hid->name));
 
 	snprintf(hid->phys, sizeof(hid->phys), "%pMR",
 		 &l2cap_pi(session->ctrl_sock->sk)->chan->src);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 063/126] x86/mm: Remove in_nmi() warning from vmalloc_fault()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (61 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 062/126] Bluetooth: hidp: Fix handling of strncpy for hid->name information Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 064/126] pinctrl: imx: off by one in imx_pinconf_group_dbg_show() Greg Kroah-Hartman
                   ` (65 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Joerg Roedel, Thomas Gleixner,
	David H. Gutteridge, H . Peter Anvin, linux-mm, Linus Torvalds,
	Andy Lutomirski, Dave Hansen, Josh Poimboeuf, Juergen Gross,
	Peter Zijlstra, Borislav Petkov, Jiri Kosina, Boris Ostrovsky,
	Brian Gerst, David Laight, Denys Vlasenko, Eduardo Valentin,
	Will Deacon, aliguori, daniel.gruss, hughd, keescook,
	Andrea Arcangeli, Waiman Long, Pavel Machek,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, joro, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Joerg Roedel <jroedel@suse.de>

[ Upstream commit 6863ea0cda8725072522cd78bda332d9a0b73150 ]

It is perfectly okay to take page-faults, especially on the
vmalloc area while executing an NMI handler. Remove the
warning.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: David H. Gutteridge <dhgutteridge@sympatico.ca>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: linux-mm@kvack.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Waiman Long <llong@redhat.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: joro@8bytes.org
Link: https://lkml.kernel.org/r/1532533683-5988-2-git-send-email-joro@8bytes.org
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/mm/fault.c |    2 --
 1 file changed, 2 deletions(-)

--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -317,8 +317,6 @@ static noinline int vmalloc_fault(unsign
 	if (!(address >= VMALLOC_START && address < VMALLOC_END))
 		return -1;
 
-	WARN_ON_ONCE(in_nmi());
-
 	/*
 	 * Synchronize this task's top level page-table
 	 * with the 'reference' page table.



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 064/126] pinctrl: imx: off by one in imx_pinconf_group_dbg_show()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (62 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 063/126] x86/mm: Remove in_nmi() warning from vmalloc_fault() Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 065/126] gpio: ml-ioh: Fix buffer underwrite on probe error path Greg Kroah-Hartman
                   ` (64 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dong Aisheng, Dan Carpenter,
	Linus Walleij, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

[ Upstream commit b4859f3edb47825f62d1b2efdd75fe7945996f49 ]

The > should really be >= here.  It's harmless because
pinctrl_generic_get_group() will return a NULL if group is invalid.

Fixes: ae75ff814538 ("pinctrl: pinctrl-imx: add imx pinctrl core driver")
Reported-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pinctrl/freescale/pinctrl-imx.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/pinctrl/freescale/pinctrl-imx.c
+++ b/drivers/pinctrl/freescale/pinctrl-imx.c
@@ -389,7 +389,7 @@ static void imx_pinconf_group_dbg_show(s
 	const char *name;
 	int i, ret;
 
-	if (group > pctldev->num_groups)
+	if (group >= pctldev->num_groups)
 		return;
 
 	seq_printf(s, "\n");



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 065/126] gpio: ml-ioh: Fix buffer underwrite on probe error path
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (63 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 064/126] pinctrl: imx: off by one in imx_pinconf_group_dbg_show() Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 066/126] pinctrl/amd: only handle irq if it is pending and unmasked Greg Kroah-Hartman
                   ` (63 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Anton Vasilyev, Linus Walleij, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Anton Vasilyev <vasilyev@ispras.ru>

[ Upstream commit 4bf4eed44bfe288f459496eaf38089502ef91a79 ]

If ioh_gpio_probe() fails on devm_irq_alloc_descs() then chip may point
to any element of chip_save array, so reverse iteration from pointer chip
may become chip_save[-1] and gpiochip_remove() will operate with wrong
memory.

The patch fix the error path of ioh_gpio_probe() to correctly bypass
chip_save array.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpio/gpio-ml-ioh.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/gpio/gpio-ml-ioh.c
+++ b/drivers/gpio/gpio-ml-ioh.c
@@ -497,9 +497,10 @@ static int ioh_gpio_probe(struct pci_dev
 	return 0;
 
 err_gpiochip_add:
+	chip = chip_save;
 	while (--i >= 0) {
-		chip--;
 		gpiochip_remove(&chip->gpio);
+		chip++;
 	}
 	kfree(chip_save);
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 066/126] pinctrl/amd: only handle irq if it is pending and unmasked
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (64 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 065/126] gpio: ml-ioh: Fix buffer underwrite on probe error path Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 067/126] net: mvneta: fix mtu change on port without link Greg Kroah-Hartman
                   ` (62 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Daniel Kurtz, Thomas Gleixner,
	Linus Walleij, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Kurtz <djkurtz@chromium.org>

[ Upstream commit 8bbed1eef001fdfc0ee9595f64cc4f769d265af4 ]

The AMD pinctrl driver demultiplexes GPIO interrupts and fires off their
individual handlers.

If one of these GPIO irqs is configured as a level interrupt, and its
downstream handler is a threaded ONESHOT interrupt, the GPIO interrupt
source is masked by handle_level_irq() until the eventual return of the
threaded irq handler.  During this time the level GPIO interrupt status
will still report as high until the actual gpio source is cleared - both
in the individual GPIO interrupt status bit (INTERRUPT_STS_OFF) and in
its corresponding "WAKE_INT_STATUS_REG" bit.

Thus, if another GPIO interrupt occurs during this time,
amd_gpio_irq_handler() will see that the (masked-and-not-yet-cleared)
level irq is still pending and incorrectly call its handler again.

To fix this, have amd_gpio_irq_handler() check for both interrupts status
and mask before calling generic_handle_irq().

Note: Is it possible that this bug was the source of the interrupt storm
on Ryzen when using chained interrupts before commit ba714a9c1dea85
("pinctrl/amd: Use regular interrupt instead of chained")?

Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pinctrl/pinctrl-amd.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/pinctrl/pinctrl-amd.c
+++ b/drivers/pinctrl/pinctrl-amd.c
@@ -530,7 +530,8 @@ static irqreturn_t amd_gpio_irq_handler(
 		/* Each status bit covers four pins */
 		for (i = 0; i < 4; i++) {
 			regval = readl(regs + i);
-			if (!(regval & PIN_IRQ_PENDING))
+			if (!(regval & PIN_IRQ_PENDING) ||
+			    !(regval & BIT(INTERRUPT_MASK_OFF)))
 				continue;
 			irq = irq_find_mapping(gc->irqdomain, irqnr + i);
 			generic_handle_irq(irq);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 067/126] net: mvneta: fix mtu change on port without link
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (65 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 066/126] pinctrl/amd: only handle irq if it is pending and unmasked Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 068/126] f2fs: try grabbing node page lock aggressively in sync scenario Greg Kroah-Hartman
                   ` (61 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Yelena Krivosheev, Gregory CLEMENT,
	David S. Miller, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Yelena Krivosheev <yelena@marvell.com>

[ Upstream commit 8466baf788ec3e18836bd9c91ba0b1a07af25878 ]

It is incorrect to enable TX/RX queues (call by mvneta_port_up()) for
port without link. Indeed MTU change for interface without link causes TX
queues to stuck.

Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP
network unit")
Signed-off-by: Yelena Krivosheev <yelena@marvell.com>
[gregory.clement: adding Fixes tags and rewording commit log]
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/marvell/mvneta.c |    1 -
 1 file changed, 1 deletion(-)

--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -3195,7 +3195,6 @@ static int mvneta_change_mtu(struct net_
 
 	on_each_cpu(mvneta_percpu_enable, pp, true);
 	mvneta_start_dev(pp);
-	mvneta_port_up(pp);
 
 	netdev_update_features(dev);
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 068/126] f2fs: try grabbing node page lock aggressively in sync scenario
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (66 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 067/126] net: mvneta: fix mtu change on port without link Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 069/126] pktcdvd: Fix possible Spectre-v1 for pkt_devs Greg Kroah-Hartman
                   ` (60 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Yunlei He, Chao Yu, Jaegeuk Kim, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Chao Yu <yuchao0@huawei.com>

[ Upstream commit 4b270a8cc5047682f0a3f3f9af3b498408dbd2bc ]

In synchronous scenario, like in checkpoint(), we are going to flush
dirty node pages to device synchronously, we can easily failed
writebacking node page due to trylock_page() failure, especially in
condition of intensive lock competition, which can cause long latency
of checkpoint(). So let's use lock_page() in synchronous scenario to
avoid this issue.

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/f2fs/node.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1610,7 +1610,9 @@ next_step:
 						!is_cold_node(page)))
 				continue;
 lock_node:
-			if (!trylock_page(page))
+			if (wbc->sync_mode == WB_SYNC_ALL)
+				lock_page(page);
+			else if (!trylock_page(page))
 				continue;
 
 			if (unlikely(page->mapping != NODE_MAPPING(sbi))) {



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 069/126] pktcdvd: Fix possible Spectre-v1 for pkt_devs
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (67 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 068/126] f2fs: try grabbing node page lock aggressively in sync scenario Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 070/126] f2fs: fix to skip GC if type in SSA and SIT is inconsistent Greg Kroah-Hartman
                   ` (59 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jinbum Park, Jens Axboe, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jinbum Park <jinb.park7@gmail.com>

[ Upstream commit 55690c07b44a82cc3359ce0c233f4ba7d80ba145 ]

User controls @dev_minor which to be used as index of pkt_devs.
So, It can be exploited via Spectre-like attack. (speculative execution)

This kind of attack leaks address of pkt_devs, [1]
It leads an attacker to bypass security mechanism such as KASLR.

So sanitize @dev_minor before using it to prevent attack.

[1] https://github.com/jinb-park/linux-exploit/
tree/master/exploit-remaining-spectre-gadget/leak_pkt_devs.c

Signed-off-by: Jinbum Park <jinb.park7@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/block/pktcdvd.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -67,7 +67,7 @@
 #include <scsi/scsi.h>
 #include <linux/debugfs.h>
 #include <linux/device.h>
-
+#include <linux/nospec.h>
 #include <linux/uaccess.h>
 
 #define DRIVER_NAME	"pktcdvd"
@@ -2231,6 +2231,8 @@ static struct pktcdvd_device *pkt_find_d
 {
 	if (dev_minor >= MAX_WRITERS)
 		return NULL;
+
+	dev_minor = array_index_nospec(dev_minor, MAX_WRITERS);
 	return pkt_devs[dev_minor];
 }
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 070/126] f2fs: fix to skip GC if type in SSA and SIT is inconsistent
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (68 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 069/126] pktcdvd: Fix possible Spectre-v1 for pkt_devs Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:41 ` [PATCH 4.14 071/126] tpm_tis_spi: Pass the SPI IRQ down to the driver Greg Kroah-Hartman
                   ` (58 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Chao Yu, Jaegeuk Kim, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Chao Yu <yuchao0@huawei.com>

[ Upstream commit 10d255c3540239c7920f52d2eb223756e186af56 ]

If segment type in SSA and SIT is inconsistent, we will encounter below
BUG_ON during GC, to avoid this panic, let's just skip doing GC on such
segment.

The bug is triggered with image reported in below link:

https://bugzilla.kernel.org/show_bug.cgi?id=200223

[  388.060262] ------------[ cut here ]------------
[  388.060268] kernel BUG at /home/y00370721/git/devf2fs/gc.c:989!
[  388.061172] invalid opcode: 0000 [#1] SMP
[  388.061773] Modules linked in: f2fs(O) bluetooth ecdh_generic xt_tcpudp iptable_filter ip_tables x_tables lp ttm drm_kms_helper drm intel_rapl sb_edac crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel fb_sys_fops ppdev aes_x86_64 syscopyarea crypto_simd sysfillrect parport_pc joydev sysimgblt glue_helper parport cryptd i2c_piix4 serio_raw mac_hid btrfs hid_generic usbhid hid raid6_pq psmouse pata_acpi floppy
[  388.064247] CPU: 7 PID: 4151 Comm: f2fs_gc-7:0 Tainted: G           O    4.13.0-rc1+ #26
[  388.065306] Hardware name: Xen HVM domU, BIOS 4.1.2_115-900.260_ 11/06/2015
[  388.066058] task: ffff880201583b80 task.stack: ffffc90004d7c000
[  388.069948] RIP: 0010:do_garbage_collect+0xcc8/0xcd0 [f2fs]
[  388.070766] RSP: 0018:ffffc90004d7fc68 EFLAGS: 00010202
[  388.071783] RAX: ffff8801ed227000 RBX: 0000000000000001 RCX: ffffea0007b489c0
[  388.072700] RDX: ffff880000000000 RSI: 0000000000000001 RDI: ffffea0007b489c0
[  388.073607] RBP: ffffc90004d7fd58 R08: 0000000000000003 R09: ffffea0007b489dc
[  388.074619] R10: 0000000000000000 R11: 0052782ab317138d R12: 0000000000000018
[  388.075625] R13: 0000000000000018 R14: ffff880211ceb000 R15: ffff880211ceb000
[  388.076687] FS:  0000000000000000(0000) GS:ffff880214fc0000(0000) knlGS:0000000000000000
[  388.083277] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  388.084536] CR2: 0000000000e18c60 CR3: 00000001ecf2e000 CR4: 00000000001406e0
[  388.085748] Call Trace:
[  388.086690]  ? find_next_bit+0xb/0x10
[  388.088091]  f2fs_gc+0x1a8/0x9d0 [f2fs]
[  388.088888]  ? lock_timer_base+0x7d/0xa0
[  388.090213]  ? try_to_del_timer_sync+0x44/0x60
[  388.091698]  gc_thread_func+0x342/0x4b0 [f2fs]
[  388.092892]  ? wait_woken+0x80/0x80
[  388.094098]  kthread+0x109/0x140
[  388.095010]  ? f2fs_gc+0x9d0/0x9d0 [f2fs]
[  388.096043]  ? kthread_park+0x60/0x60
[  388.097281]  ret_from_fork+0x25/0x30
[  388.098401] Code: ff ff 48 83 e8 01 48 89 44 24 58 e9 27 f8 ff ff 48 83 e8 01 e9 78 fc ff ff 48 8d 78 ff e9 17 fb ff ff 48 83 ef 01 e9 4d f4 ff ff <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55
[  388.100864] RIP: do_garbage_collect+0xcc8/0xcd0 [f2fs] RSP: ffffc90004d7fc68
[  388.101810] ---[ end trace 81c73d6e6b7da61d ]---

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/f2fs/gc.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -958,7 +958,13 @@ static int do_garbage_collect(struct f2f
 			goto next;
 
 		sum = page_address(sum_page);
-		f2fs_bug_on(sbi, type != GET_SUM_TYPE((&sum->footer)));
+		if (type != GET_SUM_TYPE((&sum->footer))) {
+			f2fs_msg(sbi->sb, KERN_ERR, "Inconsistent segment (%u) "
+				"type [%d, %d] in SSA and SIT",
+				segno, type, GET_SUM_TYPE((&sum->footer)));
+			set_sbi_flag(sbi, SBI_NEED_FSCK);
+			goto next;
+		}
 
 		/*
 		 * this is to avoid deadlock:



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 071/126] tpm_tis_spi: Pass the SPI IRQ down to the driver
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (69 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 070/126] f2fs: fix to skip GC if type in SSA and SIT is inconsistent Greg Kroah-Hartman
@ 2018-09-17 22:41 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 072/126] tpm/tpm_i2c_infineon: switch to i2c_lock_bus(..., I2C_LOCK_SEGMENT) Greg Kroah-Hartman
                   ` (57 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mark Brown, Linus Walleij,
	Jarkko Sakkinen, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Linus Walleij <linus.walleij@linaro.org>

[ Upstream commit 1a339b658d9dbe1471f67b78237cf8fa08bbbeb5 ]

An SPI TPM device managed directly on an embedded board using
the SPI bus and some GPIO or similar line as IRQ handler will
pass the IRQn from the TPM device associated with the SPI
device. This is already handled by the SPI core, so make sure
to pass this down to the core as well.

(The TPM core habit of using -1 to signal no IRQ is dubious
(as IRQ 0 is NO_IRQ) but I do not want to mess with that
semantic in this patch.)

Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/char/tpm/tpm_tis_spi.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

--- a/drivers/char/tpm/tpm_tis_spi.c
+++ b/drivers/char/tpm/tpm_tis_spi.c
@@ -188,6 +188,7 @@ static const struct tpm_tis_phy_ops tpm_
 static int tpm_tis_spi_probe(struct spi_device *dev)
 {
 	struct tpm_tis_spi_phy *phy;
+	int irq;
 
 	phy = devm_kzalloc(&dev->dev, sizeof(struct tpm_tis_spi_phy),
 			   GFP_KERNEL);
@@ -200,7 +201,13 @@ static int tpm_tis_spi_probe(struct spi_
 	if (!phy->iobuf)
 		return -ENOMEM;
 
-	return tpm_tis_core_init(&dev->dev, &phy->priv, -1, &tpm_spi_phy_ops,
+	/* If the SPI device has an IRQ then use that */
+	if (dev->irq > 0)
+		irq = dev->irq;
+	else
+		irq = -1;
+
+	return tpm_tis_core_init(&dev->dev, &phy->priv, irq, &tpm_spi_phy_ops,
 				 NULL);
 }
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 072/126] tpm/tpm_i2c_infineon: switch to i2c_lock_bus(..., I2C_LOCK_SEGMENT)
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (70 preceding siblings ...)
  2018-09-17 22:41 ` [PATCH 4.14 071/126] tpm_tis_spi: Pass the SPI IRQ down to the driver Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 073/126] f2fs: fix to do sanity check with reserved blkaddr of inline inode Greg Kroah-Hartman
                   ` (56 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Peter Rosin, Jarkko Sakkinen,
	Alexander Steffen, Wolfram Sang, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Rosin <peda@axentia.se>

[ Upstream commit bb853aac2c478ce78116128263801189408ad2a8 ]

Locking the root adapter for __i2c_transfer will deadlock if the
device sits behind a mux-locked I2C mux. Switch to the finer-grained
i2c_lock_bus with the I2C_LOCK_SEGMENT flag. If the device does not
sit behind a mux-locked mux, the two locking variants are equivalent.

Signed-off-by: Peter Rosin <peda@axentia.se>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Alexander Steffen <Alexander.Steffen@infineon.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/char/tpm/tpm_i2c_infineon.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/drivers/char/tpm/tpm_i2c_infineon.c
+++ b/drivers/char/tpm/tpm_i2c_infineon.c
@@ -117,7 +117,7 @@ static int iic_tpm_read(u8 addr, u8 *buf
 	/* Lock the adapter for the duration of the whole sequence. */
 	if (!tpm_dev.client->adapter->algo->master_xfer)
 		return -EOPNOTSUPP;
-	i2c_lock_adapter(tpm_dev.client->adapter);
+	i2c_lock_bus(tpm_dev.client->adapter, I2C_LOCK_SEGMENT);
 
 	if (tpm_dev.chip_type == SLB9645) {
 		/* use a combined read for newer chips
@@ -192,7 +192,7 @@ static int iic_tpm_read(u8 addr, u8 *buf
 	}
 
 out:
-	i2c_unlock_adapter(tpm_dev.client->adapter);
+	i2c_unlock_bus(tpm_dev.client->adapter, I2C_LOCK_SEGMENT);
 	/* take care of 'guard time' */
 	usleep_range(SLEEP_DURATION_LOW, SLEEP_DURATION_HI);
 
@@ -224,7 +224,7 @@ static int iic_tpm_write_generic(u8 addr
 
 	if (!tpm_dev.client->adapter->algo->master_xfer)
 		return -EOPNOTSUPP;
-	i2c_lock_adapter(tpm_dev.client->adapter);
+	i2c_lock_bus(tpm_dev.client->adapter, I2C_LOCK_SEGMENT);
 
 	/* prepend the 'register address' to the buffer */
 	tpm_dev.buf[0] = addr;
@@ -243,7 +243,7 @@ static int iic_tpm_write_generic(u8 addr
 		usleep_range(sleep_low, sleep_hi);
 	}
 
-	i2c_unlock_adapter(tpm_dev.client->adapter);
+	i2c_unlock_bus(tpm_dev.client->adapter, I2C_LOCK_SEGMENT);
 	/* take care of 'guard time' */
 	usleep_range(SLEEP_DURATION_LOW, SLEEP_DURATION_HI);
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 073/126] f2fs: fix to do sanity check with reserved blkaddr of inline inode
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (71 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 072/126] tpm/tpm_i2c_infineon: switch to i2c_lock_bus(..., I2C_LOCK_SEGMENT) Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 074/126] MIPS: Octeon: add missing of_node_put() Greg Kroah-Hartman
                   ` (55 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Wen Xu, Chao Yu, Jaegeuk Kim, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Chao Yu <yuchao0@huawei.com>

[ Upstream commit 4dbe38dc386910c668c75ae616b99b823b59f3eb ]

As Wen Xu reported in bugzilla, after image was injected with random data
by fuzzing, inline inode would contain invalid reserved blkaddr, then
during inline conversion, we will encounter illegal memory accessing
reported by KASAN, the root cause of this is when writing out converted
inline page, we will use invalid reserved blkaddr to update sit bitmap,
result in accessing memory beyond sit bitmap boundary.

In order to fix this issue, let's do sanity check with reserved block
address of inline inode to avoid above condition.

https://bugzilla.kernel.org/show_bug.cgi?id=200179

[ 1428.846352] BUG: KASAN: use-after-free in update_sit_entry+0x80/0x7f0
[ 1428.846618] Read of size 4 at addr ffff880194483540 by task a.out/2741

[ 1428.846855] CPU: 0 PID: 2741 Comm: a.out Tainted: G        W         4.17.0+ #1
[ 1428.846858] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 1428.846860] Call Trace:
[ 1428.846868]  dump_stack+0x71/0xab
[ 1428.846875]  print_address_description+0x6b/0x290
[ 1428.846881]  kasan_report+0x28e/0x390
[ 1428.846888]  ? update_sit_entry+0x80/0x7f0
[ 1428.846898]  update_sit_entry+0x80/0x7f0
[ 1428.846906]  f2fs_allocate_data_block+0x6db/0xc70
[ 1428.846914]  ? f2fs_get_node_info+0x14f/0x590
[ 1428.846920]  do_write_page+0xc8/0x150
[ 1428.846928]  f2fs_outplace_write_data+0xfe/0x210
[ 1428.846935]  ? f2fs_do_write_node_page+0x170/0x170
[ 1428.846941]  ? radix_tree_tag_clear+0xff/0x130
[ 1428.846946]  ? __mod_node_page_state+0x22/0xa0
[ 1428.846951]  ? inc_zone_page_state+0x54/0x100
[ 1428.846956]  ? __test_set_page_writeback+0x336/0x5d0
[ 1428.846964]  f2fs_convert_inline_page+0x407/0x6d0
[ 1428.846971]  ? f2fs_read_inline_data+0x3b0/0x3b0
[ 1428.846978]  ? __get_node_page+0x335/0x6b0
[ 1428.846987]  f2fs_convert_inline_inode+0x41b/0x500
[ 1428.846994]  ? f2fs_convert_inline_page+0x6d0/0x6d0
[ 1428.847000]  ? kasan_unpoison_shadow+0x31/0x40
[ 1428.847005]  ? kasan_kmalloc+0xa6/0xd0
[ 1428.847024]  f2fs_file_mmap+0x79/0xc0
[ 1428.847029]  mmap_region+0x58b/0x880
[ 1428.847037]  ? arch_get_unmapped_area+0x370/0x370
[ 1428.847042]  do_mmap+0x55b/0x7a0
[ 1428.847048]  vm_mmap_pgoff+0x16f/0x1c0
[ 1428.847055]  ? vma_is_stack_for_current+0x50/0x50
[ 1428.847062]  ? __fsnotify_update_child_dentry_flags.part.1+0x160/0x160
[ 1428.847068]  ? do_sys_open+0x206/0x2a0
[ 1428.847073]  ? __fget+0xb4/0x100
[ 1428.847079]  ksys_mmap_pgoff+0x278/0x360
[ 1428.847085]  ? find_mergeable_anon_vma+0x50/0x50
[ 1428.847091]  do_syscall_64+0x73/0x160
[ 1428.847098]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1428.847102] RIP: 0033:0x7fb1430766ba
[ 1428.847103] Code: 89 f5 41 54 49 89 fc 55 53 74 35 49 63 e8 48 63 da 4d 89 f9 49 89 e8 4d 63 d6 48 89 da 4c 89 ee 4c 89 e7 b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 56 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f 1f 00
[ 1428.847162] RSP: 002b:00007ffc651d9388 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
[ 1428.847167] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fb1430766ba
[ 1428.847170] RDX: 0000000000000001 RSI: 0000000000001000 RDI: 0000000000000000
[ 1428.847173] RBP: 0000000000000003 R08: 0000000000000003 R09: 0000000000000000
[ 1428.847176] R10: 0000000000008002 R11: 0000000000000246 R12: 0000000000000000
[ 1428.847179] R13: 0000000000001000 R14: 0000000000008002 R15: 0000000000000000

[ 1428.847252] Allocated by task 2683:
[ 1428.847372]  kasan_kmalloc+0xa6/0xd0
[ 1428.847380]  kmem_cache_alloc+0xc8/0x1e0
[ 1428.847385]  getname_flags+0x73/0x2b0
[ 1428.847390]  user_path_at_empty+0x1d/0x40
[ 1428.847395]  vfs_statx+0xc1/0x150
[ 1428.847401]  __do_sys_newlstat+0x7e/0xd0
[ 1428.847405]  do_syscall_64+0x73/0x160
[ 1428.847411]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[ 1428.847466] Freed by task 2683:
[ 1428.847566]  __kasan_slab_free+0x137/0x190
[ 1428.847571]  kmem_cache_free+0x85/0x1e0
[ 1428.847575]  filename_lookup+0x191/0x280
[ 1428.847580]  vfs_statx+0xc1/0x150
[ 1428.847585]  __do_sys_newlstat+0x7e/0xd0
[ 1428.847590]  do_syscall_64+0x73/0x160
[ 1428.847596]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[ 1428.847648] The buggy address belongs to the object at ffff880194483300
                which belongs to the cache names_cache of size 4096
[ 1428.847946] The buggy address is located 576 bytes inside of
                4096-byte region [ffff880194483300, ffff880194484300)
[ 1428.848234] The buggy address belongs to the page:
[ 1428.848366] page:ffffea0006512000 count:1 mapcount:0 mapping:ffff8801f3586380 index:0x0 compound_mapcount: 0
[ 1428.848606] flags: 0x17fff8000008100(slab|head)
[ 1428.848737] raw: 017fff8000008100 dead000000000100 dead000000000200 ffff8801f3586380
[ 1428.848931] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
[ 1428.849122] page dumped because: kasan: bad access detected

[ 1428.849305] Memory state around the buggy address:
[ 1428.849436]  ffff880194483400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1428.849620]  ffff880194483480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1428.849804] >ffff880194483500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1428.849985]                                            ^
[ 1428.850120]  ffff880194483580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1428.850303]  ffff880194483600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1428.850498] ==================================================================

Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/f2fs/inline.c |   21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -128,6 +128,16 @@ int f2fs_convert_inline_page(struct dnod
 	if (err)
 		return err;
 
+	if (unlikely(dn->data_blkaddr != NEW_ADDR)) {
+		f2fs_put_dnode(dn);
+		set_sbi_flag(fio.sbi, SBI_NEED_FSCK);
+		f2fs_msg(fio.sbi->sb, KERN_WARNING,
+			"%s: corrupted inline inode ino=%lx, i_addr[0]:0x%x, "
+			"run fsck to fix.",
+			__func__, dn->inode->i_ino, dn->data_blkaddr);
+		return -EINVAL;
+	}
+
 	f2fs_bug_on(F2FS_P_SB(page), PageWriteback(page));
 
 	read_inline_data(page, dn->inode_page);
@@ -365,6 +375,17 @@ static int f2fs_move_inline_dirents(stru
 	if (err)
 		goto out;
 
+	if (unlikely(dn.data_blkaddr != NEW_ADDR)) {
+		f2fs_put_dnode(&dn);
+		set_sbi_flag(F2FS_P_SB(page), SBI_NEED_FSCK);
+		f2fs_msg(F2FS_P_SB(page)->sb, KERN_WARNING,
+			"%s: corrupted inline inode ino=%lx, i_addr[0]:0x%x, "
+			"run fsck to fix.",
+			__func__, dir->i_ino, dn.data_blkaddr);
+		err = -EINVAL;
+		goto out;
+	}
+
 	f2fs_wait_on_page_writeback(page, DATA, true);
 	zero_user_segment(page, MAX_INLINE_DATA(dir), PAGE_SIZE);
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 074/126] MIPS: Octeon: add missing of_node_put()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (72 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 073/126] f2fs: fix to do sanity check with reserved blkaddr of inline inode Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 075/126] MIPS: generic: fix " Greg Kroah-Hartman
                   ` (54 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Nicholas Mc Guire, Paul Burton,
	Ralf Baechle, James Hogan, linux-mips, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nicholas Mc Guire <hofrat@osadl.org>

[ Upstream commit b1259519e618d479ede8a0db5474b3aff99f5056 ]

The call to of_find_node_by_name returns a node pointer with refcount
incremented thus it must be explicitly decremented here after the last
usage.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/19558/
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/mips/cavium-octeon/octeon-platform.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/arch/mips/cavium-octeon/octeon-platform.c
+++ b/arch/mips/cavium-octeon/octeon-platform.c
@@ -322,6 +322,7 @@ static int __init octeon_ehci_device_ini
 		return 0;
 
 	pd = of_find_device_by_node(ehci_node);
+	of_node_put(ehci_node);
 	if (!pd)
 		return 0;
 
@@ -384,6 +385,7 @@ static int __init octeon_ohci_device_ini
 		return 0;
 
 	pd = of_find_device_by_node(ohci_node);
+	of_node_put(ohci_node);
 	if (!pd)
 		return 0;
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 075/126] MIPS: generic: fix missing of_node_put()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (73 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 074/126] MIPS: Octeon: add missing of_node_put() Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 076/126] net: dcb: For wild-card lookups, use priority -1, not 0 Greg Kroah-Hartman
                   ` (53 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Nicholas Mc Guire, Paul Burton,
	Ralf Baechle, James Hogan, linux-mips, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nicholas Mc Guire <hofrat@osadl.org>

[ Upstream commit 28ec2238f37e72a3a40a7eb46893e7651bcc40a6 ]

of_find_compatible_node() returns a device_node pointer with refcount
incremented and must be decremented explicitly.
 As this code is using the result only to check presence of the interrupt
controller (!NULL) but not actually using the result otherwise the
refcount can be decremented here immediately again.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/19820/
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/mips/generic/init.c |    1 +
 1 file changed, 1 insertion(+)

--- a/arch/mips/generic/init.c
+++ b/arch/mips/generic/init.c
@@ -204,6 +204,7 @@ void __init arch_init_irq(void)
 					    "mti,cpu-interrupt-controller");
 	if (!cpu_has_veic && !intc_node)
 		mips_cpu_irq_init();
+	of_node_put(intc_node);
 
 	irqchip_init();
 }



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 076/126] net: dcb: For wild-card lookups, use priority -1, not 0
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (74 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 075/126] MIPS: generic: fix " Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 077/126] dm cache: only allow a single io_mode cache feature to be requested Greg Kroah-Hartman
                   ` (52 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Petr Machata, Ido Schimmel,
	David S. Miller, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Petr Machata <petrm@mellanox.com>

[ Upstream commit 08193d1a893c802c4b807e4d522865061f4e9f4f ]

The function dcb_app_lookup walks the list of specified DCB APP entries,
looking for one that matches a given criteria: ifindex, selector,
protocol ID and optionally also priority. The "don't care" value for
priority is set to 0, because that priority has not been allowed under
CEE regime, which predates the IEEE standardization.

Under IEEE, 0 is a valid priority number. But because dcb_app_lookup
considers zero a wild card, attempts to add an APP entry with priority 0
fail when other entries exist for a given ifindex / selector / PID
triplet.

Fix by changing the wild-card value to -1.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/dcb/dcbnl.c |   11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

--- a/net/dcb/dcbnl.c
+++ b/net/dcb/dcbnl.c
@@ -1765,7 +1765,7 @@ static struct dcb_app_type *dcb_app_look
 		if (itr->app.selector == app->selector &&
 		    itr->app.protocol == app->protocol &&
 		    itr->ifindex == ifindex &&
-		    (!prio || itr->app.priority == prio))
+		    ((prio == -1) || itr->app.priority == prio))
 			return itr;
 	}
 
@@ -1800,7 +1800,8 @@ u8 dcb_getapp(struct net_device *dev, st
 	u8 prio = 0;
 
 	spin_lock_bh(&dcb_lock);
-	if ((itr = dcb_app_lookup(app, dev->ifindex, 0)))
+	itr = dcb_app_lookup(app, dev->ifindex, -1);
+	if (itr)
 		prio = itr->app.priority;
 	spin_unlock_bh(&dcb_lock);
 
@@ -1828,7 +1829,8 @@ int dcb_setapp(struct net_device *dev, s
 
 	spin_lock_bh(&dcb_lock);
 	/* Search for existing match and replace */
-	if ((itr = dcb_app_lookup(new, dev->ifindex, 0))) {
+	itr = dcb_app_lookup(new, dev->ifindex, -1);
+	if (itr) {
 		if (new->priority)
 			itr->app.priority = new->priority;
 		else {
@@ -1861,7 +1863,8 @@ u8 dcb_ieee_getapp_mask(struct net_devic
 	u8 prio = 0;
 
 	spin_lock_bh(&dcb_lock);
-	if ((itr = dcb_app_lookup(app, dev->ifindex, 0)))
+	itr = dcb_app_lookup(app, dev->ifindex, -1);
+	if (itr)
 		prio |= 1 << itr->app.priority;
 	spin_unlock_bh(&dcb_lock);
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 077/126] dm cache: only allow a single io_mode cache feature to be requested
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (75 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 076/126] net: dcb: For wild-card lookups, use priority -1, not 0 Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 078/126] Input: atmel_mxt_ts - only use first T9 instance Greg Kroah-Hartman
                   ` (51 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, John Pittman, Mike Snitzer, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: John Pittman <jpittman@redhat.com>

[ Upstream commit af9313c32c0fa2a0ac3b113669273833d60cc9de ]

More than one io_mode feature can be requested when creating a dm cache
device (as is: last one wins).  The io_mode selections are incompatible
with one another, we should force them to be selected exclusively.  Add
a counter to check for more than one io_mode selection.

Fixes: 629d0a8a1a10 ("dm cache metadata: add "metadata2" feature")
Signed-off-by: John Pittman <jpittman@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/md/dm-cache-target.c |   19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

--- a/drivers/md/dm-cache-target.c
+++ b/drivers/md/dm-cache-target.c
@@ -2330,7 +2330,7 @@ static int parse_features(struct cache_a
 		{0, 2, "Invalid number of cache feature arguments"},
 	};
 
-	int r;
+	int r, mode_ctr = 0;
 	unsigned argc;
 	const char *arg;
 	struct cache_features *cf = &ca->features;
@@ -2344,14 +2344,20 @@ static int parse_features(struct cache_a
 	while (argc--) {
 		arg = dm_shift_arg(as);
 
-		if (!strcasecmp(arg, "writeback"))
+		if (!strcasecmp(arg, "writeback")) {
 			cf->io_mode = CM_IO_WRITEBACK;
+			mode_ctr++;
+		}
 
-		else if (!strcasecmp(arg, "writethrough"))
+		else if (!strcasecmp(arg, "writethrough")) {
 			cf->io_mode = CM_IO_WRITETHROUGH;
+			mode_ctr++;
+		}
 
-		else if (!strcasecmp(arg, "passthrough"))
+		else if (!strcasecmp(arg, "passthrough")) {
 			cf->io_mode = CM_IO_PASSTHROUGH;
+			mode_ctr++;
+		}
 
 		else if (!strcasecmp(arg, "metadata2"))
 			cf->metadata_version = 2;
@@ -2362,6 +2368,11 @@ static int parse_features(struct cache_a
 		}
 	}
 
+	if (mode_ctr > 1) {
+		*error = "Duplicate cache io_mode features requested";
+		return -EINVAL;
+	}
+
 	return 0;
 }
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 078/126] Input: atmel_mxt_ts - only use first T9 instance
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (76 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 077/126] dm cache: only allow a single io_mode cache feature to be requested Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 079/126] media: s5p-mfc: Fix buffer look up in s5p_mfc_handle_frame_{new, copy_time} functions Greg Kroah-Hartman
                   ` (50 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Nick Dyer, Benson Leung, Yufeng Shen,
	Dmitry Torokhov, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nick Dyer <nick.dyer@itdev.co.uk>

[ Upstream commit 36f5d9ef26e52edff046b4b097855db89bf0cd4a ]

The driver only registers one input device, which uses the screen
parameters from the first T9 instance. The first T63 instance also uses
those parameters.

It is incorrect to send input reports from the second instances of these
objects if they are enabled: the input scaling will be wrong and the
positions will be mashed together.

This also causes problems on Android if the number of slots exceeds 32.

In the future, this could be handled by looking for enabled touch object
instances and creating an input device for each one.

Signed-off-by: Nick Dyer <nick.dyer@itdev.co.uk>
Acked-by: Benson Leung <bleung@chromium.org>
Acked-by: Yufeng Shen <miletus@chromium.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/input/touchscreen/atmel_mxt_ts.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/drivers/input/touchscreen/atmel_mxt_ts.c
+++ b/drivers/input/touchscreen/atmel_mxt_ts.c
@@ -1647,10 +1647,11 @@ static int mxt_parse_object_table(struct
 			break;
 		case MXT_TOUCH_MULTI_T9:
 			data->multitouch = MXT_TOUCH_MULTI_T9;
+			/* Only handle messages from first T9 instance */
 			data->T9_reportid_min = min_id;
-			data->T9_reportid_max = max_id;
-			data->num_touchids = object->num_report_ids
-						* mxt_obj_instances(object);
+			data->T9_reportid_max = min_id +
+						object->num_report_ids - 1;
+			data->num_touchids = object->num_report_ids;
 			break;
 		case MXT_SPT_MESSAGECOUNT_T44:
 			data->T44_address = object->start_address;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 079/126] media: s5p-mfc: Fix buffer look up in s5p_mfc_handle_frame_{new, copy_time} functions
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (77 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 078/126] Input: atmel_mxt_ts - only use first T9 instance Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 080/126] partitions/aix: append null character to print data from disk Greg Kroah-Hartman
                   ` (49 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Sylwester Nawrocki,
	Mauro Carvalho Chehab, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sylwester Nawrocki <s.nawrocki@samsung.com>

[ Upstream commit 4faeaf9c0f4581667ce5826f9c90c4fd463ef086 ]

Look up of buffers in s5p_mfc_handle_frame_new, s5p_mfc_handle_frame_copy_time
functions is not working properly for DMA addresses above 2 GiB. As a result
flags and timestamp of returned buffers are not set correctly and it breaks
operation of GStreamer/OMX plugins which rely on the CAPTURE buffer queue
flags.

Due to improper return type of the get_dec_y_adr, get_dspl_y_adr callbacks
and sign bit extension these callbacks return incorrect address values,
e.g. 0xfffffffffefc0000 instead of 0x00000000fefc0000. Then the statement:

"if (vb2_dma_contig_plane_dma_addr(&dst_buf->b->vb2_buf, 0) == dec_y_addr)"

is always false, which breaks looking up capture queue buffers.

To ensure proper matching by address u32 type is used for the DMA
addresses. This should work on all related SoCs, since the MFC DMA
address width is not larger than 32-bit.

Changes done in this patch are minimal as there is a larger patch series
pending refactoring the whole driver.

Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/media/platform/s5p-mfc/s5p_mfc.c |   23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

--- a/drivers/media/platform/s5p-mfc/s5p_mfc.c
+++ b/drivers/media/platform/s5p-mfc/s5p_mfc.c
@@ -254,24 +254,24 @@ static void s5p_mfc_handle_frame_all_ext
 static void s5p_mfc_handle_frame_copy_time(struct s5p_mfc_ctx *ctx)
 {
 	struct s5p_mfc_dev *dev = ctx->dev;
-	struct s5p_mfc_buf  *dst_buf, *src_buf;
-	size_t dec_y_addr;
+	struct s5p_mfc_buf *dst_buf, *src_buf;
+	u32 dec_y_addr;
 	unsigned int frame_type;
 
 	/* Make sure we actually have a new frame before continuing. */
 	frame_type = s5p_mfc_hw_call(dev->mfc_ops, get_dec_frame_type, dev);
 	if (frame_type == S5P_FIMV_DECODE_FRAME_SKIPPED)
 		return;
-	dec_y_addr = s5p_mfc_hw_call(dev->mfc_ops, get_dec_y_adr, dev);
+	dec_y_addr = (u32)s5p_mfc_hw_call(dev->mfc_ops, get_dec_y_adr, dev);
 
 	/* Copy timestamp / timecode from decoded src to dst and set
 	   appropriate flags. */
 	src_buf = list_entry(ctx->src_queue.next, struct s5p_mfc_buf, list);
 	list_for_each_entry(dst_buf, &ctx->dst_queue, list) {
-		if (vb2_dma_contig_plane_dma_addr(&dst_buf->b->vb2_buf, 0)
-				== dec_y_addr) {
-			dst_buf->b->timecode =
-						src_buf->b->timecode;
+		u32 addr = (u32)vb2_dma_contig_plane_dma_addr(&dst_buf->b->vb2_buf, 0);
+
+		if (addr == dec_y_addr) {
+			dst_buf->b->timecode = src_buf->b->timecode;
 			dst_buf->b->vb2_buf.timestamp =
 						src_buf->b->vb2_buf.timestamp;
 			dst_buf->b->flags &=
@@ -307,10 +307,10 @@ static void s5p_mfc_handle_frame_new(str
 {
 	struct s5p_mfc_dev *dev = ctx->dev;
 	struct s5p_mfc_buf  *dst_buf;
-	size_t dspl_y_addr;
+	u32 dspl_y_addr;
 	unsigned int frame_type;
 
-	dspl_y_addr = s5p_mfc_hw_call(dev->mfc_ops, get_dspl_y_adr, dev);
+	dspl_y_addr = (u32)s5p_mfc_hw_call(dev->mfc_ops, get_dspl_y_adr, dev);
 	if (IS_MFCV6_PLUS(dev))
 		frame_type = s5p_mfc_hw_call(dev->mfc_ops,
 			get_disp_frame_type, ctx);
@@ -329,9 +329,10 @@ static void s5p_mfc_handle_frame_new(str
 	/* The MFC returns address of the buffer, now we have to
 	 * check which videobuf does it correspond to */
 	list_for_each_entry(dst_buf, &ctx->dst_queue, list) {
+		u32 addr = (u32)vb2_dma_contig_plane_dma_addr(&dst_buf->b->vb2_buf, 0);
+
 		/* Check if this is the buffer we're looking for */
-		if (vb2_dma_contig_plane_dma_addr(&dst_buf->b->vb2_buf, 0)
-				== dspl_y_addr) {
+		if (addr == dspl_y_addr) {
 			list_del(&dst_buf->list);
 			ctx->dst_queue_cnt--;
 			dst_buf->b->sequence = ctx->sequence;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 080/126] partitions/aix: append null character to print data from disk
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (78 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 079/126] media: s5p-mfc: Fix buffer look up in s5p_mfc_handle_frame_{new, copy_time} functions Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 081/126] partitions/aix: fix usage of uninitialized lv_info and lvname structures Greg Kroah-Hartman
                   ` (48 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Daniel J. Axtens,
	Mauricio Faria de Oliveira, Jens Axboe, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mauricio Faria de Oliveira <mfo@canonical.com>

[ Upstream commit d43fdae7bac2def8c4314b5a49822cb7f08a45f1 ]

Even if properly initialized, the lvname array (i.e., strings)
is read from disk, and might contain corrupt data (e.g., lack
the null terminating character for strings).

So, make sure the partition name string used in pr_warn() has
the null terminating character.

Fixes: 6ceea22bbbc8 ("partitions: add aix lvm partition support files")
Suggested-by: Daniel J. Axtens <daniel.axtens@canonical.com>
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 block/partitions/aix.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/block/partitions/aix.c
+++ b/block/partitions/aix.c
@@ -282,10 +282,14 @@ int aix_partition(struct parsed_partitio
 				next_lp_ix += 1;
 		}
 		for (i = 0; i < state->limit; i += 1)
-			if (lvip[i].pps_found && !lvip[i].lv_is_contiguous)
+			if (lvip[i].pps_found && !lvip[i].lv_is_contiguous) {
+				char tmp[sizeof(n[i].name) + 1]; // null char
+
+				snprintf(tmp, sizeof(tmp), "%s", n[i].name);
 				pr_warn("partition %s (%u pp's found) is "
 					"not contiguous\n",
-					n[i].name, lvip[i].pps_found);
+					tmp, lvip[i].pps_found);
+			}
 		kfree(pvd);
 	}
 	kfree(n);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 081/126] partitions/aix: fix usage of uninitialized lv_info and lvname structures
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (79 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 080/126] partitions/aix: append null character to print data from disk Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 082/126] media: helene: fix xtal frequency setting at power on Greg Kroah-Hartman
                   ` (47 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mauricio Faria de Oliveira,
	Jens Axboe, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mauricio Faria de Oliveira <mfo@canonical.com>

[ Upstream commit 14cb2c8a6c5dae57ee3e2da10fa3db2b9087e39e ]

The if-block that sets a successful return value in aix_partition()
uses 'lvip[].pps_per_lv' and 'n[].name' potentially uninitialized.

For example, if 'numlvs' is zero or alloc_lvn() fails, neither is
initialized, but are used anyway if alloc_pvd() succeeds after it.

So, make the alloc_pvd() call conditional on their initialization.

This has been hit when attaching an apparently corrupted/stressed
AIX LUN, misleading the kernel to pr_warn() invalid data and hang.

    [...] partition (null) (11 pp's found) is not contiguous
    [...] partition (null) (2 pp's found) is not contiguous
    [...] partition (null) (3 pp's found) is not contiguous
    [...] partition (null) (64 pp's found) is not contiguous

Fixes: 6ceea22bbbc8 ("partitions: add aix lvm partition support files")
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 block/partitions/aix.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/block/partitions/aix.c
+++ b/block/partitions/aix.c
@@ -178,7 +178,7 @@ int aix_partition(struct parsed_partitio
 	u32 vgda_sector = 0;
 	u32 vgda_len = 0;
 	int numlvs = 0;
-	struct pvd *pvd;
+	struct pvd *pvd = NULL;
 	struct lv_info {
 		unsigned short pps_per_lv;
 		unsigned short pps_found;
@@ -232,10 +232,11 @@ int aix_partition(struct parsed_partitio
 				if (lvip[i].pps_per_lv)
 					foundlvs += 1;
 			}
+			/* pvd loops depend on n[].name and lvip[].pps_per_lv */
+			pvd = alloc_pvd(state, vgda_sector + 17);
 		}
 		put_dev_sector(sect);
 	}
-	pvd = alloc_pvd(state, vgda_sector + 17);
 	if (pvd) {
 		int numpps = be16_to_cpu(pvd->pp_count);
 		int psn_part1 = be32_to_cpu(pvd->psn_part1);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 082/126] media: helene: fix xtal frequency setting at power on
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (80 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 081/126] partitions/aix: fix usage of uninitialized lv_info and lvname structures Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 083/126] f2fs: fix to wait on page writeback before updating page Greg Kroah-Hartman
                   ` (46 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Katsuhiro Suzuki, Abylay Ospan,
	Mauro Carvalho Chehab, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com>

[ Upstream commit a00e5f074b3f3cd39d1ccdc53d4d805b014df3f3 ]

This patch fixes crystal frequency setting when power on this device.

Signed-off-by: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com>
Acked-by: Abylay Ospan <aospan@netup.ru>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/media/dvb-frontends/helene.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/drivers/media/dvb-frontends/helene.c
+++ b/drivers/media/dvb-frontends/helene.c
@@ -897,7 +897,10 @@ static int helene_x_pon(struct helene_pr
 	helene_write_regs(priv, 0x99, cdata, sizeof(cdata));
 
 	/* 0x81 - 0x94 */
-	data[0] = 0x18; /* xtal 24 MHz */
+	if (priv->xtal == SONY_HELENE_XTAL_16000)
+		data[0] = 0x10; /* xtal 16 MHz */
+	else
+		data[0] = 0x18; /* xtal 24 MHz */
 	data[1] = (uint8_t)(0x80 | (0x04 & 0x1F)); /* 4 x 25 = 100uA */
 	data[2] = (uint8_t)(0x80 | (0x26 & 0x7F)); /* 38 x 0.25 = 9.5pF */
 	data[3] = 0x80; /* REFOUT signal output 500mVpp */



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 083/126] f2fs: fix to wait on page writeback before updating page
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (81 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 082/126] media: helene: fix xtal frequency setting at power on Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 084/126] f2fs: Fix uninitialized return in f2fs_ioc_shutdown() Greg Kroah-Hartman
                   ` (45 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Chao Yu, Jaegeuk Kim, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Chao Yu <yuchao0@huawei.com>

[ Upstream commit 6aead1617b3adf2b7e2c56f0f13e4e0ee42ebb4a ]

In error path of f2fs_move_rehashed_dirents, inode page could be writeback
state, so we should wait on inode page writeback before updating it.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/f2fs/inline.c |    1 +
 1 file changed, 1 insertion(+)

--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -502,6 +502,7 @@ static int f2fs_move_rehashed_dirents(st
 	return 0;
 recover:
 	lock_page(ipage);
+	f2fs_wait_on_page_writeback(ipage, NODE, true);
 	memcpy(inline_dentry, backup_dentry, MAX_INLINE_DATA(dir));
 	f2fs_i_depth_write(dir, 0);
 	f2fs_i_size_write(dir, MAX_INLINE_DATA(dir));



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 084/126] f2fs: Fix uninitialized return in f2fs_ioc_shutdown()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (82 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 083/126] f2fs: fix to wait on page writeback before updating page Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 085/126] iommu/ipmmu-vmsa: Fix allocation in atomic context Greg Kroah-Hartman
                   ` (44 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dan Carpenter, Chao Yu, Jaegeuk Kim,
	Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

[ Upstream commit 2a96d8ad94ce57cb0072f7a660b1039720c47716 ]

"ret" can be uninitialized on the success path when "in ==
F2FS_GOING_DOWN_FULLSYNC".

Fixes: 60b2b4ee2bc0 ("f2fs: Fix deadlock in shutdown ioctl")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/f2fs/file.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1803,7 +1803,7 @@ static int f2fs_ioc_shutdown(struct file
 	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
 	struct super_block *sb = sbi->sb;
 	__u32 in;
-	int ret;
+	int ret = 0;
 
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 085/126] iommu/ipmmu-vmsa: Fix allocation in atomic context
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (83 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 084/126] f2fs: Fix uninitialized return in f2fs_ioc_shutdown() Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 086/126] mfd: ti_am335x_tscadc: Fix struct clk memory leak Greg Kroah-Hartman
                   ` (43 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Geert Uytterhoeven, Laurent Pinchart,
	Joerg Roedel, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Geert Uytterhoeven <geert+renesas@glider.be>

[ Upstream commit 46583e8c48c5a094ba28060615b3a7c8c576690f ]

When attaching a device to an IOMMU group with
CONFIG_DEBUG_ATOMIC_SLEEP=y:

    BUG: sleeping function called from invalid context at mm/slab.h:421
    in_atomic(): 1, irqs_disabled(): 128, pid: 61, name: kworker/1:1
    ...
    Call trace:
     ...
     arm_lpae_alloc_pgtable+0x114/0x184
     arm_64_lpae_alloc_pgtable_s1+0x2c/0x128
     arm_32_lpae_alloc_pgtable_s1+0x40/0x6c
     alloc_io_pgtable_ops+0x60/0x88
     ipmmu_attach_device+0x140/0x334

ipmmu_attach_device() takes a spinlock, while arm_lpae_alloc_pgtable()
allocates memory using GFP_KERNEL.  Originally, the ipmmu-vmsa driver
had its own custom page table allocation implementation using
GFP_ATOMIC, hence the spinlock was fine.

Fix this by replacing the spinlock by a mutex, like the arm-smmu driver
does.

Fixes: f20ed39f53145e45 ("iommu/ipmmu-vmsa: Use the ARM LPAE page table allocator")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/iommu/ipmmu-vmsa.c |    9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -54,7 +54,7 @@ struct ipmmu_vmsa_domain {
 	struct io_pgtable_ops *iop;
 
 	unsigned int context_id;
-	spinlock_t lock;			/* Protects mappings */
+	struct mutex mutex;			/* Protects mappings */
 };
 
 struct ipmmu_vmsa_iommu_priv {
@@ -523,7 +523,7 @@ static struct iommu_domain *__ipmmu_doma
 	if (!domain)
 		return NULL;
 
-	spin_lock_init(&domain->lock);
+	mutex_init(&domain->mutex);
 
 	return &domain->io_domain;
 }
@@ -548,7 +548,6 @@ static int ipmmu_attach_device(struct io
 	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
 	struct ipmmu_vmsa_device *mmu = priv->mmu;
 	struct ipmmu_vmsa_domain *domain = to_vmsa_domain(io_domain);
-	unsigned long flags;
 	unsigned int i;
 	int ret = 0;
 
@@ -557,7 +556,7 @@ static int ipmmu_attach_device(struct io
 		return -ENXIO;
 	}
 
-	spin_lock_irqsave(&domain->lock, flags);
+	mutex_lock(&domain->mutex);
 
 	if (!domain->mmu) {
 		/* The domain hasn't been used yet, initialize it. */
@@ -574,7 +573,7 @@ static int ipmmu_attach_device(struct io
 	} else
 		dev_info(dev, "Reusing IPMMU context %u\n", domain->context_id);
 
-	spin_unlock_irqrestore(&domain->lock, flags);
+	mutex_unlock(&domain->mutex);
 
 	if (ret < 0)
 		return ret;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 086/126] mfd: ti_am335x_tscadc: Fix struct clk memory leak
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (84 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 085/126] iommu/ipmmu-vmsa: Fix allocation in atomic context Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 087/126] f2fs: fix to do sanity check with {sit,nat}_ver_bitmap_bytesize Greg Kroah-Hartman
                   ` (42 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Zumeng Chen, Lee Jones, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Zumeng Chen <zumeng.chen@gmail.com>

[ Upstream commit c2b1509c77a99a0dcea0a9051ca743cb88385f50 ]

Use devm_elk_get() to let Linux manage struct clk memory to avoid the following
memory leakage report:

unreferenced object 0xdd75efc0 (size 64):
  comm "systemd-udevd", pid 186, jiffies 4294945126 (age 1195.750s)
  hex dump (first 32 bytes):
    61 64 63 5f 74 73 63 5f 66 63 6b 00 00 00 00 00  adc_tsc_fck.....
    00 00 00 00 92 03 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<c0a15260>] kmemleak_alloc+0x40/0x74
    [<c0287a10>] __kmalloc_track_caller+0x198/0x388
    [<c0255610>] kstrdup+0x40/0x5c
    [<c025565c>] kstrdup_const+0x30/0x3c
    [<c0636630>] __clk_create_clk+0x60/0xac
    [<c0630918>] clk_get_sys+0x74/0x144
    [<c0630cdc>] clk_get+0x5c/0x68
    [<bf0ac540>] ti_tscadc_probe+0x260/0x468 [ti_am335x_tscadc]
    [<c06f3c0c>] platform_drv_probe+0x60/0xac
    [<c06f1abc>] driver_probe_device+0x214/0x2dc
    [<c06f1c18>] __driver_attach+0x94/0xc0
    [<c06efe2c>] bus_for_each_dev+0x90/0xa0
    [<c06f1470>] driver_attach+0x28/0x30
    [<c06f1030>] bus_add_driver+0x184/0x1ec
    [<c06f2b74>] driver_register+0xb0/0xf0
    [<c06f3b4c>] __platform_driver_register+0x40/0x54

Signed-off-by: Zumeng Chen <zumeng.chen@gmail.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/mfd/ti_am335x_tscadc.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/drivers/mfd/ti_am335x_tscadc.c
+++ b/drivers/mfd/ti_am335x_tscadc.c
@@ -210,14 +210,13 @@ static	int ti_tscadc_probe(struct platfo
 	 * The TSC_ADC_SS controller design assumes the OCP clock is
 	 * at least 6x faster than the ADC clock.
 	 */
-	clk = clk_get(&pdev->dev, "adc_tsc_fck");
+	clk = devm_clk_get(&pdev->dev, "adc_tsc_fck");
 	if (IS_ERR(clk)) {
 		dev_err(&pdev->dev, "failed to get TSC fck\n");
 		err = PTR_ERR(clk);
 		goto err_disable_clk;
 	}
 	clock_rate = clk_get_rate(clk);
-	clk_put(clk);
 	tscadc->clk_div = clock_rate / ADC_CLK;
 
 	/* TSCADC_CLKDIV needs to be configured to the value minus 1 */



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 087/126] f2fs: fix to do sanity check with {sit,nat}_ver_bitmap_bytesize
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (85 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 086/126] mfd: ti_am335x_tscadc: Fix struct clk memory leak Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 088/126] NFSv4.1: Fix a potential layoutget/layoutrecall deadlock Greg Kroah-Hartman
                   ` (41 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Wen Xu, Chao Yu, Jaegeuk Kim, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Chao Yu <yuchao0@huawei.com>

[ Upstream commit c77ec61ca0a49544ca81881cc5d5529858f7e196 ]

This patch adds to do sanity check with {sit,nat}_ver_bitmap_bytesize
during mount, in order to avoid accessing across cache boundary with
this abnormal bitmap size.

- Overview
buffer overrun in build_sit_info() when mounting a crafted f2fs image

- Reproduce

- Kernel message
[  548.580867] F2FS-fs (loop0): Invalid log blocks per segment (8201)

[  548.580877] F2FS-fs (loop0): Can't find valid F2FS filesystem in 1th superblock
[  548.584979] ==================================================================
[  548.586568] BUG: KASAN: use-after-free in kmemdup+0x36/0x50
[  548.587715] Read of size 64 at addr ffff8801e9c265ff by task mount/1295

[  548.589428] CPU: 1 PID: 1295 Comm: mount Not tainted 4.18.0-rc1+ #4
[  548.589432] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  548.589438] Call Trace:
[  548.589474]  dump_stack+0x7b/0xb5
[  548.589487]  print_address_description+0x70/0x290
[  548.589492]  kasan_report+0x291/0x390
[  548.589496]  ? kmemdup+0x36/0x50
[  548.589509]  check_memory_region+0x139/0x190
[  548.589514]  memcpy+0x23/0x50
[  548.589518]  kmemdup+0x36/0x50
[  548.589545]  f2fs_build_segment_manager+0x8fa/0x3410
[  548.589551]  ? __asan_loadN+0xf/0x20
[  548.589560]  ? f2fs_sanity_check_ckpt+0x1be/0x240
[  548.589566]  ? f2fs_flush_sit_entries+0x10c0/0x10c0
[  548.589587]  ? __put_user_ns+0x40/0x40
[  548.589604]  ? find_next_bit+0x57/0x90
[  548.589610]  f2fs_fill_super+0x194b/0x2b40
[  548.589617]  ? f2fs_commit_super+0x1b0/0x1b0
[  548.589637]  ? set_blocksize+0x90/0x140
[  548.589651]  mount_bdev+0x1c5/0x210
[  548.589655]  ? f2fs_commit_super+0x1b0/0x1b0
[  548.589667]  f2fs_mount+0x15/0x20
[  548.589672]  mount_fs+0x60/0x1a0
[  548.589683]  ? alloc_vfsmnt+0x309/0x360
[  548.589688]  vfs_kern_mount+0x6b/0x1a0
[  548.589699]  do_mount+0x34a/0x18c0
[  548.589710]  ? lockref_put_or_lock+0xcf/0x160
[  548.589716]  ? copy_mount_string+0x20/0x20
[  548.589728]  ? memcg_kmem_put_cache+0x1b/0xa0
[  548.589734]  ? kasan_check_write+0x14/0x20
[  548.589740]  ? _copy_from_user+0x6a/0x90
[  548.589744]  ? memdup_user+0x42/0x60
[  548.589750]  ksys_mount+0x83/0xd0
[  548.589755]  __x64_sys_mount+0x67/0x80
[  548.589781]  do_syscall_64+0x78/0x170
[  548.589797]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  548.589820] RIP: 0033:0x7f76fc331b9a
[  548.589821] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48
[  548.589880] RSP: 002b:00007ffd4f0a0e48 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[  548.589890] RAX: ffffffffffffffda RBX: 000000000146c030 RCX: 00007f76fc331b9a
[  548.589892] RDX: 000000000146c210 RSI: 000000000146df30 RDI: 0000000001474ec0
[  548.589895] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[  548.589897] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000000001474ec0
[  548.589900] R13: 000000000146c210 R14: 0000000000000000 R15: 0000000000000003

[  548.590242] The buggy address belongs to the page:
[  548.591243] page:ffffea0007a70980 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[  548.592886] flags: 0x2ffff0000000000()
[  548.593665] raw: 02ffff0000000000 dead000000000100 dead000000000200 0000000000000000
[  548.595258] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[  548.603713] page dumped because: kasan: bad access detected

[  548.605203] Memory state around the buggy address:
[  548.606198]  ffff8801e9c26480: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  548.607676]  ffff8801e9c26500: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  548.609157] >ffff8801e9c26580: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  548.610629]                                                                 ^
[  548.612088]  ffff8801e9c26600: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  548.613674]  ffff8801e9c26680: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  548.615141] ==================================================================
[  548.616613] Disabling lock debugging due to kernel taint
[  548.622871] WARNING: CPU: 1 PID: 1295 at mm/page_alloc.c:4065 __alloc_pages_slowpath+0xe4a/0x1420
[  548.622878] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy
[  548.623217] CPU: 1 PID: 1295 Comm: mount Tainted: G    B             4.18.0-rc1+ #4
[  548.623219] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  548.623226] RIP: 0010:__alloc_pages_slowpath+0xe4a/0x1420
[  548.623227] Code: ff ff 01 89 85 c8 fe ff ff e9 91 fc ff ff 41 89 c5 e9 5c fc ff ff 0f 0b 89 f8 25 ff ff f7 ff 89 85 8c fe ff ff e9 d5 f2 ff ff <0f> 0b e9 65 f2 ff ff 65 8b 05 38 81 d2 47 f6 c4 01 74 1c 65 48 8b
[  548.623281] RSP: 0018:ffff8801f28c7678 EFLAGS: 00010246
[  548.623284] RAX: 0000000000000000 RBX: 00000000006040c0 RCX: ffffffffb82f73b7
[  548.623287] RDX: 1ffff1003e518eeb RSI: 000000000000000c RDI: 0000000000000000
[  548.623290] RBP: ffff8801f28c7880 R08: 0000000000000000 R09: ffffed0047fff2c5
[  548.623292] R10: 0000000000000001 R11: ffffed0047fff2c4 R12: ffff8801e88de040
[  548.623295] R13: 00000000006040c0 R14: 000000000000000c R15: ffff8801f28c7938
[  548.623299] FS:  00007f76fca51840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000
[  548.623302] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  548.623304] CR2: 00007f19b9171760 CR3: 00000001ed952000 CR4: 00000000000006e0
[  548.623317] Call Trace:
[  548.623325]  ? kasan_check_read+0x11/0x20
[  548.623330]  ? __zone_watermark_ok+0x92/0x240
[  548.623336]  ? get_page_from_freelist+0x1c3/0x1d90
[  548.623347]  ? _raw_spin_lock_irqsave+0x2a/0x60
[  548.623353]  ? warn_alloc+0x250/0x250
[  548.623358]  ? save_stack+0x46/0xd0
[  548.623361]  ? kasan_kmalloc+0xad/0xe0
[  548.623366]  ? __isolate_free_page+0x2a0/0x2a0
[  548.623370]  ? mount_fs+0x60/0x1a0
[  548.623374]  ? vfs_kern_mount+0x6b/0x1a0
[  548.623378]  ? do_mount+0x34a/0x18c0
[  548.623383]  ? ksys_mount+0x83/0xd0
[  548.623387]  ? __x64_sys_mount+0x67/0x80
[  548.623391]  ? do_syscall_64+0x78/0x170
[  548.623396]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  548.623401]  __alloc_pages_nodemask+0x3c5/0x400
[  548.623407]  ? __alloc_pages_slowpath+0x1420/0x1420
[  548.623412]  ? __mutex_lock_slowpath+0x20/0x20
[  548.623417]  ? kvmalloc_node+0x31/0x80
[  548.623424]  alloc_pages_current+0x75/0x110
[  548.623436]  kmalloc_order+0x24/0x60
[  548.623442]  kmalloc_order_trace+0x24/0xb0
[  548.623448]  __kmalloc_track_caller+0x207/0x220
[  548.623455]  ? f2fs_build_node_manager+0x399/0xbb0
[  548.623460]  kmemdup+0x20/0x50
[  548.623465]  f2fs_build_node_manager+0x399/0xbb0
[  548.623470]  f2fs_fill_super+0x195e/0x2b40
[  548.623477]  ? f2fs_commit_super+0x1b0/0x1b0
[  548.623481]  ? set_blocksize+0x90/0x140
[  548.623486]  mount_bdev+0x1c5/0x210
[  548.623489]  ? f2fs_commit_super+0x1b0/0x1b0
[  548.623495]  f2fs_mount+0x15/0x20
[  548.623498]  mount_fs+0x60/0x1a0
[  548.623503]  ? alloc_vfsmnt+0x309/0x360
[  548.623508]  vfs_kern_mount+0x6b/0x1a0
[  548.623513]  do_mount+0x34a/0x18c0
[  548.623518]  ? lockref_put_or_lock+0xcf/0x160
[  548.623523]  ? copy_mount_string+0x20/0x20
[  548.623528]  ? memcg_kmem_put_cache+0x1b/0xa0
[  548.623533]  ? kasan_check_write+0x14/0x20
[  548.623537]  ? _copy_from_user+0x6a/0x90
[  548.623542]  ? memdup_user+0x42/0x60
[  548.623547]  ksys_mount+0x83/0xd0
[  548.623552]  __x64_sys_mount+0x67/0x80
[  548.623557]  do_syscall_64+0x78/0x170
[  548.623562]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  548.623566] RIP: 0033:0x7f76fc331b9a
[  548.623567] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48
[  548.623632] RSP: 002b:00007ffd4f0a0e48 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[  548.623636] RAX: ffffffffffffffda RBX: 000000000146c030 RCX: 00007f76fc331b9a
[  548.623639] RDX: 000000000146c210 RSI: 000000000146df30 RDI: 0000000001474ec0
[  548.623641] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[  548.623643] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000000001474ec0
[  548.623646] R13: 000000000146c210 R14: 0000000000000000 R15: 0000000000000003
[  548.623650] ---[ end trace 4ce02f25ff7d3df5 ]---
[  548.623656] F2FS-fs (loop0): Failed to initialize F2FS node manager
[  548.627936] F2FS-fs (loop0): Invalid log blocks per segment (8201)

[  548.627940] F2FS-fs (loop0): Can't find valid F2FS filesystem in 1th superblock
[  548.635835] F2FS-fs (loop0): Failed to initialize F2FS node manager

- Location
https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/f2fs/segment.c#L3578

	sit_i->sit_bitmap = kmemdup(src_bitmap, bitmap_size, GFP_KERNEL);

Buffer overrun happens when doing memcpy. I suspect there is missing (inconsistent) checks on bitmap_size.

Reported by Wen Xu (wen.xu@gatech.edu) from SSLab, Gatech.

Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/f2fs/super.c |   21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1883,12 +1883,17 @@ int sanity_check_ckpt(struct f2fs_sb_inf
 	struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
 	unsigned int ovp_segments, reserved_segments;
 	unsigned int main_segs, blocks_per_seg;
+	unsigned int sit_segs, nat_segs;
+	unsigned int sit_bitmap_size, nat_bitmap_size;
+	unsigned int log_blocks_per_seg;
 	int i;
 
 	total = le32_to_cpu(raw_super->segment_count);
 	fsmeta = le32_to_cpu(raw_super->segment_count_ckpt);
-	fsmeta += le32_to_cpu(raw_super->segment_count_sit);
-	fsmeta += le32_to_cpu(raw_super->segment_count_nat);
+	sit_segs = le32_to_cpu(raw_super->segment_count_sit);
+	fsmeta += sit_segs;
+	nat_segs = le32_to_cpu(raw_super->segment_count_nat);
+	fsmeta += nat_segs;
 	fsmeta += le32_to_cpu(ckpt->rsvd_segment_count);
 	fsmeta += le32_to_cpu(raw_super->segment_count_ssa);
 
@@ -1919,6 +1924,18 @@ int sanity_check_ckpt(struct f2fs_sb_inf
 			return 1;
 	}
 
+	sit_bitmap_size = le32_to_cpu(ckpt->sit_ver_bitmap_bytesize);
+	nat_bitmap_size = le32_to_cpu(ckpt->nat_ver_bitmap_bytesize);
+	log_blocks_per_seg = le32_to_cpu(raw_super->log_blocks_per_seg);
+
+	if (sit_bitmap_size != ((sit_segs / 2) << log_blocks_per_seg) / 8 ||
+		nat_bitmap_size != ((nat_segs / 2) << log_blocks_per_seg) / 8) {
+		f2fs_msg(sbi->sb, KERN_ERR,
+			"Wrong bitmap size: sit: %u, nat:%u",
+			sit_bitmap_size, nat_bitmap_size);
+		return 1;
+	}
+
 	if (unlikely(f2fs_cp_error(sbi))) {
 		f2fs_msg(sbi->sb, KERN_ERR, "A bug case: need to run fsck");
 		return 1;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 088/126] NFSv4.1: Fix a potential layoutget/layoutrecall deadlock
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (86 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 087/126] f2fs: fix to do sanity check with {sit,nat}_ver_bitmap_bytesize Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 089/126] MIPS: WARN_ON invalid DMA cache maintenance, not BUG_ON Greg Kroah-Hartman
                   ` (40 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Trond Myklebust, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Trond Myklebust <trond.myklebust@hammerspace.com>

[ Upstream commit bd3d16a887b0c19a2a20d35ffed499e3a3637feb ]

If the client is sending a layoutget, but the server issues a callback
to recall what it thinks may be an outstanding layout, then we may find
an uninitialised layout attached to the inode due to the layoutget.
In that case, it is appropriate to return NFS4ERR_NOMATCHING_LAYOUT
rather than NFS4ERR_DELAY, as the latter can end up deadlocking.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/nfs/callback_proc.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -213,9 +213,9 @@ static u32 pnfs_check_callback_stateid(s
 {
 	u32 oldseq, newseq;
 
-	/* Is the stateid still not initialised? */
+	/* Is the stateid not initialised? */
 	if (!pnfs_layout_is_valid(lo))
-		return NFS4ERR_DELAY;
+		return NFS4ERR_NOMATCHING_LAYOUT;
 
 	/* Mismatched stateid? */
 	if (!nfs4_stateid_match_other(&lo->plh_stateid, new))



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 089/126] MIPS: WARN_ON invalid DMA cache maintenance, not BUG_ON
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (87 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 088/126] NFSv4.1: Fix a potential layoutget/layoutrecall deadlock Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 090/126] RDMA/cma: Do not ignore net namespace for unbound cm_id Greg Kroah-Hartman
                   ` (39 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Paul Burton, Florian Fainelli,
	Ralf Baechle, linux-mips, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Paul Burton <paul.burton@imgtec.com>

[ Upstream commit d4da0e97baea8768b3d66ccef3967bebd50dfc3b ]

If a driver causes DMA cache maintenance with a zero length then we
currently BUG and kill the kernel. As this is a scenario that we may
well be able to recover from, WARN & return in the condition instead.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/14623/
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/mips/mm/c-r4k.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/arch/mips/mm/c-r4k.c
+++ b/arch/mips/mm/c-r4k.c
@@ -835,7 +835,8 @@ static void r4k_flush_icache_user_range(
 static void r4k_dma_cache_wback_inv(unsigned long addr, unsigned long size)
 {
 	/* Catch bad driver code */
-	BUG_ON(size == 0);
+	if (WARN_ON(size == 0))
+		return;
 
 	preempt_disable();
 	if (cpu_has_inclusive_pcaches) {
@@ -871,7 +872,8 @@ static void r4k_dma_cache_wback_inv(unsi
 static void r4k_dma_cache_inv(unsigned long addr, unsigned long size)
 {
 	/* Catch bad driver code */
-	BUG_ON(size == 0);
+	if (WARN_ON(size == 0))
+		return;
 
 	preempt_disable();
 	if (cpu_has_inclusive_pcaches) {



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 090/126] RDMA/cma: Do not ignore net namespace for unbound cm_id
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (88 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 089/126] MIPS: WARN_ON invalid DMA cache maintenance, not BUG_ON Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 091/126] drm/i915: set DP Main Stream Attribute for color range on DDI platforms Greg Kroah-Hartman
                   ` (38 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Parav Pandit, Daniel Jurgens,
	Leon Romanovsky, Jason Gunthorpe, Sasha Levin

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Parav Pandit <parav@mellanox.com>

[ Upstream commit 643d213a9a034fa04f5575a40dfc8548e33ce04f ]

Currently if the cm_id is not bound to any netdevice, than for such cm_id,
net namespace is ignored; which is incorrect.

Regardless of cm_id bound to a netdevice or not, net namespace must
match. When a cm_id is bound to a netdevice, in such case net namespace
and netdevice both must match.

Fixes: 4c21b5bcef73 ("IB/cma: Add net_dev and private data checks to RDMA CM")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/infiniband/core/cma.c |   13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1459,9 +1459,16 @@ static bool cma_match_net_dev(const stru
 		       (addr->src_addr.ss_family == AF_IB ||
 			cma_protocol_roce_dev_port(id->device, port_num));
 
-	return !addr->dev_addr.bound_dev_if ||
-	       (net_eq(dev_net(net_dev), addr->dev_addr.net) &&
-		addr->dev_addr.bound_dev_if == net_dev->ifindex);
+	/*
+	 * Net namespaces must match, and if the listner is listening
+	 * on a specific netdevice than netdevice must match as well.
+	 */
+	if (net_eq(dev_net(net_dev), addr->dev_addr.net) &&
+	    (!!addr->dev_addr.bound_dev_if ==
+	     (addr->dev_addr.bound_dev_if == net_dev->ifindex)))
+		return true;
+	else
+		return false;
 }
 
 static struct rdma_id_private *cma_find_listener(



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 091/126] drm/i915: set DP Main Stream Attribute for color range on DDI platforms
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (89 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 090/126] RDMA/cma: Do not ignore net namespace for unbound cm_id Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 092/126] inet: frags: change inet_frags_init_net() return value Greg Kroah-Hartman
                   ` (37 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Michał Kopeć, N. W.,
	Nicholas Stommel, Tom Yan, Paulo Zanoni, Rodrigo Vivi,
	Ville Syrjälä,
	Jani Nikula

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jani Nikula <jani.nikula@intel.com>

commit 6209c285e7a5e68dbcdf8fd2456c6dd68433806b upstream.

Since Haswell we have no color range indication either in the pipe or
port registers for DP. Instead, there's a separate register for setting
the DP Main Stream Attributes (MSA) directly. The MSA register
definition makes no references to colorimetry, just a vague reference to
the DP spec. The connection to the color range was lost.

Apparently we've failed to set the proper MSA bit for limited, or CEA,
range ever since the first DDI platforms. We've started setting other
MSA parameters since commit dae847991a43 ("drm/i915: add
intel_ddi_set_pipe_settings").

Without the crucial bit of information, the DP sink has no way of
knowing the source is actually transmitting limited range RGB, leading
to "washed out" colors. With the colorimetry information, compliant
sinks should be able to handle the limited range properly. Native
(i.e. non-LSPCON) HDMI was not affected because we do pass the color
range via AVI infoframes.

Though not the root cause, the problem was made worse for DDI platforms
with commit 55bc60db5988 ("drm/i915: Add "Automatic" mode for the
"Broadcast RGB" property"), which selects limited range RGB
automatically based on the mode, as per the DP, HDMI and CEA specs.

After all these years, the fix boils down to flipping one bit.

[Per testing reports, this fixes DP sinks, but not the LSPCON. My
 educated guess is that the LSPCON fails to turn the CEA range MSA into
 AVI infoframes for HDMI.]

Reported-by: Michał Kopeć <mkopec12@gmail.com>
Reported-by: N. W. <nw9165-3201@yahoo.com>
Reported-by: Nicholas Stommel <nicholas.stommel@gmail.com>
Reported-by: Tom Yan <tom.ty89@gmail.com>
Tested-by: Nicholas Stommel <nicholas.stommel@gmail.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=100023
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107476
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=94921
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v3.9+
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180814060001.18224-1-jani.nikula@intel.com
(cherry picked from commit dc5977da99ea28094b8fa4e9bacbd29bedc41de5)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/i915/i915_reg.h  |    1 +
 drivers/gpu/drm/i915/intel_ddi.c |    4 ++++
 2 files changed, 5 insertions(+)

--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -8462,6 +8462,7 @@ enum skl_power_gate {
 #define  TRANS_MSA_10_BPC		(2<<5)
 #define  TRANS_MSA_12_BPC		(3<<5)
 #define  TRANS_MSA_16_BPC		(4<<5)
+#define  TRANS_MSA_CEA_RANGE		(1<<3)
 
 /* LCPLL Control */
 #define LCPLL_CTL			_MMIO(0x130040)
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -1396,6 +1396,10 @@ void intel_ddi_set_pipe_settings(const s
 		WARN_ON(transcoder_is_dsi(cpu_transcoder));
 
 		temp = TRANS_MSA_SYNC_CLK;
+
+		if (crtc_state->limited_color_range)
+			temp |= TRANS_MSA_CEA_RANGE;
+
 		switch (crtc_state->pipe_bpp) {
 		case 18:
 			temp |= TRANS_MSA_6_BPC;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 092/126] inet: frags: change inet_frags_init_net() return value
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (90 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 091/126] drm/i915: set DP Main Stream Attribute for color range on DDI platforms Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 093/126] inet: frags: add a pointer to struct netns_frags Greg Kroah-Hartman
                   ` (36 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

We will soon initialize one rhashtable per struct netns_frags
in inet_frags_init_net().

This patch changes the return value to eventually propagate an
error.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 787bea7748a76130566f881c2342a0be4127d182)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/net/inet_frag.h                 |    3 ++-
 net/ieee802154/6lowpan/reassembly.c     |   11 ++++++++---
 net/ipv4/ip_fragment.c                  |   12 +++++++++---
 net/ipv6/netfilter/nf_conntrack_reasm.c |   12 +++++++++---
 net/ipv6/reassembly.c                   |   11 +++++++++--
 5 files changed, 37 insertions(+), 12 deletions(-)

--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -104,9 +104,10 @@ struct inet_frags {
 int inet_frags_init(struct inet_frags *);
 void inet_frags_fini(struct inet_frags *);
 
-static inline void inet_frags_init_net(struct netns_frags *nf)
+static inline int inet_frags_init_net(struct netns_frags *nf)
 {
 	atomic_set(&nf->mem, 0);
+	return 0;
 }
 void inet_frags_exit_net(struct netns_frags *nf, struct inet_frags *f);
 
--- a/net/ieee802154/6lowpan/reassembly.c
+++ b/net/ieee802154/6lowpan/reassembly.c
@@ -580,14 +580,19 @@ static int __net_init lowpan_frags_init_
 {
 	struct netns_ieee802154_lowpan *ieee802154_lowpan =
 		net_ieee802154_lowpan(net);
+	int res;
 
 	ieee802154_lowpan->frags.high_thresh = IPV6_FRAG_HIGH_THRESH;
 	ieee802154_lowpan->frags.low_thresh = IPV6_FRAG_LOW_THRESH;
 	ieee802154_lowpan->frags.timeout = IPV6_FRAG_TIMEOUT;
 
-	inet_frags_init_net(&ieee802154_lowpan->frags);
-
-	return lowpan_frags_ns_sysctl_register(net);
+	res = inet_frags_init_net(&ieee802154_lowpan->frags);
+	if (res < 0)
+		return res;
+	res = lowpan_frags_ns_sysctl_register(net);
+	if (res < 0)
+		inet_frags_exit_net(&ieee802154_lowpan->frags, &lowpan_frags);
+	return res;
 }
 
 static void __net_exit lowpan_frags_exit_net(struct net *net)
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -850,6 +850,8 @@ static void __init ip4_frags_ctl_registe
 
 static int __net_init ipv4_frags_init_net(struct net *net)
 {
+	int res;
+
 	/* Fragment cache limits.
 	 *
 	 * The fragment memory accounting code, (tries to) account for
@@ -875,9 +877,13 @@ static int __net_init ipv4_frags_init_ne
 
 	net->ipv4.frags.max_dist = 64;
 
-	inet_frags_init_net(&net->ipv4.frags);
-
-	return ip4_frags_ns_ctl_register(net);
+	res = inet_frags_init_net(&net->ipv4.frags);
+	if (res < 0)
+		return res;
+	res = ip4_frags_ns_ctl_register(net);
+	if (res < 0)
+		inet_frags_exit_net(&net->ipv4.frags, &ip4_frags);
+	return res;
 }
 
 static void __net_exit ipv4_frags_exit_net(struct net *net)
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -630,12 +630,18 @@ EXPORT_SYMBOL_GPL(nf_ct_frag6_gather);
 
 static int nf_ct_net_init(struct net *net)
 {
+	int res;
+
 	net->nf_frag.frags.high_thresh = IPV6_FRAG_HIGH_THRESH;
 	net->nf_frag.frags.low_thresh = IPV6_FRAG_LOW_THRESH;
 	net->nf_frag.frags.timeout = IPV6_FRAG_TIMEOUT;
-	inet_frags_init_net(&net->nf_frag.frags);
-
-	return nf_ct_frag6_sysctl_register(net);
+	res = inet_frags_init_net(&net->nf_frag.frags);
+	if (res < 0)
+		return res;
+	res = nf_ct_frag6_sysctl_register(net);
+	if (res < 0)
+		inet_frags_exit_net(&net->nf_frag.frags, &nf_frags);
+	return res;
 }
 
 static void nf_ct_net_exit(struct net *net)
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -714,13 +714,20 @@ static void ip6_frags_sysctl_unregister(
 
 static int __net_init ipv6_frags_init_net(struct net *net)
 {
+	int res;
+
 	net->ipv6.frags.high_thresh = IPV6_FRAG_HIGH_THRESH;
 	net->ipv6.frags.low_thresh = IPV6_FRAG_LOW_THRESH;
 	net->ipv6.frags.timeout = IPV6_FRAG_TIMEOUT;
 
-	inet_frags_init_net(&net->ipv6.frags);
+	res = inet_frags_init_net(&net->ipv6.frags);
+	if (res < 0)
+		return res;
 
-	return ip6_frags_ns_sysctl_register(net);
+	res = ip6_frags_ns_sysctl_register(net);
+	if (res < 0)
+		inet_frags_exit_net(&net->ipv6.frags, &ip6_frags);
+	return res;
 }
 
 static void __net_exit ipv6_frags_exit_net(struct net *net)



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 093/126] inet: frags: add a pointer to struct netns_frags
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (91 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 092/126] inet: frags: change inet_frags_init_net() return value Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 094/126] inet: frags: refactor ipfrag_init() Greg Kroah-Hartman
                   ` (35 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

In order to simplify the API, add a pointer to struct inet_frags.
This will allow us to make things less complex.

These functions no longer have a struct inet_frags parameter :

inet_frag_destroy(struct inet_frag_queue *q  /*, struct inet_frags *f */)
inet_frag_put(struct inet_frag_queue *q /*, struct inet_frags *f */)
inet_frag_kill(struct inet_frag_queue *q /*, struct inet_frags *f */)
inet_frags_exit_net(struct netns_frags *nf /*, struct inet_frags *f */)
ip6_expire_frag_queue(struct net *net, struct frag_queue *fq)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 093ba72914b696521e4885756a68a3332782c8de)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/net/inet_frag.h                 |   11 ++++++-----
 include/net/ipv6.h                      |    3 +--
 net/ieee802154/6lowpan/reassembly.c     |   13 +++++++------
 net/ipv4/inet_fragment.c                |   17 ++++++++++-------
 net/ipv4/ip_fragment.c                  |    9 +++++----
 net/ipv6/netfilter/nf_conntrack_reasm.c |   16 +++++++++-------
 net/ipv6/reassembly.c                   |   20 ++++++++++----------
 7 files changed, 48 insertions(+), 41 deletions(-)

--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -10,6 +10,7 @@ struct netns_frags {
 	int			high_thresh;
 	int			low_thresh;
 	int			max_dist;
+	struct inet_frags	*f;
 };
 
 /**
@@ -109,20 +110,20 @@ static inline int inet_frags_init_net(st
 	atomic_set(&nf->mem, 0);
 	return 0;
 }
-void inet_frags_exit_net(struct netns_frags *nf, struct inet_frags *f);
+void inet_frags_exit_net(struct netns_frags *nf);
 
-void inet_frag_kill(struct inet_frag_queue *q, struct inet_frags *f);
-void inet_frag_destroy(struct inet_frag_queue *q, struct inet_frags *f);
+void inet_frag_kill(struct inet_frag_queue *q);
+void inet_frag_destroy(struct inet_frag_queue *q);
 struct inet_frag_queue *inet_frag_find(struct netns_frags *nf,
 		struct inet_frags *f, void *key, unsigned int hash);
 
 void inet_frag_maybe_warn_overflow(struct inet_frag_queue *q,
 				   const char *prefix);
 
-static inline void inet_frag_put(struct inet_frag_queue *q, struct inet_frags *f)
+static inline void inet_frag_put(struct inet_frag_queue *q)
 {
 	if (refcount_dec_and_test(&q->refcnt))
-		inet_frag_destroy(q, f);
+		inet_frag_destroy(q);
 }
 
 static inline bool inet_frag_evicting(struct inet_frag_queue *q)
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -560,8 +560,7 @@ struct frag_queue {
 	u8			ecn;
 };
 
-void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq,
-			   struct inet_frags *frags);
+void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq);
 
 static inline bool ipv6_addr_any(const struct in6_addr *a)
 {
--- a/net/ieee802154/6lowpan/reassembly.c
+++ b/net/ieee802154/6lowpan/reassembly.c
@@ -93,10 +93,10 @@ static void lowpan_frag_expire(unsigned
 	if (fq->q.flags & INET_FRAG_COMPLETE)
 		goto out;
 
-	inet_frag_kill(&fq->q, &lowpan_frags);
+	inet_frag_kill(&fq->q);
 out:
 	spin_unlock(&fq->q.lock);
-	inet_frag_put(&fq->q, &lowpan_frags);
+	inet_frag_put(&fq->q);
 }
 
 static inline struct lowpan_frag_queue *
@@ -229,7 +229,7 @@ static int lowpan_frag_reasm(struct lowp
 	struct sk_buff *fp, *head = fq->q.fragments;
 	int sum_truesize;
 
-	inet_frag_kill(&fq->q, &lowpan_frags);
+	inet_frag_kill(&fq->q);
 
 	/* Make the one we just received the head. */
 	if (prev) {
@@ -437,7 +437,7 @@ int lowpan_frag_rcv(struct sk_buff *skb,
 		ret = lowpan_frag_queue(fq, skb, frag_type);
 		spin_unlock(&fq->q.lock);
 
-		inet_frag_put(&fq->q, &lowpan_frags);
+		inet_frag_put(&fq->q);
 		return ret;
 	}
 
@@ -585,13 +585,14 @@ static int __net_init lowpan_frags_init_
 	ieee802154_lowpan->frags.high_thresh = IPV6_FRAG_HIGH_THRESH;
 	ieee802154_lowpan->frags.low_thresh = IPV6_FRAG_LOW_THRESH;
 	ieee802154_lowpan->frags.timeout = IPV6_FRAG_TIMEOUT;
+	ieee802154_lowpan->frags.f = &lowpan_frags;
 
 	res = inet_frags_init_net(&ieee802154_lowpan->frags);
 	if (res < 0)
 		return res;
 	res = lowpan_frags_ns_sysctl_register(net);
 	if (res < 0)
-		inet_frags_exit_net(&ieee802154_lowpan->frags, &lowpan_frags);
+		inet_frags_exit_net(&ieee802154_lowpan->frags);
 	return res;
 }
 
@@ -601,7 +602,7 @@ static void __net_exit lowpan_frags_exit
 		net_ieee802154_lowpan(net);
 
 	lowpan_frags_ns_sysctl_unregister(net);
-	inet_frags_exit_net(&ieee802154_lowpan->frags, &lowpan_frags);
+	inet_frags_exit_net(&ieee802154_lowpan->frags);
 }
 
 static struct pernet_operations lowpan_frags_ops = {
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -219,8 +219,9 @@ void inet_frags_fini(struct inet_frags *
 }
 EXPORT_SYMBOL(inet_frags_fini);
 
-void inet_frags_exit_net(struct netns_frags *nf, struct inet_frags *f)
+void inet_frags_exit_net(struct netns_frags *nf)
 {
+	struct inet_frags *f =nf->f;
 	unsigned int seq;
 	int i;
 
@@ -264,33 +265,34 @@ __acquires(hb->chain_lock)
 	return hb;
 }
 
-static inline void fq_unlink(struct inet_frag_queue *fq, struct inet_frags *f)
+static inline void fq_unlink(struct inet_frag_queue *fq)
 {
 	struct inet_frag_bucket *hb;
 
-	hb = get_frag_bucket_locked(fq, f);
+	hb = get_frag_bucket_locked(fq, fq->net->f);
 	hlist_del(&fq->list);
 	fq->flags |= INET_FRAG_COMPLETE;
 	spin_unlock(&hb->chain_lock);
 }
 
-void inet_frag_kill(struct inet_frag_queue *fq, struct inet_frags *f)
+void inet_frag_kill(struct inet_frag_queue *fq)
 {
 	if (del_timer(&fq->timer))
 		refcount_dec(&fq->refcnt);
 
 	if (!(fq->flags & INET_FRAG_COMPLETE)) {
-		fq_unlink(fq, f);
+		fq_unlink(fq);
 		refcount_dec(&fq->refcnt);
 	}
 }
 EXPORT_SYMBOL(inet_frag_kill);
 
-void inet_frag_destroy(struct inet_frag_queue *q, struct inet_frags *f)
+void inet_frag_destroy(struct inet_frag_queue *q)
 {
 	struct sk_buff *fp;
 	struct netns_frags *nf;
 	unsigned int sum, sum_truesize = 0;
+	struct inet_frags *f;
 
 	WARN_ON(!(q->flags & INET_FRAG_COMPLETE));
 	WARN_ON(del_timer(&q->timer) != 0);
@@ -298,6 +300,7 @@ void inet_frag_destroy(struct inet_frag_
 	/* Release all fragment data. */
 	fp = q->fragments;
 	nf = q->net;
+	f = nf->f;
 	while (fp) {
 		struct sk_buff *xp = fp->next;
 
@@ -333,7 +336,7 @@ static struct inet_frag_queue *inet_frag
 			refcount_inc(&qp->refcnt);
 			spin_unlock(&hb->chain_lock);
 			qp_in->flags |= INET_FRAG_COMPLETE;
-			inet_frag_put(qp_in, f);
+			inet_frag_put(qp_in);
 			return qp;
 		}
 	}
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -168,7 +168,7 @@ static void ip4_frag_free(struct inet_fr
 
 static void ipq_put(struct ipq *ipq)
 {
-	inet_frag_put(&ipq->q, &ip4_frags);
+	inet_frag_put(&ipq->q);
 }
 
 /* Kill ipq entry. It is not destroyed immediately,
@@ -176,7 +176,7 @@ static void ipq_put(struct ipq *ipq)
  */
 static void ipq_kill(struct ipq *ipq)
 {
-	inet_frag_kill(&ipq->q, &ip4_frags);
+	inet_frag_kill(&ipq->q);
 }
 
 static bool frag_expire_skip_icmp(u32 user)
@@ -876,20 +876,21 @@ static int __net_init ipv4_frags_init_ne
 	net->ipv4.frags.timeout = IP_FRAG_TIME;
 
 	net->ipv4.frags.max_dist = 64;
+	net->ipv4.frags.f = &ip4_frags;
 
 	res = inet_frags_init_net(&net->ipv4.frags);
 	if (res < 0)
 		return res;
 	res = ip4_frags_ns_ctl_register(net);
 	if (res < 0)
-		inet_frags_exit_net(&net->ipv4.frags, &ip4_frags);
+		inet_frags_exit_net(&net->ipv4.frags);
 	return res;
 }
 
 static void __net_exit ipv4_frags_exit_net(struct net *net)
 {
 	ip4_frags_ns_ctl_unregister(net);
-	inet_frags_exit_net(&net->ipv4.frags, &ip4_frags);
+	inet_frags_exit_net(&net->ipv4.frags);
 }
 
 static struct pernet_operations ip4_frags_ops = {
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -177,7 +177,7 @@ static void nf_ct_frag6_expire(unsigned
 	fq = container_of((struct inet_frag_queue *)data, struct frag_queue, q);
 	net = container_of(fq->q.net, struct net, nf_frag.frags);
 
-	ip6_expire_frag_queue(net, fq, &nf_frags);
+	ip6_expire_frag_queue(net, fq);
 }
 
 /* Creation primitives. */
@@ -263,7 +263,7 @@ static int nf_ct_frag6_queue(struct frag
 			 * this case. -DaveM
 			 */
 			pr_debug("end of fragment not rounded to 8 bytes.\n");
-			inet_frag_kill(&fq->q, &nf_frags);
+			inet_frag_kill(&fq->q);
 			return -EPROTO;
 		}
 		if (end > fq->q.len) {
@@ -356,7 +356,7 @@ found:
 	return 0;
 
 discard_fq:
-	inet_frag_kill(&fq->q, &nf_frags);
+	inet_frag_kill(&fq->q);
 err:
 	return -EINVAL;
 }
@@ -378,7 +378,7 @@ nf_ct_frag6_reasm(struct frag_queue *fq,
 	int    payload_len;
 	u8 ecn;
 
-	inet_frag_kill(&fq->q, &nf_frags);
+	inet_frag_kill(&fq->q);
 
 	WARN_ON(head == NULL);
 	WARN_ON(NFCT_FRAG6_CB(head)->offset != 0);
@@ -623,7 +623,7 @@ int nf_ct_frag6_gather(struct net *net,
 
 out_unlock:
 	spin_unlock_bh(&fq->q.lock);
-	inet_frag_put(&fq->q, &nf_frags);
+	inet_frag_put(&fq->q);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(nf_ct_frag6_gather);
@@ -635,19 +635,21 @@ static int nf_ct_net_init(struct net *ne
 	net->nf_frag.frags.high_thresh = IPV6_FRAG_HIGH_THRESH;
 	net->nf_frag.frags.low_thresh = IPV6_FRAG_LOW_THRESH;
 	net->nf_frag.frags.timeout = IPV6_FRAG_TIMEOUT;
+	net->nf_frag.frags.f = &nf_frags;
+
 	res = inet_frags_init_net(&net->nf_frag.frags);
 	if (res < 0)
 		return res;
 	res = nf_ct_frag6_sysctl_register(net);
 	if (res < 0)
-		inet_frags_exit_net(&net->nf_frag.frags, &nf_frags);
+		inet_frags_exit_net(&net->nf_frag.frags);
 	return res;
 }
 
 static void nf_ct_net_exit(struct net *net)
 {
 	nf_ct_frags6_sysctl_unregister(net);
-	inet_frags_exit_net(&net->nf_frag.frags, &nf_frags);
+	inet_frags_exit_net(&net->nf_frag.frags);
 }
 
 static struct pernet_operations nf_ct_net_ops = {
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -128,8 +128,7 @@ void ip6_frag_init(struct inet_frag_queu
 }
 EXPORT_SYMBOL(ip6_frag_init);
 
-void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq,
-			   struct inet_frags *frags)
+void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq)
 {
 	struct net_device *dev = NULL;
 
@@ -138,7 +137,7 @@ void ip6_expire_frag_queue(struct net *n
 	if (fq->q.flags & INET_FRAG_COMPLETE)
 		goto out;
 
-	inet_frag_kill(&fq->q, frags);
+	inet_frag_kill(&fq->q);
 
 	rcu_read_lock();
 	dev = dev_get_by_index_rcu(net, fq->iif);
@@ -166,7 +165,7 @@ out_rcu_unlock:
 	rcu_read_unlock();
 out:
 	spin_unlock(&fq->q.lock);
-	inet_frag_put(&fq->q, frags);
+	inet_frag_put(&fq->q);
 }
 EXPORT_SYMBOL(ip6_expire_frag_queue);
 
@@ -178,7 +177,7 @@ static void ip6_frag_expire(unsigned lon
 	fq = container_of((struct inet_frag_queue *)data, struct frag_queue, q);
 	net = container_of(fq->q.net, struct net, ipv6.frags);
 
-	ip6_expire_frag_queue(net, fq, &ip6_frags);
+	ip6_expire_frag_queue(net, fq);
 }
 
 static struct frag_queue *
@@ -363,7 +362,7 @@ found:
 	return -1;
 
 discard_fq:
-	inet_frag_kill(&fq->q, &ip6_frags);
+	inet_frag_kill(&fq->q);
 err:
 	__IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
 			IPSTATS_MIB_REASMFAILS);
@@ -390,7 +389,7 @@ static int ip6_frag_reasm(struct frag_qu
 	int sum_truesize;
 	u8 ecn;
 
-	inet_frag_kill(&fq->q, &ip6_frags);
+	inet_frag_kill(&fq->q);
 
 	ecn = ip_frag_ecn_table[fq->ecn];
 	if (unlikely(ecn == 0xff))
@@ -568,7 +567,7 @@ static int ipv6_frag_rcv(struct sk_buff
 		ret = ip6_frag_queue(fq, skb, fhdr, IP6CB(skb)->nhoff);
 
 		spin_unlock(&fq->q.lock);
-		inet_frag_put(&fq->q, &ip6_frags);
+		inet_frag_put(&fq->q);
 		return ret;
 	}
 
@@ -719,6 +718,7 @@ static int __net_init ipv6_frags_init_ne
 	net->ipv6.frags.high_thresh = IPV6_FRAG_HIGH_THRESH;
 	net->ipv6.frags.low_thresh = IPV6_FRAG_LOW_THRESH;
 	net->ipv6.frags.timeout = IPV6_FRAG_TIMEOUT;
+	net->ipv6.frags.f = &ip6_frags;
 
 	res = inet_frags_init_net(&net->ipv6.frags);
 	if (res < 0)
@@ -726,14 +726,14 @@ static int __net_init ipv6_frags_init_ne
 
 	res = ip6_frags_ns_sysctl_register(net);
 	if (res < 0)
-		inet_frags_exit_net(&net->ipv6.frags, &ip6_frags);
+		inet_frags_exit_net(&net->ipv6.frags);
 	return res;
 }
 
 static void __net_exit ipv6_frags_exit_net(struct net *net)
 {
 	ip6_frags_ns_sysctl_unregister(net);
-	inet_frags_exit_net(&net->ipv6.frags, &ip6_frags);
+	inet_frags_exit_net(&net->ipv6.frags);
 }
 
 static struct pernet_operations ip6_frags_ops = {



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 094/126] inet: frags: refactor ipfrag_init()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (92 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 093/126] inet: frags: add a pointer to struct netns_frags Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 095/126] inet: frags: Convert timers to use timer_setup() Greg Kroah-Hartman
                   ` (34 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

We need to call inet_frags_init() before register_pernet_subsys(),
as a prereq for following patch ("inet: frags: use rhashtables for reassembly units")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 483a6e4fa055123142d8956866fe2aa9c98d546d)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/ip_fragment.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -900,8 +900,6 @@ static struct pernet_operations ip4_frag
 
 void __init ipfrag_init(void)
 {
-	ip4_frags_ctl_register();
-	register_pernet_subsys(&ip4_frags_ops);
 	ip4_frags.hashfn = ip4_hashfn;
 	ip4_frags.constructor = ip4_frag_init;
 	ip4_frags.destructor = ip4_frag_free;
@@ -911,4 +909,6 @@ void __init ipfrag_init(void)
 	ip4_frags.frags_cache_name = ip_frag_cache_name;
 	if (inet_frags_init(&ip4_frags))
 		panic("IP: failed to allocate ip4_frags cache\n");
+	ip4_frags_ctl_register();
+	register_pernet_subsys(&ip4_frags_ops);
 }



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 095/126] inet: frags: Convert timers to use timer_setup()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (93 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 094/126] inet: frags: refactor ipfrag_init() Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 096/126] inet: frags: refactor ipv6_frag_init() Greg Kroah-Hartman
                   ` (33 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexander Aring, Stefan Schmidt,
	David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	linux-wpan, netdev, netfilter-devel, coreteam, Kees Cook

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Alexander Aring <alex.aring@gmail.com>
Cc: Stefan Schmidt <stefan@osg.samsung.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: Florian Westphal <fw@strlen.de>
Cc: linux-wpan@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: netfilter-devel@vger.kernel.org
Cc: coreteam@netfilter.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Stefan Schmidt <stefan@osg.samsung.com> # for ieee802154
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 78802011fbe34331bdef6f2dfb1634011f0e4c32)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/net/inet_frag.h                 |    2 +-
 net/ieee802154/6lowpan/reassembly.c     |    5 +++--
 net/ipv4/inet_fragment.c                |    4 ++--
 net/ipv4/ip_fragment.c                  |    5 +++--
 net/ipv6/netfilter/nf_conntrack_reasm.c |    5 +++--
 net/ipv6/reassembly.c                   |    5 +++--
 6 files changed, 15 insertions(+), 11 deletions(-)

--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -97,7 +97,7 @@ struct inet_frags {
 	void			(*constructor)(struct inet_frag_queue *q,
 					       const void *arg);
 	void			(*destructor)(struct inet_frag_queue *);
-	void			(*frag_expire)(unsigned long data);
+	void			(*frag_expire)(struct timer_list *t);
 	struct kmem_cache	*frags_cachep;
 	const char		*frags_cache_name;
 };
--- a/net/ieee802154/6lowpan/reassembly.c
+++ b/net/ieee802154/6lowpan/reassembly.c
@@ -80,12 +80,13 @@ static void lowpan_frag_init(struct inet
 	fq->daddr = *arg->dst;
 }
 
-static void lowpan_frag_expire(unsigned long data)
+static void lowpan_frag_expire(struct timer_list *t)
 {
+	struct inet_frag_queue *frag = from_timer(frag, t, timer);
 	struct frag_queue *fq;
 	struct net *net;
 
-	fq = container_of((struct inet_frag_queue *)data, struct frag_queue, q);
+	fq = container_of(frag, struct frag_queue, q);
 	net = container_of(fq->q.net, struct net, ieee802154_lowpan.frags);
 
 	spin_lock(&fq->q.lock);
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -150,7 +150,7 @@ inet_evict_bucket(struct inet_frags *f,
 	spin_unlock(&hb->chain_lock);
 
 	hlist_for_each_entry_safe(fq, n, &expired, list_evictor)
-		f->frag_expire((unsigned long) fq);
+		f->frag_expire(&fq->timer);
 
 	return evicted;
 }
@@ -367,7 +367,7 @@ static struct inet_frag_queue *inet_frag
 	f->constructor(q, arg);
 	add_frag_mem_limit(nf, f->qsize);
 
-	setup_timer(&q->timer, f->frag_expire, (unsigned long)q);
+	timer_setup(&q->timer, f->frag_expire, 0);
 	spin_lock_init(&q->lock);
 	refcount_set(&q->refcnt, 1);
 
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -191,12 +191,13 @@ static bool frag_expire_skip_icmp(u32 us
 /*
  * Oops, a fragment queue timed out.  Kill it and send an ICMP reply.
  */
-static void ip_expire(unsigned long arg)
+static void ip_expire(struct timer_list *t)
 {
+	struct inet_frag_queue *frag = from_timer(frag, t, timer);
 	struct ipq *qp;
 	struct net *net;
 
-	qp = container_of((struct inet_frag_queue *) arg, struct ipq, q);
+	qp = container_of(frag, struct ipq, q);
 	net = container_of(qp->q.net, struct net, ipv4.frags);
 
 	rcu_read_lock();
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -169,12 +169,13 @@ static unsigned int nf_hashfn(const stru
 	return nf_hash_frag(nq->id, &nq->saddr, &nq->daddr);
 }
 
-static void nf_ct_frag6_expire(unsigned long data)
+static void nf_ct_frag6_expire(struct timer_list *t)
 {
+	struct inet_frag_queue *frag = from_timer(frag, t, timer);
 	struct frag_queue *fq;
 	struct net *net;
 
-	fq = container_of((struct inet_frag_queue *)data, struct frag_queue, q);
+	fq = container_of(frag, struct frag_queue, q);
 	net = container_of(fq->q.net, struct net, nf_frag.frags);
 
 	ip6_expire_frag_queue(net, fq);
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -169,12 +169,13 @@ out:
 }
 EXPORT_SYMBOL(ip6_expire_frag_queue);
 
-static void ip6_frag_expire(unsigned long data)
+static void ip6_frag_expire(struct timer_list *t)
 {
+	struct inet_frag_queue *frag = from_timer(frag, t, timer);
 	struct frag_queue *fq;
 	struct net *net;
 
-	fq = container_of((struct inet_frag_queue *)data, struct frag_queue, q);
+	fq = container_of(frag, struct frag_queue, q);
 	net = container_of(fq->q.net, struct net, ipv6.frags);
 
 	ip6_expire_frag_queue(net, fq);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 096/126] inet: frags: refactor ipv6_frag_init()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (94 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 095/126] inet: frags: Convert timers to use timer_setup() Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 097/126] inet: frags: refactor lowpan_net_frag_init() Greg Kroah-Hartman
                   ` (32 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

We want to call inet_frags_init() earlier.

This is a prereq to "inet: frags: use rhashtables for reassembly units"

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5b975bab23615cd0fdf67af6c9298eb01c4b9f61)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/reassembly.c |   25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -746,10 +746,21 @@ int __init ipv6_frag_init(void)
 {
 	int ret;
 
-	ret = inet6_add_protocol(&frag_protocol, IPPROTO_FRAGMENT);
+	ip6_frags.hashfn = ip6_hashfn;
+	ip6_frags.constructor = ip6_frag_init;
+	ip6_frags.destructor = NULL;
+	ip6_frags.qsize = sizeof(struct frag_queue);
+	ip6_frags.match = ip6_frag_match;
+	ip6_frags.frag_expire = ip6_frag_expire;
+	ip6_frags.frags_cache_name = ip6_frag_cache_name;
+	ret = inet_frags_init(&ip6_frags);
 	if (ret)
 		goto out;
 
+	ret = inet6_add_protocol(&frag_protocol, IPPROTO_FRAGMENT);
+	if (ret)
+		goto err_protocol;
+
 	ret = ip6_frags_sysctl_register();
 	if (ret)
 		goto err_sysctl;
@@ -758,16 +769,6 @@ int __init ipv6_frag_init(void)
 	if (ret)
 		goto err_pernet;
 
-	ip6_frags.hashfn = ip6_hashfn;
-	ip6_frags.constructor = ip6_frag_init;
-	ip6_frags.destructor = NULL;
-	ip6_frags.qsize = sizeof(struct frag_queue);
-	ip6_frags.match = ip6_frag_match;
-	ip6_frags.frag_expire = ip6_frag_expire;
-	ip6_frags.frags_cache_name = ip6_frag_cache_name;
-	ret = inet_frags_init(&ip6_frags);
-	if (ret)
-		goto err_pernet;
 out:
 	return ret;
 
@@ -775,6 +776,8 @@ err_pernet:
 	ip6_frags_sysctl_unregister();
 err_sysctl:
 	inet6_del_protocol(&frag_protocol, IPPROTO_FRAGMENT);
+err_protocol:
+	inet_frags_fini(&ip6_frags);
 	goto out;
 }
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 097/126] inet: frags: refactor lowpan_net_frag_init()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (95 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 096/126] inet: frags: refactor ipv6_frag_init() Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 098/126] ipv6: export ip6 fragments sysctl to unprivileged users Greg Kroah-Hartman
                   ` (31 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

We want to call lowpan_net_frag_init() earlier.
Similar to commit "inet: frags: refactor ipv6_frag_init()"

This is a prereq to "inet: frags: use rhashtables for reassembly units"

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 807f1844df4ac23594268fa9f41902d0549e92aa)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ieee802154/6lowpan/reassembly.c |   20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

--- a/net/ieee802154/6lowpan/reassembly.c
+++ b/net/ieee802154/6lowpan/reassembly.c
@@ -615,14 +615,6 @@ int __init lowpan_net_frag_init(void)
 {
 	int ret;
 
-	ret = lowpan_frags_sysctl_register();
-	if (ret)
-		return ret;
-
-	ret = register_pernet_subsys(&lowpan_frags_ops);
-	if (ret)
-		goto err_pernet;
-
 	lowpan_frags.hashfn = lowpan_hashfn;
 	lowpan_frags.constructor = lowpan_frag_init;
 	lowpan_frags.destructor = NULL;
@@ -632,11 +624,21 @@ int __init lowpan_net_frag_init(void)
 	lowpan_frags.frags_cache_name = lowpan_frags_cache_name;
 	ret = inet_frags_init(&lowpan_frags);
 	if (ret)
-		goto err_pernet;
+		goto out;
+
+	ret = lowpan_frags_sysctl_register();
+	if (ret)
+		goto err_sysctl;
 
+	ret = register_pernet_subsys(&lowpan_frags_ops);
+	if (ret)
+		goto err_pernet;
+out:
 	return ret;
 err_pernet:
 	lowpan_frags_sysctl_unregister();
+err_sysctl:
+	inet_frags_fini(&lowpan_frags);
 	return ret;
 }
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 098/126] ipv6: export ip6 fragments sysctl to unprivileged users
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (96 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 097/126] inet: frags: refactor lowpan_net_frag_init() Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 099/126] rhashtable: add schedule points Greg Kroah-Hartman
                   ` (30 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, netdev@vger.kernel.org,
	stable@vger.kernel.org, edumazet@google.com, Nikolay Borisov,
	Eric Dumazet, David S. Miller, Nikolay Borisov

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

IPv4 was changed in commit 52a773d645e9 ("net: Export ip fragment
sysctl to unprivileged users")

The only sysctl that is not per-netns is not used :
ip6frag_secret_interval

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 18dcbe12fe9fca0ab825f7eff993060525ac2503)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/reassembly.c |    4 ----
 1 file changed, 4 deletions(-)

--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -649,10 +649,6 @@ static int __net_init ip6_frags_ns_sysct
 		table[1].data = &net->ipv6.frags.low_thresh;
 		table[1].extra2 = &net->ipv6.frags.high_thresh;
 		table[2].data = &net->ipv6.frags.timeout;
-
-		/* Don't export sysctls to unprivileged users */
-		if (net->user_ns != &init_user_ns)
-			table[0].procname = NULL;
 	}
 
 	hdr = register_net_sysctl(net, "net/ipv6", table);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 099/126] rhashtable: add schedule points
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (97 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 098/126] ipv6: export ip6 fragments sysctl to unprivileged users Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 100/126] inet: frags: use rhashtables for reassembly units Greg Kroah-Hartman
                   ` (29 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Herbert Xu, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

Rehashing and destroying large hash table takes a lot of time,
and happens in process context. It is safe to add cond_resched()
in rhashtable_rehash_table() and rhashtable_free_and_destroy()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit ae6da1f503abb5a5081f9f6c4a6881de97830f3e)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 lib/rhashtable.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -364,6 +364,7 @@ static int rhashtable_rehash_table(struc
 		err = rhashtable_rehash_chain(ht, old_hash);
 		if (err)
 			return err;
+		cond_resched();
 	}
 
 	/* Publish the new table pointer. */
@@ -1073,6 +1074,7 @@ void rhashtable_free_and_destroy(struct
 		for (i = 0; i < tbl->size; i++) {
 			struct rhash_head *pos, *next;
 
+			cond_resched();
 			for (pos = rht_dereference(*rht_bucket(tbl, i), ht),
 			     next = !rht_is_a_nulls(pos) ?
 					rht_dereference(pos->next, ht) : NULL;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 100/126] inet: frags: use rhashtables for reassembly units
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (98 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 099/126] rhashtable: add schedule points Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 101/126] inet: frags: remove some helpers Greg Kroah-Hartman
                   ` (28 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Kirill Tkhai,
	Herbert Xu, Florian Westphal, Jesper Dangaard Brouer,
	Alexander Aring, Stefan Schmidt, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

Some applications still rely on IP fragmentation, and to be fair linux
reassembly unit is not working under any serious load.

It uses static hash tables of 1024 buckets, and up to 128 items per bucket (!!!)

A work queue is supposed to garbage collect items when host is under memory
pressure, and doing a hash rebuild, changing seed used in hash computations.

This work queue blocks softirqs for up to 25 ms when doing a hash rebuild,
occurring every 5 seconds if host is under fire.

Then there is the problem of sharing this hash table for all netns.

It is time to switch to rhashtables, and allocate one of them per netns
to speedup netns dismantle, since this is a critical metric these days.

Lookup is now using RCU. A followup patch will even remove
the refcount hold/release left from prior implementation and save
a couple of atomic operations.

Before this patch, 16 cpus (16 RX queue NIC) could not handle more
than 1 Mpps frags DDOS.

After the patch, I reach 9 Mpps without any tuning, and can use up to 2GB
of storage for the fragments (exact number depends on frags being evicted
after timeout)

$ grep FRAG /proc/net/sockstat
FRAG: inuse 1966916 memory 2140004608

A followup patch will change the limits for 64bit arches.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Florian Westphal <fw@strlen.de>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Alexander Aring <alex.aring@gmail.com>
Cc: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 648700f76b03b7e8149d13cc2bdb3355035258a9)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/networking/ip-sysctl.txt  |    7 
 include/net/inet_frag.h                 |   81 +++----
 include/net/ipv6.h                      |   16 -
 net/ieee802154/6lowpan/6lowpan_i.h      |   26 --
 net/ieee802154/6lowpan/reassembly.c     |   91 +++-----
 net/ipv4/inet_fragment.c                |  346 ++++++--------------------------
 net/ipv4/ip_fragment.c                  |  112 ++++------
 net/ipv6/netfilter/nf_conntrack_reasm.c |   51 +---
 net/ipv6/reassembly.c                   |  110 ++++------
 9 files changed, 266 insertions(+), 574 deletions(-)

--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -134,13 +134,10 @@ min_adv_mss - INTEGER
 IP Fragmentation:
 
 ipfrag_high_thresh - INTEGER
-	Maximum memory used to reassemble IP fragments. When
-	ipfrag_high_thresh bytes of memory is allocated for this purpose,
-	the fragment handler will toss packets until ipfrag_low_thresh
-	is reached. This also serves as a maximum limit to namespaces
-	different from the initial one.
+	Maximum memory used to reassemble IP fragments.
 
 ipfrag_low_thresh - INTEGER
+	(Obsolete since linux-4.17)
 	Maximum memory used to reassemble IP fragments before the kernel
 	begins to remove incomplete fragment queues to free up resources.
 	The kernel still accepts new fragments for defragmentation.
--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -2,7 +2,11 @@
 #ifndef __NET_FRAG_H__
 #define __NET_FRAG_H__
 
+#include <linux/rhashtable.h>
+
 struct netns_frags {
+	struct rhashtable       rhashtable ____cacheline_aligned_in_smp;
+
 	/* Keep atomic mem on separate cachelines in structs that include it */
 	atomic_t		mem ____cacheline_aligned_in_smp;
 	/* sysctls */
@@ -26,12 +30,30 @@ enum {
 	INET_FRAG_COMPLETE	= BIT(2),
 };
 
+struct frag_v4_compare_key {
+	__be32		saddr;
+	__be32		daddr;
+	u32		user;
+	u32		vif;
+	__be16		id;
+	u16		protocol;
+};
+
+struct frag_v6_compare_key {
+	struct in6_addr	saddr;
+	struct in6_addr	daddr;
+	u32		user;
+	__be32		id;
+	u32		iif;
+};
+
 /**
  * struct inet_frag_queue - fragment queue
  *
- * @lock: spinlock protecting the queue
+ * @node: rhash node
+ * @key: keys identifying this frag.
  * @timer: queue expiration timer
- * @list: hash bucket list
+ * @lock: spinlock protecting this frag
  * @refcnt: reference count of the queue
  * @fragments: received fragments head
  * @fragments_tail: received fragments tail
@@ -41,12 +63,16 @@ enum {
  * @flags: fragment queue flags
  * @max_size: maximum received fragment size
  * @net: namespace that this frag belongs to
- * @list_evictor: list of queues to forcefully evict (e.g. due to low memory)
+ * @rcu: rcu head for freeing deferall
  */
 struct inet_frag_queue {
-	spinlock_t		lock;
+	struct rhash_head	node;
+	union {
+		struct frag_v4_compare_key v4;
+		struct frag_v6_compare_key v6;
+	} key;
 	struct timer_list	timer;
-	struct hlist_node	list;
+	spinlock_t		lock;
 	refcount_t		refcnt;
 	struct sk_buff		*fragments;
 	struct sk_buff		*fragments_tail;
@@ -55,51 +81,20 @@ struct inet_frag_queue {
 	int			meat;
 	__u8			flags;
 	u16			max_size;
-	struct netns_frags	*net;
-	struct hlist_node	list_evictor;
-};
-
-#define INETFRAGS_HASHSZ	1024
-
-/* averaged:
- * max_depth = default ipfrag_high_thresh / INETFRAGS_HASHSZ /
- *	       rounded up (SKB_TRUELEN(0) + sizeof(struct ipq or
- *	       struct frag_queue))
- */
-#define INETFRAGS_MAXDEPTH	128
-
-struct inet_frag_bucket {
-	struct hlist_head	chain;
-	spinlock_t		chain_lock;
+	struct netns_frags      *net;
+	struct rcu_head		rcu;
 };
 
 struct inet_frags {
-	struct inet_frag_bucket	hash[INETFRAGS_HASHSZ];
-
-	struct work_struct	frags_work;
-	unsigned int next_bucket;
-	unsigned long last_rebuild_jiffies;
-	bool rebuild;
-
-	/* The first call to hashfn is responsible to initialize
-	 * rnd. This is best done with net_get_random_once.
-	 *
-	 * rnd_seqlock is used to let hash insertion detect
-	 * when it needs to re-lookup the hash chain to use.
-	 */
-	u32			rnd;
-	seqlock_t		rnd_seqlock;
 	unsigned int		qsize;
 
-	unsigned int		(*hashfn)(const struct inet_frag_queue *);
-	bool			(*match)(const struct inet_frag_queue *q,
-					 const void *arg);
 	void			(*constructor)(struct inet_frag_queue *q,
 					       const void *arg);
 	void			(*destructor)(struct inet_frag_queue *);
 	void			(*frag_expire)(struct timer_list *t);
 	struct kmem_cache	*frags_cachep;
 	const char		*frags_cache_name;
+	struct rhashtable_params rhash_params;
 };
 
 int inet_frags_init(struct inet_frags *);
@@ -108,15 +103,13 @@ void inet_frags_fini(struct inet_frags *
 static inline int inet_frags_init_net(struct netns_frags *nf)
 {
 	atomic_set(&nf->mem, 0);
-	return 0;
+	return rhashtable_init(&nf->rhashtable, &nf->f->rhash_params);
 }
 void inet_frags_exit_net(struct netns_frags *nf);
 
 void inet_frag_kill(struct inet_frag_queue *q);
 void inet_frag_destroy(struct inet_frag_queue *q);
-struct inet_frag_queue *inet_frag_find(struct netns_frags *nf,
-		struct inet_frags *f, void *key, unsigned int hash);
-
+struct inet_frag_queue *inet_frag_find(struct netns_frags *nf, void *key);
 void inet_frag_maybe_warn_overflow(struct inet_frag_queue *q,
 				   const char *prefix);
 
@@ -128,7 +121,7 @@ static inline void inet_frag_put(struct
 
 static inline bool inet_frag_evicting(struct inet_frag_queue *q)
 {
-	return !hlist_unhashed(&q->list_evictor);
+	return false;
 }
 
 /* Memory Tracking Functions. */
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -531,17 +531,8 @@ enum ip6_defrag_users {
 	__IP6_DEFRAG_CONNTRACK_BRIDGE_IN = IP6_DEFRAG_CONNTRACK_BRIDGE_IN + USHRT_MAX,
 };
 
-struct ip6_create_arg {
-	__be32 id;
-	u32 user;
-	const struct in6_addr *src;
-	const struct in6_addr *dst;
-	int iif;
-	u8 ecn;
-};
-
 void ip6_frag_init(struct inet_frag_queue *q, const void *a);
-bool ip6_frag_match(const struct inet_frag_queue *q, const void *a);
+extern const struct rhashtable_params ip6_rhash_params;
 
 /*
  *	Equivalent of ipv4 struct ip
@@ -549,11 +540,6 @@ bool ip6_frag_match(const struct inet_fr
 struct frag_queue {
 	struct inet_frag_queue	q;
 
-	__be32			id;		/* fragment id		*/
-	u32			user;
-	struct in6_addr		saddr;
-	struct in6_addr		daddr;
-
 	int			iif;
 	unsigned int		csum;
 	__u16			nhoffset;
--- a/net/ieee802154/6lowpan/6lowpan_i.h
+++ b/net/ieee802154/6lowpan/6lowpan_i.h
@@ -17,37 +17,19 @@ typedef unsigned __bitwise lowpan_rx_res
 #define LOWPAN_DISPATCH_FRAG1           0xc0
 #define LOWPAN_DISPATCH_FRAGN           0xe0
 
-struct lowpan_create_arg {
+struct frag_lowpan_compare_key {
 	u16 tag;
 	u16 d_size;
-	const struct ieee802154_addr *src;
-	const struct ieee802154_addr *dst;
+	const struct ieee802154_addr src;
+	const struct ieee802154_addr dst;
 };
 
-/* Equivalent of ipv4 struct ip
+/* Equivalent of ipv4 struct ipq
  */
 struct lowpan_frag_queue {
 	struct inet_frag_queue	q;
-
-	u16			tag;
-	u16			d_size;
-	struct ieee802154_addr	saddr;
-	struct ieee802154_addr	daddr;
 };
 
-static inline u32 ieee802154_addr_hash(const struct ieee802154_addr *a)
-{
-	switch (a->mode) {
-	case IEEE802154_ADDR_LONG:
-		return (((__force u64)a->extended_addr) >> 32) ^
-			(((__force u64)a->extended_addr) & 0xffffffff);
-	case IEEE802154_ADDR_SHORT:
-		return (__force u32)(a->short_addr + (a->pan_id << 16));
-	default:
-		return 0;
-	}
-}
-
 int lowpan_frag_rcv(struct sk_buff *skb, const u8 frag_type);
 void lowpan_net_frag_exit(void);
 int lowpan_net_frag_init(void);
--- a/net/ieee802154/6lowpan/reassembly.c
+++ b/net/ieee802154/6lowpan/reassembly.c
@@ -37,47 +37,15 @@ static struct inet_frags lowpan_frags;
 static int lowpan_frag_reasm(struct lowpan_frag_queue *fq,
 			     struct sk_buff *prev, struct net_device *ldev);
 
-static unsigned int lowpan_hash_frag(u16 tag, u16 d_size,
-				     const struct ieee802154_addr *saddr,
-				     const struct ieee802154_addr *daddr)
-{
-	net_get_random_once(&lowpan_frags.rnd, sizeof(lowpan_frags.rnd));
-	return jhash_3words(ieee802154_addr_hash(saddr),
-			    ieee802154_addr_hash(daddr),
-			    (__force u32)(tag + (d_size << 16)),
-			    lowpan_frags.rnd);
-}
-
-static unsigned int lowpan_hashfn(const struct inet_frag_queue *q)
-{
-	const struct lowpan_frag_queue *fq;
-
-	fq = container_of(q, struct lowpan_frag_queue, q);
-	return lowpan_hash_frag(fq->tag, fq->d_size, &fq->saddr, &fq->daddr);
-}
-
-static bool lowpan_frag_match(const struct inet_frag_queue *q, const void *a)
-{
-	const struct lowpan_frag_queue *fq;
-	const struct lowpan_create_arg *arg = a;
-
-	fq = container_of(q, struct lowpan_frag_queue, q);
-	return	fq->tag == arg->tag && fq->d_size == arg->d_size &&
-		ieee802154_addr_equal(&fq->saddr, arg->src) &&
-		ieee802154_addr_equal(&fq->daddr, arg->dst);
-}
-
 static void lowpan_frag_init(struct inet_frag_queue *q, const void *a)
 {
-	const struct lowpan_create_arg *arg = a;
+	const struct frag_lowpan_compare_key *key = a;
 	struct lowpan_frag_queue *fq;
 
 	fq = container_of(q, struct lowpan_frag_queue, q);
 
-	fq->tag = arg->tag;
-	fq->d_size = arg->d_size;
-	fq->saddr = *arg->src;
-	fq->daddr = *arg->dst;
+	BUILD_BUG_ON(sizeof(*key) > sizeof(q->key));
+	memcpy(&q->key, key, sizeof(*key));
 }
 
 static void lowpan_frag_expire(struct timer_list *t)
@@ -105,21 +73,17 @@ fq_find(struct net *net, const struct lo
 	const struct ieee802154_addr *src,
 	const struct ieee802154_addr *dst)
 {
-	struct inet_frag_queue *q;
-	struct lowpan_create_arg arg;
-	unsigned int hash;
 	struct netns_ieee802154_lowpan *ieee802154_lowpan =
 		net_ieee802154_lowpan(net);
+	struct frag_lowpan_compare_key key = {
+		.tag = cb->d_tag,
+		.d_size = cb->d_size,
+		.src = *src,
+		.dst = *dst,
+	};
+	struct inet_frag_queue *q;
 
-	arg.tag = cb->d_tag;
-	arg.d_size = cb->d_size;
-	arg.src = src;
-	arg.dst = dst;
-
-	hash = lowpan_hash_frag(cb->d_tag, cb->d_size, src, dst);
-
-	q = inet_frag_find(&ieee802154_lowpan->frags,
-			   &lowpan_frags, &arg, hash);
+	q = inet_frag_find(&ieee802154_lowpan->frags, &key);
 	if (IS_ERR_OR_NULL(q)) {
 		inet_frag_maybe_warn_overflow(q, pr_fmt());
 		return NULL;
@@ -611,17 +575,46 @@ static struct pernet_operations lowpan_f
 	.exit = lowpan_frags_exit_net,
 };
 
+static u32 lowpan_key_hashfn(const void *data, u32 len, u32 seed)
+{
+	return jhash2(data,
+		      sizeof(struct frag_lowpan_compare_key) / sizeof(u32), seed);
+}
+
+static u32 lowpan_obj_hashfn(const void *data, u32 len, u32 seed)
+{
+	const struct inet_frag_queue *fq = data;
+
+	return jhash2((const u32 *)&fq->key,
+		      sizeof(struct frag_lowpan_compare_key) / sizeof(u32), seed);
+}
+
+static int lowpan_obj_cmpfn(struct rhashtable_compare_arg *arg, const void *ptr)
+{
+	const struct frag_lowpan_compare_key *key = arg->key;
+	const struct inet_frag_queue *fq = ptr;
+
+	return !!memcmp(&fq->key, key, sizeof(*key));
+}
+
+static const struct rhashtable_params lowpan_rhash_params = {
+	.head_offset		= offsetof(struct inet_frag_queue, node),
+	.hashfn			= lowpan_key_hashfn,
+	.obj_hashfn		= lowpan_obj_hashfn,
+	.obj_cmpfn		= lowpan_obj_cmpfn,
+	.automatic_shrinking	= true,
+};
+
 int __init lowpan_net_frag_init(void)
 {
 	int ret;
 
-	lowpan_frags.hashfn = lowpan_hashfn;
 	lowpan_frags.constructor = lowpan_frag_init;
 	lowpan_frags.destructor = NULL;
 	lowpan_frags.qsize = sizeof(struct frag_queue);
-	lowpan_frags.match = lowpan_frag_match;
 	lowpan_frags.frag_expire = lowpan_frag_expire;
 	lowpan_frags.frags_cache_name = lowpan_frags_cache_name;
+	lowpan_frags.rhash_params = lowpan_rhash_params;
 	ret = inet_frags_init(&lowpan_frags);
 	if (ret)
 		goto out;
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -25,12 +25,6 @@
 #include <net/inet_frag.h>
 #include <net/inet_ecn.h>
 
-#define INETFRAGS_EVICT_BUCKETS   128
-#define INETFRAGS_EVICT_MAX	  512
-
-/* don't rebuild inetfrag table with new secret more often than this */
-#define INETFRAGS_MIN_REBUILD_INTERVAL (5 * HZ)
-
 /* Given the OR values of all fragments, apply RFC 3168 5.3 requirements
  * Value : 0xff if frame should be dropped.
  *         0 or INET_ECN_CE value, to be ORed in to final iph->tos field
@@ -52,157 +46,8 @@ const u8 ip_frag_ecn_table[16] = {
 };
 EXPORT_SYMBOL(ip_frag_ecn_table);
 
-static unsigned int
-inet_frag_hashfn(const struct inet_frags *f, const struct inet_frag_queue *q)
-{
-	return f->hashfn(q) & (INETFRAGS_HASHSZ - 1);
-}
-
-static bool inet_frag_may_rebuild(struct inet_frags *f)
-{
-	return time_after(jiffies,
-	       f->last_rebuild_jiffies + INETFRAGS_MIN_REBUILD_INTERVAL);
-}
-
-static void inet_frag_secret_rebuild(struct inet_frags *f)
-{
-	int i;
-
-	write_seqlock_bh(&f->rnd_seqlock);
-
-	if (!inet_frag_may_rebuild(f))
-		goto out;
-
-	get_random_bytes(&f->rnd, sizeof(u32));
-
-	for (i = 0; i < INETFRAGS_HASHSZ; i++) {
-		struct inet_frag_bucket *hb;
-		struct inet_frag_queue *q;
-		struct hlist_node *n;
-
-		hb = &f->hash[i];
-		spin_lock(&hb->chain_lock);
-
-		hlist_for_each_entry_safe(q, n, &hb->chain, list) {
-			unsigned int hval = inet_frag_hashfn(f, q);
-
-			if (hval != i) {
-				struct inet_frag_bucket *hb_dest;
-
-				hlist_del(&q->list);
-
-				/* Relink to new hash chain. */
-				hb_dest = &f->hash[hval];
-
-				/* This is the only place where we take
-				 * another chain_lock while already holding
-				 * one.  As this will not run concurrently,
-				 * we cannot deadlock on hb_dest lock below, if its
-				 * already locked it will be released soon since
-				 * other caller cannot be waiting for hb lock
-				 * that we've taken above.
-				 */
-				spin_lock_nested(&hb_dest->chain_lock,
-						 SINGLE_DEPTH_NESTING);
-				hlist_add_head(&q->list, &hb_dest->chain);
-				spin_unlock(&hb_dest->chain_lock);
-			}
-		}
-		spin_unlock(&hb->chain_lock);
-	}
-
-	f->rebuild = false;
-	f->last_rebuild_jiffies = jiffies;
-out:
-	write_sequnlock_bh(&f->rnd_seqlock);
-}
-
-static bool inet_fragq_should_evict(const struct inet_frag_queue *q)
-{
-	if (!hlist_unhashed(&q->list_evictor))
-		return false;
-
-	return q->net->low_thresh == 0 ||
-	       frag_mem_limit(q->net) >= q->net->low_thresh;
-}
-
-static unsigned int
-inet_evict_bucket(struct inet_frags *f, struct inet_frag_bucket *hb)
-{
-	struct inet_frag_queue *fq;
-	struct hlist_node *n;
-	unsigned int evicted = 0;
-	HLIST_HEAD(expired);
-
-	spin_lock(&hb->chain_lock);
-
-	hlist_for_each_entry_safe(fq, n, &hb->chain, list) {
-		if (!inet_fragq_should_evict(fq))
-			continue;
-
-		if (!del_timer(&fq->timer))
-			continue;
-
-		hlist_add_head(&fq->list_evictor, &expired);
-		++evicted;
-	}
-
-	spin_unlock(&hb->chain_lock);
-
-	hlist_for_each_entry_safe(fq, n, &expired, list_evictor)
-		f->frag_expire(&fq->timer);
-
-	return evicted;
-}
-
-static void inet_frag_worker(struct work_struct *work)
-{
-	unsigned int budget = INETFRAGS_EVICT_BUCKETS;
-	unsigned int i, evicted = 0;
-	struct inet_frags *f;
-
-	f = container_of(work, struct inet_frags, frags_work);
-
-	BUILD_BUG_ON(INETFRAGS_EVICT_BUCKETS >= INETFRAGS_HASHSZ);
-
-	local_bh_disable();
-
-	for (i = ACCESS_ONCE(f->next_bucket); budget; --budget) {
-		evicted += inet_evict_bucket(f, &f->hash[i]);
-		i = (i + 1) & (INETFRAGS_HASHSZ - 1);
-		if (evicted > INETFRAGS_EVICT_MAX)
-			break;
-	}
-
-	f->next_bucket = i;
-
-	local_bh_enable();
-
-	if (f->rebuild && inet_frag_may_rebuild(f))
-		inet_frag_secret_rebuild(f);
-}
-
-static void inet_frag_schedule_worker(struct inet_frags *f)
-{
-	if (unlikely(!work_pending(&f->frags_work)))
-		schedule_work(&f->frags_work);
-}
-
 int inet_frags_init(struct inet_frags *f)
 {
-	int i;
-
-	INIT_WORK(&f->frags_work, inet_frag_worker);
-
-	for (i = 0; i < INETFRAGS_HASHSZ; i++) {
-		struct inet_frag_bucket *hb = &f->hash[i];
-
-		spin_lock_init(&hb->chain_lock);
-		INIT_HLIST_HEAD(&hb->chain);
-	}
-
-	seqlock_init(&f->rnd_seqlock);
-	f->last_rebuild_jiffies = 0;
 	f->frags_cachep = kmem_cache_create(f->frags_cache_name, f->qsize, 0, 0,
 					    NULL);
 	if (!f->frags_cachep)
@@ -214,66 +59,42 @@ EXPORT_SYMBOL(inet_frags_init);
 
 void inet_frags_fini(struct inet_frags *f)
 {
-	cancel_work_sync(&f->frags_work);
+	/* We must wait that all inet_frag_destroy_rcu() have completed. */
+	rcu_barrier();
+
 	kmem_cache_destroy(f->frags_cachep);
+	f->frags_cachep = NULL;
 }
 EXPORT_SYMBOL(inet_frags_fini);
 
-void inet_frags_exit_net(struct netns_frags *nf)
+static void inet_frags_free_cb(void *ptr, void *arg)
 {
-	struct inet_frags *f =nf->f;
-	unsigned int seq;
-	int i;
-
-	nf->low_thresh = 0;
-
-evict_again:
-	local_bh_disable();
-	seq = read_seqbegin(&f->rnd_seqlock);
-
-	for (i = 0; i < INETFRAGS_HASHSZ ; i++)
-		inet_evict_bucket(f, &f->hash[i]);
+	struct inet_frag_queue *fq = ptr;
 
-	local_bh_enable();
-	cond_resched();
-
-	if (read_seqretry(&f->rnd_seqlock, seq) ||
-	    sum_frag_mem_limit(nf))
-		goto evict_again;
-}
-EXPORT_SYMBOL(inet_frags_exit_net);
+	/* If we can not cancel the timer, it means this frag_queue
+	 * is already disappearing, we have nothing to do.
+	 * Otherwise, we own a refcount until the end of this function.
+	 */
+	if (!del_timer(&fq->timer))
+		return;
 
-static struct inet_frag_bucket *
-get_frag_bucket_locked(struct inet_frag_queue *fq, struct inet_frags *f)
-__acquires(hb->chain_lock)
-{
-	struct inet_frag_bucket *hb;
-	unsigned int seq, hash;
-
- restart:
-	seq = read_seqbegin(&f->rnd_seqlock);
-
-	hash = inet_frag_hashfn(f, fq);
-	hb = &f->hash[hash];
-
-	spin_lock(&hb->chain_lock);
-	if (read_seqretry(&f->rnd_seqlock, seq)) {
-		spin_unlock(&hb->chain_lock);
-		goto restart;
+	spin_lock_bh(&fq->lock);
+	if (!(fq->flags & INET_FRAG_COMPLETE)) {
+		fq->flags |= INET_FRAG_COMPLETE;
+		refcount_dec(&fq->refcnt);
 	}
+	spin_unlock_bh(&fq->lock);
 
-	return hb;
+	inet_frag_put(fq);
 }
 
-static inline void fq_unlink(struct inet_frag_queue *fq)
+void inet_frags_exit_net(struct netns_frags *nf)
 {
-	struct inet_frag_bucket *hb;
+	nf->low_thresh = 0; /* prevent creation of new frags */
 
-	hb = get_frag_bucket_locked(fq, fq->net->f);
-	hlist_del(&fq->list);
-	fq->flags |= INET_FRAG_COMPLETE;
-	spin_unlock(&hb->chain_lock);
+	rhashtable_free_and_destroy(&nf->rhashtable, inet_frags_free_cb, NULL);
 }
+EXPORT_SYMBOL(inet_frags_exit_net);
 
 void inet_frag_kill(struct inet_frag_queue *fq)
 {
@@ -281,12 +102,26 @@ void inet_frag_kill(struct inet_frag_que
 		refcount_dec(&fq->refcnt);
 
 	if (!(fq->flags & INET_FRAG_COMPLETE)) {
-		fq_unlink(fq);
+		struct netns_frags *nf = fq->net;
+
+		fq->flags |= INET_FRAG_COMPLETE;
+		rhashtable_remove_fast(&nf->rhashtable, &fq->node, nf->f->rhash_params);
 		refcount_dec(&fq->refcnt);
 	}
 }
 EXPORT_SYMBOL(inet_frag_kill);
 
+static void inet_frag_destroy_rcu(struct rcu_head *head)
+{
+	struct inet_frag_queue *q = container_of(head, struct inet_frag_queue,
+						 rcu);
+	struct inet_frags *f = q->net->f;
+
+	if (f->destructor)
+		f->destructor(q);
+	kmem_cache_free(f->frags_cachep, q);
+}
+
 void inet_frag_destroy(struct inet_frag_queue *q)
 {
 	struct sk_buff *fp;
@@ -310,55 +145,21 @@ void inet_frag_destroy(struct inet_frag_
 	}
 	sum = sum_truesize + f->qsize;
 
-	if (f->destructor)
-		f->destructor(q);
-	kmem_cache_free(f->frags_cachep, q);
+	call_rcu(&q->rcu, inet_frag_destroy_rcu);
 
 	sub_frag_mem_limit(nf, sum);
 }
 EXPORT_SYMBOL(inet_frag_destroy);
 
-static struct inet_frag_queue *inet_frag_intern(struct netns_frags *nf,
-						struct inet_frag_queue *qp_in,
-						struct inet_frags *f,
-						void *arg)
-{
-	struct inet_frag_bucket *hb = get_frag_bucket_locked(qp_in, f);
-	struct inet_frag_queue *qp;
-
-#ifdef CONFIG_SMP
-	/* With SMP race we have to recheck hash table, because
-	 * such entry could have been created on other cpu before
-	 * we acquired hash bucket lock.
-	 */
-	hlist_for_each_entry(qp, &hb->chain, list) {
-		if (qp->net == nf && f->match(qp, arg)) {
-			refcount_inc(&qp->refcnt);
-			spin_unlock(&hb->chain_lock);
-			qp_in->flags |= INET_FRAG_COMPLETE;
-			inet_frag_put(qp_in);
-			return qp;
-		}
-	}
-#endif
-	qp = qp_in;
-	if (!mod_timer(&qp->timer, jiffies + nf->timeout))
-		refcount_inc(&qp->refcnt);
-
-	refcount_inc(&qp->refcnt);
-	hlist_add_head(&qp->list, &hb->chain);
-
-	spin_unlock(&hb->chain_lock);
-
-	return qp;
-}
-
 static struct inet_frag_queue *inet_frag_alloc(struct netns_frags *nf,
 					       struct inet_frags *f,
 					       void *arg)
 {
 	struct inet_frag_queue *q;
 
+	if (!nf->high_thresh || frag_mem_limit(nf) > nf->high_thresh)
+		return NULL;
+
 	q = kmem_cache_zalloc(f->frags_cachep, GFP_ATOMIC);
 	if (!q)
 		return NULL;
@@ -369,64 +170,52 @@ static struct inet_frag_queue *inet_frag
 
 	timer_setup(&q->timer, f->frag_expire, 0);
 	spin_lock_init(&q->lock);
-	refcount_set(&q->refcnt, 1);
+	refcount_set(&q->refcnt, 3);
 
 	return q;
 }
 
 static struct inet_frag_queue *inet_frag_create(struct netns_frags *nf,
-						struct inet_frags *f,
 						void *arg)
 {
+	struct inet_frags *f = nf->f;
 	struct inet_frag_queue *q;
+	int err;
 
 	q = inet_frag_alloc(nf, f, arg);
 	if (!q)
 		return NULL;
 
-	return inet_frag_intern(nf, q, f, arg);
-}
-
-struct inet_frag_queue *inet_frag_find(struct netns_frags *nf,
-				       struct inet_frags *f, void *key,
-				       unsigned int hash)
-{
-	struct inet_frag_bucket *hb;
-	struct inet_frag_queue *q;
-	int depth = 0;
+	mod_timer(&q->timer, jiffies + nf->timeout);
 
-	if (!nf->high_thresh || frag_mem_limit(nf) > nf->high_thresh) {
-		inet_frag_schedule_worker(f);
+	err = rhashtable_insert_fast(&nf->rhashtable, &q->node,
+				     f->rhash_params);
+	if (err < 0) {
+		q->flags |= INET_FRAG_COMPLETE;
+		inet_frag_kill(q);
+		inet_frag_destroy(q);
 		return NULL;
 	}
+	return q;
+}
 
-	if (frag_mem_limit(nf) > nf->low_thresh)
-		inet_frag_schedule_worker(f);
-
-	hash &= (INETFRAGS_HASHSZ - 1);
-	hb = &f->hash[hash];
-
-	spin_lock(&hb->chain_lock);
-	hlist_for_each_entry(q, &hb->chain, list) {
-		if (q->net == nf && f->match(q, key)) {
-			refcount_inc(&q->refcnt);
-			spin_unlock(&hb->chain_lock);
-			return q;
-		}
-		depth++;
-	}
-	spin_unlock(&hb->chain_lock);
+/* TODO : call from rcu_read_lock() and no longer use refcount_inc_not_zero() */
+struct inet_frag_queue *inet_frag_find(struct netns_frags *nf, void *key)
+{
+	struct inet_frag_queue *fq;
 
-	if (depth <= INETFRAGS_MAXDEPTH)
-		return inet_frag_create(nf, f, key);
+	rcu_read_lock();
 
-	if (inet_frag_may_rebuild(f)) {
-		if (!f->rebuild)
-			f->rebuild = true;
-		inet_frag_schedule_worker(f);
+	fq = rhashtable_lookup(&nf->rhashtable, key, nf->f->rhash_params);
+	if (fq) {
+		if (!refcount_inc_not_zero(&fq->refcnt))
+			fq = NULL;
+		rcu_read_unlock();
+		return fq;
 	}
+	rcu_read_unlock();
 
-	return ERR_PTR(-ENOBUFS);
+	return inet_frag_create(nf, key);
 }
 EXPORT_SYMBOL(inet_frag_find);
 
@@ -434,8 +223,7 @@ void inet_frag_maybe_warn_overflow(struc
 				   const char *prefix)
 {
 	static const char msg[] = "inet_frag_find: Fragment hash bucket"
-		" list length grew over limit " __stringify(INETFRAGS_MAXDEPTH)
-		". Dropping fragment.\n";
+		" list length grew over limit. Dropping fragment.\n";
 
 	if (PTR_ERR(q) == -ENOBUFS)
 		net_dbg_ratelimited("%s%s", prefix, msg);
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -69,15 +69,9 @@ struct ipfrag_skb_cb
 struct ipq {
 	struct inet_frag_queue q;
 
-	u32		user;
-	__be32		saddr;
-	__be32		daddr;
-	__be16		id;
-	u8		protocol;
 	u8		ecn; /* RFC3168 support */
 	u16		max_df_size; /* largest frag with DF set seen */
 	int             iif;
-	int             vif;   /* L3 master device index */
 	unsigned int    rid;
 	struct inet_peer *peer;
 };
@@ -97,41 +91,6 @@ int ip_frag_mem(struct net *net)
 static int ip_frag_reasm(struct ipq *qp, struct sk_buff *prev,
 			 struct net_device *dev);
 
-struct ip4_create_arg {
-	struct iphdr *iph;
-	u32 user;
-	int vif;
-};
-
-static unsigned int ipqhashfn(__be16 id, __be32 saddr, __be32 daddr, u8 prot)
-{
-	net_get_random_once(&ip4_frags.rnd, sizeof(ip4_frags.rnd));
-	return jhash_3words((__force u32)id << 16 | prot,
-			    (__force u32)saddr, (__force u32)daddr,
-			    ip4_frags.rnd);
-}
-
-static unsigned int ip4_hashfn(const struct inet_frag_queue *q)
-{
-	const struct ipq *ipq;
-
-	ipq = container_of(q, struct ipq, q);
-	return ipqhashfn(ipq->id, ipq->saddr, ipq->daddr, ipq->protocol);
-}
-
-static bool ip4_frag_match(const struct inet_frag_queue *q, const void *a)
-{
-	const struct ipq *qp;
-	const struct ip4_create_arg *arg = a;
-
-	qp = container_of(q, struct ipq, q);
-	return	qp->id == arg->iph->id &&
-		qp->saddr == arg->iph->saddr &&
-		qp->daddr == arg->iph->daddr &&
-		qp->protocol == arg->iph->protocol &&
-		qp->user == arg->user &&
-		qp->vif == arg->vif;
-}
 
 static void ip4_frag_init(struct inet_frag_queue *q, const void *a)
 {
@@ -140,17 +99,12 @@ static void ip4_frag_init(struct inet_fr
 					       frags);
 	struct net *net = container_of(ipv4, struct net, ipv4);
 
-	const struct ip4_create_arg *arg = a;
+	const struct frag_v4_compare_key *key = a;
 
-	qp->protocol = arg->iph->protocol;
-	qp->id = arg->iph->id;
-	qp->ecn = ip4_frag_ecn(arg->iph->tos);
-	qp->saddr = arg->iph->saddr;
-	qp->daddr = arg->iph->daddr;
-	qp->vif = arg->vif;
-	qp->user = arg->user;
+	q->key.v4 = *key;
+	qp->ecn = 0;
 	qp->peer = q->net->max_dist ?
-		inet_getpeer_v4(net->ipv4.peers, arg->iph->saddr, arg->vif, 1) :
+		inet_getpeer_v4(net->ipv4.peers, key->saddr, key->vif, 1) :
 		NULL;
 }
 
@@ -234,7 +188,7 @@ static void ip_expire(struct timer_list
 		/* Only an end host needs to send an ICMP
 		 * "Fragment Reassembly Timeout" message, per RFC792.
 		 */
-		if (frag_expire_skip_icmp(qp->user) &&
+		if (frag_expire_skip_icmp(qp->q.key.v4.user) &&
 		    (skb_rtable(head)->rt_type != RTN_LOCAL))
 			goto out;
 
@@ -262,17 +216,17 @@ out_rcu_unlock:
 static struct ipq *ip_find(struct net *net, struct iphdr *iph,
 			   u32 user, int vif)
 {
+	struct frag_v4_compare_key key = {
+		.saddr = iph->saddr,
+		.daddr = iph->daddr,
+		.user = user,
+		.vif = vif,
+		.id = iph->id,
+		.protocol = iph->protocol,
+	};
 	struct inet_frag_queue *q;
-	struct ip4_create_arg arg;
-	unsigned int hash;
-
-	arg.iph = iph;
-	arg.user = user;
-	arg.vif = vif;
 
-	hash = ipqhashfn(iph->id, iph->saddr, iph->daddr, iph->protocol);
-
-	q = inet_frag_find(&net->ipv4.frags, &ip4_frags, &arg, hash);
+	q = inet_frag_find(&net->ipv4.frags, &key);
 	if (IS_ERR_OR_NULL(q)) {
 		inet_frag_maybe_warn_overflow(q, pr_fmt());
 		return NULL;
@@ -661,7 +615,7 @@ out_nomem:
 	err = -ENOMEM;
 	goto out_fail;
 out_oversize:
-	net_info_ratelimited("Oversized IP packet from %pI4\n", &qp->saddr);
+	net_info_ratelimited("Oversized IP packet from %pI4\n", &qp->q.key.v4.saddr);
 out_fail:
 	__IP_INC_STATS(net, IPSTATS_MIB_REASMFAILS);
 	return err;
@@ -899,15 +853,47 @@ static struct pernet_operations ip4_frag
 	.exit = ipv4_frags_exit_net,
 };
 
+
+static u32 ip4_key_hashfn(const void *data, u32 len, u32 seed)
+{
+	return jhash2(data,
+		      sizeof(struct frag_v4_compare_key) / sizeof(u32), seed);
+}
+
+static u32 ip4_obj_hashfn(const void *data, u32 len, u32 seed)
+{
+	const struct inet_frag_queue *fq = data;
+
+	return jhash2((const u32 *)&fq->key.v4,
+		      sizeof(struct frag_v4_compare_key) / sizeof(u32), seed);
+}
+
+static int ip4_obj_cmpfn(struct rhashtable_compare_arg *arg, const void *ptr)
+{
+	const struct frag_v4_compare_key *key = arg->key;
+	const struct inet_frag_queue *fq = ptr;
+
+	return !!memcmp(&fq->key, key, sizeof(*key));
+}
+
+static const struct rhashtable_params ip4_rhash_params = {
+	.head_offset		= offsetof(struct inet_frag_queue, node),
+	.key_offset		= offsetof(struct inet_frag_queue, key),
+	.key_len		= sizeof(struct frag_v4_compare_key),
+	.hashfn			= ip4_key_hashfn,
+	.obj_hashfn		= ip4_obj_hashfn,
+	.obj_cmpfn		= ip4_obj_cmpfn,
+	.automatic_shrinking	= true,
+};
+
 void __init ipfrag_init(void)
 {
-	ip4_frags.hashfn = ip4_hashfn;
 	ip4_frags.constructor = ip4_frag_init;
 	ip4_frags.destructor = ip4_frag_free;
 	ip4_frags.qsize = sizeof(struct ipq);
-	ip4_frags.match = ip4_frag_match;
 	ip4_frags.frag_expire = ip_expire;
 	ip4_frags.frags_cache_name = ip_frag_cache_name;
+	ip4_frags.rhash_params = ip4_rhash_params;
 	if (inet_frags_init(&ip4_frags))
 		panic("IP: failed to allocate ip4_frags cache\n");
 	ip4_frags_ctl_register();
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -152,23 +152,6 @@ static inline u8 ip6_frag_ecn(const stru
 	return 1 << (ipv6_get_dsfield(ipv6h) & INET_ECN_MASK);
 }
 
-static unsigned int nf_hash_frag(__be32 id, const struct in6_addr *saddr,
-				 const struct in6_addr *daddr)
-{
-	net_get_random_once(&nf_frags.rnd, sizeof(nf_frags.rnd));
-	return jhash_3words(ipv6_addr_hash(saddr), ipv6_addr_hash(daddr),
-			    (__force u32)id, nf_frags.rnd);
-}
-
-
-static unsigned int nf_hashfn(const struct inet_frag_queue *q)
-{
-	const struct frag_queue *nq;
-
-	nq = container_of(q, struct frag_queue, q);
-	return nf_hash_frag(nq->id, &nq->saddr, &nq->daddr);
-}
-
 static void nf_ct_frag6_expire(struct timer_list *t)
 {
 	struct inet_frag_queue *frag = from_timer(frag, t, timer);
@@ -182,26 +165,19 @@ static void nf_ct_frag6_expire(struct ti
 }
 
 /* Creation primitives. */
-static inline struct frag_queue *fq_find(struct net *net, __be32 id,
-					 u32 user, struct in6_addr *src,
-					 struct in6_addr *dst, int iif, u8 ecn)
+static struct frag_queue *fq_find(struct net *net, __be32 id, u32 user,
+				  const struct ipv6hdr *hdr, int iif)
 {
+	struct frag_v6_compare_key key = {
+		.id = id,
+		.saddr = hdr->saddr,
+		.daddr = hdr->daddr,
+		.user = user,
+		.iif = iif,
+	};
 	struct inet_frag_queue *q;
-	struct ip6_create_arg arg;
-	unsigned int hash;
-
-	arg.id = id;
-	arg.user = user;
-	arg.src = src;
-	arg.dst = dst;
-	arg.iif = iif;
-	arg.ecn = ecn;
-
-	local_bh_disable();
-	hash = nf_hash_frag(id, src, dst);
 
-	q = inet_frag_find(&net->nf_frag.frags, &nf_frags, &arg, hash);
-	local_bh_enable();
+	q = inet_frag_find(&net->nf_frag.frags, &key);
 	if (IS_ERR_OR_NULL(q)) {
 		inet_frag_maybe_warn_overflow(q, pr_fmt());
 		return NULL;
@@ -593,8 +569,8 @@ int nf_ct_frag6_gather(struct net *net,
 	fhdr = (struct frag_hdr *)skb_transport_header(skb);
 
 	skb_orphan(skb);
-	fq = fq_find(net, fhdr->identification, user, &hdr->saddr, &hdr->daddr,
-		     skb->dev ? skb->dev->ifindex : 0, ip6_frag_ecn(hdr));
+	fq = fq_find(net, fhdr->identification, user, hdr,
+		     skb->dev ? skb->dev->ifindex : 0);
 	if (fq == NULL) {
 		pr_debug("Can't find and can't create new queue\n");
 		return -ENOMEM;
@@ -662,13 +638,12 @@ int nf_ct_frag6_init(void)
 {
 	int ret = 0;
 
-	nf_frags.hashfn = nf_hashfn;
 	nf_frags.constructor = ip6_frag_init;
 	nf_frags.destructor = NULL;
 	nf_frags.qsize = sizeof(struct frag_queue);
-	nf_frags.match = ip6_frag_match;
 	nf_frags.frag_expire = nf_ct_frag6_expire;
 	nf_frags.frags_cache_name = nf_frags_cache_name;
+	nf_frags.rhash_params = ip6_rhash_params;
 	ret = inet_frags_init(&nf_frags);
 	if (ret)
 		goto out;
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -79,52 +79,13 @@ static struct inet_frags ip6_frags;
 static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
 			  struct net_device *dev);
 
-/*
- * callers should be careful not to use the hash value outside the ipfrag_lock
- * as doing so could race with ipfrag_hash_rnd being recalculated.
- */
-static unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr,
-				    const struct in6_addr *daddr)
-{
-	net_get_random_once(&ip6_frags.rnd, sizeof(ip6_frags.rnd));
-	return jhash_3words(ipv6_addr_hash(saddr), ipv6_addr_hash(daddr),
-			    (__force u32)id, ip6_frags.rnd);
-}
-
-static unsigned int ip6_hashfn(const struct inet_frag_queue *q)
-{
-	const struct frag_queue *fq;
-
-	fq = container_of(q, struct frag_queue, q);
-	return inet6_hash_frag(fq->id, &fq->saddr, &fq->daddr);
-}
-
-bool ip6_frag_match(const struct inet_frag_queue *q, const void *a)
-{
-	const struct frag_queue *fq;
-	const struct ip6_create_arg *arg = a;
-
-	fq = container_of(q, struct frag_queue, q);
-	return	fq->id == arg->id &&
-		fq->user == arg->user &&
-		ipv6_addr_equal(&fq->saddr, arg->src) &&
-		ipv6_addr_equal(&fq->daddr, arg->dst) &&
-		(arg->iif == fq->iif ||
-		 !(ipv6_addr_type(arg->dst) & (IPV6_ADDR_MULTICAST |
-					       IPV6_ADDR_LINKLOCAL)));
-}
-EXPORT_SYMBOL(ip6_frag_match);
-
 void ip6_frag_init(struct inet_frag_queue *q, const void *a)
 {
 	struct frag_queue *fq = container_of(q, struct frag_queue, q);
-	const struct ip6_create_arg *arg = a;
+	const struct frag_v6_compare_key *key = a;
 
-	fq->id = arg->id;
-	fq->user = arg->user;
-	fq->saddr = *arg->src;
-	fq->daddr = *arg->dst;
-	fq->ecn = arg->ecn;
+	q->key.v6 = *key;
+	fq->ecn = 0;
 }
 EXPORT_SYMBOL(ip6_frag_init);
 
@@ -182,23 +143,22 @@ static void ip6_frag_expire(struct timer
 }
 
 static struct frag_queue *
-fq_find(struct net *net, __be32 id, const struct in6_addr *src,
-	const struct in6_addr *dst, int iif, u8 ecn)
+fq_find(struct net *net, __be32 id, const struct ipv6hdr *hdr, int iif)
 {
+	struct frag_v6_compare_key key = {
+		.id = id,
+		.saddr = hdr->saddr,
+		.daddr = hdr->daddr,
+		.user = IP6_DEFRAG_LOCAL_DELIVER,
+		.iif = iif,
+	};
 	struct inet_frag_queue *q;
-	struct ip6_create_arg arg;
-	unsigned int hash;
-
-	arg.id = id;
-	arg.user = IP6_DEFRAG_LOCAL_DELIVER;
-	arg.src = src;
-	arg.dst = dst;
-	arg.iif = iif;
-	arg.ecn = ecn;
 
-	hash = inet6_hash_frag(id, src, dst);
+	if (!(ipv6_addr_type(&hdr->daddr) & (IPV6_ADDR_MULTICAST |
+					    IPV6_ADDR_LINKLOCAL)))
+		key.iif = 0;
 
-	q = inet_frag_find(&net->ipv6.frags, &ip6_frags, &arg, hash);
+	q = inet_frag_find(&net->ipv6.frags, &key);
 	if (IS_ERR_OR_NULL(q)) {
 		inet_frag_maybe_warn_overflow(q, pr_fmt());
 		return NULL;
@@ -530,6 +490,7 @@ static int ipv6_frag_rcv(struct sk_buff
 	struct frag_queue *fq;
 	const struct ipv6hdr *hdr = ipv6_hdr(skb);
 	struct net *net = dev_net(skb_dst(skb)->dev);
+	int iif;
 
 	if (IP6CB(skb)->flags & IP6SKB_FRAGMENTED)
 		goto fail_hdr;
@@ -558,13 +519,14 @@ static int ipv6_frag_rcv(struct sk_buff
 		return 1;
 	}
 
-	fq = fq_find(net, fhdr->identification, &hdr->saddr, &hdr->daddr,
-		     skb->dev ? skb->dev->ifindex : 0, ip6_frag_ecn(hdr));
+	iif = skb->dev ? skb->dev->ifindex : 0;
+	fq = fq_find(net, fhdr->identification, hdr, iif);
 	if (fq) {
 		int ret;
 
 		spin_lock(&fq->q.lock);
 
+		fq->iif = iif;
 		ret = ip6_frag_queue(fq, skb, fhdr, IP6CB(skb)->nhoff);
 
 		spin_unlock(&fq->q.lock);
@@ -738,17 +700,47 @@ static struct pernet_operations ip6_frag
 	.exit = ipv6_frags_exit_net,
 };
 
+static u32 ip6_key_hashfn(const void *data, u32 len, u32 seed)
+{
+	return jhash2(data,
+		      sizeof(struct frag_v6_compare_key) / sizeof(u32), seed);
+}
+
+static u32 ip6_obj_hashfn(const void *data, u32 len, u32 seed)
+{
+	const struct inet_frag_queue *fq = data;
+
+	return jhash2((const u32 *)&fq->key.v6,
+		      sizeof(struct frag_v6_compare_key) / sizeof(u32), seed);
+}
+
+static int ip6_obj_cmpfn(struct rhashtable_compare_arg *arg, const void *ptr)
+{
+	const struct frag_v6_compare_key *key = arg->key;
+	const struct inet_frag_queue *fq = ptr;
+
+	return !!memcmp(&fq->key, key, sizeof(*key));
+}
+
+const struct rhashtable_params ip6_rhash_params = {
+	.head_offset		= offsetof(struct inet_frag_queue, node),
+	.hashfn			= ip6_key_hashfn,
+	.obj_hashfn		= ip6_obj_hashfn,
+	.obj_cmpfn		= ip6_obj_cmpfn,
+	.automatic_shrinking	= true,
+};
+EXPORT_SYMBOL(ip6_rhash_params);
+
 int __init ipv6_frag_init(void)
 {
 	int ret;
 
-	ip6_frags.hashfn = ip6_hashfn;
 	ip6_frags.constructor = ip6_frag_init;
 	ip6_frags.destructor = NULL;
 	ip6_frags.qsize = sizeof(struct frag_queue);
-	ip6_frags.match = ip6_frag_match;
 	ip6_frags.frag_expire = ip6_frag_expire;
 	ip6_frags.frags_cache_name = ip6_frag_cache_name;
+	ip6_frags.rhash_params = ip6_rhash_params;
 	ret = inet_frags_init(&ip6_frags);
 	if (ret)
 		goto out;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 101/126] inet: frags: remove some helpers
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (99 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 100/126] inet: frags: use rhashtables for reassembly units Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 102/126] inet: frags: get rif of inet_frag_evicting() Greg Kroah-Hartman
                   ` (27 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

Remove sum_frag_mem_limit(), ip_frag_mem() & ip6_frag_mem()

Also since we use rhashtable we can bring back the number of fragments
in "grep FRAG /proc/net/sockstat /proc/net/sockstat6" that was
removed in commit 434d305405ab ("inet: frag: don't account number
of fragment queues")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6befe4a78b1553edb6eed3a78b4bcd9748526672)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/net/inet_frag.h |    5 -----
 include/net/ip.h        |    1 -
 include/net/ipv6.h      |    7 -------
 net/ipv4/ip_fragment.c  |    5 -----
 net/ipv4/proc.c         |    6 +++---
 net/ipv6/proc.c         |    5 +++--
 6 files changed, 6 insertions(+), 23 deletions(-)

--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -141,11 +141,6 @@ static inline void add_frag_mem_limit(st
 	atomic_add(i, &nf->mem);
 }
 
-static inline int sum_frag_mem_limit(struct netns_frags *nf)
-{
-	return atomic_read(&nf->mem);
-}
-
 /* RFC 3168 support :
  * We want to check ECN values of all fragments, do detect invalid combinations.
  * In ipq->ecn, we store the OR value of each ip4_frag_ecn() fragment value.
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -570,7 +570,6 @@ static inline struct sk_buff *ip_check_d
 	return skb;
 }
 #endif
-int ip_frag_mem(struct net *net);
 
 /*
  *	Functions provided by ip_forward.c
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -331,13 +331,6 @@ static inline bool ipv6_accept_ra(struct
 	    idev->cnf.accept_ra;
 }
 
-#if IS_ENABLED(CONFIG_IPV6)
-static inline int ip6_frag_mem(struct net *net)
-{
-	return sum_frag_mem_limit(&net->ipv6.frags);
-}
-#endif
-
 #define IPV6_FRAG_HIGH_THRESH	(4 * 1024*1024)	/* 4194304 */
 #define IPV6_FRAG_LOW_THRESH	(3 * 1024*1024)	/* 3145728 */
 #define IPV6_FRAG_TIMEOUT	(60 * HZ)	/* 60 seconds */
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -83,11 +83,6 @@ static u8 ip4_frag_ecn(u8 tos)
 
 static struct inet_frags ip4_frags;
 
-int ip_frag_mem(struct net *net)
-{
-	return sum_frag_mem_limit(&net->ipv4.frags);
-}
-
 static int ip_frag_reasm(struct ipq *qp, struct sk_buff *prev,
 			 struct net_device *dev);
 
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -54,7 +54,6 @@
 static int sockstat_seq_show(struct seq_file *seq, void *v)
 {
 	struct net *net = seq->private;
-	unsigned int frag_mem;
 	int orphans, sockets;
 
 	orphans = percpu_counter_sum_positive(&tcp_orphan_count);
@@ -72,8 +71,9 @@ static int sockstat_seq_show(struct seq_
 		   sock_prot_inuse_get(net, &udplite_prot));
 	seq_printf(seq, "RAW: inuse %d\n",
 		   sock_prot_inuse_get(net, &raw_prot));
-	frag_mem = ip_frag_mem(net);
-	seq_printf(seq,  "FRAG: inuse %u memory %u\n", !!frag_mem, frag_mem);
+	seq_printf(seq,  "FRAG: inuse %u memory %u\n",
+		   atomic_read(&net->ipv4.frags.rhashtable.nelems),
+		   frag_mem_limit(&net->ipv4.frags));
 	return 0;
 }
 
--- a/net/ipv6/proc.c
+++ b/net/ipv6/proc.c
@@ -38,7 +38,6 @@
 static int sockstat6_seq_show(struct seq_file *seq, void *v)
 {
 	struct net *net = seq->private;
-	unsigned int frag_mem = ip6_frag_mem(net);
 
 	seq_printf(seq, "TCP6: inuse %d\n",
 		       sock_prot_inuse_get(net, &tcpv6_prot));
@@ -48,7 +47,9 @@ static int sockstat6_seq_show(struct seq
 			sock_prot_inuse_get(net, &udplitev6_prot));
 	seq_printf(seq, "RAW6: inuse %d\n",
 		       sock_prot_inuse_get(net, &rawv6_prot));
-	seq_printf(seq, "FRAG6: inuse %u memory %u\n", !!frag_mem, frag_mem);
+	seq_printf(seq, "FRAG6: inuse %u memory %u\n",
+		   atomic_read(&net->ipv6.frags.rhashtable.nelems),
+		   frag_mem_limit(&net->ipv6.frags));
 	return 0;
 }
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 102/126] inet: frags: get rif of inet_frag_evicting()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (100 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 101/126] inet: frags: remove some helpers Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 103/126] inet: frags: remove inet_frag_maybe_warn_overflow() Greg Kroah-Hartman
                   ` (26 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

This refactors ip_expire() since one indentation level is removed.

Note: in the future, we should try hard to avoid the skb_clone()
since this is a serious performance cost.
Under DDOS, the ICMP message wont be sent because of rate limits.

Fact that ip6_expire_frag_queue() does not use skb_clone() is
disturbing too. Presumably IPv6 should have the same
issue than the one we fixed in commit ec4fbd64751d
("inet: frag: release spinlock before calling icmp_send()")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 399d1404be660d355192ff4df5ccc3f4159ec1e4)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/net/inet_frag.h |    5 ---
 net/ipv4/ip_fragment.c  |   65 +++++++++++++++++++++++-------------------------
 net/ipv6/reassembly.c   |    4 --
 3 files changed, 32 insertions(+), 42 deletions(-)

--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -119,11 +119,6 @@ static inline void inet_frag_put(struct
 		inet_frag_destroy(q);
 }
 
-static inline bool inet_frag_evicting(struct inet_frag_queue *q)
-{
-	return false;
-}
-
 /* Memory Tracking Functions. */
 
 static inline int frag_mem_limit(struct netns_frags *nf)
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -143,8 +143,11 @@ static bool frag_expire_skip_icmp(u32 us
 static void ip_expire(struct timer_list *t)
 {
 	struct inet_frag_queue *frag = from_timer(frag, t, timer);
-	struct ipq *qp;
+	struct sk_buff *clone, *head;
+	const struct iphdr *iph;
 	struct net *net;
+	struct ipq *qp;
+	int err;
 
 	qp = container_of(frag, struct ipq, q);
 	net = container_of(qp->q.net, struct net, ipv4.frags);
@@ -158,45 +161,41 @@ static void ip_expire(struct timer_list
 	ipq_kill(qp);
 	__IP_INC_STATS(net, IPSTATS_MIB_REASMFAILS);
 
-	if (!inet_frag_evicting(&qp->q)) {
-		struct sk_buff *clone, *head = qp->q.fragments;
-		const struct iphdr *iph;
-		int err;
+	head = qp->q.fragments;
 
-		__IP_INC_STATS(net, IPSTATS_MIB_REASMTIMEOUT);
+	__IP_INC_STATS(net, IPSTATS_MIB_REASMTIMEOUT);
 
-		if (!(qp->q.flags & INET_FRAG_FIRST_IN) || !qp->q.fragments)
-			goto out;
+	if (!(qp->q.flags & INET_FRAG_FIRST_IN) || !head)
+		goto out;
 
-		head->dev = dev_get_by_index_rcu(net, qp->iif);
-		if (!head->dev)
-			goto out;
+	head->dev = dev_get_by_index_rcu(net, qp->iif);
+	if (!head->dev)
+		goto out;
 
 
-		/* skb has no dst, perform route lookup again */
-		iph = ip_hdr(head);
-		err = ip_route_input_noref(head, iph->daddr, iph->saddr,
+	/* skb has no dst, perform route lookup again */
+	iph = ip_hdr(head);
+	err = ip_route_input_noref(head, iph->daddr, iph->saddr,
 					   iph->tos, head->dev);
-		if (err)
-			goto out;
+	if (err)
+		goto out;
+
+	/* Only an end host needs to send an ICMP
+	 * "Fragment Reassembly Timeout" message, per RFC792.
+	 */
+	if (frag_expire_skip_icmp(qp->q.key.v4.user) &&
+	    (skb_rtable(head)->rt_type != RTN_LOCAL))
+		goto out;
+
+	clone = skb_clone(head, GFP_ATOMIC);
 
-		/* Only an end host needs to send an ICMP
-		 * "Fragment Reassembly Timeout" message, per RFC792.
-		 */
-		if (frag_expire_skip_icmp(qp->q.key.v4.user) &&
-		    (skb_rtable(head)->rt_type != RTN_LOCAL))
-			goto out;
-
-		clone = skb_clone(head, GFP_ATOMIC);
-
-		/* Send an ICMP "Fragment Reassembly Timeout" message. */
-		if (clone) {
-			spin_unlock(&qp->q.lock);
-			icmp_send(clone, ICMP_TIME_EXCEEDED,
-				  ICMP_EXC_FRAGTIME, 0);
-			consume_skb(clone);
-			goto out_rcu_unlock;
-		}
+	/* Send an ICMP "Fragment Reassembly Timeout" message. */
+	if (clone) {
+		spin_unlock(&qp->q.lock);
+		icmp_send(clone, ICMP_TIME_EXCEEDED,
+			  ICMP_EXC_FRAGTIME, 0);
+		consume_skb(clone);
+		goto out_rcu_unlock;
 	}
 out:
 	spin_unlock(&qp->q.lock);
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -106,10 +106,6 @@ void ip6_expire_frag_queue(struct net *n
 		goto out_rcu_unlock;
 
 	__IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMFAILS);
-
-	if (inet_frag_evicting(&fq->q))
-		goto out_rcu_unlock;
-
 	__IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMTIMEOUT);
 
 	/* Don't send error if the first segment did not arrive. */



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 103/126] inet: frags: remove inet_frag_maybe_warn_overflow()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (101 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 102/126] inet: frags: get rif of inet_frag_evicting() Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 104/126] inet: frags: break the 2GB limit for frags storage Greg Kroah-Hartman
                   ` (25 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

This function is obsolete, after rhashtable addition to inet defrag.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 2d44ed22e607f9a285b049de2263e3840673a260)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/net/inet_frag.h                 |    2 --
 net/ieee802154/6lowpan/reassembly.c     |    5 ++---
 net/ipv4/inet_fragment.c                |   11 -----------
 net/ipv4/ip_fragment.c                  |    5 ++---
 net/ipv6/netfilter/nf_conntrack_reasm.c |    5 ++---
 net/ipv6/reassembly.c                   |    5 ++---
 6 files changed, 8 insertions(+), 25 deletions(-)

--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -110,8 +110,6 @@ void inet_frags_exit_net(struct netns_fr
 void inet_frag_kill(struct inet_frag_queue *q);
 void inet_frag_destroy(struct inet_frag_queue *q);
 struct inet_frag_queue *inet_frag_find(struct netns_frags *nf, void *key);
-void inet_frag_maybe_warn_overflow(struct inet_frag_queue *q,
-				   const char *prefix);
 
 static inline void inet_frag_put(struct inet_frag_queue *q)
 {
--- a/net/ieee802154/6lowpan/reassembly.c
+++ b/net/ieee802154/6lowpan/reassembly.c
@@ -84,10 +84,9 @@ fq_find(struct net *net, const struct lo
 	struct inet_frag_queue *q;
 
 	q = inet_frag_find(&ieee802154_lowpan->frags, &key);
-	if (IS_ERR_OR_NULL(q)) {
-		inet_frag_maybe_warn_overflow(q, pr_fmt());
+	if (!q)
 		return NULL;
-	}
+
 	return container_of(q, struct lowpan_frag_queue, q);
 }
 
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -218,14 +218,3 @@ struct inet_frag_queue *inet_frag_find(s
 	return inet_frag_create(nf, key);
 }
 EXPORT_SYMBOL(inet_frag_find);
-
-void inet_frag_maybe_warn_overflow(struct inet_frag_queue *q,
-				   const char *prefix)
-{
-	static const char msg[] = "inet_frag_find: Fragment hash bucket"
-		" list length grew over limit. Dropping fragment.\n";
-
-	if (PTR_ERR(q) == -ENOBUFS)
-		net_dbg_ratelimited("%s%s", prefix, msg);
-}
-EXPORT_SYMBOL(inet_frag_maybe_warn_overflow);
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -221,10 +221,9 @@ static struct ipq *ip_find(struct net *n
 	struct inet_frag_queue *q;
 
 	q = inet_frag_find(&net->ipv4.frags, &key);
-	if (IS_ERR_OR_NULL(q)) {
-		inet_frag_maybe_warn_overflow(q, pr_fmt());
+	if (!q)
 		return NULL;
-	}
+
 	return container_of(q, struct ipq, q);
 }
 
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -178,10 +178,9 @@ static struct frag_queue *fq_find(struct
 	struct inet_frag_queue *q;
 
 	q = inet_frag_find(&net->nf_frag.frags, &key);
-	if (IS_ERR_OR_NULL(q)) {
-		inet_frag_maybe_warn_overflow(q, pr_fmt());
+	if (!q)
 		return NULL;
-	}
+
 	return container_of(q, struct frag_queue, q);
 }
 
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -155,10 +155,9 @@ fq_find(struct net *net, __be32 id, cons
 		key.iif = 0;
 
 	q = inet_frag_find(&net->ipv6.frags, &key);
-	if (IS_ERR_OR_NULL(q)) {
-		inet_frag_maybe_warn_overflow(q, pr_fmt());
+	if (!q)
 		return NULL;
-	}
+
 	return container_of(q, struct frag_queue, q);
 }
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 104/126] inet: frags: break the 2GB limit for frags storage
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (102 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 103/126] inet: frags: remove inet_frag_maybe_warn_overflow() Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 105/126] inet: frags: do not clone skb in ip_expire() Greg Kroah-Hartman
                   ` (24 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

Some users are willing to provision huge amounts of memory to be able
to perform reassembly reasonnably well under pressure.

Current memory tracking is using one atomic_t and integers.

Switch to atomic_long_t so that 64bit arches can use more than 2GB,
without any cost for 32bit arches.

Note that this patch avoids an overflow error, if high_thresh was set
to ~2GB, since this test in inet_frag_alloc() was never true :

if (... || frag_mem_limit(nf) > nf->high_thresh)

Tested:

$ echo 16000000000 >/proc/sys/net/ipv4/ipfrag_high_thresh

<frag DDOS>

$ grep FRAG /proc/net/sockstat
FRAG: inuse 14705885 memory 16000002880

$ nstat -n ; sleep 1 ; nstat | grep Reas
IpReasmReqds                    3317150            0.0
IpReasmFails                    3317112            0.0

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3e67f106f619dcfaf6f4e2039599bdb69848c714)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/networking/ip-sysctl.txt  |    4 ++--
 include/net/inet_frag.h                 |   20 ++++++++++----------
 net/ieee802154/6lowpan/reassembly.c     |   10 +++++-----
 net/ipv4/ip_fragment.c                  |   10 +++++-----
 net/ipv4/proc.c                         |    2 +-
 net/ipv6/netfilter/nf_conntrack_reasm.c |   10 +++++-----
 net/ipv6/proc.c                         |    2 +-
 net/ipv6/reassembly.c                   |    6 +++---
 8 files changed, 32 insertions(+), 32 deletions(-)

--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -133,10 +133,10 @@ min_adv_mss - INTEGER
 
 IP Fragmentation:
 
-ipfrag_high_thresh - INTEGER
+ipfrag_high_thresh - LONG INTEGER
 	Maximum memory used to reassemble IP fragments.
 
-ipfrag_low_thresh - INTEGER
+ipfrag_low_thresh - LONG INTEGER
 	(Obsolete since linux-4.17)
 	Maximum memory used to reassemble IP fragments before the kernel
 	begins to remove incomplete fragment queues to free up resources.
--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -8,11 +8,11 @@ struct netns_frags {
 	struct rhashtable       rhashtable ____cacheline_aligned_in_smp;
 
 	/* Keep atomic mem on separate cachelines in structs that include it */
-	atomic_t		mem ____cacheline_aligned_in_smp;
+	atomic_long_t		mem ____cacheline_aligned_in_smp;
 	/* sysctls */
+	long			high_thresh;
+	long			low_thresh;
 	int			timeout;
-	int			high_thresh;
-	int			low_thresh;
 	int			max_dist;
 	struct inet_frags	*f;
 };
@@ -102,7 +102,7 @@ void inet_frags_fini(struct inet_frags *
 
 static inline int inet_frags_init_net(struct netns_frags *nf)
 {
-	atomic_set(&nf->mem, 0);
+	atomic_long_set(&nf->mem, 0);
 	return rhashtable_init(&nf->rhashtable, &nf->f->rhash_params);
 }
 void inet_frags_exit_net(struct netns_frags *nf);
@@ -119,19 +119,19 @@ static inline void inet_frag_put(struct
 
 /* Memory Tracking Functions. */
 
-static inline int frag_mem_limit(struct netns_frags *nf)
+static inline long frag_mem_limit(const struct netns_frags *nf)
 {
-	return atomic_read(&nf->mem);
+	return atomic_long_read(&nf->mem);
 }
 
-static inline void sub_frag_mem_limit(struct netns_frags *nf, int i)
+static inline void sub_frag_mem_limit(struct netns_frags *nf, long val)
 {
-	atomic_sub(i, &nf->mem);
+	atomic_long_sub(val, &nf->mem);
 }
 
-static inline void add_frag_mem_limit(struct netns_frags *nf, int i)
+static inline void add_frag_mem_limit(struct netns_frags *nf, long val)
 {
-	atomic_add(i, &nf->mem);
+	atomic_long_add(val, &nf->mem);
 }
 
 /* RFC 3168 support :
--- a/net/ieee802154/6lowpan/reassembly.c
+++ b/net/ieee802154/6lowpan/reassembly.c
@@ -411,23 +411,23 @@ err:
 }
 
 #ifdef CONFIG_SYSCTL
-static int zero;
+static long zero;
 
 static struct ctl_table lowpan_frags_ns_ctl_table[] = {
 	{
 		.procname	= "6lowpanfrag_high_thresh",
 		.data		= &init_net.ieee802154_lowpan.frags.high_thresh,
-		.maxlen		= sizeof(int),
+		.maxlen		= sizeof(unsigned long),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
+		.proc_handler	= proc_doulongvec_minmax,
 		.extra1		= &init_net.ieee802154_lowpan.frags.low_thresh
 	},
 	{
 		.procname	= "6lowpanfrag_low_thresh",
 		.data		= &init_net.ieee802154_lowpan.frags.low_thresh,
-		.maxlen		= sizeof(int),
+		.maxlen		= sizeof(unsigned long),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
+		.proc_handler	= proc_doulongvec_minmax,
 		.extra1		= &zero,
 		.extra2		= &init_net.ieee802154_lowpan.frags.high_thresh
 	},
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -683,23 +683,23 @@ struct sk_buff *ip_check_defrag(struct n
 EXPORT_SYMBOL(ip_check_defrag);
 
 #ifdef CONFIG_SYSCTL
-static int zero;
+static long zero;
 
 static struct ctl_table ip4_frags_ns_ctl_table[] = {
 	{
 		.procname	= "ipfrag_high_thresh",
 		.data		= &init_net.ipv4.frags.high_thresh,
-		.maxlen		= sizeof(int),
+		.maxlen		= sizeof(unsigned long),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
+		.proc_handler	= proc_doulongvec_minmax,
 		.extra1		= &init_net.ipv4.frags.low_thresh
 	},
 	{
 		.procname	= "ipfrag_low_thresh",
 		.data		= &init_net.ipv4.frags.low_thresh,
-		.maxlen		= sizeof(int),
+		.maxlen		= sizeof(unsigned long),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
+		.proc_handler	= proc_doulongvec_minmax,
 		.extra1		= &zero,
 		.extra2		= &init_net.ipv4.frags.high_thresh
 	},
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -71,7 +71,7 @@ static int sockstat_seq_show(struct seq_
 		   sock_prot_inuse_get(net, &udplite_prot));
 	seq_printf(seq, "RAW: inuse %d\n",
 		   sock_prot_inuse_get(net, &raw_prot));
-	seq_printf(seq,  "FRAG: inuse %u memory %u\n",
+	seq_printf(seq,  "FRAG: inuse %u memory %lu\n",
 		   atomic_read(&net->ipv4.frags.rhashtable.nelems),
 		   frag_mem_limit(&net->ipv4.frags));
 	return 0;
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -63,7 +63,7 @@ struct nf_ct_frag6_skb_cb
 static struct inet_frags nf_frags;
 
 #ifdef CONFIG_SYSCTL
-static int zero;
+static long zero;
 
 static struct ctl_table nf_ct_frag6_sysctl_table[] = {
 	{
@@ -76,18 +76,18 @@ static struct ctl_table nf_ct_frag6_sysc
 	{
 		.procname	= "nf_conntrack_frag6_low_thresh",
 		.data		= &init_net.nf_frag.frags.low_thresh,
-		.maxlen		= sizeof(unsigned int),
+		.maxlen		= sizeof(unsigned long),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
+		.proc_handler	= proc_doulongvec_minmax,
 		.extra1		= &zero,
 		.extra2		= &init_net.nf_frag.frags.high_thresh
 	},
 	{
 		.procname	= "nf_conntrack_frag6_high_thresh",
 		.data		= &init_net.nf_frag.frags.high_thresh,
-		.maxlen		= sizeof(unsigned int),
+		.maxlen		= sizeof(unsigned long),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
+		.proc_handler	= proc_doulongvec_minmax,
 		.extra1		= &init_net.nf_frag.frags.low_thresh
 	},
 	{ }
--- a/net/ipv6/proc.c
+++ b/net/ipv6/proc.c
@@ -47,7 +47,7 @@ static int sockstat6_seq_show(struct seq
 			sock_prot_inuse_get(net, &udplitev6_prot));
 	seq_printf(seq, "RAW6: inuse %d\n",
 		       sock_prot_inuse_get(net, &rawv6_prot));
-	seq_printf(seq, "FRAG6: inuse %u memory %u\n",
+	seq_printf(seq, "FRAG6: inuse %u memory %lu\n",
 		   atomic_read(&net->ipv6.frags.rhashtable.nelems),
 		   frag_mem_limit(&net->ipv6.frags));
 	return 0;
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -552,15 +552,15 @@ static struct ctl_table ip6_frags_ns_ctl
 	{
 		.procname	= "ip6frag_high_thresh",
 		.data		= &init_net.ipv6.frags.high_thresh,
-		.maxlen		= sizeof(int),
+		.maxlen		= sizeof(unsigned long),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
+		.proc_handler	= proc_doulongvec_minmax,
 		.extra1		= &init_net.ipv6.frags.low_thresh
 	},
 	{
 		.procname	= "ip6frag_low_thresh",
 		.data		= &init_net.ipv6.frags.low_thresh,
-		.maxlen		= sizeof(int),
+		.maxlen		= sizeof(unsigned long),
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= &zero,



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 105/126] inet: frags: do not clone skb in ip_expire()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (103 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 104/126] inet: frags: break the 2GB limit for frags storage Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 106/126] ipv6: frags: rewrite ip6_expire_frag_queue() Greg Kroah-Hartman
                   ` (23 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

An skb_clone() was added in commit ec4fbd64751d ("inet: frag: release
spinlock before calling icmp_send()")

While fixing the bug at that time, it also added a very high cost
for DDOS frags, as the ICMP rate limit is applied after this
expensive operation (skb_clone() + consume_skb(), implying memory
allocations, copy, and freeing)

We can use skb_get(head) here, all we want is to make sure skb wont
be freed by another cpu.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 1eec5d5670084ee644597bd26c25e22c69b9f748)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/ip_fragment.c |   16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -143,8 +143,8 @@ static bool frag_expire_skip_icmp(u32 us
 static void ip_expire(struct timer_list *t)
 {
 	struct inet_frag_queue *frag = from_timer(frag, t, timer);
-	struct sk_buff *clone, *head;
 	const struct iphdr *iph;
+	struct sk_buff *head;
 	struct net *net;
 	struct ipq *qp;
 	int err;
@@ -187,16 +187,12 @@ static void ip_expire(struct timer_list
 	    (skb_rtable(head)->rt_type != RTN_LOCAL))
 		goto out;
 
-	clone = skb_clone(head, GFP_ATOMIC);
+	skb_get(head);
+	spin_unlock(&qp->q.lock);
+	icmp_send(head, ICMP_TIME_EXCEEDED, ICMP_EXC_FRAGTIME, 0);
+	kfree_skb(head);
+	goto out_rcu_unlock;
 
-	/* Send an ICMP "Fragment Reassembly Timeout" message. */
-	if (clone) {
-		spin_unlock(&qp->q.lock);
-		icmp_send(clone, ICMP_TIME_EXCEEDED,
-			  ICMP_EXC_FRAGTIME, 0);
-		consume_skb(clone);
-		goto out_rcu_unlock;
-	}
 out:
 	spin_unlock(&qp->q.lock);
 out_rcu_unlock:



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 106/126] ipv6: frags: rewrite ip6_expire_frag_queue()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (104 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 105/126] inet: frags: do not clone skb in ip_expire() Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 107/126] rhashtable: reorganize struct rhashtable layout Greg Kroah-Hartman
                   ` (22 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

Make it similar to IPv4 ip_expire(), and release the lock
before calling icmp functions.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 05c0b86b9696802fd0ce5676a92a63f1b455bdf3)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/reassembly.c |   24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -92,7 +92,9 @@ EXPORT_SYMBOL(ip6_frag_init);
 void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq)
 {
 	struct net_device *dev = NULL;
+	struct sk_buff *head;
 
+	rcu_read_lock();
 	spin_lock(&fq->q.lock);
 
 	if (fq->q.flags & INET_FRAG_COMPLETE)
@@ -100,28 +102,34 @@ void ip6_expire_frag_queue(struct net *n
 
 	inet_frag_kill(&fq->q);
 
-	rcu_read_lock();
 	dev = dev_get_by_index_rcu(net, fq->iif);
 	if (!dev)
-		goto out_rcu_unlock;
+		goto out;
 
 	__IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMFAILS);
 	__IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMTIMEOUT);
 
 	/* Don't send error if the first segment did not arrive. */
-	if (!(fq->q.flags & INET_FRAG_FIRST_IN) || !fq->q.fragments)
-		goto out_rcu_unlock;
+	head = fq->q.fragments;
+	if (!(fq->q.flags & INET_FRAG_FIRST_IN) || !head)
+		goto out;
 
 	/* But use as source device on which LAST ARRIVED
 	 * segment was received. And do not use fq->dev
 	 * pointer directly, device might already disappeared.
 	 */
-	fq->q.fragments->dev = dev;
-	icmpv6_send(fq->q.fragments, ICMPV6_TIME_EXCEED, ICMPV6_EXC_FRAGTIME, 0);
-out_rcu_unlock:
-	rcu_read_unlock();
+	head->dev = dev;
+	skb_get(head);
+	spin_unlock(&fq->q.lock);
+
+	icmpv6_send(head, ICMPV6_TIME_EXCEED, ICMPV6_EXC_FRAGTIME, 0);
+	kfree_skb(head);
+	goto out_rcu_unlock;
+
 out:
 	spin_unlock(&fq->q.lock);
+out_rcu_unlock:
+	rcu_read_unlock();
 	inet_frag_put(&fq->q);
 }
 EXPORT_SYMBOL(ip6_expire_frag_queue);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 107/126] rhashtable: reorganize struct rhashtable layout
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (105 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 106/126] ipv6: frags: rewrite ip6_expire_frag_queue() Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 108/126] inet: frags: reorganize struct netns_frags Greg Kroah-Hartman
                   ` (21 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

While under frags DDOS I noticed unfortunate false sharing between
@nelems and @params.automatic_shrinking

Move @nelems at the end of struct rhashtable so that first cache line
is shared between all cpus, because almost never dirtied.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e5d672a0780d9e7118caad4c171ec88b8299398d)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/rhashtable.h |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -152,25 +152,25 @@ struct rhashtable_params {
 /**
  * struct rhashtable - Hash table handle
  * @tbl: Bucket table
- * @nelems: Number of elements in table
  * @key_len: Key length for hashfn
- * @p: Configuration parameters
  * @max_elems: Maximum number of elements in table
+ * @p: Configuration parameters
  * @rhlist: True if this is an rhltable
  * @run_work: Deferred worker to expand/shrink asynchronously
  * @mutex: Mutex to protect current/future table swapping
  * @lock: Spin lock to protect walker list
+ * @nelems: Number of elements in table
  */
 struct rhashtable {
 	struct bucket_table __rcu	*tbl;
-	atomic_t			nelems;
 	unsigned int			key_len;
-	struct rhashtable_params	p;
 	unsigned int			max_elems;
+	struct rhashtable_params	p;
 	bool				rhlist;
 	struct work_struct		run_work;
 	struct mutex                    mutex;
 	spinlock_t			lock;
+	atomic_t			nelems;
 };
 
 /**



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 108/126] inet: frags: reorganize struct netns_frags
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (106 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 107/126] rhashtable: reorganize struct rhashtable layout Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 109/126] inet: frags: get rid of ipfrag_skb_cb/FRAG_CB Greg Kroah-Hartman
                   ` (20 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

Put the read-mostly fields in a separate cache line
at the beginning of struct netns_frags, to reduce
false sharing noticed in inet_frag_kill()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c2615cf5a761b32bf74e85bddc223dfff3d9b9f0)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/net/inet_frag.h |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -5,16 +5,17 @@
 #include <linux/rhashtable.h>
 
 struct netns_frags {
-	struct rhashtable       rhashtable ____cacheline_aligned_in_smp;
-
-	/* Keep atomic mem on separate cachelines in structs that include it */
-	atomic_long_t		mem ____cacheline_aligned_in_smp;
 	/* sysctls */
 	long			high_thresh;
 	long			low_thresh;
 	int			timeout;
 	int			max_dist;
 	struct inet_frags	*f;
+
+	struct rhashtable       rhashtable ____cacheline_aligned_in_smp;
+
+	/* Keep atomic mem on separate cachelines in structs that include it */
+	atomic_long_t		mem ____cacheline_aligned_in_smp;
 };
 
 /**



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 109/126] inet: frags: get rid of ipfrag_skb_cb/FRAG_CB
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (107 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 108/126] inet: frags: reorganize struct netns_frags Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 110/126] inet: frags: fix ip6frag_low_thresh boundary Greg Kroah-Hartman
                   ` (19 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

ip_defrag uses skb->cb[] to store the fragment offset, and unfortunately
this integer is currently in a different cache line than skb->next,
meaning that we use two cache lines per skb when finding the insertion point.

By aliasing skb->ip_defrag_offset and skb->dev, we pack all the fields
in a single cache line and save precious memory bandwidth.

Note that after the fast path added by Changli Gao in commit
d6bebca92c66 ("fragment: add fast path for in-order fragments")
this change wont help the fast path, since we still need
to access prev->len (2nd cache line), but will show great
benefits when slow path is entered, since we perform
a linear scan of a potentially long list.

Also, note that this potential long list is an attack vector,
we might consider also using an rb-tree there eventually.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit bf66337140c64c27fa37222b7abca7e49d63fb57)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/skbuff.h |    1 +
 net/ipv4/ip_fragment.c |   35 ++++++++++++++---------------------
 2 files changed, 15 insertions(+), 21 deletions(-)

--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -678,6 +678,7 @@ struct sk_buff {
 		 * UDP receive path is one user.
 		 */
 		unsigned long		dev_scratch;
+		int			ip_defrag_offset;
 	};
 	/*
 	 * This is the control buffer. It is free to use for every
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -57,14 +57,6 @@
  */
 static const char ip_frag_cache_name[] = "ip4-frags";
 
-struct ipfrag_skb_cb
-{
-	struct inet_skb_parm	h;
-	int			offset;
-};
-
-#define FRAG_CB(skb)	((struct ipfrag_skb_cb *)((skb)->cb))
-
 /* Describe an entry in the "incomplete datagrams" queue. */
 struct ipq {
 	struct inet_frag_queue q;
@@ -353,13 +345,13 @@ static int ip_frag_queue(struct ipq *qp,
 	 * this fragment, right?
 	 */
 	prev = qp->q.fragments_tail;
-	if (!prev || FRAG_CB(prev)->offset < offset) {
+	if (!prev || prev->ip_defrag_offset < offset) {
 		next = NULL;
 		goto found;
 	}
 	prev = NULL;
 	for (next = qp->q.fragments; next != NULL; next = next->next) {
-		if (FRAG_CB(next)->offset >= offset)
+		if (next->ip_defrag_offset >= offset)
 			break;	/* bingo! */
 		prev = next;
 	}
@@ -370,7 +362,7 @@ found:
 	 * any overlaps are eliminated.
 	 */
 	if (prev) {
-		int i = (FRAG_CB(prev)->offset + prev->len) - offset;
+		int i = (prev->ip_defrag_offset + prev->len) - offset;
 
 		if (i > 0) {
 			offset += i;
@@ -387,8 +379,8 @@ found:
 
 	err = -ENOMEM;
 
-	while (next && FRAG_CB(next)->offset < end) {
-		int i = end - FRAG_CB(next)->offset; /* overlap is 'i' bytes */
+	while (next && next->ip_defrag_offset < end) {
+		int i = end - next->ip_defrag_offset; /* overlap is 'i' bytes */
 
 		if (i < next->len) {
 			int delta = -next->truesize;
@@ -401,7 +393,7 @@ found:
 			delta += next->truesize;
 			if (delta)
 				add_frag_mem_limit(qp->q.net, delta);
-			FRAG_CB(next)->offset += i;
+			next->ip_defrag_offset += i;
 			qp->q.meat -= i;
 			if (next->ip_summed != CHECKSUM_UNNECESSARY)
 				next->ip_summed = CHECKSUM_NONE;
@@ -425,7 +417,13 @@ found:
 		}
 	}
 
-	FRAG_CB(skb)->offset = offset;
+	/* Note : skb->ip_defrag_offset and skb->dev share the same location */
+	dev = skb->dev;
+	if (dev)
+		qp->iif = dev->ifindex;
+	/* Makes sure compiler wont do silly aliasing games */
+	barrier();
+	skb->ip_defrag_offset = offset;
 
 	/* Insert this fragment in the chain of fragments. */
 	skb->next = next;
@@ -436,11 +434,6 @@ found:
 	else
 		qp->q.fragments = skb;
 
-	dev = skb->dev;
-	if (dev) {
-		qp->iif = dev->ifindex;
-		skb->dev = NULL;
-	}
 	qp->q.stamp = skb->tstamp;
 	qp->q.meat += skb->len;
 	qp->ecn |= ecn;
@@ -516,7 +509,7 @@ static int ip_frag_reasm(struct ipq *qp,
 	}
 
 	WARN_ON(!head);
-	WARN_ON(FRAG_CB(head)->offset != 0);
+	WARN_ON(head->ip_defrag_offset != 0);
 
 	/* Allocate a new buffer for the datagram. */
 	ihlen = ip_hdrlen(head);



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 110/126] inet: frags: fix ip6frag_low_thresh boundary
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (108 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 109/126] inet: frags: get rid of ipfrag_skb_cb/FRAG_CB Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 111/126] ip: discard IPv4 datagrams with overlapping segments Greg Kroah-Hartman
                   ` (18 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet,
	Maciej Żenczykowski, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

Giving an integer to proc_doulongvec_minmax() is dangerous on 64bit arches,
since linker might place next to it a non zero value preventing a change
to ip6frag_low_thresh.

ip6frag_low_thresh is not used anymore in the kernel, but we do not
want to prematuraly break user scripts wanting to change it.

Since specifying a minimal value of 0 for proc_doulongvec_minmax()
is moot, let's remove these zero values in all defrag units.

Fixes: 6e00f7dd5e4e ("ipv6: frags: fix /proc/sys/net/ipv6/ip6frag_low_thresh")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3d23401283e80ceb03f765842787e0e79ff598b7)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ieee802154/6lowpan/reassembly.c     |    2 --
 net/ipv4/ip_fragment.c                  |    5 ++---
 net/ipv6/netfilter/nf_conntrack_reasm.c |    2 --
 net/ipv6/reassembly.c                   |    4 +---
 4 files changed, 3 insertions(+), 10 deletions(-)

--- a/net/ieee802154/6lowpan/reassembly.c
+++ b/net/ieee802154/6lowpan/reassembly.c
@@ -411,7 +411,6 @@ err:
 }
 
 #ifdef CONFIG_SYSCTL
-static long zero;
 
 static struct ctl_table lowpan_frags_ns_ctl_table[] = {
 	{
@@ -428,7 +427,6 @@ static struct ctl_table lowpan_frags_ns_
 		.maxlen		= sizeof(unsigned long),
 		.mode		= 0644,
 		.proc_handler	= proc_doulongvec_minmax,
-		.extra1		= &zero,
 		.extra2		= &init_net.ieee802154_lowpan.frags.high_thresh
 	},
 	{
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -672,7 +672,7 @@ struct sk_buff *ip_check_defrag(struct n
 EXPORT_SYMBOL(ip_check_defrag);
 
 #ifdef CONFIG_SYSCTL
-static long zero;
+static int dist_min;
 
 static struct ctl_table ip4_frags_ns_ctl_table[] = {
 	{
@@ -689,7 +689,6 @@ static struct ctl_table ip4_frags_ns_ctl
 		.maxlen		= sizeof(unsigned long),
 		.mode		= 0644,
 		.proc_handler	= proc_doulongvec_minmax,
-		.extra1		= &zero,
 		.extra2		= &init_net.ipv4.frags.high_thresh
 	},
 	{
@@ -705,7 +704,7 @@ static struct ctl_table ip4_frags_ns_ctl
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= &zero
+		.extra1		= &dist_min,
 	},
 	{ }
 };
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -63,7 +63,6 @@ struct nf_ct_frag6_skb_cb
 static struct inet_frags nf_frags;
 
 #ifdef CONFIG_SYSCTL
-static long zero;
 
 static struct ctl_table nf_ct_frag6_sysctl_table[] = {
 	{
@@ -79,7 +78,6 @@ static struct ctl_table nf_ct_frag6_sysc
 		.maxlen		= sizeof(unsigned long),
 		.mode		= 0644,
 		.proc_handler	= proc_doulongvec_minmax,
-		.extra1		= &zero,
 		.extra2		= &init_net.nf_frag.frags.high_thresh
 	},
 	{
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -554,7 +554,6 @@ static const struct inet6_protocol frag_
 };
 
 #ifdef CONFIG_SYSCTL
-static int zero;
 
 static struct ctl_table ip6_frags_ns_ctl_table[] = {
 	{
@@ -570,8 +569,7 @@ static struct ctl_table ip6_frags_ns_ctl
 		.data		= &init_net.ipv6.frags.low_thresh,
 		.maxlen		= sizeof(unsigned long),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= &zero,
+		.proc_handler	= proc_doulongvec_minmax,
 		.extra2		= &init_net.ipv6.frags.high_thresh
 	},
 	{



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 111/126] ip: discard IPv4 datagrams with overlapping segments.
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (109 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 110/126] inet: frags: fix ip6frag_low_thresh boundary Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 112/126] net: speed up skb_rbtree_purge() Greg Kroah-Hartman
                   ` (17 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, David S. Miller, Peter Oskolkov,
	Eric Dumazet, Florian Westphal, Stephen Hemminger

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Oskolkov <posk@google.com>

This behavior is required in IPv6, and there is little need
to tolerate overlapping fragments in IPv4. This change
simplifies the code and eliminates potential DDoS attack vectors.

Tested: ran ip_defrag selftest (not yet available uptream).

Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Peter Oskolkov <posk@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 7969e5c40dfd04799d4341f1b7cd266b6e47f227)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/uapi/linux/snmp.h |    1 
 net/ipv4/ip_fragment.c    |   77 +++++++++++-----------------------------------
 net/ipv4/proc.c           |    1 
 3 files changed, 22 insertions(+), 57 deletions(-)

--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -56,6 +56,7 @@ enum
 	IPSTATS_MIB_ECT1PKTS,			/* InECT1Pkts */
 	IPSTATS_MIB_ECT0PKTS,			/* InECT0Pkts */
 	IPSTATS_MIB_CEPKTS,			/* InCEPkts */
+	IPSTATS_MIB_REASM_OVERLAPS,		/* ReasmOverlaps */
 	__IPSTATS_MIB_MAX
 };
 
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -277,6 +277,7 @@ static int ip_frag_reinit(struct ipq *qp
 /* Add new segment to existing queue. */
 static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 {
+	struct net *net = container_of(qp->q.net, struct net, ipv4.frags);
 	struct sk_buff *prev, *next;
 	struct net_device *dev;
 	unsigned int fragsize;
@@ -357,65 +358,23 @@ static int ip_frag_queue(struct ipq *qp,
 	}
 
 found:
-	/* We found where to put this one.  Check for overlap with
-	 * preceding fragment, and, if needed, align things so that
-	 * any overlaps are eliminated.
+	/* RFC5722, Section 4, amended by Errata ID : 3089
+	 *                          When reassembling an IPv6 datagram, if
+	 *   one or more its constituent fragments is determined to be an
+	 *   overlapping fragment, the entire datagram (and any constituent
+	 *   fragments) MUST be silently discarded.
+	 *
+	 * We do the same here for IPv4.
 	 */
-	if (prev) {
-		int i = (prev->ip_defrag_offset + prev->len) - offset;
 
-		if (i > 0) {
-			offset += i;
-			err = -EINVAL;
-			if (end <= offset)
-				goto err;
-			err = -ENOMEM;
-			if (!pskb_pull(skb, i))
-				goto err;
-			if (skb->ip_summed != CHECKSUM_UNNECESSARY)
-				skb->ip_summed = CHECKSUM_NONE;
-		}
-	}
-
-	err = -ENOMEM;
-
-	while (next && next->ip_defrag_offset < end) {
-		int i = end - next->ip_defrag_offset; /* overlap is 'i' bytes */
-
-		if (i < next->len) {
-			int delta = -next->truesize;
-
-			/* Eat head of the next overlapped fragment
-			 * and leave the loop. The next ones cannot overlap.
-			 */
-			if (!pskb_pull(next, i))
-				goto err;
-			delta += next->truesize;
-			if (delta)
-				add_frag_mem_limit(qp->q.net, delta);
-			next->ip_defrag_offset += i;
-			qp->q.meat -= i;
-			if (next->ip_summed != CHECKSUM_UNNECESSARY)
-				next->ip_summed = CHECKSUM_NONE;
-			break;
-		} else {
-			struct sk_buff *free_it = next;
-
-			/* Old fragment is completely overridden with
-			 * new one drop it.
-			 */
-			next = next->next;
-
-			if (prev)
-				prev->next = next;
-			else
-				qp->q.fragments = next;
-
-			qp->q.meat -= free_it->len;
-			sub_frag_mem_limit(qp->q.net, free_it->truesize);
-			kfree_skb(free_it);
-		}
-	}
+	/* Is there an overlap with the previous fragment? */
+	if (prev &&
+	    (prev->ip_defrag_offset + prev->len) > offset)
+		goto discard_qp;
+
+	/* Is there an overlap with the next fragment? */
+	if (next && next->ip_defrag_offset < end)
+		goto discard_qp;
 
 	/* Note : skb->ip_defrag_offset and skb->dev share the same location */
 	dev = skb->dev;
@@ -463,6 +422,10 @@ found:
 	skb_dst_drop(skb);
 	return -EINPROGRESS;
 
+discard_qp:
+	inet_frag_kill(&qp->q);
+	err = -EINVAL;
+	__IP_INC_STATS(net, IPSTATS_MIB_REASM_OVERLAPS);
 err:
 	kfree_skb(skb);
 	return err;
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -132,6 +132,7 @@ static const struct snmp_mib snmp4_ipext
 	SNMP_MIB_ITEM("InECT1Pkts", IPSTATS_MIB_ECT1PKTS),
 	SNMP_MIB_ITEM("InECT0Pkts", IPSTATS_MIB_ECT0PKTS),
 	SNMP_MIB_ITEM("InCEPkts", IPSTATS_MIB_CEPKTS),
+	SNMP_MIB_ITEM("ReasmOverlaps", IPSTATS_MIB_REASM_OVERLAPS),
 	SNMP_MIB_SENTINEL
 };
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 112/126] net: speed up skb_rbtree_purge()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (110 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 111/126] ip: discard IPv4 datagrams with overlapping segments Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 113/126] net: modify skb_rbtree_purge to return the truesize of all purged skbs Greg Kroah-Hartman
                   ` (16 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

As measured in my prior patch ("sch_netem: faster rb tree removal"),
rbtree_postorder_for_each_entry_safe() is nice looking but much slower
than using rb_next() directly, except when tree is small enough
to fit in CPU caches (then the cost is the same)

Also note that there is not even an increase of text size :
$ size net/core/skbuff.o.before net/core/skbuff.o
   text	   data	    bss	    dec	    hex	filename
  40711	   1298	      0	  42009	   a419	net/core/skbuff.o.before
  40711	   1298	      0	  42009	   a419	net/core/skbuff.o

From: Eric Dumazet <edumazet@google.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 7c90584c66cc4b033a3b684b0e0950f79e7b7166)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/core/skbuff.c |   11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2850,12 +2850,15 @@ EXPORT_SYMBOL(skb_queue_purge);
  */
 void skb_rbtree_purge(struct rb_root *root)
 {
-	struct sk_buff *skb, *next;
+	struct rb_node *p = rb_first(root);
 
-	rbtree_postorder_for_each_entry_safe(skb, next, root, rbnode)
-		kfree_skb(skb);
+	while (p) {
+		struct sk_buff *skb = rb_entry(p, struct sk_buff, rbnode);
 
-	*root = RB_ROOT;
+		p = rb_next(p);
+		rb_erase(&skb->rbnode, root);
+		kfree_skb(skb);
+	}
 }
 
 /**



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 113/126] net: modify skb_rbtree_purge to return the truesize of all purged skbs.
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (111 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 112/126] net: speed up skb_rbtree_purge() Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 114/126] ipv6: defrag: drop non-last frags smaller than min mtu Greg Kroah-Hartman
                   ` (15 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Peter Oskolkov,
	Florian Westphal, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Oskolkov <posk@google.com>

Tested: see the next patch is the series.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Peter Oskolkov <posk@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 385114dec8a49b5e5945e77ba7de6356106713f4)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/skbuff.h |    2 +-
 net/core/skbuff.c      |    6 +++++-
 2 files changed, 6 insertions(+), 2 deletions(-)

--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2581,7 +2581,7 @@ static inline void __skb_queue_purge(str
 		kfree_skb(skb);
 }
 
-void skb_rbtree_purge(struct rb_root *root);
+unsigned int skb_rbtree_purge(struct rb_root *root);
 
 void *netdev_alloc_frag(unsigned int fragsz);
 
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2842,23 +2842,27 @@ EXPORT_SYMBOL(skb_queue_purge);
 /**
  *	skb_rbtree_purge - empty a skb rbtree
  *	@root: root of the rbtree to empty
+ *	Return value: the sum of truesizes of all purged skbs.
  *
  *	Delete all buffers on an &sk_buff rbtree. Each buffer is removed from
  *	the list and one reference dropped. This function does not take
  *	any lock. Synchronization should be handled by the caller (e.g., TCP
  *	out-of-order queue is protected by the socket lock).
  */
-void skb_rbtree_purge(struct rb_root *root)
+unsigned int skb_rbtree_purge(struct rb_root *root)
 {
 	struct rb_node *p = rb_first(root);
+	unsigned int sum = 0;
 
 	while (p) {
 		struct sk_buff *skb = rb_entry(p, struct sk_buff, rbnode);
 
 		p = rb_next(p);
 		rb_erase(&skb->rbnode, root);
+		sum += skb->truesize;
 		kfree_skb(skb);
 	}
+	return sum;
 }
 
 /**



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 114/126] ipv6: defrag: drop non-last frags smaller than min mtu
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (112 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 113/126] net: modify skb_rbtree_purge to return the truesize of all purged skbs Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 115/126] net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends Greg Kroah-Hartman
                   ` (14 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Peter Oskolkov, Eric Dumazet,
	Florian Westphal, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Florian Westphal <fw@strlen.de>

don't bother with pathological cases, they only waste cycles.
IPv6 requires a minimum MTU of 1280 so we should never see fragments
smaller than this (except last frag).

v3: don't use awkward "-offset + len"
v2: drop IPv4 part, which added same check w. IPV4_MIN_MTU (68).
    There were concerns that there could be even smaller frags
    generated by intermediate nodes, e.g. on radio networks.

Cc: Peter Oskolkov <posk@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 0ed4229b08c13c84a3c301a08defdc9e7f4467e6)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/netfilter/nf_conntrack_reasm.c |    4 ++++
 net/ipv6/reassembly.c                   |    4 ++++
 2 files changed, 8 insertions(+)

--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -565,6 +565,10 @@ int nf_ct_frag6_gather(struct net *net,
 	hdr = ipv6_hdr(skb);
 	fhdr = (struct frag_hdr *)skb_transport_header(skb);
 
+	if (skb->len - skb_network_offset(skb) < IPV6_MIN_MTU &&
+	    fhdr->frag_off & htons(IP6_MF))
+		return -EINVAL;
+
 	skb_orphan(skb);
 	fq = fq_find(net, fhdr->identification, user, hdr,
 		     skb->dev ? skb->dev->ifindex : 0);
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -522,6 +522,10 @@ static int ipv6_frag_rcv(struct sk_buff
 		return 1;
 	}
 
+	if (skb->len - skb_network_offset(skb) < IPV6_MIN_MTU &&
+	    fhdr->frag_off & htons(IP6_MF))
+		goto fail_hdr;
+
 	iif = skb->dev ? skb->dev->ifindex : 0;
 	fq = fq_find(net, fhdr->identification, hdr, iif);
 	if (fq) {



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 115/126] net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (113 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 114/126] ipv6: defrag: drop non-last frags smaller than min mtu Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 116/126] net: add rb_to_skb() and other rb tree helpers Greg Kroah-Hartman
                   ` (13 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

After working on IP defragmentation lately, I found that some large
packets defeat CHECKSUM_COMPLETE optimization because of NIC adding
zero paddings on the last (small) fragment.

While removing the padding with pskb_trim_rcsum(), we set skb->ip_summed
to CHECKSUM_NONE, forcing a full csum validation, even if all prior
fragments had CHECKSUM_COMPLETE set.

We can instead compute the checksum of the part we are trimming,
usually smaller than the part we keep.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 88078d98d1bb085d72af8437707279e203524fa5)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/skbuff.h |    5 ++---
 net/core/skbuff.c      |   14 ++++++++++++++
 2 files changed, 16 insertions(+), 3 deletions(-)

--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3135,6 +3135,7 @@ static inline void *skb_push_rcsum(struc
 	return skb->data;
 }
 
+int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len);
 /**
  *	pskb_trim_rcsum - trim received skb and update checksum
  *	@skb: buffer to trim
@@ -3148,9 +3149,7 @@ static inline int pskb_trim_rcsum(struct
 {
 	if (likely(len >= skb->len))
 		return 0;
-	if (skb->ip_summed == CHECKSUM_COMPLETE)
-		skb->ip_summed = CHECKSUM_NONE;
-	return __pskb_trim(skb, len);
+	return pskb_trim_rcsum_slow(skb, len);
 }
 
 static inline int __skb_trim_rcsum(struct sk_buff *skb, unsigned int len)
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1839,6 +1839,20 @@ done:
 }
 EXPORT_SYMBOL(___pskb_trim);
 
+/* Note : use pskb_trim_rcsum() instead of calling this directly
+ */
+int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len)
+{
+	if (skb->ip_summed == CHECKSUM_COMPLETE) {
+		int delta = skb->len - len;
+
+		skb->csum = csum_sub(skb->csum,
+				     skb_checksum(skb, len, delta, 0));
+	}
+	return __pskb_trim(skb, len);
+}
+EXPORT_SYMBOL(pskb_trim_rcsum_slow);
+
 /**
  *	__pskb_pull_tail - advance tail of skb header
  *	@skb: buffer to reallocate



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 116/126] net: add rb_to_skb() and other rb tree helpers
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (114 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 115/126] net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 117/126] net: sk_buff rbnode reorg Greg Kroah-Hartman
                   ` (12 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

Geeralize private netem_rb_to_skb()

TCP rtx queue will soon be converted to rb-tree,
so we will need skb_rbtree_walk() helpers.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 18a4c0eab2623cc95be98a1e6af1ad18e7695977)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/skbuff.h  |   18 ++++++++++++++++++
 net/ipv4/tcp_fastopen.c |    8 +++-----
 net/ipv4/tcp_input.c    |   33 ++++++++++++---------------------
 net/sched/sch_netem.c   |   14 ++++----------
 4 files changed, 37 insertions(+), 36 deletions(-)

--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3169,6 +3169,12 @@ static inline int __skb_grow_rcsum(struc
 
 #define rb_to_skb(rb) rb_entry_safe(rb, struct sk_buff, rbnode)
 
+#define rb_to_skb(rb) rb_entry_safe(rb, struct sk_buff, rbnode)
+#define skb_rb_first(root) rb_to_skb(rb_first(root))
+#define skb_rb_last(root)  rb_to_skb(rb_last(root))
+#define skb_rb_next(skb)   rb_to_skb(rb_next(&(skb)->rbnode))
+#define skb_rb_prev(skb)   rb_to_skb(rb_prev(&(skb)->rbnode))
+
 #define skb_queue_walk(queue, skb) \
 		for (skb = (queue)->next;					\
 		     skb != (struct sk_buff *)(queue);				\
@@ -3183,6 +3189,18 @@ static inline int __skb_grow_rcsum(struc
 		for (; skb != (struct sk_buff *)(queue);			\
 		     skb = skb->next)
 
+#define skb_rbtree_walk(skb, root)						\
+		for (skb = skb_rb_first(root); skb != NULL;			\
+		     skb = skb_rb_next(skb))
+
+#define skb_rbtree_walk_from(skb)						\
+		for (; skb != NULL;						\
+		     skb = skb_rb_next(skb))
+
+#define skb_rbtree_walk_from_safe(skb, tmp)					\
+		for (; tmp = skb ? skb_rb_next(skb) : NULL, (skb != NULL);	\
+		     skb = tmp)
+
 #define skb_queue_walk_from_safe(queue, skb, tmp)				\
 		for (tmp = skb->next;						\
 		     skb != (struct sk_buff *)(queue);				\
--- a/net/ipv4/tcp_fastopen.c
+++ b/net/ipv4/tcp_fastopen.c
@@ -458,17 +458,15 @@ bool tcp_fastopen_active_should_disable(
 void tcp_fastopen_active_disable_ofo_check(struct sock *sk)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
-	struct rb_node *p;
-	struct sk_buff *skb;
 	struct dst_entry *dst;
+	struct sk_buff *skb;
 
 	if (!tp->syn_fastopen)
 		return;
 
 	if (!tp->data_segs_in) {
-		p = rb_first(&tp->out_of_order_queue);
-		if (p && !rb_next(p)) {
-			skb = rb_entry(p, struct sk_buff, rbnode);
+		skb = skb_rb_first(&tp->out_of_order_queue);
+		if (skb && !skb_rb_next(skb)) {
 			if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN) {
 				tcp_fastopen_active_disable(sk);
 				return;
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4372,7 +4372,7 @@ static void tcp_ofo_queue(struct sock *s
 
 	p = rb_first(&tp->out_of_order_queue);
 	while (p) {
-		skb = rb_entry(p, struct sk_buff, rbnode);
+		skb = rb_to_skb(p);
 		if (after(TCP_SKB_CB(skb)->seq, tp->rcv_nxt))
 			break;
 
@@ -4440,7 +4440,7 @@ static int tcp_try_rmem_schedule(struct
 static void tcp_data_queue_ofo(struct sock *sk, struct sk_buff *skb)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
-	struct rb_node **p, *q, *parent;
+	struct rb_node **p, *parent;
 	struct sk_buff *skb1;
 	u32 seq, end_seq;
 	bool fragstolen;
@@ -4503,7 +4503,7 @@ coalesce_done:
 	parent = NULL;
 	while (*p) {
 		parent = *p;
-		skb1 = rb_entry(parent, struct sk_buff, rbnode);
+		skb1 = rb_to_skb(parent);
 		if (before(seq, TCP_SKB_CB(skb1)->seq)) {
 			p = &parent->rb_left;
 			continue;
@@ -4548,9 +4548,7 @@ insert:
 
 merge_right:
 	/* Remove other segments covered by skb. */
-	while ((q = rb_next(&skb->rbnode)) != NULL) {
-		skb1 = rb_entry(q, struct sk_buff, rbnode);
-
+	while ((skb1 = skb_rb_next(skb)) != NULL) {
 		if (!after(end_seq, TCP_SKB_CB(skb1)->seq))
 			break;
 		if (before(end_seq, TCP_SKB_CB(skb1)->end_seq)) {
@@ -4565,7 +4563,7 @@ merge_right:
 		tcp_drop(sk, skb1);
 	}
 	/* If there is no skb after us, we are the last_skb ! */
-	if (!q)
+	if (!skb1)
 		tp->ooo_last_skb = skb;
 
 add_sack:
@@ -4749,7 +4747,7 @@ static struct sk_buff *tcp_skb_next(stru
 	if (list)
 		return !skb_queue_is_last(list, skb) ? skb->next : NULL;
 
-	return rb_entry_safe(rb_next(&skb->rbnode), struct sk_buff, rbnode);
+	return skb_rb_next(skb);
 }
 
 static struct sk_buff *tcp_collapse_one(struct sock *sk, struct sk_buff *skb,
@@ -4778,7 +4776,7 @@ static void tcp_rbtree_insert(struct rb_
 
 	while (*p) {
 		parent = *p;
-		skb1 = rb_entry(parent, struct sk_buff, rbnode);
+		skb1 = rb_to_skb(parent);
 		if (before(TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb1)->seq))
 			p = &parent->rb_left;
 		else
@@ -4898,19 +4896,12 @@ static void tcp_collapse_ofo_queue(struc
 	struct tcp_sock *tp = tcp_sk(sk);
 	u32 range_truesize, sum_tiny = 0;
 	struct sk_buff *skb, *head;
-	struct rb_node *p;
 	u32 start, end;
 
-	p = rb_first(&tp->out_of_order_queue);
-	skb = rb_entry_safe(p, struct sk_buff, rbnode);
+	skb = skb_rb_first(&tp->out_of_order_queue);
 new_range:
 	if (!skb) {
-		p = rb_last(&tp->out_of_order_queue);
-		/* Note: This is possible p is NULL here. We do not
-		 * use rb_entry_safe(), as ooo_last_skb is valid only
-		 * if rbtree is not empty.
-		 */
-		tp->ooo_last_skb = rb_entry(p, struct sk_buff, rbnode);
+		tp->ooo_last_skb = skb_rb_last(&tp->out_of_order_queue);
 		return;
 	}
 	start = TCP_SKB_CB(skb)->seq;
@@ -4918,7 +4909,7 @@ new_range:
 	range_truesize = skb->truesize;
 
 	for (head = skb;;) {
-		skb = tcp_skb_next(skb, NULL);
+		skb = skb_rb_next(skb);
 
 		/* Range is terminated when we see a gap or when
 		 * we are at the queue end.
@@ -4974,7 +4965,7 @@ static bool tcp_prune_ofo_queue(struct s
 		prev = rb_prev(node);
 		rb_erase(node, &tp->out_of_order_queue);
 		goal -= rb_to_skb(node)->truesize;
-		tcp_drop(sk, rb_entry(node, struct sk_buff, rbnode));
+		tcp_drop(sk, rb_to_skb(node));
 		if (!prev || goal <= 0) {
 			sk_mem_reclaim(sk);
 			if (atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf &&
@@ -4984,7 +4975,7 @@ static bool tcp_prune_ofo_queue(struct s
 		}
 		node = prev;
 	} while (node);
-	tp->ooo_last_skb = rb_entry(prev, struct sk_buff, rbnode);
+	tp->ooo_last_skb = rb_to_skb(prev);
 
 	/* Reset SACK state.  A conforming SACK implementation will
 	 * do the same at a timeout based retransmit.  When a connection
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -149,12 +149,6 @@ struct netem_skb_cb {
 	ktime_t		tstamp_save;
 };
 
-
-static struct sk_buff *netem_rb_to_skb(struct rb_node *rb)
-{
-	return rb_entry(rb, struct sk_buff, rbnode);
-}
-
 static inline struct netem_skb_cb *netem_skb_cb(struct sk_buff *skb)
 {
 	/* we assume we can use skb next/prev/tstamp as storage for rb_node */
@@ -365,7 +359,7 @@ static void tfifo_reset(struct Qdisc *sc
 	struct rb_node *p;
 
 	while ((p = rb_first(&q->t_root))) {
-		struct sk_buff *skb = netem_rb_to_skb(p);
+		struct sk_buff *skb = rb_to_skb(p);
 
 		rb_erase(p, &q->t_root);
 		rtnl_kfree_skbs(skb, skb);
@@ -382,7 +376,7 @@ static void tfifo_enqueue(struct sk_buff
 		struct sk_buff *skb;
 
 		parent = *p;
-		skb = netem_rb_to_skb(parent);
+		skb = rb_to_skb(parent);
 		if (tnext >= netem_skb_cb(skb)->time_to_send)
 			p = &parent->rb_right;
 		else
@@ -538,7 +532,7 @@ static int netem_enqueue(struct sk_buff
 				struct sk_buff *t_skb;
 				struct netem_skb_cb *t_last;
 
-				t_skb = netem_rb_to_skb(rb_last(&q->t_root));
+				t_skb = skb_rb_last(&q->t_root);
 				t_last = netem_skb_cb(t_skb);
 				if (!last ||
 				    t_last->time_to_send > last->time_to_send) {
@@ -618,7 +612,7 @@ deliver:
 	if (p) {
 		psched_time_t time_to_send;
 
-		skb = netem_rb_to_skb(p);
+		skb = rb_to_skb(p);
 
 		/* if more time remaining? */
 		time_to_send = netem_skb_cb(skb)->time_to_send;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 117/126] net: sk_buff rbnode reorg
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (115 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 116/126] net: add rb_to_skb() and other rb tree helpers Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-10-04 20:13   ` Mitch Harder
  2018-09-17 22:42 ` [PATCH 4.14 118/126] ipv4: frags: precedence bug in ip_expire() Greg Kroah-Hartman
                   ` (11 subsequent siblings)
  128 siblings, 1 reply; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Soheil Hassas Yeganeh,
	Wei Wang, Willem de Bruijn, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

commit bffa72cf7f9df842f0016ba03586039296b4caaf upstream

skb->rbnode shares space with skb->next, skb->prev and skb->tstamp

Current uses (TCP receive ofo queue and netem) need to save/restore
tstamp, while skb->dev is either NULL (TCP) or a constant for a given
queue (netem).

Since we plan using an RB tree for TCP retransmit queue to speedup SACK
processing with large BDP, this patch exchanges skb->dev and
skb->tstamp.

This saves some overhead in both TCP and netem.

v2: removes the swtstamp field from struct tcp_skb_cb

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Wei Wang <weiwan@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/skbuff.h                  |   24 ++--
 include/net/inet_frag.h                 |    3 
 net/ipv4/inet_fragment.c                |   14 +-
 net/ipv4/ip_fragment.c                  |  182 +++++++++++++++++---------------
 net/ipv6/netfilter/nf_conntrack_reasm.c |    1 
 net/ipv6/reassembly.c                   |    1 
 6 files changed, 128 insertions(+), 97 deletions(-)

--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -663,23 +663,27 @@ struct sk_buff {
 			struct sk_buff		*prev;
 
 			union {
-				ktime_t		tstamp;
-				u64		skb_mstamp;
+				struct net_device	*dev;
+				/* Some protocols might use this space to store information,
+				 * while device pointer would be NULL.
+				 * UDP receive path is one user.
+				 */
+				unsigned long		dev_scratch;
 			};
 		};
-		struct rb_node	rbnode; /* used in netem & tcp stack */
+		struct rb_node		rbnode; /* used in netem, ip4 defrag, and tcp stack */
+		struct list_head	list;
 	};
-	struct sock		*sk;
 
 	union {
-		struct net_device	*dev;
-		/* Some protocols might use this space to store information,
-		 * while device pointer would be NULL.
-		 * UDP receive path is one user.
-		 */
-		unsigned long		dev_scratch;
+		struct sock		*sk;
 		int			ip_defrag_offset;
 	};
+
+	union {
+		ktime_t		tstamp;
+		u64		skb_mstamp;
+	};
 	/*
 	 * This is the control buffer. It is free to use for every
 	 * layer. Please put your private variables there. If you
--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -75,7 +75,8 @@ struct inet_frag_queue {
 	struct timer_list	timer;
 	spinlock_t		lock;
 	refcount_t		refcnt;
-	struct sk_buff		*fragments;
+	struct sk_buff		*fragments;  /* Used in IPv6. */
+	struct rb_root		rb_fragments; /* Used in IPv4. */
 	struct sk_buff		*fragments_tail;
 	ktime_t			stamp;
 	int			len;
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -136,12 +136,16 @@ void inet_frag_destroy(struct inet_frag_
 	fp = q->fragments;
 	nf = q->net;
 	f = nf->f;
-	while (fp) {
-		struct sk_buff *xp = fp->next;
+	if (fp) {
+		do {
+			struct sk_buff *xp = fp->next;
 
-		sum_truesize += fp->truesize;
-		kfree_skb(fp);
-		fp = xp;
+			sum_truesize += fp->truesize;
+			kfree_skb(fp);
+			fp = xp;
+		} while (fp);
+	} else {
+		sum_truesize = skb_rbtree_purge(&q->rb_fragments);
 	}
 	sum = sum_truesize + f->qsize;
 
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -136,7 +136,7 @@ static void ip_expire(struct timer_list
 {
 	struct inet_frag_queue *frag = from_timer(frag, t, timer);
 	const struct iphdr *iph;
-	struct sk_buff *head;
+	struct sk_buff *head = NULL;
 	struct net *net;
 	struct ipq *qp;
 	int err;
@@ -152,14 +152,31 @@ static void ip_expire(struct timer_list
 
 	ipq_kill(qp);
 	__IP_INC_STATS(net, IPSTATS_MIB_REASMFAILS);
-
-	head = qp->q.fragments;
-
 	__IP_INC_STATS(net, IPSTATS_MIB_REASMTIMEOUT);
 
-	if (!(qp->q.flags & INET_FRAG_FIRST_IN) || !head)
+	if (!qp->q.flags & INET_FRAG_FIRST_IN)
 		goto out;
 
+	/* sk_buff::dev and sk_buff::rbnode are unionized. So we
+	 * pull the head out of the tree in order to be able to
+	 * deal with head->dev.
+	 */
+	if (qp->q.fragments) {
+		head = qp->q.fragments;
+		qp->q.fragments = head->next;
+	} else {
+		head = skb_rb_first(&qp->q.rb_fragments);
+		if (!head)
+			goto out;
+		rb_erase(&head->rbnode, &qp->q.rb_fragments);
+		memset(&head->rbnode, 0, sizeof(head->rbnode));
+		barrier();
+	}
+	if (head == qp->q.fragments_tail)
+		qp->q.fragments_tail = NULL;
+
+	sub_frag_mem_limit(qp->q.net, head->truesize);
+
 	head->dev = dev_get_by_index_rcu(net, qp->iif);
 	if (!head->dev)
 		goto out;
@@ -179,16 +196,16 @@ static void ip_expire(struct timer_list
 	    (skb_rtable(head)->rt_type != RTN_LOCAL))
 		goto out;
 
-	skb_get(head);
 	spin_unlock(&qp->q.lock);
 	icmp_send(head, ICMP_TIME_EXCEEDED, ICMP_EXC_FRAGTIME, 0);
-	kfree_skb(head);
 	goto out_rcu_unlock;
 
 out:
 	spin_unlock(&qp->q.lock);
 out_rcu_unlock:
 	rcu_read_unlock();
+	if (head)
+		kfree_skb(head);
 	ipq_put(qp);
 }
 
@@ -231,7 +248,7 @@ static int ip_frag_too_far(struct ipq *q
 	end = atomic_inc_return(&peer->rid);
 	qp->rid = end;
 
-	rc = qp->q.fragments && (end - start) > max;
+	rc = qp->q.fragments_tail && (end - start) > max;
 
 	if (rc) {
 		struct net *net;
@@ -245,7 +262,6 @@ static int ip_frag_too_far(struct ipq *q
 
 static int ip_frag_reinit(struct ipq *qp)
 {
-	struct sk_buff *fp;
 	unsigned int sum_truesize = 0;
 
 	if (!mod_timer(&qp->q.timer, jiffies + qp->q.net->timeout)) {
@@ -253,20 +269,14 @@ static int ip_frag_reinit(struct ipq *qp
 		return -ETIMEDOUT;
 	}
 
-	fp = qp->q.fragments;
-	do {
-		struct sk_buff *xp = fp->next;
-
-		sum_truesize += fp->truesize;
-		kfree_skb(fp);
-		fp = xp;
-	} while (fp);
+	sum_truesize = skb_rbtree_purge(&qp->q.rb_fragments);
 	sub_frag_mem_limit(qp->q.net, sum_truesize);
 
 	qp->q.flags = 0;
 	qp->q.len = 0;
 	qp->q.meat = 0;
 	qp->q.fragments = NULL;
+	qp->q.rb_fragments = RB_ROOT;
 	qp->q.fragments_tail = NULL;
 	qp->iif = 0;
 	qp->ecn = 0;
@@ -278,7 +288,8 @@ static int ip_frag_reinit(struct ipq *qp
 static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 {
 	struct net *net = container_of(qp->q.net, struct net, ipv4.frags);
-	struct sk_buff *prev, *next;
+	struct rb_node **rbn, *parent;
+	struct sk_buff *skb1;
 	struct net_device *dev;
 	unsigned int fragsize;
 	int flags, offset;
@@ -341,58 +352,58 @@ static int ip_frag_queue(struct ipq *qp,
 	if (err)
 		goto err;
 
-	/* Find out which fragments are in front and at the back of us
-	 * in the chain of fragments so far.  We must know where to put
-	 * this fragment, right?
-	 */
-	prev = qp->q.fragments_tail;
-	if (!prev || prev->ip_defrag_offset < offset) {
-		next = NULL;
-		goto found;
-	}
-	prev = NULL;
-	for (next = qp->q.fragments; next != NULL; next = next->next) {
-		if (next->ip_defrag_offset >= offset)
-			break;	/* bingo! */
-		prev = next;
-	}
+	/* Note : skb->rbnode and skb->dev share the same location. */
+	dev = skb->dev;
+	/* Makes sure compiler wont do silly aliasing games */
+	barrier();
 
-found:
 	/* RFC5722, Section 4, amended by Errata ID : 3089
 	 *                          When reassembling an IPv6 datagram, if
 	 *   one or more its constituent fragments is determined to be an
 	 *   overlapping fragment, the entire datagram (and any constituent
 	 *   fragments) MUST be silently discarded.
 	 *
-	 * We do the same here for IPv4.
+	 * We do the same here for IPv4 (and increment an snmp counter).
 	 */
 
-	/* Is there an overlap with the previous fragment? */
-	if (prev &&
-	    (prev->ip_defrag_offset + prev->len) > offset)
-		goto discard_qp;
-
-	/* Is there an overlap with the next fragment? */
-	if (next && next->ip_defrag_offset < end)
-		goto discard_qp;
+	/* Find out where to put this fragment.  */
+	skb1 = qp->q.fragments_tail;
+	if (!skb1) {
+		/* This is the first fragment we've received. */
+		rb_link_node(&skb->rbnode, NULL, &qp->q.rb_fragments.rb_node);
+		qp->q.fragments_tail = skb;
+	} else if ((skb1->ip_defrag_offset + skb1->len) < end) {
+		/* This is the common/special case: skb goes to the end. */
+		/* Detect and discard overlaps. */
+		if (offset < (skb1->ip_defrag_offset + skb1->len))
+			goto discard_qp;
+		/* Insert after skb1. */
+		rb_link_node(&skb->rbnode, &skb1->rbnode, &skb1->rbnode.rb_right);
+		qp->q.fragments_tail = skb;
+	} else {
+		/* Binary search. Note that skb can become the first fragment, but
+		 * not the last (covered above). */
+		rbn = &qp->q.rb_fragments.rb_node;
+		do {
+			parent = *rbn;
+			skb1 = rb_to_skb(parent);
+			if (end <= skb1->ip_defrag_offset)
+				rbn = &parent->rb_left;
+			else if (offset >= skb1->ip_defrag_offset + skb1->len)
+				rbn = &parent->rb_right;
+			else /* Found an overlap with skb1. */
+				goto discard_qp;
+		} while (*rbn);
+		/* Here we have parent properly set, and rbn pointing to
+		 * one of its NULL left/right children. Insert skb. */
+		rb_link_node(&skb->rbnode, parent, rbn);
+	}
+	rb_insert_color(&skb->rbnode, &qp->q.rb_fragments);
 
-	/* Note : skb->ip_defrag_offset and skb->dev share the same location */
-	dev = skb->dev;
 	if (dev)
 		qp->iif = dev->ifindex;
-	/* Makes sure compiler wont do silly aliasing games */
-	barrier();
 	skb->ip_defrag_offset = offset;
 
-	/* Insert this fragment in the chain of fragments. */
-	skb->next = next;
-	if (!next)
-		qp->q.fragments_tail = skb;
-	if (prev)
-		prev->next = skb;
-	else
-		qp->q.fragments = skb;
-
 	qp->q.stamp = skb->tstamp;
 	qp->q.meat += skb->len;
 	qp->ecn |= ecn;
@@ -414,7 +425,7 @@ found:
 		unsigned long orefdst = skb->_skb_refdst;
 
 		skb->_skb_refdst = 0UL;
-		err = ip_frag_reasm(qp, prev, dev);
+		err = ip_frag_reasm(qp, skb, dev);
 		skb->_skb_refdst = orefdst;
 		return err;
 	}
@@ -431,15 +442,15 @@ err:
 	return err;
 }
 
-
 /* Build a new IP datagram from all its fragments. */
-
-static int ip_frag_reasm(struct ipq *qp, struct sk_buff *prev,
+static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,
 			 struct net_device *dev)
 {
 	struct net *net = container_of(qp->q.net, struct net, ipv4.frags);
 	struct iphdr *iph;
-	struct sk_buff *fp, *head = qp->q.fragments;
+	struct sk_buff *fp, *head = skb_rb_first(&qp->q.rb_fragments);
+	struct sk_buff **nextp; /* To build frag_list. */
+	struct rb_node *rbn;
 	int len;
 	int ihlen;
 	int err;
@@ -453,25 +464,20 @@ static int ip_frag_reasm(struct ipq *qp,
 		goto out_fail;
 	}
 	/* Make the one we just received the head. */
-	if (prev) {
-		head = prev->next;
-		fp = skb_clone(head, GFP_ATOMIC);
+	if (head != skb) {
+		fp = skb_clone(skb, GFP_ATOMIC);
 		if (!fp)
 			goto out_nomem;
-
-		fp->next = head->next;
-		if (!fp->next)
+		rb_replace_node(&skb->rbnode, &fp->rbnode, &qp->q.rb_fragments);
+		if (qp->q.fragments_tail == skb)
 			qp->q.fragments_tail = fp;
-		prev->next = fp;
-
-		skb_morph(head, qp->q.fragments);
-		head->next = qp->q.fragments->next;
-
-		consume_skb(qp->q.fragments);
-		qp->q.fragments = head;
+		skb_morph(skb, head);
+		rb_replace_node(&head->rbnode, &skb->rbnode,
+				&qp->q.rb_fragments);
+		consume_skb(head);
+		head = skb;
 	}
 
-	WARN_ON(!head);
 	WARN_ON(head->ip_defrag_offset != 0);
 
 	/* Allocate a new buffer for the datagram. */
@@ -496,24 +502,35 @@ static int ip_frag_reasm(struct ipq *qp,
 		clone = alloc_skb(0, GFP_ATOMIC);
 		if (!clone)
 			goto out_nomem;
-		clone->next = head->next;
-		head->next = clone;
 		skb_shinfo(clone)->frag_list = skb_shinfo(head)->frag_list;
 		skb_frag_list_init(head);
 		for (i = 0; i < skb_shinfo(head)->nr_frags; i++)
 			plen += skb_frag_size(&skb_shinfo(head)->frags[i]);
 		clone->len = clone->data_len = head->data_len - plen;
-		head->data_len -= clone->len;
-		head->len -= clone->len;
+		skb->truesize += clone->truesize;
 		clone->csum = 0;
 		clone->ip_summed = head->ip_summed;
 		add_frag_mem_limit(qp->q.net, clone->truesize);
+		skb_shinfo(head)->frag_list = clone;
+		nextp = &clone->next;
+	} else {
+		nextp = &skb_shinfo(head)->frag_list;
 	}
 
-	skb_shinfo(head)->frag_list = head->next;
 	skb_push(head, head->data - skb_network_header(head));
 
-	for (fp=head->next; fp; fp = fp->next) {
+	/* Traverse the tree in order, to build frag_list. */
+	rbn = rb_next(&head->rbnode);
+	rb_erase(&head->rbnode, &qp->q.rb_fragments);
+	while (rbn) {
+		struct rb_node *rbnext = rb_next(rbn);
+		fp = rb_to_skb(rbn);
+		rb_erase(rbn, &qp->q.rb_fragments);
+		rbn = rbnext;
+		*nextp = fp;
+		nextp = &fp->next;
+		fp->prev = NULL;
+		memset(&fp->rbnode, 0, sizeof(fp->rbnode));
 		head->data_len += fp->len;
 		head->len += fp->len;
 		if (head->ip_summed != fp->ip_summed)
@@ -524,7 +541,9 @@ static int ip_frag_reasm(struct ipq *qp,
 	}
 	sub_frag_mem_limit(qp->q.net, head->truesize);
 
+	*nextp = NULL;
 	head->next = NULL;
+	head->prev = NULL;
 	head->dev = dev;
 	head->tstamp = qp->q.stamp;
 	IPCB(head)->frag_max_size = max(qp->max_df_size, qp->q.max_size);
@@ -552,6 +571,7 @@ static int ip_frag_reasm(struct ipq *qp,
 
 	__IP_INC_STATS(net, IPSTATS_MIB_REASMOKS);
 	qp->q.fragments = NULL;
+	qp->q.rb_fragments = RB_ROOT;
 	qp->q.fragments_tail = NULL;
 	return 0;
 
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -471,6 +471,7 @@ nf_ct_frag6_reasm(struct frag_queue *fq,
 					  head->csum);
 
 	fq->q.fragments = NULL;
+	fq->q.rb_fragments = RB_ROOT;
 	fq->q.fragments_tail = NULL;
 
 	return true;
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -472,6 +472,7 @@ static int ip6_frag_reasm(struct frag_qu
 	__IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMOKS);
 	rcu_read_unlock();
 	fq->q.fragments = NULL;
+	fq->q.rb_fragments = RB_ROOT;
 	fq->q.fragments_tail = NULL;
 	return 1;
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 118/126] ipv4: frags: precedence bug in ip_expire()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (116 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 117/126] net: sk_buff rbnode reorg Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 119/126] ip: add helpers to process in-order fragments faster Greg Kroah-Hartman
                   ` (10 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, netdev@vger.kernel.org,
	stable@vger.kernel.org, edumazet@google.com, Dan Carpenter,
	David S. Miller, Dan Carpenter

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

We accidentally removed the parentheses here, but they are required
because '!' has higher precedence than '&'.

Fixes: fa0f527358bd ("ip: use rb trees for IP frag queue.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 70837ffe3085c9a91488b52ca13ac84424da1042)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/ip_fragment.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -154,7 +154,7 @@ static void ip_expire(struct timer_list
 	__IP_INC_STATS(net, IPSTATS_MIB_REASMFAILS);
 	__IP_INC_STATS(net, IPSTATS_MIB_REASMTIMEOUT);
 
-	if (!qp->q.flags & INET_FRAG_FIRST_IN)
+	if (!(qp->q.flags & INET_FRAG_FIRST_IN))
 		goto out;
 
 	/* sk_buff::dev and sk_buff::rbnode are unionized. So we



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 119/126] ip: add helpers to process in-order fragments faster.
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (117 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 118/126] ipv4: frags: precedence bug in ip_expire() Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 120/126] ip: process in-order fragments efficiently Greg Kroah-Hartman
                   ` (9 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Willem de Bruijn, Peter Oskolkov,
	Eric Dumazet, Florian Westphal, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Oskolkov <posk@google.com>

This patch introduces several helper functions/macros that will be
used in the follow-up patch. No runtime changes yet.

The new logic (fully implemented in the second patch) is as follows:

* Nodes in the rb-tree will now contain not single fragments, but lists
  of consecutive fragments ("runs").

* At each point in time, the current "active" run at the tail is
  maintained/tracked. Fragments that arrive in-order, adjacent
  to the previous tail fragment, are added to this tail run without
  triggering the re-balancing of the rb-tree.

* If a fragment arrives out of order with the offset _before_ the tail run,
  it is inserted into the rb-tree as a single fragment.

* If a fragment arrives after the current tail fragment (with a gap),
  it starts a new "tail" run, as is inserted into the rb-tree
  at the end as the head of the new run.

skb->cb is used to store additional information
needed here (suggested by Eric Dumazet).

Reported-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Peter Oskolkov <posk@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 353c9cb360874e737fb000545f783df756c06f9a)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/net/inet_frag.h |    6 +++
 net/ipv4/ip_fragment.c  |   73 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 79 insertions(+)

--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -57,7 +57,9 @@ struct frag_v6_compare_key {
  * @lock: spinlock protecting this frag
  * @refcnt: reference count of the queue
  * @fragments: received fragments head
+ * @rb_fragments: received fragments rb-tree root
  * @fragments_tail: received fragments tail
+ * @last_run_head: the head of the last "run". see ip_fragment.c
  * @stamp: timestamp of the last received fragment
  * @len: total length of the original datagram
  * @meat: length of received fragments so far
@@ -78,6 +80,7 @@ struct inet_frag_queue {
 	struct sk_buff		*fragments;  /* Used in IPv6. */
 	struct rb_root		rb_fragments; /* Used in IPv4. */
 	struct sk_buff		*fragments_tail;
+	struct sk_buff		*last_run_head;
 	ktime_t			stamp;
 	int			len;
 	int			meat;
@@ -113,6 +116,9 @@ void inet_frag_kill(struct inet_frag_que
 void inet_frag_destroy(struct inet_frag_queue *q);
 struct inet_frag_queue *inet_frag_find(struct netns_frags *nf, void *key);
 
+/* Free all skbs in the queue; return the sum of their truesizes. */
+unsigned int inet_frag_rbtree_purge(struct rb_root *root);
+
 static inline void inet_frag_put(struct inet_frag_queue *q)
 {
 	if (refcount_dec_and_test(&q->refcnt))
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -57,6 +57,57 @@
  */
 static const char ip_frag_cache_name[] = "ip4-frags";
 
+/* Use skb->cb to track consecutive/adjacent fragments coming at
+ * the end of the queue. Nodes in the rb-tree queue will
+ * contain "runs" of one or more adjacent fragments.
+ *
+ * Invariants:
+ * - next_frag is NULL at the tail of a "run";
+ * - the head of a "run" has the sum of all fragment lengths in frag_run_len.
+ */
+struct ipfrag_skb_cb {
+	struct inet_skb_parm	h;
+	struct sk_buff		*next_frag;
+	int			frag_run_len;
+};
+
+#define FRAG_CB(skb)		((struct ipfrag_skb_cb *)((skb)->cb))
+
+static void ip4_frag_init_run(struct sk_buff *skb)
+{
+	BUILD_BUG_ON(sizeof(struct ipfrag_skb_cb) > sizeof(skb->cb));
+
+	FRAG_CB(skb)->next_frag = NULL;
+	FRAG_CB(skb)->frag_run_len = skb->len;
+}
+
+/* Append skb to the last "run". */
+static void ip4_frag_append_to_last_run(struct inet_frag_queue *q,
+					struct sk_buff *skb)
+{
+	RB_CLEAR_NODE(&skb->rbnode);
+	FRAG_CB(skb)->next_frag = NULL;
+
+	FRAG_CB(q->last_run_head)->frag_run_len += skb->len;
+	FRAG_CB(q->fragments_tail)->next_frag = skb;
+	q->fragments_tail = skb;
+}
+
+/* Create a new "run" with the skb. */
+static void ip4_frag_create_run(struct inet_frag_queue *q, struct sk_buff *skb)
+{
+	if (q->last_run_head)
+		rb_link_node(&skb->rbnode, &q->last_run_head->rbnode,
+			     &q->last_run_head->rbnode.rb_right);
+	else
+		rb_link_node(&skb->rbnode, NULL, &q->rb_fragments.rb_node);
+	rb_insert_color(&skb->rbnode, &q->rb_fragments);
+
+	ip4_frag_init_run(skb);
+	q->fragments_tail = skb;
+	q->last_run_head = skb;
+}
+
 /* Describe an entry in the "incomplete datagrams" queue. */
 struct ipq {
 	struct inet_frag_queue q;
@@ -654,6 +705,28 @@ struct sk_buff *ip_check_defrag(struct n
 }
 EXPORT_SYMBOL(ip_check_defrag);
 
+unsigned int inet_frag_rbtree_purge(struct rb_root *root)
+{
+	struct rb_node *p = rb_first(root);
+	unsigned int sum = 0;
+
+	while (p) {
+		struct sk_buff *skb = rb_entry(p, struct sk_buff, rbnode);
+
+		p = rb_next(p);
+		rb_erase(&skb->rbnode, root);
+		while (skb) {
+			struct sk_buff *next = FRAG_CB(skb)->next_frag;
+
+			sum += skb->truesize;
+			kfree_skb(skb);
+			skb = next;
+		}
+	}
+	return sum;
+}
+EXPORT_SYMBOL(inet_frag_rbtree_purge);
+
 #ifdef CONFIG_SYSCTL
 static int dist_min;
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 120/126] ip: process in-order fragments efficiently
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (118 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 119/126] ip: add helpers to process in-order fragments faster Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 121/126] ip: frags: fix crash in ip_do_fragment() Greg Kroah-Hartman
                   ` (8 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Willem de Bruijn, Peter Oskolkov,
	Eric Dumazet, Florian Westphal, David S. Miller

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Oskolkov <posk@google.com>

This patch changes the runtime behavior of IP defrag queue:
incoming in-order fragments are added to the end of the current
list/"run" of in-order fragments at the tail.

On some workloads, UDP stream performance is substantially improved:

RX: ./udp_stream -F 10 -T 2 -l 60
TX: ./udp_stream -c -H <host> -F 10 -T 5 -l 60

with this patchset applied on a 10Gbps receiver:

  throughput=9524.18
  throughput_units=Mbit/s

upstream (net-next):

  throughput=4608.93
  throughput_units=Mbit/s

Reported-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Peter Oskolkov <posk@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a4fd284a1f8fd4b6c59aa59db2185b1e17c5c11c)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/inet_fragment.c |    2 
 net/ipv4/ip_fragment.c   |  110 +++++++++++++++++++++++++++++------------------
 2 files changed, 70 insertions(+), 42 deletions(-)

--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -145,7 +145,7 @@ void inet_frag_destroy(struct inet_frag_
 			fp = xp;
 		} while (fp);
 	} else {
-		sum_truesize = skb_rbtree_purge(&q->rb_fragments);
+		sum_truesize = inet_frag_rbtree_purge(&q->rb_fragments);
 	}
 	sum = sum_truesize + f->qsize;
 
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -126,8 +126,8 @@ static u8 ip4_frag_ecn(u8 tos)
 
 static struct inet_frags ip4_frags;
 
-static int ip_frag_reasm(struct ipq *qp, struct sk_buff *prev,
-			 struct net_device *dev);
+static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,
+			 struct sk_buff *prev_tail, struct net_device *dev);
 
 
 static void ip4_frag_init(struct inet_frag_queue *q, const void *a)
@@ -219,7 +219,12 @@ static void ip_expire(struct timer_list
 		head = skb_rb_first(&qp->q.rb_fragments);
 		if (!head)
 			goto out;
-		rb_erase(&head->rbnode, &qp->q.rb_fragments);
+		if (FRAG_CB(head)->next_frag)
+			rb_replace_node(&head->rbnode,
+					&FRAG_CB(head)->next_frag->rbnode,
+					&qp->q.rb_fragments);
+		else
+			rb_erase(&head->rbnode, &qp->q.rb_fragments);
 		memset(&head->rbnode, 0, sizeof(head->rbnode));
 		barrier();
 	}
@@ -320,7 +325,7 @@ static int ip_frag_reinit(struct ipq *qp
 		return -ETIMEDOUT;
 	}
 
-	sum_truesize = skb_rbtree_purge(&qp->q.rb_fragments);
+	sum_truesize = inet_frag_rbtree_purge(&qp->q.rb_fragments);
 	sub_frag_mem_limit(qp->q.net, sum_truesize);
 
 	qp->q.flags = 0;
@@ -329,6 +334,7 @@ static int ip_frag_reinit(struct ipq *qp
 	qp->q.fragments = NULL;
 	qp->q.rb_fragments = RB_ROOT;
 	qp->q.fragments_tail = NULL;
+	qp->q.last_run_head = NULL;
 	qp->iif = 0;
 	qp->ecn = 0;
 
@@ -340,7 +346,7 @@ static int ip_frag_queue(struct ipq *qp,
 {
 	struct net *net = container_of(qp->q.net, struct net, ipv4.frags);
 	struct rb_node **rbn, *parent;
-	struct sk_buff *skb1;
+	struct sk_buff *skb1, *prev_tail;
 	struct net_device *dev;
 	unsigned int fragsize;
 	int flags, offset;
@@ -418,38 +424,41 @@ static int ip_frag_queue(struct ipq *qp,
 	 */
 
 	/* Find out where to put this fragment.  */
-	skb1 = qp->q.fragments_tail;
-	if (!skb1) {
-		/* This is the first fragment we've received. */
-		rb_link_node(&skb->rbnode, NULL, &qp->q.rb_fragments.rb_node);
-		qp->q.fragments_tail = skb;
-	} else if ((skb1->ip_defrag_offset + skb1->len) < end) {
-		/* This is the common/special case: skb goes to the end. */
+	prev_tail = qp->q.fragments_tail;
+	if (!prev_tail)
+		ip4_frag_create_run(&qp->q, skb);  /* First fragment. */
+	else if (prev_tail->ip_defrag_offset + prev_tail->len < end) {
+		/* This is the common case: skb goes to the end. */
 		/* Detect and discard overlaps. */
-		if (offset < (skb1->ip_defrag_offset + skb1->len))
+		if (offset < prev_tail->ip_defrag_offset + prev_tail->len)
 			goto discard_qp;
-		/* Insert after skb1. */
-		rb_link_node(&skb->rbnode, &skb1->rbnode, &skb1->rbnode.rb_right);
-		qp->q.fragments_tail = skb;
+		if (offset == prev_tail->ip_defrag_offset + prev_tail->len)
+			ip4_frag_append_to_last_run(&qp->q, skb);
+		else
+			ip4_frag_create_run(&qp->q, skb);
 	} else {
-		/* Binary search. Note that skb can become the first fragment, but
-		 * not the last (covered above). */
+		/* Binary search. Note that skb can become the first fragment,
+		 * but not the last (covered above).
+		 */
 		rbn = &qp->q.rb_fragments.rb_node;
 		do {
 			parent = *rbn;
 			skb1 = rb_to_skb(parent);
 			if (end <= skb1->ip_defrag_offset)
 				rbn = &parent->rb_left;
-			else if (offset >= skb1->ip_defrag_offset + skb1->len)
+			else if (offset >= skb1->ip_defrag_offset +
+						FRAG_CB(skb1)->frag_run_len)
 				rbn = &parent->rb_right;
 			else /* Found an overlap with skb1. */
 				goto discard_qp;
 		} while (*rbn);
 		/* Here we have parent properly set, and rbn pointing to
-		 * one of its NULL left/right children. Insert skb. */
+		 * one of its NULL left/right children. Insert skb.
+		 */
+		ip4_frag_init_run(skb);
 		rb_link_node(&skb->rbnode, parent, rbn);
+		rb_insert_color(&skb->rbnode, &qp->q.rb_fragments);
 	}
-	rb_insert_color(&skb->rbnode, &qp->q.rb_fragments);
 
 	if (dev)
 		qp->iif = dev->ifindex;
@@ -476,7 +485,7 @@ static int ip_frag_queue(struct ipq *qp,
 		unsigned long orefdst = skb->_skb_refdst;
 
 		skb->_skb_refdst = 0UL;
-		err = ip_frag_reasm(qp, skb, dev);
+		err = ip_frag_reasm(qp, skb, prev_tail, dev);
 		skb->_skb_refdst = orefdst;
 		return err;
 	}
@@ -495,7 +504,7 @@ err:
 
 /* Build a new IP datagram from all its fragments. */
 static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,
-			 struct net_device *dev)
+			 struct sk_buff *prev_tail, struct net_device *dev)
 {
 	struct net *net = container_of(qp->q.net, struct net, ipv4.frags);
 	struct iphdr *iph;
@@ -519,10 +528,16 @@ static int ip_frag_reasm(struct ipq *qp,
 		fp = skb_clone(skb, GFP_ATOMIC);
 		if (!fp)
 			goto out_nomem;
-		rb_replace_node(&skb->rbnode, &fp->rbnode, &qp->q.rb_fragments);
+		FRAG_CB(fp)->next_frag = FRAG_CB(skb)->next_frag;
+		if (RB_EMPTY_NODE(&skb->rbnode))
+			FRAG_CB(prev_tail)->next_frag = fp;
+		else
+			rb_replace_node(&skb->rbnode, &fp->rbnode,
+					&qp->q.rb_fragments);
 		if (qp->q.fragments_tail == skb)
 			qp->q.fragments_tail = fp;
 		skb_morph(skb, head);
+		FRAG_CB(skb)->next_frag = FRAG_CB(head)->next_frag;
 		rb_replace_node(&head->rbnode, &skb->rbnode,
 				&qp->q.rb_fragments);
 		consume_skb(head);
@@ -558,7 +573,7 @@ static int ip_frag_reasm(struct ipq *qp,
 		for (i = 0; i < skb_shinfo(head)->nr_frags; i++)
 			plen += skb_frag_size(&skb_shinfo(head)->frags[i]);
 		clone->len = clone->data_len = head->data_len - plen;
-		skb->truesize += clone->truesize;
+		head->truesize += clone->truesize;
 		clone->csum = 0;
 		clone->ip_summed = head->ip_summed;
 		add_frag_mem_limit(qp->q.net, clone->truesize);
@@ -571,24 +586,36 @@ static int ip_frag_reasm(struct ipq *qp,
 	skb_push(head, head->data - skb_network_header(head));
 
 	/* Traverse the tree in order, to build frag_list. */
+	fp = FRAG_CB(head)->next_frag;
 	rbn = rb_next(&head->rbnode);
 	rb_erase(&head->rbnode, &qp->q.rb_fragments);
-	while (rbn) {
-		struct rb_node *rbnext = rb_next(rbn);
-		fp = rb_to_skb(rbn);
-		rb_erase(rbn, &qp->q.rb_fragments);
-		rbn = rbnext;
-		*nextp = fp;
-		nextp = &fp->next;
-		fp->prev = NULL;
-		memset(&fp->rbnode, 0, sizeof(fp->rbnode));
-		head->data_len += fp->len;
-		head->len += fp->len;
-		if (head->ip_summed != fp->ip_summed)
-			head->ip_summed = CHECKSUM_NONE;
-		else if (head->ip_summed == CHECKSUM_COMPLETE)
-			head->csum = csum_add(head->csum, fp->csum);
-		head->truesize += fp->truesize;
+	while (rbn || fp) {
+		/* fp points to the next sk_buff in the current run;
+		 * rbn points to the next run.
+		 */
+		/* Go through the current run. */
+		while (fp) {
+			*nextp = fp;
+			nextp = &fp->next;
+			fp->prev = NULL;
+			memset(&fp->rbnode, 0, sizeof(fp->rbnode));
+			head->data_len += fp->len;
+			head->len += fp->len;
+			if (head->ip_summed != fp->ip_summed)
+				head->ip_summed = CHECKSUM_NONE;
+			else if (head->ip_summed == CHECKSUM_COMPLETE)
+				head->csum = csum_add(head->csum, fp->csum);
+			head->truesize += fp->truesize;
+			fp = FRAG_CB(fp)->next_frag;
+		}
+		/* Move to the next run. */
+		if (rbn) {
+			struct rb_node *rbnext = rb_next(rbn);
+
+			fp = rb_to_skb(rbn);
+			rb_erase(rbn, &qp->q.rb_fragments);
+			rbn = rbnext;
+		}
 	}
 	sub_frag_mem_limit(qp->q.net, head->truesize);
 
@@ -624,6 +651,7 @@ static int ip_frag_reasm(struct ipq *qp,
 	qp->q.fragments = NULL;
 	qp->q.rb_fragments = RB_ROOT;
 	qp->q.fragments_tail = NULL;
+	qp->q.last_run_head = NULL;
 	return 0;
 
 out_nomem:



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 121/126] ip: frags: fix crash in ip_do_fragment()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (119 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 120/126] ip: process in-order fragments efficiently Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 122/126] mtd: ubi: wl: Fix error return code in ubi_wl_init() Greg Kroah-Hartman
                   ` (7 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, netdev@vger.kernel.org,
	stable@vger.kernel.org, edumazet@google.com, Taehee Yoo,
	Eric Dumazet, David S. Miller, Taehee Yoo

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Taehee Yoo <ap420073@gmail.com>

commit 5d407b071dc369c26a38398326ee2be53651cfe4 upstream

A kernel crash occurrs when defragmented packet is fragmented
in ip_do_fragment().
In defragment routine, skb_orphan() is called and
skb->ip_defrag_offset is set. but skb->sk and
skb->ip_defrag_offset are same union member. so that
frag->sk is not NULL.
Hence crash occurrs in skb->sk check routine in ip_do_fragment() when
defragmented packet is fragmented.

test commands:
   %iptables -t nat -I POSTROUTING -j MASQUERADE
   %hping3 192.168.4.2 -s 1000 -p 2000 -d 60000

splat looks like:
[  261.069429] kernel BUG at net/ipv4/ip_output.c:636!
[  261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[  261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3
[  261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600
[  261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c
[  261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202
[  261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004
[  261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8
[  261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395
[  261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4
[  261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000
[  261.174169] FS:  00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000
[  261.183012] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0
[  261.198158] Call Trace:
[  261.199018]  ? dst_output+0x180/0x180
[  261.205011]  ? save_trace+0x300/0x300
[  261.209018]  ? ip_copy_metadata+0xb00/0xb00
[  261.213034]  ? sched_clock_local+0xd4/0x140
[  261.218158]  ? kill_l4proto+0x120/0x120 [nf_conntrack]
[  261.223014]  ? rt_cpu_seq_stop+0x10/0x10
[  261.227014]  ? find_held_lock+0x39/0x1c0
[  261.233008]  ip_finish_output+0x51d/0xb50
[  261.237006]  ? ip_fragment.constprop.56+0x220/0x220
[  261.243011]  ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack]
[  261.250152]  ? rcu_is_watching+0x77/0x120
[  261.255010]  ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4]
[  261.261033]  ? nf_hook_slow+0xb1/0x160
[  261.265007]  ip_output+0x1c7/0x710
[  261.269005]  ? ip_mc_output+0x13f0/0x13f0
[  261.273002]  ? __local_bh_enable_ip+0xe9/0x1b0
[  261.278152]  ? ip_fragment.constprop.56+0x220/0x220
[  261.282996]  ? nf_hook_slow+0xb1/0x160
[  261.287007]  raw_sendmsg+0x21f9/0x4420
[  261.291008]  ? dst_output+0x180/0x180
[  261.297003]  ? sched_clock_cpu+0x126/0x170
[  261.301003]  ? find_held_lock+0x39/0x1c0
[  261.306155]  ? stop_critical_timings+0x420/0x420
[  261.311004]  ? check_flags.part.36+0x450/0x450
[  261.315005]  ? _raw_spin_unlock_irq+0x29/0x40
[  261.320995]  ? _raw_spin_unlock_irq+0x29/0x40
[  261.326142]  ? cyc2ns_read_end+0x10/0x10
[  261.330139]  ? raw_bind+0x280/0x280
[  261.334138]  ? sched_clock_cpu+0x126/0x170
[  261.338995]  ? check_flags.part.36+0x450/0x450
[  261.342991]  ? __lock_acquire+0x4500/0x4500
[  261.348994]  ? inet_sendmsg+0x11c/0x500
[  261.352989]  ? dst_output+0x180/0x180
[  261.357012]  inet_sendmsg+0x11c/0x500
[ ... ]

v2:
 - clear skb->sk at reassembly routine.(Eric Dumarzet)

Fixes: fa0f527358bd ("ip: use rb trees for IP frag queue.")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/ip_fragment.c                  |    1 +
 net/ipv6/netfilter/nf_conntrack_reasm.c |    1 +
 2 files changed, 2 insertions(+)

--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -599,6 +599,7 @@ static int ip_frag_reasm(struct ipq *qp,
 			nextp = &fp->next;
 			fp->prev = NULL;
 			memset(&fp->rbnode, 0, sizeof(fp->rbnode));
+			fp->sk = NULL;
 			head->data_len += fp->len;
 			head->len += fp->len;
 			if (head->ip_summed != fp->ip_summed)
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -453,6 +453,7 @@ nf_ct_frag6_reasm(struct frag_queue *fq,
 		else if (head->ip_summed == CHECKSUM_COMPLETE)
 			head->csum = csum_add(head->csum, fp->csum);
 		head->truesize += fp->truesize;
+		fp->sk = NULL;
 	}
 	sub_frag_mem_limit(fq->q.net, head->truesize);
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 122/126] mtd: ubi: wl: Fix error return code in ubi_wl_init()
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (120 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 121/126] ip: frags: fix crash in ip_do_fragment() Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 123/126] tun: fix use after free for ptr_ring Greg Kroah-Hartman
                   ` (6 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Wei Yongjun, Boris Brezillon,
	Richard Weinberger, Ben Hutchings

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Wei Yongjun <weiyongjun1@huawei.com>

commit 7233982ade15eeac05c6f351e8d347406e6bcd2f upstream.

Fix to return error code -ENOMEM from the kmem_cache_alloc() error
handling case instead of 0, as done elsewhere in this function.

Fixes: f78e5623f45b ("ubi: fastmap: Erase outdated anchor PEBs during
attach")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/mtd/ubi/wl.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -1615,8 +1615,10 @@ int ubi_wl_init(struct ubi_device *ubi,
 		cond_resched();
 
 		e = kmem_cache_alloc(ubi_wl_entry_slab, GFP_KERNEL);
-		if (!e)
+		if (!e) {
+			err = -ENOMEM;
 			goto out_free;
+		}
 
 		e->pnum = aeb->pnum;
 		e->ec = aeb->ec;
@@ -1635,8 +1637,10 @@ int ubi_wl_init(struct ubi_device *ubi,
 			cond_resched();
 
 			e = kmem_cache_alloc(ubi_wl_entry_slab, GFP_KERNEL);
-			if (!e)
+			if (!e) {
+				err = -ENOMEM;
 				goto out_free;
+			}
 
 			e->pnum = aeb->pnum;
 			e->ec = aeb->ec;



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 123/126] tun: fix use after free for ptr_ring
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (121 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 122/126] mtd: ubi: wl: Fix error return code in ubi_wl_init() Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 124/126] tuntap: fix use after free during release Greg Kroah-Hartman
                   ` (5 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+e8b902c3c3fadf0a9dba,
	Eric Dumazet, Cong Wang, Michael S. Tsirkin, Jason Wang,
	David S. Miller, Zubin Mithra

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jason Wang <jasowang@redhat.com>

commit b196d88aba8ac72b775137854121097f4c4c6862 upstream.

We used to initialize ptr_ring during TUNSETIFF, this is because its
size depends on the tx_queue_len of netdevice. And we try to clean it
up when socket were detached from netdevice. A race were spotted when
trying to do uninit during a read which will lead a use after free for
pointer ring. Solving this by always initialize a zero size ptr_ring
in open() and do resizing during TUNSETIFF, and then we can safely do
cleanup during close(). With this, there's no need for the workaround
that was introduced by commit 4df0bfc79904 ("tun: fix a memory leak
for tfile->tx_array").

Backport Note :-
Comparison with the upstream patch:
[1] A "semantic revert" of the changes made in
    4df0bfc799("tun: fix a memory leak for tfile->tx_array").
        4df0bfc799 was applied upstream, and then skb array was changed
	to use ptr_ring. The upstream patch then removes the changes introduced
	by 4df0bfc799. This backport does the same; "revert" the changes
	made by 4df0bfc799.
[2] xdp_rxq_info_unreg() being called in relevant locations
        As xdp_rxq_info related patches are not present in 4.14, these
	changes are not needed in the backport.
[3] An instance of ptr_ring_init needs to be replaced by skb_array_init
	Inside tun_attach()
[4] ptr_ring_cleanup needs to be replaced by skb_array_cleanup
	Inside tun_chr_close()

Note that the backport for 7063efd33b ("tuntap: fix use after free during release")
needs to be applied on top of this patch.

Reported-by: syzbot+e8b902c3c3fadf0a9dba@syzkaller.appspotmail.com
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Fixes: 1576d9860599 ("tun: switch to use skb array for tx")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Zubin Mithra <zsm@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/tun.c |   21 +++++++--------------
 1 file changed, 7 insertions(+), 14 deletions(-)

--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -534,14 +534,6 @@ static void tun_queue_purge(struct tun_f
 	skb_queue_purge(&tfile->sk.sk_error_queue);
 }
 
-static void tun_cleanup_tx_array(struct tun_file *tfile)
-{
-	if (tfile->tx_array.ring.queue) {
-		skb_array_cleanup(&tfile->tx_array);
-		memset(&tfile->tx_array, 0, sizeof(tfile->tx_array));
-	}
-}
-
 static void __tun_detach(struct tun_file *tfile, bool clean)
 {
 	struct tun_file *ntfile;
@@ -583,7 +575,6 @@ static void __tun_detach(struct tun_file
 			    tun->dev->reg_state == NETREG_REGISTERED)
 				unregister_netdevice(tun->dev);
 		}
-		tun_cleanup_tx_array(tfile);
 		sock_put(&tfile->sk);
 	}
 }
@@ -623,13 +614,11 @@ static void tun_detach_all(struct net_de
 		/* Drop read queue */
 		tun_queue_purge(tfile);
 		sock_put(&tfile->sk);
-		tun_cleanup_tx_array(tfile);
 	}
 	list_for_each_entry_safe(tfile, tmp, &tun->disabled, next) {
 		tun_enable_queue(tfile);
 		tun_queue_purge(tfile);
 		sock_put(&tfile->sk);
-		tun_cleanup_tx_array(tfile);
 	}
 	BUG_ON(tun->numdisabled != 0);
 
@@ -675,7 +664,7 @@ static int tun_attach(struct tun_struct
 	}
 
 	if (!tfile->detached &&
-	    skb_array_init(&tfile->tx_array, dev->tx_queue_len, GFP_KERNEL)) {
+	    skb_array_resize(&tfile->tx_array, dev->tx_queue_len, GFP_KERNEL)) {
 		err = -ENOMEM;
 		goto out;
 	}
@@ -2624,6 +2613,11 @@ static int tun_chr_open(struct inode *in
 					    &tun_proto, 0);
 	if (!tfile)
 		return -ENOMEM;
+	if (skb_array_init(&tfile->tx_array, 0, GFP_KERNEL)) {
+		sk_free(&tfile->sk);
+		return -ENOMEM;
+	}
+
 	RCU_INIT_POINTER(tfile->tun, NULL);
 	tfile->flags = 0;
 	tfile->ifindex = 0;
@@ -2644,8 +2638,6 @@ static int tun_chr_open(struct inode *in
 
 	sock_set_flag(&tfile->sk, SOCK_ZEROCOPY);
 
-	memset(&tfile->tx_array, 0, sizeof(tfile->tx_array));
-
 	return 0;
 }
 
@@ -2654,6 +2646,7 @@ static int tun_chr_close(struct inode *i
 	struct tun_file *tfile = file->private_data;
 
 	tun_detach(tfile, true);
+	skb_array_cleanup(&tfile->tx_array);
 
 	return 0;
 }



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 124/126] tuntap: fix use after free during release
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (122 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 123/126] tun: fix use after free for ptr_ring Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 125/126] autofs: fix autofs_sbi() does not check super block type Greg Kroah-Hartman
                   ` (4 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andrei Vagin, Jason Wang,
	David S. Miller, Zubin Mithra

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jason Wang <jasowang@redhat.com>

commit 7063efd33bb15abc0160347f89eb5aba6b7d000e upstream.

After commit b196d88aba8a ("tun: fix use after free for ptr_ring") we
need clean up tx ring during release(). But unfortunately, it tries to
do the cleanup blindly after socket were destroyed which will lead
another use-after-free. Fix this by doing the cleanup before dropping
the last reference of the socket in __tun_detach().

Backport Note :-
Upstream commit moves the ptr_ring_cleanup call from tun_chr_close to
__tun_detach. Upstream applied that patch after replacing skb_array with
ptr_ring. This patch moves the skb_array_cleanup call from
tun_chr_close to __tun_detach.

Reported-by: Andrei Vagin <avagin@virtuozzo.com>
Acked-by: Andrei Vagin <avagin@virtuozzo.com>
Fixes: b196d88aba8a ("tun: fix use after free for ptr_ring")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Zubin Mithra <zsm@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/tun.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -575,6 +575,7 @@ static void __tun_detach(struct tun_file
 			    tun->dev->reg_state == NETREG_REGISTERED)
 				unregister_netdevice(tun->dev);
 		}
+		skb_array_cleanup(&tfile->tx_array);
 		sock_put(&tfile->sk);
 	}
 }
@@ -2646,7 +2647,6 @@ static int tun_chr_close(struct inode *i
 	struct tun_file *tfile = file->private_data;
 
 	tun_detach(tfile, true);
-	skb_array_cleanup(&tfile->tx_array);
 
 	return 0;
 }



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 125/126] autofs: fix autofs_sbi() does not check super block type
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (123 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 124/126] tuntap: fix use after free during release Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 22:42 ` [PATCH 4.14 126/126] mm: get rid of vmacache_flush_all() entirely Greg Kroah-Hartman
                   ` (3 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+87c3c541582e56943277,
	Ian Kent, Andrew Morton, Linus Torvalds, Zubin Mithra

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ian Kent <raven@themaw.net>

commit 0633da48f0793aeba27f82d30605624416723a91 upstream.

autofs_sbi() does not check the superblock magic number to verify it has
been given an autofs super block.

Backport Note: autofs4 has been renamed to autofs upstream. As a result
the upstream patch does not apply cleanly onto 4.14.y.

Link: http://lkml.kernel.org/r/153475422934.17131.7563724552005298277.stgit@pluto.themaw.net
Reported-by: <syzbot+87c3c541582e56943277@syzkaller.appspotmail.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Zubin Mithra <zsm@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/autofs4/autofs_i.h |    4 +++-
 fs/autofs4/inode.c    |    1 -
 2 files changed, 3 insertions(+), 2 deletions(-)

--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -26,6 +26,7 @@
 #include <linux/list.h>
 #include <linux/completion.h>
 #include <asm/current.h>
+#include <linux/magic.h>
 
 /* This is the range of ioctl() numbers we claim as ours */
 #define AUTOFS_IOC_FIRST     AUTOFS_IOC_READY
@@ -124,7 +125,8 @@ struct autofs_sb_info {
 
 static inline struct autofs_sb_info *autofs4_sbi(struct super_block *sb)
 {
-	return (struct autofs_sb_info *)(sb->s_fs_info);
+	return sb->s_magic != AUTOFS_SUPER_MAGIC ?
+		NULL : (struct autofs_sb_info *)(sb->s_fs_info);
 }
 
 static inline struct autofs_info *autofs4_dentry_ino(struct dentry *dentry)
--- a/fs/autofs4/inode.c
+++ b/fs/autofs4/inode.c
@@ -14,7 +14,6 @@
 #include <linux/pagemap.h>
 #include <linux/parser.h>
 #include <linux/bitops.h>
-#include <linux/magic.h>
 #include "autofs_i.h"
 #include <linux/module.h>
 



^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 4.14 126/126] mm: get rid of vmacache_flush_all() entirely
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (124 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 125/126] autofs: fix autofs_sbi() does not check super block type Greg Kroah-Hartman
@ 2018-09-17 22:42 ` Greg Kroah-Hartman
  2018-09-17 23:59 ` [PATCH 4.14 000/126] 4.14.71-stable review Nathan Chancellor
                   ` (2 subsequent siblings)
  128 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-17 22:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jann Horn, Will Deacon,
	Davidlohr Bueso, Oleg Nesterov, stable, Linus Torvalds

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <torvalds@linux-foundation.org>

commit 7a9cdebdcc17e426fb5287e4a82db1dfe86339b2 upstream.

Jann Horn points out that the vmacache_flush_all() function is not only
potentially expensive, it's buggy too.  It also happens to be entirely
unnecessary, because the sequence number overflow case can be avoided by
simply making the sequence number be 64-bit.  That doesn't even grow the
data structures in question, because the other adjacent fields are
already 64-bit.

So simplify the whole thing by just making the sequence number overflow
case go away entirely, which gets rid of all the complications and makes
the code faster too.  Win-win.

[ Oleg Nesterov points out that the VMACACHE_FULL_FLUSHES statistics
  also just goes away entirely with this ]

Reported-by: Jann Horn <jannh@google.com>
Suggested-by: Will Deacon <will.deacon@arm.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/linux/mm_types.h      |    2 +-
 include/linux/mm_types_task.h |    2 +-
 include/linux/vm_event_item.h |    1 -
 include/linux/vmacache.h      |    5 -----
 mm/debug.c                    |    4 ++--
 mm/vmacache.c                 |   38 --------------------------------------
 6 files changed, 4 insertions(+), 48 deletions(-)

--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -354,7 +354,7 @@ struct kioctx_table;
 struct mm_struct {
 	struct vm_area_struct *mmap;		/* list of VMAs */
 	struct rb_root mm_rb;
-	u32 vmacache_seqnum;                   /* per-thread vmacache */
+	u64 vmacache_seqnum;                   /* per-thread vmacache */
 #ifdef CONFIG_MMU
 	unsigned long (*get_unmapped_area) (struct file *filp,
 				unsigned long addr, unsigned long len,
--- a/include/linux/mm_types_task.h
+++ b/include/linux/mm_types_task.h
@@ -32,7 +32,7 @@
 #define VMACACHE_MASK (VMACACHE_SIZE - 1)
 
 struct vmacache {
-	u32 seqnum;
+	u64 seqnum;
 	struct vm_area_struct *vmas[VMACACHE_SIZE];
 };
 
--- a/include/linux/vm_event_item.h
+++ b/include/linux/vm_event_item.h
@@ -105,7 +105,6 @@ enum vm_event_item { PGPGIN, PGPGOUT, PS
 #ifdef CONFIG_DEBUG_VM_VMACACHE
 		VMACACHE_FIND_CALLS,
 		VMACACHE_FIND_HITS,
-		VMACACHE_FULL_FLUSHES,
 #endif
 #ifdef CONFIG_SWAP
 		SWAP_RA,
--- a/include/linux/vmacache.h
+++ b/include/linux/vmacache.h
@@ -16,7 +16,6 @@ static inline void vmacache_flush(struct
 	memset(tsk->vmacache.vmas, 0, sizeof(tsk->vmacache.vmas));
 }
 
-extern void vmacache_flush_all(struct mm_struct *mm);
 extern void vmacache_update(unsigned long addr, struct vm_area_struct *newvma);
 extern struct vm_area_struct *vmacache_find(struct mm_struct *mm,
 						    unsigned long addr);
@@ -30,10 +29,6 @@ extern struct vm_area_struct *vmacache_f
 static inline void vmacache_invalidate(struct mm_struct *mm)
 {
 	mm->vmacache_seqnum++;
-
-	/* deal with overflows */
-	if (unlikely(mm->vmacache_seqnum == 0))
-		vmacache_flush_all(mm);
 }
 
 #endif /* __LINUX_VMACACHE_H */
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -100,7 +100,7 @@ EXPORT_SYMBOL(dump_vma);
 
 void dump_mm(const struct mm_struct *mm)
 {
-	pr_emerg("mm %p mmap %p seqnum %d task_size %lu\n"
+	pr_emerg("mm %p mmap %p seqnum %llu task_size %lu\n"
 #ifdef CONFIG_MMU
 		"get_unmapped_area %p\n"
 #endif
@@ -128,7 +128,7 @@ void dump_mm(const struct mm_struct *mm)
 		"tlb_flush_pending %d\n"
 		"def_flags: %#lx(%pGv)\n",
 
-		mm, mm->mmap, mm->vmacache_seqnum, mm->task_size,
+		mm, mm->mmap, (long long) mm->vmacache_seqnum, mm->task_size,
 #ifdef CONFIG_MMU
 		mm->get_unmapped_area,
 #endif
--- a/mm/vmacache.c
+++ b/mm/vmacache.c
@@ -8,44 +8,6 @@
 #include <linux/vmacache.h>
 
 /*
- * Flush vma caches for threads that share a given mm.
- *
- * The operation is safe because the caller holds the mmap_sem
- * exclusively and other threads accessing the vma cache will
- * have mmap_sem held at least for read, so no extra locking
- * is required to maintain the vma cache.
- */
-void vmacache_flush_all(struct mm_struct *mm)
-{
-	struct task_struct *g, *p;
-
-	count_vm_vmacache_event(VMACACHE_FULL_FLUSHES);
-
-	/*
-	 * Single threaded tasks need not iterate the entire
-	 * list of process. We can avoid the flushing as well
-	 * since the mm's seqnum was increased and don't have
-	 * to worry about other threads' seqnum. Current's
-	 * flush will occur upon the next lookup.
-	 */
-	if (atomic_read(&mm->mm_users) == 1)
-		return;
-
-	rcu_read_lock();
-	for_each_process_thread(g, p) {
-		/*
-		 * Only flush the vmacache pointers as the
-		 * mm seqnum is already set and curr's will
-		 * be set upon invalidation when the next
-		 * lookup is done.
-		 */
-		if (mm == p->mm)
-			vmacache_flush(p);
-	}
-	rcu_read_unlock();
-}
-
-/*
  * This task may be accessing a foreign mm via (for example)
  * get_user_pages()->find_vma().  The vmacache is task-local and this
  * task's vmacache pertains to a different mm (ie, its own).  There is



^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 4.14 000/126] 4.14.71-stable review
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (125 preceding siblings ...)
  2018-09-17 22:42 ` [PATCH 4.14 126/126] mm: get rid of vmacache_flush_all() entirely Greg Kroah-Hartman
@ 2018-09-17 23:59 ` Nathan Chancellor
  2018-09-18  7:44   ` Greg Kroah-Hartman
  2018-09-18 16:20 ` Guenter Roeck
  2018-09-18 16:53 ` Naresh Kamboju
  128 siblings, 1 reply; 134+ messages in thread
From: Nathan Chancellor @ 2018-09-17 23:59 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, linux, shuah, patches,
	ben.hutchings, lkft-triage, stable

On Tue, Sep 18, 2018 at 12:40:48AM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.14.71 release.
> There are 126 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed Sep 19 21:16:12 UTC 2018.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.71-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Merged, compiled, and installed onto my Raspberry Pi.

No initial issues noticed in dmesg or general usage.

Thanks!
Nathan

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 4.14 000/126] 4.14.71-stable review
  2018-09-17 23:59 ` [PATCH 4.14 000/126] 4.14.71-stable review Nathan Chancellor
@ 2018-09-18  7:44   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-09-18  7:44 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: linux-kernel, torvalds, akpm, linux, shuah, patches,
	ben.hutchings, lkft-triage, stable

On Mon, Sep 17, 2018 at 04:59:49PM -0700, Nathan Chancellor wrote:
> On Tue, Sep 18, 2018 at 12:40:48AM +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.14.71 release.
> > There are 126 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Wed Sep 19 21:16:12 UTC 2018.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > 	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.71-rc1.gz
> > or in the git tree and branch at:
> > 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> 
> Merged, compiled, and installed onto my Raspberry Pi.
> 
> No initial issues noticed in dmesg or general usage.

Thanks for testing 3 of these and letting me know.

greg k-h

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 4.14 000/126] 4.14.71-stable review
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (126 preceding siblings ...)
  2018-09-17 23:59 ` [PATCH 4.14 000/126] 4.14.71-stable review Nathan Chancellor
@ 2018-09-18 16:20 ` Guenter Roeck
  2018-09-18 16:53 ` Naresh Kamboju
  128 siblings, 0 replies; 134+ messages in thread
From: Guenter Roeck @ 2018-09-18 16:20 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, shuah, patches, ben.hutchings,
	lkft-triage, stable

On Tue, Sep 18, 2018 at 12:40:48AM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.14.71 release.
> There are 126 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed Sep 19 21:16:12 UTC 2018.
> Anything received after that time might be too late.
> 

Build results:
	total: 151 pass: 151 fail: 0
Qemu test results:
	total: 315 pass: 315 fail: 0

Details are available at https://kerneltests.org/builders/.

Guenter

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 4.14 000/126] 4.14.71-stable review
  2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
                   ` (127 preceding siblings ...)
  2018-09-18 16:20 ` Guenter Roeck
@ 2018-09-18 16:53 ` Naresh Kamboju
  128 siblings, 0 replies; 134+ messages in thread
From: Naresh Kamboju @ 2018-09-18 16:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: open list, Linus Torvalds, Andrew Morton, Guenter Roeck,
	Shuah Khan, patches, Ben Hutchings, lkft-triage, linux- stable

On 18 September 2018 at 04:10, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> This is the start of the stable review cycle for the 4.14.71 release.
> There are 126 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Sep 19 21:16:12 UTC 2018.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
>         https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.71-rc1.gz
> or in the git tree and branch at:
>         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Summary
------------------------------------------------------------------------

kernel: 4.14.71-rc1
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.14.y
git commit: 117adc51b16e0068c963f8c1339a0360ccf29f17
git describe: v4.14.70-127-g117adc51b16e
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.14-oe/build/v4.14.70-127-g117adc51b16e

No regressions (compared to build v4.14.70)


Ran 19763 total tests in the following environments and test suites.

Environments
--------------
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- qemu_arm
- qemu_arm64
- qemu_i386
- qemu_x86_64
- x15 - arm
- x86_64

Test Suites
-----------
* boot
* kselftest
* libhugetlbfs
* ltp-cap_bounds-tests
* ltp-containers-tests
* ltp-cve-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* ltp-timers-tests
* ltp-open-posix-tests
* kselftest-vsyscall-mode-native
* kselftest-vsyscall-mode-none

-- 
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 4.14 117/126] net: sk_buff rbnode reorg
  2018-09-17 22:42 ` [PATCH 4.14 117/126] net: sk_buff rbnode reorg Greg Kroah-Hartman
@ 2018-10-04 20:13   ` Mitch Harder
  2018-11-29 10:33     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 134+ messages in thread
From: Mitch Harder @ 2018-10-04 20:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linux Kernel Mailing List, stable, Eric Dumazet,
	Soheil Hassas Yeganeh, Wei Wang, Willem de Bruijn,
	David S. Miller

On Mon, Sep 17, 2018 at 5:42 PM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> 4.14-stable review patch.  If anyone has any objections, please let me know.
>
> ------------------
>
> From: Eric Dumazet <edumazet@google.com>
>
> commit bffa72cf7f9df842f0016ba03586039296b4caaf upstream
>
> skb->rbnode shares space with skb->next, skb->prev and skb->tstamp
>
> Current uses (TCP receive ofo queue and netem) need to save/restore
> tstamp, while skb->dev is either NULL (TCP) or a constant for a given
> queue (netem).
>
> Since we plan using an RB tree for TCP retransmit queue to speedup SACK
> processing with large BDP, this patch exchanges skb->dev and
> skb->tstamp.
>
> This saves some overhead in both TCP and netem.
>
> v2: removes the swtstamp field from struct tcp_skb_cb
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Soheil Hassas Yeganeh <soheil@google.com>
> Cc: Wei Wang <weiwan@google.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  include/linux/skbuff.h                  |   24 ++--
>  include/net/inet_frag.h                 |    3
>  net/ipv4/inet_fragment.c                |   14 +-
>  net/ipv4/ip_fragment.c                  |  182 +++++++++++++++++---------------
>  net/ipv6/netfilter/nf_conntrack_reasm.c |    1
>  net/ipv6/reassembly.c                   |    1
>  6 files changed, 128 insertions(+), 97 deletions(-)
>
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -663,23 +663,27 @@ struct sk_buff {
>                         struct sk_buff          *prev;
>
>                         union {
> -                               ktime_t         tstamp;
> -                               u64             skb_mstamp;
> +                               struct net_device       *dev;
> +                               /* Some protocols might use this space to store information,
> +                                * while device pointer would be NULL.
> +                                * UDP receive path is one user.
> +                                */
> +                               unsigned long           dev_scratch;
>                         };
>                 };
> -               struct rb_node  rbnode; /* used in netem & tcp stack */
> +               struct rb_node          rbnode; /* used in netem, ip4 defrag, and tcp stack */
> +               struct list_head        list;
>         };
> -       struct sock             *sk;
>
>         union {
> -               struct net_device       *dev;
> -               /* Some protocols might use this space to store information,
> -                * while device pointer would be NULL.
> -                * UDP receive path is one user.
> -                */
> -               unsigned long           dev_scratch;
> +               struct sock             *sk;
>                 int                     ip_defrag_offset;
>         };
> +
> +       union {
> +               ktime_t         tstamp;
> +               u64             skb_mstamp;
> +       };
>         /*
>          * This is the control buffer. It is free to use for every
>          * layer. Please put your private variables there. If you
> --- a/include/net/inet_frag.h
> +++ b/include/net/inet_frag.h
> @@ -75,7 +75,8 @@ struct inet_frag_queue {
>         struct timer_list       timer;
>         spinlock_t              lock;
>         refcount_t              refcnt;
> -       struct sk_buff          *fragments;
> +       struct sk_buff          *fragments;  /* Used in IPv6. */
> +       struct rb_root          rb_fragments; /* Used in IPv4. */
>         struct sk_buff          *fragments_tail;
>         ktime_t                 stamp;
>         int                     len;
> --- a/net/ipv4/inet_fragment.c
> +++ b/net/ipv4/inet_fragment.c
> @@ -136,12 +136,16 @@ void inet_frag_destroy(struct inet_frag_
>         fp = q->fragments;
>         nf = q->net;
>         f = nf->f;
> -       while (fp) {
> -               struct sk_buff *xp = fp->next;
> +       if (fp) {
> +               do {
> +                       struct sk_buff *xp = fp->next;
>
> -               sum_truesize += fp->truesize;
> -               kfree_skb(fp);
> -               fp = xp;
> +                       sum_truesize += fp->truesize;
> +                       kfree_skb(fp);
> +                       fp = xp;
> +               } while (fp);
> +       } else {
> +               sum_truesize = skb_rbtree_purge(&q->rb_fragments);
>         }
>         sum = sum_truesize + f->qsize;
>
> --- a/net/ipv4/ip_fragment.c
> +++ b/net/ipv4/ip_fragment.c
> @@ -136,7 +136,7 @@ static void ip_expire(struct timer_list
>  {
>         struct inet_frag_queue *frag = from_timer(frag, t, timer);
>         const struct iphdr *iph;
> -       struct sk_buff *head;
> +       struct sk_buff *head = NULL;
>         struct net *net;
>         struct ipq *qp;
>         int err;
> @@ -152,14 +152,31 @@ static void ip_expire(struct timer_list
>
>         ipq_kill(qp);
>         __IP_INC_STATS(net, IPSTATS_MIB_REASMFAILS);
> -
> -       head = qp->q.fragments;
> -
>         __IP_INC_STATS(net, IPSTATS_MIB_REASMTIMEOUT);
>
> -       if (!(qp->q.flags & INET_FRAG_FIRST_IN) || !head)
> +       if (!qp->q.flags & INET_FRAG_FIRST_IN)
>                 goto out;
>
> +       /* sk_buff::dev and sk_buff::rbnode are unionized. So we
> +        * pull the head out of the tree in order to be able to
> +        * deal with head->dev.
> +        */
> +       if (qp->q.fragments) {
> +               head = qp->q.fragments;
> +               qp->q.fragments = head->next;
> +       } else {
> +               head = skb_rb_first(&qp->q.rb_fragments);
> +               if (!head)
> +                       goto out;
> +               rb_erase(&head->rbnode, &qp->q.rb_fragments);
> +               memset(&head->rbnode, 0, sizeof(head->rbnode));
> +               barrier();
> +       }
> +       if (head == qp->q.fragments_tail)
> +               qp->q.fragments_tail = NULL;
> +
> +       sub_frag_mem_limit(qp->q.net, head->truesize);
> +
>         head->dev = dev_get_by_index_rcu(net, qp->iif);
>         if (!head->dev)
>                 goto out;
> @@ -179,16 +196,16 @@ static void ip_expire(struct timer_list
>             (skb_rtable(head)->rt_type != RTN_LOCAL))
>                 goto out;
>
> -       skb_get(head);
>         spin_unlock(&qp->q.lock);
>         icmp_send(head, ICMP_TIME_EXCEEDED, ICMP_EXC_FRAGTIME, 0);
> -       kfree_skb(head);
>         goto out_rcu_unlock;
>
>  out:
>         spin_unlock(&qp->q.lock);
>  out_rcu_unlock:
>         rcu_read_unlock();
> +       if (head)
> +               kfree_skb(head);
>         ipq_put(qp);
>  }
>
> @@ -231,7 +248,7 @@ static int ip_frag_too_far(struct ipq *q
>         end = atomic_inc_return(&peer->rid);
>         qp->rid = end;
>
> -       rc = qp->q.fragments && (end - start) > max;
> +       rc = qp->q.fragments_tail && (end - start) > max;
>
>         if (rc) {
>                 struct net *net;
> @@ -245,7 +262,6 @@ static int ip_frag_too_far(struct ipq *q
>
>  static int ip_frag_reinit(struct ipq *qp)
>  {
> -       struct sk_buff *fp;
>         unsigned int sum_truesize = 0;
>
>         if (!mod_timer(&qp->q.timer, jiffies + qp->q.net->timeout)) {
> @@ -253,20 +269,14 @@ static int ip_frag_reinit(struct ipq *qp
>                 return -ETIMEDOUT;
>         }
>
> -       fp = qp->q.fragments;
> -       do {
> -               struct sk_buff *xp = fp->next;
> -
> -               sum_truesize += fp->truesize;
> -               kfree_skb(fp);
> -               fp = xp;
> -       } while (fp);
> +       sum_truesize = skb_rbtree_purge(&qp->q.rb_fragments);
>         sub_frag_mem_limit(qp->q.net, sum_truesize);
>
>         qp->q.flags = 0;
>         qp->q.len = 0;
>         qp->q.meat = 0;
>         qp->q.fragments = NULL;
> +       qp->q.rb_fragments = RB_ROOT;
>         qp->q.fragments_tail = NULL;
>         qp->iif = 0;
>         qp->ecn = 0;
> @@ -278,7 +288,8 @@ static int ip_frag_reinit(struct ipq *qp
>  static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
>  {
>         struct net *net = container_of(qp->q.net, struct net, ipv4.frags);
> -       struct sk_buff *prev, *next;
> +       struct rb_node **rbn, *parent;
> +       struct sk_buff *skb1;
>         struct net_device *dev;
>         unsigned int fragsize;
>         int flags, offset;
> @@ -341,58 +352,58 @@ static int ip_frag_queue(struct ipq *qp,
>         if (err)
>                 goto err;
>
> -       /* Find out which fragments are in front and at the back of us
> -        * in the chain of fragments so far.  We must know where to put
> -        * this fragment, right?
> -        */
> -       prev = qp->q.fragments_tail;
> -       if (!prev || prev->ip_defrag_offset < offset) {
> -               next = NULL;
> -               goto found;
> -       }
> -       prev = NULL;
> -       for (next = qp->q.fragments; next != NULL; next = next->next) {
> -               if (next->ip_defrag_offset >= offset)
> -                       break;  /* bingo! */
> -               prev = next;
> -       }
> +       /* Note : skb->rbnode and skb->dev share the same location. */
> +       dev = skb->dev;
> +       /* Makes sure compiler wont do silly aliasing games */
> +       barrier();
>
> -found:
>         /* RFC5722, Section 4, amended by Errata ID : 3089
>          *                          When reassembling an IPv6 datagram, if
>          *   one or more its constituent fragments is determined to be an
>          *   overlapping fragment, the entire datagram (and any constituent
>          *   fragments) MUST be silently discarded.
>          *
> -        * We do the same here for IPv4.
> +        * We do the same here for IPv4 (and increment an snmp counter).
>          */
>
> -       /* Is there an overlap with the previous fragment? */
> -       if (prev &&
> -           (prev->ip_defrag_offset + prev->len) > offset)
> -               goto discard_qp;
> -
> -       /* Is there an overlap with the next fragment? */
> -       if (next && next->ip_defrag_offset < end)
> -               goto discard_qp;
> +       /* Find out where to put this fragment.  */
> +       skb1 = qp->q.fragments_tail;
> +       if (!skb1) {
> +               /* This is the first fragment we've received. */
> +               rb_link_node(&skb->rbnode, NULL, &qp->q.rb_fragments.rb_node);
> +               qp->q.fragments_tail = skb;
> +       } else if ((skb1->ip_defrag_offset + skb1->len) < end) {
> +               /* This is the common/special case: skb goes to the end. */
> +               /* Detect and discard overlaps. */
> +               if (offset < (skb1->ip_defrag_offset + skb1->len))
> +                       goto discard_qp;
> +               /* Insert after skb1. */
> +               rb_link_node(&skb->rbnode, &skb1->rbnode, &skb1->rbnode.rb_right);
> +               qp->q.fragments_tail = skb;
> +       } else {
> +               /* Binary search. Note that skb can become the first fragment, but
> +                * not the last (covered above). */
> +               rbn = &qp->q.rb_fragments.rb_node;
> +               do {
> +                       parent = *rbn;
> +                       skb1 = rb_to_skb(parent);
> +                       if (end <= skb1->ip_defrag_offset)
> +                               rbn = &parent->rb_left;
> +                       else if (offset >= skb1->ip_defrag_offset + skb1->len)
> +                               rbn = &parent->rb_right;
> +                       else /* Found an overlap with skb1. */
> +                               goto discard_qp;
> +               } while (*rbn);
> +               /* Here we have parent properly set, and rbn pointing to
> +                * one of its NULL left/right children. Insert skb. */
> +               rb_link_node(&skb->rbnode, parent, rbn);
> +       }
> +       rb_insert_color(&skb->rbnode, &qp->q.rb_fragments);
>
> -       /* Note : skb->ip_defrag_offset and skb->dev share the same location */
> -       dev = skb->dev;
>         if (dev)
>                 qp->iif = dev->ifindex;
> -       /* Makes sure compiler wont do silly aliasing games */
> -       barrier();
>         skb->ip_defrag_offset = offset;
>
> -       /* Insert this fragment in the chain of fragments. */
> -       skb->next = next;
> -       if (!next)
> -               qp->q.fragments_tail = skb;
> -       if (prev)
> -               prev->next = skb;
> -       else
> -               qp->q.fragments = skb;
> -
>         qp->q.stamp = skb->tstamp;
>         qp->q.meat += skb->len;
>         qp->ecn |= ecn;
> @@ -414,7 +425,7 @@ found:
>                 unsigned long orefdst = skb->_skb_refdst;
>
>                 skb->_skb_refdst = 0UL;
> -               err = ip_frag_reasm(qp, prev, dev);
> +               err = ip_frag_reasm(qp, skb, dev);
>                 skb->_skb_refdst = orefdst;
>                 return err;
>         }
> @@ -431,15 +442,15 @@ err:
>         return err;
>  }
>
> -
>  /* Build a new IP datagram from all its fragments. */
> -
> -static int ip_frag_reasm(struct ipq *qp, struct sk_buff *prev,
> +static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,
>                          struct net_device *dev)
>  {
>         struct net *net = container_of(qp->q.net, struct net, ipv4.frags);
>         struct iphdr *iph;
> -       struct sk_buff *fp, *head = qp->q.fragments;
> +       struct sk_buff *fp, *head = skb_rb_first(&qp->q.rb_fragments);
> +       struct sk_buff **nextp; /* To build frag_list. */
> +       struct rb_node *rbn;
>         int len;
>         int ihlen;
>         int err;
> @@ -453,25 +464,20 @@ static int ip_frag_reasm(struct ipq *qp,
>                 goto out_fail;
>         }
>         /* Make the one we just received the head. */
> -       if (prev) {
> -               head = prev->next;
> -               fp = skb_clone(head, GFP_ATOMIC);
> +       if (head != skb) {
> +               fp = skb_clone(skb, GFP_ATOMIC);
>                 if (!fp)
>                         goto out_nomem;
> -
> -               fp->next = head->next;
> -               if (!fp->next)
> +               rb_replace_node(&skb->rbnode, &fp->rbnode, &qp->q.rb_fragments);
> +               if (qp->q.fragments_tail == skb)
>                         qp->q.fragments_tail = fp;
> -               prev->next = fp;
> -
> -               skb_morph(head, qp->q.fragments);
> -               head->next = qp->q.fragments->next;
> -
> -               consume_skb(qp->q.fragments);
> -               qp->q.fragments = head;
> +               skb_morph(skb, head);
> +               rb_replace_node(&head->rbnode, &skb->rbnode,
> +                               &qp->q.rb_fragments);
> +               consume_skb(head);
> +               head = skb;
>         }
>
> -       WARN_ON(!head);
>         WARN_ON(head->ip_defrag_offset != 0);
>
>         /* Allocate a new buffer for the datagram. */
> @@ -496,24 +502,35 @@ static int ip_frag_reasm(struct ipq *qp,
>                 clone = alloc_skb(0, GFP_ATOMIC);
>                 if (!clone)
>                         goto out_nomem;
> -               clone->next = head->next;
> -               head->next = clone;
>                 skb_shinfo(clone)->frag_list = skb_shinfo(head)->frag_list;
>                 skb_frag_list_init(head);
>                 for (i = 0; i < skb_shinfo(head)->nr_frags; i++)
>                         plen += skb_frag_size(&skb_shinfo(head)->frags[i]);
>                 clone->len = clone->data_len = head->data_len - plen;
> -               head->data_len -= clone->len;
> -               head->len -= clone->len;
> +               skb->truesize += clone->truesize;
>                 clone->csum = 0;bffa72cf7f9df
>                 clone->ip_summed = head->ip_summed;
>                 add_frag_mem_limit(qp->q.net, clone->truesize);
> +               skb_shinfo(head)->frag_list = clone;
> +               nextp = &clone->next;
> +       } else {
> +               nextp = &skb_shinfo(head)->frag_list;
>         }
>
> -       skb_shinfo(head)->frag_list = head->next;
>         skb_push(head, head->data - skb_network_header(head));
>
> -       for (fp=head->next; fp; fp = fp->next) {
> +       /* Traverse the tree in order, to build frag_list. */
> +       rbn = rb_next(&head->rbnode);
> +       rb_erase(&head->rbnode, &qp->q.rb_fragments);
> +       while (rbn) {
> +               struct rb_node *rbnext = rb_next(rbn);
> +               fp = rb_to_skb(rbn);
> +               rb_erase(rbn, &qp->q.rb_fragments);
> +               rbn = rbnext;
> +               *nextp = fp;
> +               nextp = &fp->next;
> +               fp->prev = NULL;
> +               memset(&fp->rbnode, 0, sizeof(fp->rbnode));
>                 head->data_len += fp->len;
>                 head->len += fp->len;
>                 if (head->ip_summed != fp->ip_summed)
> @@ -524,7 +541,9 @@ static int ip_frag_reasm(struct ipq *qp,
>         }
>         sub_frag_mem_limit(qp->q.net, head->truesize);
>
> +       *nextp = NULL;
>         head->next = NULL;
> +       head->prev = NULL;
>         head->dev = dev;
>         head->tstamp = qp->q.stamp;
>         IPCB(head)->frag_max_size = max(qp->max_df_size, qp->q.max_size);
> @@ -552,6 +571,7 @@ static int ip_frag_reasm(struct ipq *qp,
>
>         __IP_INC_STATS(net, IPSTATS_MIB_REASMOKS);
>         qp->q.fragments = NULL;
> +       qp->q.rb_fragments = RB_ROOT;
>         qp->q.fragments_tail = NULL;
>         return 0;
>
> --- a/net/ipv6/netfilter/nf_conntrack_reasm.c
> +++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
> @@ -471,6 +471,7 @@ nf_ct_frag6_reasm(struct frag_queue *fq,
>                                           head->csum);
>
>         fq->q.fragments = NULL;
> +       fq->q.rb_fragments = RB_ROOT;
>         fq->q.fragments_tail = NULL;
>
>         return true;
> --- a/net/ipv6/reassembly.c
> +++ b/net/ipv6/reassembly.c
> @@ -472,6 +472,7 @@ static int ip6_frag_reasm(struct frag_qu
>         __IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMOKS);
>         rcu_read_unlock();
>         fq->q.fragments = NULL;
> +       fq->q.rb_fragments = RB_ROOT;
>         fq->q.fragments_tail = NULL;
>         return 1;
>
>
>

I'm getting a kernel panic on the >=4.14.71 stable kernels, and I've
isolated the problem back to this patch.

My 4.18.11 kernel seems to be OK.

Whenever I inject a delay into the interface with iproute2 tools, I get a panic.

Example command:
tc qdisc add dev eth0 root netem delay 35ms

The RIP is pointing at netif_skb_features+0x31/0x230

My efforts to get a transmittable copy of the panic have been thwarted.

There's some confusion between this patch and the upstream patch
refered to in the commit message

The upstream commit patches net/sched/sch_netem.c which isn't even
touched in this commit.

Althought the commit messages are the same, the two patches seem to
have a different purpose.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/net/sched?id=bffa72cf7f9df842f0016ba03586039296b4caaf

The commit message seems more relavant to this patch.

The upstream commit bffa72cf7f9df842f0016ba03586039296b4caaf has not
yet been applied to the stable tree.

I decided to roll the dice, and apply the upstream patch
bffa72cf7f9df842f0016ba03586039296b4caaf (it's been in the main kernel
tree just over a year).

When I manually patch my 4.14.74 kernel with
bffa72cf7f9df842f0016ba03586039296b4caaf, my panic seems to be solved.

I'm uncertain if this is the proper solution, but I hope this points
in the direction of the issue.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 4.14 117/126] net: sk_buff rbnode reorg
  2018-10-04 20:13   ` Mitch Harder
@ 2018-11-29 10:33     ` Greg Kroah-Hartman
  2018-11-29 15:07       ` Mitch Harder
  0 siblings, 1 reply; 134+ messages in thread
From: Greg Kroah-Hartman @ 2018-11-29 10:33 UTC (permalink / raw)
  To: Mitch Harder
  Cc: Linux Kernel Mailing List, stable, Eric Dumazet,
	Soheil Hassas Yeganeh, Wei Wang, Willem de Bruijn,
	David S. Miller

On Thu, Oct 04, 2018 at 03:13:56PM -0500, Mitch Harder wrote:
> On Mon, Sep 17, 2018 at 5:42 PM, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> > 4.14-stable review patch.  If anyone has any objections, please let me know.
> >
> > ------------------
> >
> > From: Eric Dumazet <edumazet@google.com>
> >
> > commit bffa72cf7f9df842f0016ba03586039296b4caaf upstream
> >
> > skb->rbnode shares space with skb->next, skb->prev and skb->tstamp
> >
> > Current uses (TCP receive ofo queue and netem) need to save/restore
> > tstamp, while skb->dev is either NULL (TCP) or a constant for a given
> > queue (netem).
> >
> > Since we plan using an RB tree for TCP retransmit queue to speedup SACK
> > processing with large BDP, this patch exchanges skb->dev and
> > skb->tstamp.
> >
> > This saves some overhead in both TCP and netem.
> >
> > v2: removes the swtstamp field from struct tcp_skb_cb
> >
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Cc: Soheil Hassas Yeganeh <soheil@google.com>
> > Cc: Wei Wang <weiwan@google.com>
> > Cc: Willem de Bruijn <willemb@google.com>
> > Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
> > Signed-off-by: David S. Miller <davem@davemloft.net>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > ---
> >  include/linux/skbuff.h                  |   24 ++--
> >  include/net/inet_frag.h                 |    3
> >  net/ipv4/inet_fragment.c                |   14 +-
> >  net/ipv4/ip_fragment.c                  |  182 +++++++++++++++++---------------
> >  net/ipv6/netfilter/nf_conntrack_reasm.c |    1
> >  net/ipv6/reassembly.c                   |    1
> >  6 files changed, 128 insertions(+), 97 deletions(-)
> >
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -663,23 +663,27 @@ struct sk_buff {
> >                         struct sk_buff          *prev;
> >
> >                         union {
> > -                               ktime_t         tstamp;
> > -                               u64             skb_mstamp;
> > +                               struct net_device       *dev;
> > +                               /* Some protocols might use this space to store information,
> > +                                * while device pointer would be NULL.
> > +                                * UDP receive path is one user.
> > +                                */
> > +                               unsigned long           dev_scratch;
> >                         };
> >                 };
> > -               struct rb_node  rbnode; /* used in netem & tcp stack */
> > +               struct rb_node          rbnode; /* used in netem, ip4 defrag, and tcp stack */
> > +               struct list_head        list;
> >         };
> > -       struct sock             *sk;
> >
> >         union {
> > -               struct net_device       *dev;
> > -               /* Some protocols might use this space to store information,
> > -                * while device pointer would be NULL.
> > -                * UDP receive path is one user.
> > -                */
> > -               unsigned long           dev_scratch;
> > +               struct sock             *sk;
> >                 int                     ip_defrag_offset;
> >         };
> > +
> > +       union {
> > +               ktime_t         tstamp;
> > +               u64             skb_mstamp;
> > +       };
> >         /*
> >          * This is the control buffer. It is free to use for every
> >          * layer. Please put your private variables there. If you
> > --- a/include/net/inet_frag.h
> > +++ b/include/net/inet_frag.h
> > @@ -75,7 +75,8 @@ struct inet_frag_queue {
> >         struct timer_list       timer;
> >         spinlock_t              lock;
> >         refcount_t              refcnt;
> > -       struct sk_buff          *fragments;
> > +       struct sk_buff          *fragments;  /* Used in IPv6. */
> > +       struct rb_root          rb_fragments; /* Used in IPv4. */
> >         struct sk_buff          *fragments_tail;
> >         ktime_t                 stamp;
> >         int                     len;
> > --- a/net/ipv4/inet_fragment.c
> > +++ b/net/ipv4/inet_fragment.c
> > @@ -136,12 +136,16 @@ void inet_frag_destroy(struct inet_frag_
> >         fp = q->fragments;
> >         nf = q->net;
> >         f = nf->f;
> > -       while (fp) {
> > -               struct sk_buff *xp = fp->next;
> > +       if (fp) {
> > +               do {
> > +                       struct sk_buff *xp = fp->next;
> >
> > -               sum_truesize += fp->truesize;
> > -               kfree_skb(fp);
> > -               fp = xp;
> > +                       sum_truesize += fp->truesize;
> > +                       kfree_skb(fp);
> > +                       fp = xp;
> > +               } while (fp);
> > +       } else {
> > +               sum_truesize = skb_rbtree_purge(&q->rb_fragments);
> >         }
> >         sum = sum_truesize + f->qsize;
> >
> > --- a/net/ipv4/ip_fragment.c
> > +++ b/net/ipv4/ip_fragment.c
> > @@ -136,7 +136,7 @@ static void ip_expire(struct timer_list
> >  {
> >         struct inet_frag_queue *frag = from_timer(frag, t, timer);
> >         const struct iphdr *iph;
> > -       struct sk_buff *head;
> > +       struct sk_buff *head = NULL;
> >         struct net *net;
> >         struct ipq *qp;
> >         int err;
> > @@ -152,14 +152,31 @@ static void ip_expire(struct timer_list
> >
> >         ipq_kill(qp);
> >         __IP_INC_STATS(net, IPSTATS_MIB_REASMFAILS);
> > -
> > -       head = qp->q.fragments;
> > -
> >         __IP_INC_STATS(net, IPSTATS_MIB_REASMTIMEOUT);
> >
> > -       if (!(qp->q.flags & INET_FRAG_FIRST_IN) || !head)
> > +       if (!qp->q.flags & INET_FRAG_FIRST_IN)
> >                 goto out;
> >
> > +       /* sk_buff::dev and sk_buff::rbnode are unionized. So we
> > +        * pull the head out of the tree in order to be able to
> > +        * deal with head->dev.
> > +        */
> > +       if (qp->q.fragments) {
> > +               head = qp->q.fragments;
> > +               qp->q.fragments = head->next;
> > +       } else {
> > +               head = skb_rb_first(&qp->q.rb_fragments);
> > +               if (!head)
> > +                       goto out;
> > +               rb_erase(&head->rbnode, &qp->q.rb_fragments);
> > +               memset(&head->rbnode, 0, sizeof(head->rbnode));
> > +               barrier();
> > +       }
> > +       if (head == qp->q.fragments_tail)
> > +               qp->q.fragments_tail = NULL;
> > +
> > +       sub_frag_mem_limit(qp->q.net, head->truesize);
> > +
> >         head->dev = dev_get_by_index_rcu(net, qp->iif);
> >         if (!head->dev)
> >                 goto out;
> > @@ -179,16 +196,16 @@ static void ip_expire(struct timer_list
> >             (skb_rtable(head)->rt_type != RTN_LOCAL))
> >                 goto out;
> >
> > -       skb_get(head);
> >         spin_unlock(&qp->q.lock);
> >         icmp_send(head, ICMP_TIME_EXCEEDED, ICMP_EXC_FRAGTIME, 0);
> > -       kfree_skb(head);
> >         goto out_rcu_unlock;
> >
> >  out:
> >         spin_unlock(&qp->q.lock);
> >  out_rcu_unlock:
> >         rcu_read_unlock();
> > +       if (head)
> > +               kfree_skb(head);
> >         ipq_put(qp);
> >  }
> >
> > @@ -231,7 +248,7 @@ static int ip_frag_too_far(struct ipq *q
> >         end = atomic_inc_return(&peer->rid);
> >         qp->rid = end;
> >
> > -       rc = qp->q.fragments && (end - start) > max;
> > +       rc = qp->q.fragments_tail && (end - start) > max;
> >
> >         if (rc) {
> >                 struct net *net;
> > @@ -245,7 +262,6 @@ static int ip_frag_too_far(struct ipq *q
> >
> >  static int ip_frag_reinit(struct ipq *qp)
> >  {
> > -       struct sk_buff *fp;
> >         unsigned int sum_truesize = 0;
> >
> >         if (!mod_timer(&qp->q.timer, jiffies + qp->q.net->timeout)) {
> > @@ -253,20 +269,14 @@ static int ip_frag_reinit(struct ipq *qp
> >                 return -ETIMEDOUT;
> >         }
> >
> > -       fp = qp->q.fragments;
> > -       do {
> > -               struct sk_buff *xp = fp->next;
> > -
> > -               sum_truesize += fp->truesize;
> > -               kfree_skb(fp);
> > -               fp = xp;
> > -       } while (fp);
> > +       sum_truesize = skb_rbtree_purge(&qp->q.rb_fragments);
> >         sub_frag_mem_limit(qp->q.net, sum_truesize);
> >
> >         qp->q.flags = 0;
> >         qp->q.len = 0;
> >         qp->q.meat = 0;
> >         qp->q.fragments = NULL;
> > +       qp->q.rb_fragments = RB_ROOT;
> >         qp->q.fragments_tail = NULL;
> >         qp->iif = 0;
> >         qp->ecn = 0;
> > @@ -278,7 +288,8 @@ static int ip_frag_reinit(struct ipq *qp
> >  static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
> >  {
> >         struct net *net = container_of(qp->q.net, struct net, ipv4.frags);
> > -       struct sk_buff *prev, *next;
> > +       struct rb_node **rbn, *parent;
> > +       struct sk_buff *skb1;
> >         struct net_device *dev;
> >         unsigned int fragsize;
> >         int flags, offset;
> > @@ -341,58 +352,58 @@ static int ip_frag_queue(struct ipq *qp,
> >         if (err)
> >                 goto err;
> >
> > -       /* Find out which fragments are in front and at the back of us
> > -        * in the chain of fragments so far.  We must know where to put
> > -        * this fragment, right?
> > -        */
> > -       prev = qp->q.fragments_tail;
> > -       if (!prev || prev->ip_defrag_offset < offset) {
> > -               next = NULL;
> > -               goto found;
> > -       }
> > -       prev = NULL;
> > -       for (next = qp->q.fragments; next != NULL; next = next->next) {
> > -               if (next->ip_defrag_offset >= offset)
> > -                       break;  /* bingo! */
> > -               prev = next;
> > -       }
> > +       /* Note : skb->rbnode and skb->dev share the same location. */
> > +       dev = skb->dev;
> > +       /* Makes sure compiler wont do silly aliasing games */
> > +       barrier();
> >
> > -found:
> >         /* RFC5722, Section 4, amended by Errata ID : 3089
> >          *                          When reassembling an IPv6 datagram, if
> >          *   one or more its constituent fragments is determined to be an
> >          *   overlapping fragment, the entire datagram (and any constituent
> >          *   fragments) MUST be silently discarded.
> >          *
> > -        * We do the same here for IPv4.
> > +        * We do the same here for IPv4 (and increment an snmp counter).
> >          */
> >
> > -       /* Is there an overlap with the previous fragment? */
> > -       if (prev &&
> > -           (prev->ip_defrag_offset + prev->len) > offset)
> > -               goto discard_qp;
> > -
> > -       /* Is there an overlap with the next fragment? */
> > -       if (next && next->ip_defrag_offset < end)
> > -               goto discard_qp;
> > +       /* Find out where to put this fragment.  */
> > +       skb1 = qp->q.fragments_tail;
> > +       if (!skb1) {
> > +               /* This is the first fragment we've received. */
> > +               rb_link_node(&skb->rbnode, NULL, &qp->q.rb_fragments.rb_node);
> > +               qp->q.fragments_tail = skb;
> > +       } else if ((skb1->ip_defrag_offset + skb1->len) < end) {
> > +               /* This is the common/special case: skb goes to the end. */
> > +               /* Detect and discard overlaps. */
> > +               if (offset < (skb1->ip_defrag_offset + skb1->len))
> > +                       goto discard_qp;
> > +               /* Insert after skb1. */
> > +               rb_link_node(&skb->rbnode, &skb1->rbnode, &skb1->rbnode.rb_right);
> > +               qp->q.fragments_tail = skb;
> > +       } else {
> > +               /* Binary search. Note that skb can become the first fragment, but
> > +                * not the last (covered above). */
> > +               rbn = &qp->q.rb_fragments.rb_node;
> > +               do {
> > +                       parent = *rbn;
> > +                       skb1 = rb_to_skb(parent);
> > +                       if (end <= skb1->ip_defrag_offset)
> > +                               rbn = &parent->rb_left;
> > +                       else if (offset >= skb1->ip_defrag_offset + skb1->len)
> > +                               rbn = &parent->rb_right;
> > +                       else /* Found an overlap with skb1. */
> > +                               goto discard_qp;
> > +               } while (*rbn);
> > +               /* Here we have parent properly set, and rbn pointing to
> > +                * one of its NULL left/right children. Insert skb. */
> > +               rb_link_node(&skb->rbnode, parent, rbn);
> > +       }
> > +       rb_insert_color(&skb->rbnode, &qp->q.rb_fragments);
> >
> > -       /* Note : skb->ip_defrag_offset and skb->dev share the same location */
> > -       dev = skb->dev;
> >         if (dev)
> >                 qp->iif = dev->ifindex;
> > -       /* Makes sure compiler wont do silly aliasing games */
> > -       barrier();
> >         skb->ip_defrag_offset = offset;
> >
> > -       /* Insert this fragment in the chain of fragments. */
> > -       skb->next = next;
> > -       if (!next)
> > -               qp->q.fragments_tail = skb;
> > -       if (prev)
> > -               prev->next = skb;
> > -       else
> > -               qp->q.fragments = skb;
> > -
> >         qp->q.stamp = skb->tstamp;
> >         qp->q.meat += skb->len;
> >         qp->ecn |= ecn;
> > @@ -414,7 +425,7 @@ found:
> >                 unsigned long orefdst = skb->_skb_refdst;
> >
> >                 skb->_skb_refdst = 0UL;
> > -               err = ip_frag_reasm(qp, prev, dev);
> > +               err = ip_frag_reasm(qp, skb, dev);
> >                 skb->_skb_refdst = orefdst;
> >                 return err;
> >         }
> > @@ -431,15 +442,15 @@ err:
> >         return err;
> >  }
> >
> > -
> >  /* Build a new IP datagram from all its fragments. */
> > -
> > -static int ip_frag_reasm(struct ipq *qp, struct sk_buff *prev,
> > +static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,
> >                          struct net_device *dev)
> >  {
> >         struct net *net = container_of(qp->q.net, struct net, ipv4.frags);
> >         struct iphdr *iph;
> > -       struct sk_buff *fp, *head = qp->q.fragments;
> > +       struct sk_buff *fp, *head = skb_rb_first(&qp->q.rb_fragments);
> > +       struct sk_buff **nextp; /* To build frag_list. */
> > +       struct rb_node *rbn;
> >         int len;
> >         int ihlen;
> >         int err;
> > @@ -453,25 +464,20 @@ static int ip_frag_reasm(struct ipq *qp,
> >                 goto out_fail;
> >         }
> >         /* Make the one we just received the head. */
> > -       if (prev) {
> > -               head = prev->next;
> > -               fp = skb_clone(head, GFP_ATOMIC);
> > +       if (head != skb) {
> > +               fp = skb_clone(skb, GFP_ATOMIC);
> >                 if (!fp)
> >                         goto out_nomem;
> > -
> > -               fp->next = head->next;
> > -               if (!fp->next)
> > +               rb_replace_node(&skb->rbnode, &fp->rbnode, &qp->q.rb_fragments);
> > +               if (qp->q.fragments_tail == skb)
> >                         qp->q.fragments_tail = fp;
> > -               prev->next = fp;
> > -
> > -               skb_morph(head, qp->q.fragments);
> > -               head->next = qp->q.fragments->next;
> > -
> > -               consume_skb(qp->q.fragments);
> > -               qp->q.fragments = head;
> > +               skb_morph(skb, head);
> > +               rb_replace_node(&head->rbnode, &skb->rbnode,
> > +                               &qp->q.rb_fragments);
> > +               consume_skb(head);
> > +               head = skb;
> >         }
> >
> > -       WARN_ON(!head);
> >         WARN_ON(head->ip_defrag_offset != 0);
> >
> >         /* Allocate a new buffer for the datagram. */
> > @@ -496,24 +502,35 @@ static int ip_frag_reasm(struct ipq *qp,
> >                 clone = alloc_skb(0, GFP_ATOMIC);
> >                 if (!clone)
> >                         goto out_nomem;
> > -               clone->next = head->next;
> > -               head->next = clone;
> >                 skb_shinfo(clone)->frag_list = skb_shinfo(head)->frag_list;
> >                 skb_frag_list_init(head);
> >                 for (i = 0; i < skb_shinfo(head)->nr_frags; i++)
> >                         plen += skb_frag_size(&skb_shinfo(head)->frags[i]);
> >                 clone->len = clone->data_len = head->data_len - plen;
> > -               head->data_len -= clone->len;
> > -               head->len -= clone->len;
> > +               skb->truesize += clone->truesize;
> >                 clone->csum = 0;bffa72cf7f9df
> >                 clone->ip_summed = head->ip_summed;
> >                 add_frag_mem_limit(qp->q.net, clone->truesize);
> > +               skb_shinfo(head)->frag_list = clone;
> > +               nextp = &clone->next;
> > +       } else {
> > +               nextp = &skb_shinfo(head)->frag_list;
> >         }
> >
> > -       skb_shinfo(head)->frag_list = head->next;
> >         skb_push(head, head->data - skb_network_header(head));
> >
> > -       for (fp=head->next; fp; fp = fp->next) {
> > +       /* Traverse the tree in order, to build frag_list. */
> > +       rbn = rb_next(&head->rbnode);
> > +       rb_erase(&head->rbnode, &qp->q.rb_fragments);
> > +       while (rbn) {
> > +               struct rb_node *rbnext = rb_next(rbn);
> > +               fp = rb_to_skb(rbn);
> > +               rb_erase(rbn, &qp->q.rb_fragments);
> > +               rbn = rbnext;
> > +               *nextp = fp;
> > +               nextp = &fp->next;
> > +               fp->prev = NULL;
> > +               memset(&fp->rbnode, 0, sizeof(fp->rbnode));
> >                 head->data_len += fp->len;
> >                 head->len += fp->len;
> >                 if (head->ip_summed != fp->ip_summed)
> > @@ -524,7 +541,9 @@ static int ip_frag_reasm(struct ipq *qp,
> >         }
> >         sub_frag_mem_limit(qp->q.net, head->truesize);
> >
> > +       *nextp = NULL;
> >         head->next = NULL;
> > +       head->prev = NULL;
> >         head->dev = dev;
> >         head->tstamp = qp->q.stamp;
> >         IPCB(head)->frag_max_size = max(qp->max_df_size, qp->q.max_size);
> > @@ -552,6 +571,7 @@ static int ip_frag_reasm(struct ipq *qp,
> >
> >         __IP_INC_STATS(net, IPSTATS_MIB_REASMOKS);
> >         qp->q.fragments = NULL;
> > +       qp->q.rb_fragments = RB_ROOT;
> >         qp->q.fragments_tail = NULL;
> >         return 0;
> >
> > --- a/net/ipv6/netfilter/nf_conntrack_reasm.c
> > +++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
> > @@ -471,6 +471,7 @@ nf_ct_frag6_reasm(struct frag_queue *fq,
> >                                           head->csum);
> >
> >         fq->q.fragments = NULL;
> > +       fq->q.rb_fragments = RB_ROOT;
> >         fq->q.fragments_tail = NULL;
> >
> >         return true;
> > --- a/net/ipv6/reassembly.c
> > +++ b/net/ipv6/reassembly.c
> > @@ -472,6 +472,7 @@ static int ip6_frag_reasm(struct frag_qu
> >         __IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMOKS);
> >         rcu_read_unlock();
> >         fq->q.fragments = NULL;
> > +       fq->q.rb_fragments = RB_ROOT;
> >         fq->q.fragments_tail = NULL;
> >         return 1;
> >
> >
> >
> 
> I'm getting a kernel panic on the >=4.14.71 stable kernels, and I've
> isolated the problem back to this patch.
> 
> My 4.18.11 kernel seems to be OK.
> 
> Whenever I inject a delay into the interface with iproute2 tools, I get a panic.
> 
> Example command:
> tc qdisc add dev eth0 root netem delay 35ms
> 
> The RIP is pointing at netif_skb_features+0x31/0x230
> 
> My efforts to get a transmittable copy of the panic have been thwarted.
> 
> There's some confusion between this patch and the upstream patch
> refered to in the commit message
> 
> The upstream commit patches net/sched/sch_netem.c which isn't even
> touched in this commit.
> 
> Althought the commit messages are the same, the two patches seem to
> have a different purpose.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/net/sched?id=bffa72cf7f9df842f0016ba03586039296b4caaf
> 
> The commit message seems more relavant to this patch.
> 
> The upstream commit bffa72cf7f9df842f0016ba03586039296b4caaf has not
> yet been applied to the stable tree.
> 
> I decided to roll the dice, and apply the upstream patch
> bffa72cf7f9df842f0016ba03586039296b4caaf (it's been in the main kernel
> tree just over a year).
> 
> When I manually patch my 4.14.74 kernel with
> bffa72cf7f9df842f0016ba03586039296b4caaf, my panic seems to be solved.

That is odd, as this commit is in the 4.14.71 kernel release, so it
should not be able to be applied to 4.14.74.

If something still needs to be done here for the 4.14.y kernel tree,
please let me know.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 4.14 117/126] net: sk_buff rbnode reorg
  2018-11-29 10:33     ` Greg Kroah-Hartman
@ 2018-11-29 15:07       ` Mitch Harder
  0 siblings, 0 replies; 134+ messages in thread
From: Mitch Harder @ 2018-11-29 15:07 UTC (permalink / raw)
  To: Greg KH
  Cc: Linux Kernel Mailing List, stable, Eric Dumazet,
	Soheil Hassas Yeganeh, Wei Wang, Willem de Bruijn,
	David S. Miller

On Thu, Nov 29, 2018 at 4:33 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Thu, Oct 04, 2018 at 03:13:56PM -0500, Mitch Harder wrote:
> > On Mon, Sep 17, 2018 at 5:42 PM, Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> wrote:
> > > 4.14-stable review patch.  If anyone has any objections, please let me know.
> > >
> > > ------------------
> > >
> > > From: Eric Dumazet <edumazet@google.com>
> > >
> > > commit bffa72cf7f9df842f0016ba03586039296b4caaf upstream
> > >
> > > skb->rbnode shares space with skb->next, skb->prev and skb->tstamp
> > >
> > > Current uses (TCP receive ofo queue and netem) need to save/restore
> > > tstamp, while skb->dev is either NULL (TCP) or a constant for a given
> > > queue (netem).
> > >
> > > Since we plan using an RB tree for TCP retransmit queue to speedup SACK
> > > processing with large BDP, this patch exchanges skb->dev and
> > > skb->tstamp.
> > >
> > > This saves some overhead in both TCP and netem.
> > >
> > > v2: removes the swtstamp field from struct tcp_skb_cb
> > >
> > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > > Cc: Soheil Hassas Yeganeh <soheil@google.com>
> > > Cc: Wei Wang <weiwan@google.com>
> > > Cc: Willem de Bruijn <willemb@google.com>
> > > Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
> > > Signed-off-by: David S. Miller <davem@davemloft.net>
> > > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > ---
> > >  include/linux/skbuff.h                  |   24 ++--
> > >  include/net/inet_frag.h                 |    3
> > >  net/ipv4/inet_fragment.c                |   14 +-
> > >  net/ipv4/ip_fragment.c                  |  182 +++++++++++++++++---------------
> > >  net/ipv6/netfilter/nf_conntrack_reasm.c |    1
> > >  net/ipv6/reassembly.c                   |    1
> > >  6 files changed, 128 insertions(+), 97 deletions(-)
> > >
> > > --- a/include/linux/skbuff.h
> > > +++ b/include/linux/skbuff.h
> > > @@ -663,23 +663,27 @@ struct sk_buff {
> > >                         struct sk_buff          *prev;
> > >
> > >                         union {
> > > -                               ktime_t         tstamp;
> > > -                               u64             skb_mstamp;
> > > +                               struct net_device       *dev;
> > > +                               /* Some protocols might use this space to store information,
> > > +                                * while device pointer would be NULL.
> > > +                                * UDP receive path is one user.
> > > +                                */
> > > +                               unsigned long           dev_scratch;
> > >                         };
> > >                 };
> > > -               struct rb_node  rbnode; /* used in netem & tcp stack */
> > > +               struct rb_node          rbnode; /* used in netem, ip4 defrag, and tcp stack */
> > > +               struct list_head        list;
> > >         };
> > > -       struct sock             *sk;
> > >
> > >         union {
> > > -               struct net_device       *dev;
> > > -               /* Some protocols might use this space to store information,
> > > -                * while device pointer would be NULL.
> > > -                * UDP receive path is one user.
> > > -                */
> > > -               unsigned long           dev_scratch;
> > > +               struct sock             *sk;
> > >                 int                     ip_defrag_offset;
> > >         };
> > > +
> > > +       union {
> > > +               ktime_t         tstamp;
> > > +               u64             skb_mstamp;
> > > +       };
> > >         /*
> > >          * This is the control buffer. It is free to use for every
> > >          * layer. Please put your private variables there. If you
> > > --- a/include/net/inet_frag.h
> > > +++ b/include/net/inet_frag.h
> > > @@ -75,7 +75,8 @@ struct inet_frag_queue {
> > >         struct timer_list       timer;
> > >         spinlock_t              lock;
> > >         refcount_t              refcnt;
> > > -       struct sk_buff          *fragments;
> > > +       struct sk_buff          *fragments;  /* Used in IPv6. */
> > > +       struct rb_root          rb_fragments; /* Used in IPv4. */
> > >         struct sk_buff          *fragments_tail;
> > >         ktime_t                 stamp;
> > >         int                     len;
> > > --- a/net/ipv4/inet_fragment.c
> > > +++ b/net/ipv4/inet_fragment.c
> > > @@ -136,12 +136,16 @@ void inet_frag_destroy(struct inet_frag_
> > >         fp = q->fragments;
> > >         nf = q->net;
> > >         f = nf->f;
> > > -       while (fp) {
> > > -               struct sk_buff *xp = fp->next;
> > > +       if (fp) {
> > > +               do {
> > > +                       struct sk_buff *xp = fp->next;
> > >
> > > -               sum_truesize += fp->truesize;
> > > -               kfree_skb(fp);
> > > -               fp = xp;
> > > +                       sum_truesize += fp->truesize;
> > > +                       kfree_skb(fp);
> > > +                       fp = xp;
> > > +               } while (fp);
> > > +       } else {
> > > +               sum_truesize = skb_rbtree_purge(&q->rb_fragments);
> > >         }
> > >         sum = sum_truesize + f->qsize;
> > >
> > > --- a/net/ipv4/ip_fragment.c
> > > +++ b/net/ipv4/ip_fragment.c
> > > @@ -136,7 +136,7 @@ static void ip_expire(struct timer_list
> > >  {
> > >         struct inet_frag_queue *frag = from_timer(frag, t, timer);
> > >         const struct iphdr *iph;
> > > -       struct sk_buff *head;
> > > +       struct sk_buff *head = NULL;
> > >         struct net *net;
> > >         struct ipq *qp;
> > >         int err;
> > > @@ -152,14 +152,31 @@ static void ip_expire(struct timer_list
> > >
> > >         ipq_kill(qp);
> > >         __IP_INC_STATS(net, IPSTATS_MIB_REASMFAILS);
> > > -
> > > -       head = qp->q.fragments;
> > > -
> > >         __IP_INC_STATS(net, IPSTATS_MIB_REASMTIMEOUT);
> > >
> > > -       if (!(qp->q.flags & INET_FRAG_FIRST_IN) || !head)
> > > +       if (!qp->q.flags & INET_FRAG_FIRST_IN)
> > >                 goto out;
> > >
> > > +       /* sk_buff::dev and sk_buff::rbnode are unionized. So we
> > > +        * pull the head out of the tree in order to be able to
> > > +        * deal with head->dev.
> > > +        */
> > > +       if (qp->q.fragments) {
> > > +               head = qp->q.fragments;
> > > +               qp->q.fragments = head->next;
> > > +       } else {
> > > +               head = skb_rb_first(&qp->q.rb_fragments);
> > > +               if (!head)
> > > +                       goto out;
> > > +               rb_erase(&head->rbnode, &qp->q.rb_fragments);
> > > +               memset(&head->rbnode, 0, sizeof(head->rbnode));
> > > +               barrier();
> > > +       }
> > > +       if (head == qp->q.fragments_tail)
> > > +               qp->q.fragments_tail = NULL;
> > > +
> > > +       sub_frag_mem_limit(qp->q.net, head->truesize);
> > > +
> > >         head->dev = dev_get_by_index_rcu(net, qp->iif);
> > >         if (!head->dev)
> > >                 goto out;
> > > @@ -179,16 +196,16 @@ static void ip_expire(struct timer_list
> > >             (skb_rtable(head)->rt_type != RTN_LOCAL))
> > >                 goto out;
> > >
> > > -       skb_get(head);
> > >         spin_unlock(&qp->q.lock);
> > >         icmp_send(head, ICMP_TIME_EXCEEDED, ICMP_EXC_FRAGTIME, 0);
> > > -       kfree_skb(head);
> > >         goto out_rcu_unlock;
> > >
> > >  out:
> > >         spin_unlock(&qp->q.lock);
> > >  out_rcu_unlock:
> > >         rcu_read_unlock();
> > > +       if (head)
> > > +               kfree_skb(head);
> > >         ipq_put(qp);
> > >  }
> > >
> > > @@ -231,7 +248,7 @@ static int ip_frag_too_far(struct ipq *q
> > >         end = atomic_inc_return(&peer->rid);
> > >         qp->rid = end;
> > >
> > > -       rc = qp->q.fragments && (end - start) > max;
> > > +       rc = qp->q.fragments_tail && (end - start) > max;
> > >
> > >         if (rc) {
> > >                 struct net *net;
> > > @@ -245,7 +262,6 @@ static int ip_frag_too_far(struct ipq *q
> > >
> > >  static int ip_frag_reinit(struct ipq *qp)
> > >  {
> > > -       struct sk_buff *fp;
> > >         unsigned int sum_truesize = 0;
> > >
> > >         if (!mod_timer(&qp->q.timer, jiffies + qp->q.net->timeout)) {
> > > @@ -253,20 +269,14 @@ static int ip_frag_reinit(struct ipq *qp
> > >                 return -ETIMEDOUT;
> > >         }
> > >
> > > -       fp = qp->q.fragments;
> > > -       do {
> > > -               struct sk_buff *xp = fp->next;
> > > -
> > > -               sum_truesize += fp->truesize;
> > > -               kfree_skb(fp);
> > > -               fp = xp;
> > > -       } while (fp);
> > > +       sum_truesize = skb_rbtree_purge(&qp->q.rb_fragments);
> > >         sub_frag_mem_limit(qp->q.net, sum_truesize);
> > >
> > >         qp->q.flags = 0;
> > >         qp->q.len = 0;
> > >         qp->q.meat = 0;
> > >         qp->q.fragments = NULL;
> > > +       qp->q.rb_fragments = RB_ROOT;
> > >         qp->q.fragments_tail = NULL;
> > >         qp->iif = 0;
> > >         qp->ecn = 0;
> > > @@ -278,7 +288,8 @@ static int ip_frag_reinit(struct ipq *qp
> > >  static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
> > >  {
> > >         struct net *net = container_of(qp->q.net, struct net, ipv4.frags);
> > > -       struct sk_buff *prev, *next;
> > > +       struct rb_node **rbn, *parent;
> > > +       struct sk_buff *skb1;
> > >         struct net_device *dev;
> > >         unsigned int fragsize;
> > >         int flags, offset;
> > > @@ -341,58 +352,58 @@ static int ip_frag_queue(struct ipq *qp,
> > >         if (err)
> > >                 goto err;
> > >
> > > -       /* Find out which fragments are in front and at the back of us
> > > -        * in the chain of fragments so far.  We must know where to put
> > > -        * this fragment, right?
> > > -        */
> > > -       prev = qp->q.fragments_tail;
> > > -       if (!prev || prev->ip_defrag_offset < offset) {
> > > -               next = NULL;
> > > -               goto found;
> > > -       }
> > > -       prev = NULL;
> > > -       for (next = qp->q.fragments; next != NULL; next = next->next) {
> > > -               if (next->ip_defrag_offset >= offset)
> > > -                       break;  /* bingo! */
> > > -               prev = next;
> > > -       }
> > > +       /* Note : skb->rbnode and skb->dev share the same location. */
> > > +       dev = skb->dev;
> > > +       /* Makes sure compiler wont do silly aliasing games */
> > > +       barrier();
> > >
> > > -found:
> > >         /* RFC5722, Section 4, amended by Errata ID : 3089
> > >          *                          When reassembling an IPv6 datagram, if
> > >          *   one or more its constituent fragments is determined to be an
> > >          *   overlapping fragment, the entire datagram (and any constituent
> > >          *   fragments) MUST be silently discarded.
> > >          *
> > > -        * We do the same here for IPv4.
> > > +        * We do the same here for IPv4 (and increment an snmp counter).
> > >          */
> > >
> > > -       /* Is there an overlap with the previous fragment? */
> > > -       if (prev &&
> > > -           (prev->ip_defrag_offset + prev->len) > offset)
> > > -               goto discard_qp;
> > > -
> > > -       /* Is there an overlap with the next fragment? */
> > > -       if (next && next->ip_defrag_offset < end)
> > > -               goto discard_qp;
> > > +       /* Find out where to put this fragment.  */
> > > +       skb1 = qp->q.fragments_tail;
> > > +       if (!skb1) {
> > > +               /* This is the first fragment we've received. */
> > > +               rb_link_node(&skb->rbnode, NULL, &qp->q.rb_fragments.rb_node);
> > > +               qp->q.fragments_tail = skb;
> > > +       } else if ((skb1->ip_defrag_offset + skb1->len) < end) {
> > > +               /* This is the common/special case: skb goes to the end. */
> > > +               /* Detect and discard overlaps. */
> > > +               if (offset < (skb1->ip_defrag_offset + skb1->len))
> > > +                       goto discard_qp;
> > > +               /* Insert after skb1. */
> > > +               rb_link_node(&skb->rbnode, &skb1->rbnode, &skb1->rbnode.rb_right);
> > > +               qp->q.fragments_tail = skb;
> > > +       } else {
> > > +               /* Binary search. Note that skb can become the first fragment, but
> > > +                * not the last (covered above). */
> > > +               rbn = &qp->q.rb_fragments.rb_node;
> > > +               do {
> > > +                       parent = *rbn;
> > > +                       skb1 = rb_to_skb(parent);
> > > +                       if (end <= skb1->ip_defrag_offset)
> > > +                               rbn = &parent->rb_left;
> > > +                       else if (offset >= skb1->ip_defrag_offset + skb1->len)
> > > +                               rbn = &parent->rb_right;
> > > +                       else /* Found an overlap with skb1. */
> > > +                               goto discard_qp;
> > > +               } while (*rbn);
> > > +               /* Here we have parent properly set, and rbn pointing to
> > > +                * one of its NULL left/right children. Insert skb. */
> > > +               rb_link_node(&skb->rbnode, parent, rbn);
> > > +       }
> > > +       rb_insert_color(&skb->rbnode, &qp->q.rb_fragments);
> > >
> > > -       /* Note : skb->ip_defrag_offset and skb->dev share the same location */
> > > -       dev = skb->dev;
> > >         if (dev)
> > >                 qp->iif = dev->ifindex;
> > > -       /* Makes sure compiler wont do silly aliasing games */
> > > -       barrier();
> > >         skb->ip_defrag_offset = offset;
> > >
> > > -       /* Insert this fragment in the chain of fragments. */
> > > -       skb->next = next;
> > > -       if (!next)
> > > -               qp->q.fragments_tail = skb;
> > > -       if (prev)
> > > -               prev->next = skb;
> > > -       else
> > > -               qp->q.fragments = skb;
> > > -
> > >         qp->q.stamp = skb->tstamp;
> > >         qp->q.meat += skb->len;
> > >         qp->ecn |= ecn;
> > > @@ -414,7 +425,7 @@ found:
> > >                 unsigned long orefdst = skb->_skb_refdst;
> > >
> > >                 skb->_skb_refdst = 0UL;
> > > -               err = ip_frag_reasm(qp, prev, dev);
> > > +               err = ip_frag_reasm(qp, skb, dev);
> > >                 skb->_skb_refdst = orefdst;
> > >                 return err;
> > >         }
> > > @@ -431,15 +442,15 @@ err:
> > >         return err;
> > >  }
> > >
> > > -
> > >  /* Build a new IP datagram from all its fragments. */
> > > -
> > > -static int ip_frag_reasm(struct ipq *qp, struct sk_buff *prev,
> > > +static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,
> > >                          struct net_device *dev)
> > >  {
> > >         struct net *net = container_of(qp->q.net, struct net, ipv4.frags);
> > >         struct iphdr *iph;
> > > -       struct sk_buff *fp, *head = qp->q.fragments;
> > > +       struct sk_buff *fp, *head = skb_rb_first(&qp->q.rb_fragments);
> > > +       struct sk_buff **nextp; /* To build frag_list. */
> > > +       struct rb_node *rbn;
> > >         int len;
> > >         int ihlen;
> > >         int err;
> > > @@ -453,25 +464,20 @@ static int ip_frag_reasm(struct ipq *qp,
> > >                 goto out_fail;
> > >         }
> > >         /* Make the one we just received the head. */
> > > -       if (prev) {
> > > -               head = prev->next;
> > > -               fp = skb_clone(head, GFP_ATOMIC);
> > > +       if (head != skb) {
> > > +               fp = skb_clone(skb, GFP_ATOMIC);
> > >                 if (!fp)
> > >                         goto out_nomem;
> > > -
> > > -               fp->next = head->next;
> > > -               if (!fp->next)
> > > +               rb_replace_node(&skb->rbnode, &fp->rbnode, &qp->q.rb_fragments);
> > > +               if (qp->q.fragments_tail == skb)
> > >                         qp->q.fragments_tail = fp;
> > > -               prev->next = fp;
> > > -
> > > -               skb_morph(head, qp->q.fragments);
> > > -               head->next = qp->q.fragments->next;
> > > -
> > > -               consume_skb(qp->q.fragments);
> > > -               qp->q.fragments = head;
> > > +               skb_morph(skb, head);
> > > +               rb_replace_node(&head->rbnode, &skb->rbnode,
> > > +                               &qp->q.rb_fragments);
> > > +               consume_skb(head);
> > > +               head = skb;
> > >         }
> > >
> > > -       WARN_ON(!head);
> > >         WARN_ON(head->ip_defrag_offset != 0);
> > >
> > >         /* Allocate a new buffer for the datagram. */
> > > @@ -496,24 +502,35 @@ static int ip_frag_reasm(struct ipq *qp,
> > >                 clone = alloc_skb(0, GFP_ATOMIC);
> > >                 if (!clone)
> > >                         goto out_nomem;
> > > -               clone->next = head->next;
> > > -               head->next = clone;
> > >                 skb_shinfo(clone)->frag_list = skb_shinfo(head)->frag_list;
> > >                 skb_frag_list_init(head);
> > >                 for (i = 0; i < skb_shinfo(head)->nr_frags; i++)
> > >                         plen += skb_frag_size(&skb_shinfo(head)->frags[i]);
> > >                 clone->len = clone->data_len = head->data_len - plen;
> > > -               head->data_len -= clone->len;
> > > -               head->len -= clone->len;
> > > +               skb->truesize += clone->truesize;
> > >                 clone->csum = 0;bffa72cf7f9df
> > >                 clone->ip_summed = head->ip_summed;
> > >                 add_frag_mem_limit(qp->q.net, clone->truesize);
> > > +               skb_shinfo(head)->frag_list = clone;
> > > +               nextp = &clone->next;
> > > +       } else {
> > > +               nextp = &skb_shinfo(head)->frag_list;
> > >         }
> > >
> > > -       skb_shinfo(head)->frag_list = head->next;
> > >         skb_push(head, head->data - skb_network_header(head));
> > >
> > > -       for (fp=head->next; fp; fp = fp->next) {
> > > +       /* Traverse the tree in order, to build frag_list. */
> > > +       rbn = rb_next(&head->rbnode);
> > > +       rb_erase(&head->rbnode, &qp->q.rb_fragments);
> > > +       while (rbn) {
> > > +               struct rb_node *rbnext = rb_next(rbn);
> > > +               fp = rb_to_skb(rbn);
> > > +               rb_erase(rbn, &qp->q.rb_fragments);
> > > +               rbn = rbnext;
> > > +               *nextp = fp;
> > > +               nextp = &fp->next;
> > > +               fp->prev = NULL;
> > > +               memset(&fp->rbnode, 0, sizeof(fp->rbnode));
> > >                 head->data_len += fp->len;
> > >                 head->len += fp->len;
> > >                 if (head->ip_summed != fp->ip_summed)
> > > @@ -524,7 +541,9 @@ static int ip_frag_reasm(struct ipq *qp,
> > >         }
> > >         sub_frag_mem_limit(qp->q.net, head->truesize);
> > >
> > > +       *nextp = NULL;
> > >         head->next = NULL;
> > > +       head->prev = NULL;
> > >         head->dev = dev;
> > >         head->tstamp = qp->q.stamp;
> > >         IPCB(head)->frag_max_size = max(qp->max_df_size, qp->q.max_size);
> > > @@ -552,6 +571,7 @@ static int ip_frag_reasm(struct ipq *qp,
> > >
> > >         __IP_INC_STATS(net, IPSTATS_MIB_REASMOKS);
> > >         qp->q.fragments = NULL;
> > > +       qp->q.rb_fragments = RB_ROOT;
> > >         qp->q.fragments_tail = NULL;
> > >         return 0;
> > >
> > > --- a/net/ipv6/netfilter/nf_conntrack_reasm.c
> > > +++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
> > > @@ -471,6 +471,7 @@ nf_ct_frag6_reasm(struct frag_queue *fq,
> > >                                           head->csum);
> > >
> > >         fq->q.fragments = NULL;
> > > +       fq->q.rb_fragments = RB_ROOT;
> > >         fq->q.fragments_tail = NULL;
> > >
> > >         return true;
> > > --- a/net/ipv6/reassembly.c
> > > +++ b/net/ipv6/reassembly.c
> > > @@ -472,6 +472,7 @@ static int ip6_frag_reasm(struct frag_qu
> > >         __IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMOKS);
> > >         rcu_read_unlock();
> > >         fq->q.fragments = NULL;
> > > +       fq->q.rb_fragments = RB_ROOT;
> > >         fq->q.fragments_tail = NULL;
> > >         return 1;
> > >
> > >
> > >
> >
> > I'm getting a kernel panic on the >=4.14.71 stable kernels, and I've
> > isolated the problem back to this patch.
> >
> > My 4.18.11 kernel seems to be OK.
> >
> > Whenever I inject a delay into the interface with iproute2 tools, I get a panic.
> >
> > Example command:
> > tc qdisc add dev eth0 root netem delay 35ms
> >
> > The RIP is pointing at netif_skb_features+0x31/0x230
> >
> > My efforts to get a transmittable copy of the panic have been thwarted.
> >
> > There's some confusion between this patch and the upstream patch
> > refered to in the commit message
> >
> > The upstream commit patches net/sched/sch_netem.c which isn't even
> > touched in this commit.
> >
> > Althought the commit messages are the same, the two patches seem to
> > have a different purpose.
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/net/sched?id=bffa72cf7f9df842f0016ba03586039296b4caaf
> >
> > The commit message seems more relavant to this patch.
> >
> > The upstream commit bffa72cf7f9df842f0016ba03586039296b4caaf has not
> > yet been applied to the stable tree.
> >
> > I decided to roll the dice, and apply the upstream patch
> > bffa72cf7f9df842f0016ba03586039296b4caaf (it's been in the main kernel
> > tree just over a year).
> >
> > When I manually patch my 4.14.74 kernel with
> > bffa72cf7f9df842f0016ba03586039296b4caaf, my panic seems to be solved.
>
> That is odd, as this commit is in the 4.14.71 kernel release, so it
> should not be able to be applied to 4.14.74.
>
> If something still needs to be done here for the 4.14.y kernel tree,
> please let me know.
>
> thanks,
>
> greg k-h

Thanks for the response.

This issue was fixed in the 4.14.79 stable release with this patch:

sch_netem: restore skb->dev after dequeuing from the rbtree
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/net/sched/sch_netem.c?h=linux-4.14.y&id=bd6df7a19559f9b52049f97c3188a7d1544e16df

^ permalink raw reply	[flat|nested] 134+ messages in thread

end of thread, other threads:[~2018-11-29 15:08 UTC | newest]

Thread overview: 134+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-17 22:40 [PATCH 4.14 000/126] 4.14.71-stable review Greg Kroah-Hartman
2018-09-17 22:40 ` [PATCH 4.14 001/126] i2c: xiic: Make the start and the byte count write atomic Greg Kroah-Hartman
2018-09-17 22:40 ` [PATCH 4.14 002/126] i2c: i801: fix DNVs SMBCTRL register offset Greg Kroah-Hartman
2018-09-17 22:40 ` [PATCH 4.14 003/126] scsi: lpfc: Correct MDS diag and nvmet configuration Greg Kroah-Hartman
2018-09-17 22:40 ` [PATCH 4.14 004/126] nbd: dont allow invalid blocksize settings Greg Kroah-Hartman
2018-09-17 22:40 ` [PATCH 4.14 005/126] block: bfq: swap puts in bfqg_and_blkg_put Greg Kroah-Hartman
2018-09-17 22:40 ` [PATCH 4.14 006/126] android: binder: fix the race mmap and alloc_new_buf_locked Greg Kroah-Hartman
2018-09-17 22:40 ` [PATCH 4.14 007/126] MIPS: VDSO: Match data page cache colouring when D$ aliases Greg Kroah-Hartman
2018-09-17 22:40 ` [PATCH 4.14 008/126] SMB3: Backup intent flag missing for directory opens with backupuid mounts Greg Kroah-Hartman
2018-09-17 22:40 ` [PATCH 4.14 009/126] smb3: check for and properly advertise directory lease support Greg Kroah-Hartman
2018-09-17 22:40 ` [PATCH 4.14 010/126] Btrfs: fix data corruption when deduplicating between different files Greg Kroah-Hartman
2018-09-17 22:40 ` [PATCH 4.14 011/126] KVM: s390: vsie: copy wrapping keys to right place Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 012/126] KVM: VMX: Do not allow reexecute_instruction() when skipping MMIO instr Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 013/126] ALSA: hda - Fix cancel_work_sync() stall from jackpoll work Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 014/126] cpu/hotplug: Adjust misplaced smb() in cpuhp_thread_fun() Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 015/126] cpu/hotplug: Prevent state corruption on error rollback Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 016/126] x86/microcode: Make sure boot_cpu_data.microcode is up-to-date Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 017/126] x86/microcode: Update the new microcode revision unconditionally Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 018/126] switchtec: Fix Spectre v1 vulnerability Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 019/126] crypto: aes-generic - fix aes-generic regression on powerpc Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 020/126] tpm: separate cmd_ready/go_idle from runtime_pm Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 021/126] ARC: [plat-axs*]: Enable SWAP Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 022/126] misc: mic: SCIF Fix scif_get_new_port() error handling Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 023/126] ethtool: Remove trailing semicolon for static inline Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 024/126] i2c: aspeed: Add an explicit type casting for *get_clk_reg_val Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 025/126] Bluetooth: h5: Fix missing dependency on BT_HCIUART_SERDEV Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 026/126] gpio: tegra: Move driver registration to subsys_init level Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 027/126] powerpc/powernv: Fix concurrency issue with npu->mmio_atsd_usage Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 028/126] selftests/bpf: fix a typo in map in map test Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 029/126] media: davinci: vpif_display: Mix memory leak on probe error path Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 030/126] media: dw2102: Fix memleak on sequence of probes Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 031/126] net: phy: Fix the register offsets in Broadcom iProc mdio mux driver Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 032/126] blk-mq: fix updating tags depth Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 033/126] scsi: target: fix __transport_register_session locking Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 034/126] md/raid5: fix data corruption of replacements after originals dropped Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 035/126] timers: Clear timer_base::must_forward_clk with timer_base::lock held Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 036/126] media: camss: csid: Configure data type and decode format properly Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 037/126] gpu: ipu-v3: default to id 0 on missing OF alias Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 038/126] misc: ti-st: Fix memory leak in the error path of probe() Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 039/126] uio: potential double frees if __uio_register_device() fails Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 040/126] firmware: vpd: Fix section enabled flag on vpd_section_destroy Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 041/126] Drivers: hv: vmbus: Cleanup synic memory free path Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 042/126] tty: rocket: Fix possible buffer overwrite on register_PCI Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 043/126] f2fs: fix to active page in lru list for read path Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 044/126] f2fs: do not set free of current section Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 045/126] f2fs: fix defined but not used build warnings Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 046/126] perf tools: Allow overriding MAX_NR_CPUS at compile time Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 047/126] NFSv4.0 fix client reference leak in callback Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 048/126] perf c2c report: Fix crash for empty browser Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 049/126] perf evlist: Fix error out while applying initial delay and LBR Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 050/126] macintosh/via-pmu: Add missing mmio accessors Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 051/126] ath9k: report tx status on EOSP Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 052/126] ath9k_hw: fix channel maximum power level test Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 053/126] ath10k: prevent active scans on potential unusable channels Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 054/126] wlcore: Set rx_status boottime_ns field on rx Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 055/126] rpmsg: core: add support to power domains for devices Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 056/126] MIPS: Fix ISA virt/bus conversion for non-zero PHYS_OFFSET Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 057/126] ata: libahci: Allow reconfigure of DEVSLP register Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 058/126] ata: libahci: Correct setting " Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 059/126] scsi: 3ware: fix return 0 on the error path of probe Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 060/126] tools/testing/nvdimm: kaddr and pfn can be NULL to ->direct_access() Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 061/126] ath10k: disable bundle mgmt tx completion event support Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 062/126] Bluetooth: hidp: Fix handling of strncpy for hid->name information Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 063/126] x86/mm: Remove in_nmi() warning from vmalloc_fault() Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 064/126] pinctrl: imx: off by one in imx_pinconf_group_dbg_show() Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 065/126] gpio: ml-ioh: Fix buffer underwrite on probe error path Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 066/126] pinctrl/amd: only handle irq if it is pending and unmasked Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 067/126] net: mvneta: fix mtu change on port without link Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 068/126] f2fs: try grabbing node page lock aggressively in sync scenario Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 069/126] pktcdvd: Fix possible Spectre-v1 for pkt_devs Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 070/126] f2fs: fix to skip GC if type in SSA and SIT is inconsistent Greg Kroah-Hartman
2018-09-17 22:41 ` [PATCH 4.14 071/126] tpm_tis_spi: Pass the SPI IRQ down to the driver Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 072/126] tpm/tpm_i2c_infineon: switch to i2c_lock_bus(..., I2C_LOCK_SEGMENT) Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 073/126] f2fs: fix to do sanity check with reserved blkaddr of inline inode Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 074/126] MIPS: Octeon: add missing of_node_put() Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 075/126] MIPS: generic: fix " Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 076/126] net: dcb: For wild-card lookups, use priority -1, not 0 Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 077/126] dm cache: only allow a single io_mode cache feature to be requested Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 078/126] Input: atmel_mxt_ts - only use first T9 instance Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 079/126] media: s5p-mfc: Fix buffer look up in s5p_mfc_handle_frame_{new, copy_time} functions Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 080/126] partitions/aix: append null character to print data from disk Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 081/126] partitions/aix: fix usage of uninitialized lv_info and lvname structures Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 082/126] media: helene: fix xtal frequency setting at power on Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 083/126] f2fs: fix to wait on page writeback before updating page Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 084/126] f2fs: Fix uninitialized return in f2fs_ioc_shutdown() Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 085/126] iommu/ipmmu-vmsa: Fix allocation in atomic context Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 086/126] mfd: ti_am335x_tscadc: Fix struct clk memory leak Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 087/126] f2fs: fix to do sanity check with {sit,nat}_ver_bitmap_bytesize Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 088/126] NFSv4.1: Fix a potential layoutget/layoutrecall deadlock Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 089/126] MIPS: WARN_ON invalid DMA cache maintenance, not BUG_ON Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 090/126] RDMA/cma: Do not ignore net namespace for unbound cm_id Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 091/126] drm/i915: set DP Main Stream Attribute for color range on DDI platforms Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 092/126] inet: frags: change inet_frags_init_net() return value Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 093/126] inet: frags: add a pointer to struct netns_frags Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 094/126] inet: frags: refactor ipfrag_init() Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 095/126] inet: frags: Convert timers to use timer_setup() Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 096/126] inet: frags: refactor ipv6_frag_init() Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 097/126] inet: frags: refactor lowpan_net_frag_init() Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 098/126] ipv6: export ip6 fragments sysctl to unprivileged users Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 099/126] rhashtable: add schedule points Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 100/126] inet: frags: use rhashtables for reassembly units Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 101/126] inet: frags: remove some helpers Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 102/126] inet: frags: get rif of inet_frag_evicting() Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 103/126] inet: frags: remove inet_frag_maybe_warn_overflow() Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 104/126] inet: frags: break the 2GB limit for frags storage Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 105/126] inet: frags: do not clone skb in ip_expire() Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 106/126] ipv6: frags: rewrite ip6_expire_frag_queue() Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 107/126] rhashtable: reorganize struct rhashtable layout Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 108/126] inet: frags: reorganize struct netns_frags Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 109/126] inet: frags: get rid of ipfrag_skb_cb/FRAG_CB Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 110/126] inet: frags: fix ip6frag_low_thresh boundary Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 111/126] ip: discard IPv4 datagrams with overlapping segments Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 112/126] net: speed up skb_rbtree_purge() Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 113/126] net: modify skb_rbtree_purge to return the truesize of all purged skbs Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 114/126] ipv6: defrag: drop non-last frags smaller than min mtu Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 115/126] net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 116/126] net: add rb_to_skb() and other rb tree helpers Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 117/126] net: sk_buff rbnode reorg Greg Kroah-Hartman
2018-10-04 20:13   ` Mitch Harder
2018-11-29 10:33     ` Greg Kroah-Hartman
2018-11-29 15:07       ` Mitch Harder
2018-09-17 22:42 ` [PATCH 4.14 118/126] ipv4: frags: precedence bug in ip_expire() Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 119/126] ip: add helpers to process in-order fragments faster Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 120/126] ip: process in-order fragments efficiently Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 121/126] ip: frags: fix crash in ip_do_fragment() Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 122/126] mtd: ubi: wl: Fix error return code in ubi_wl_init() Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 123/126] tun: fix use after free for ptr_ring Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 124/126] tuntap: fix use after free during release Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 125/126] autofs: fix autofs_sbi() does not check super block type Greg Kroah-Hartman
2018-09-17 22:42 ` [PATCH 4.14 126/126] mm: get rid of vmacache_flush_all() entirely Greg Kroah-Hartman
2018-09-17 23:59 ` [PATCH 4.14 000/126] 4.14.71-stable review Nathan Chancellor
2018-09-18  7:44   ` Greg Kroah-Hartman
2018-09-18 16:20 ` Guenter Roeck
2018-09-18 16:53 ` Naresh Kamboju

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).