stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 5.19 000/158] 5.19.6-rc1 review
@ 2022-08-29 10:57 Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 001/158] mm/gup: fix FOLL_FORCE COW security issue and remove FOLL_COW Greg Kroah-Hartman
                   ` (170 more replies)
  0 siblings, 171 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, slade

This is the start of the stable review cycle for the 5.19.6 release.
There are 158 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed, 31 Aug 2022 10:57:37 +0000.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.6-rc1.gz
or in the git tree and branch at:
	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 5.19.6-rc1

Daniel Borkmann <daniel@iogearbox.net>
    bpf: Don't use tnum_range on array range checking for poke descriptors

Conor Dooley <conor.dooley@microchip.com>
    riscv: dts: microchip: mpfs: remove pci axi address translation property

Conor Dooley <conor.dooley@microchip.com>
    riscv: dts: microchip: mpfs: remove bogus card-detect-delay

Conor Dooley <conor.dooley@microchip.com>
    riscv: dts: microchip: mpfs: remove ti,fifo-depth property

Conor Dooley <conor.dooley@microchip.com>
    riscv: dts: microchip: mpfs: fix incorrect pcie child node name

Mike Christie <michael.christie@oracle.com>
    scsi: core: Fix passthrough retry counter handling

Saurabh Sengar <ssengar@linux.microsoft.com>
    scsi: storvsc: Remove WQ_MEM_RECLAIM from storvsc_error_wq

Kiwoong Kim <kwmad.kim@samsung.com>
    scsi: ufs: core: Enable link lost interrupt

Mark Brown <broonie@kernel.org>
    arm64/sme: Don't flush SVE register state when handling SME traps

Mark Brown <broonie@kernel.org>
    arm64/sme: Don't flush SVE register state when allocating SME storage

Mark Brown <broonie@kernel.org>
    arm64/signal: Flush FPSIMD register state when disabling streaming mode

Mark Rutland <mark.rutland@arm.com>
    arm64: fix rodata=full

Ian Rogers <irogers@google.com>
    perf stat: Clear evsel->reset_group for each stat run

Stephane Eranian <eranian@google.com>
    perf/x86/intel/ds: Fix precise store latency handling

Stephane Eranian <eranian@google.com>
    perf/x86/intel/uncore: Fix broken read_counter() for SNB IMC PMU

James Clark <james.clark@arm.com>
    perf python: Fix build when PYTHON_CONFIG is user supplied

Yu Kuai <yukuai3@huawei.com>
    blk-mq: fix io hung due to missing commit_rqs

Salvatore Bonaccorso <carnil@debian.org>
    Documentation/ABI: Mention retbleed vulnerability info file for sysfs

Prike Liang <Prike.Liang@amd.com>
    drm/amdkfd: Fix isa version for the GC 10.3.7

Peter Zijlstra <peterz@infradead.org>
    x86/nospec: Fix i386 RSB stuffing

Liam Howlett <liam.howlett@oracle.com>
    binder_alloc: add missing mmap_lock calls when using the VMA

Zenghui Yu <yuzenghui@huawei.com>
    arm64: Fix match_list for erratum 1286807 on Arm Cortex-A76

Guoqing Jiang <guoqing.jiang@linux.dev>
    md: call __md_stop_writes in md_stop

Guoqing Jiang <guoqing.jiang@linux.dev>
    Revert "md-raid: destroy the bitmap after destroying the thread"

David Hildenbrand <david@redhat.com>
    mm/hugetlb: fix hugetlb not supporting softdirty tracking

Jens Axboe <axboe@kernel.dk>
    io_uring: fix issue with io_write() not always undoing sb_start_write()

Jiri Slaby <jirislaby@kernel.org>
    Revert "zram: remove double compression logic"

Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
    riscv: dts: microchip: correct L2 cache interrupts

Conor Dooley <conor.dooley@microchip.com>
    riscv: traps: add missing prototype

Conor Dooley <conor.dooley@microchip.com>
    riscv: signal: fix missing prototype warning

Juergen Gross <jgross@suse.com>
    xen/privcmd: fix error exit of privcmd_ioctl_dm_op()

Heming Zhao <ocfs2-devel@oss.oracle.com>
    ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown

David Howells <dhowells@redhat.com>
    smb3: missing inode locks in punch hole

Karol Herbst <kherbst@redhat.com>
    nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf

Riwen Lu <luriwen@kylinos.cn>
    ACPI: processor: Remove freq Qos request for all CPUs

Matthew Wilcox (Oracle) <willy@infradead.org>
    shmem: update folio if shmem_replace_page() updates the page

Shakeel Butt <shakeelb@google.com>
    Revert "memcg: cleanup racy sum avoidance code"

Shigeru Yoshida <syoshida@redhat.com>
    fbdev: fbcon: Properly revert changes when vc_resize() failed

Brian Foster <bfoster@redhat.com>
    s390: fix double free of GS and RI CBs on fork() failure

Paulo Alcantara <pc@cjr.nz>
    cifs: skip extra NULL byte in filenames

Peter Xu <peterx@redhat.com>
    mm/mprotect: only reference swap pfn page if type match

Miaohe Lin <linmiaohe@huawei.com>
    mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte

Liu Shixin <liushixin2@huawei.com>
    bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem

Gerald Schaefer <gerald.schaefer@linux.ibm.com>
    s390/mm: do not trigger write fault when vma does not allow VM_WRITE

Badari Pulavarty <badari.pulavarty@intel.com>
    mm/damon/dbgfs: avoid duplicate context directory creation

Quanyang Wang <quanyang.wang@windriver.com>
    asm-generic: sections: refactor memory_intersects

Richard Guy Briggs <rgb@redhat.com>
    audit: move audit_return_fixup before the filters

Khazhismel Kumykov <khazhy@chromium.org>
    writeback: avoid use-after-free after removing device

Siddh Raman Pant <code@siddh.me>
    loop: Check for overflow while configuring loop

Jan Beulich <jbeulich@suse.com>
    x86/PAT: Have pat_enabled() properly reflect state when running on Xen

Peter Zijlstra <peterz@infradead.org>
    x86/nospec: Unwreck the RSB stuffing

Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
    x86/bugs: Add "unknown" reporting for MMIO Stale Data

Tom Lendacky <thomas.lendacky@amd.com>
    x86/sev: Don't use cc_platform_has() for early SEV-SNP calls

Chen Zhongjin <chenzhongjin@huawei.com>
    x86/unwind/orc: Unwind ftrace trampolines with correct ORC entry

Juergen Gross <jgross@suse.com>
    x86/entry: Fix entry_INT80_compat for Xen PV guests

Kan Liang <kan.liang@linux.intel.com>
    perf/x86/lbr: Enable the branch type for the Arch LBR by default

Kan Liang <kan.liang@linux.intel.com>
    perf/x86/intel: Fix pebs event constraints for ADL

Michael Roth <michael.roth@amd.com>
    x86/boot: Don't propagate uninitialized boot_params->cc_blob_address

Filipe Manana <fdmanana@suse.com>
    btrfs: update generation of hole file extent item when merging holes

Zixuan Fu <r33s3n6@gmail.com>
    btrfs: fix possible memory leak in btrfs_get_dev_args_from_path()

Goldwyn Rodrigues <rgoldwyn@suse.de>
    btrfs: check if root is readonly while setting security xattr

Omar Sandoval <osandov@fb.com>
    btrfs: fix space cache corruption and potential double allocations

Anand Jain <anand.jain@oracle.com>
    btrfs: add info when mount fails due to stale replace target

Anand Jain <anand.jain@oracle.com>
    btrfs: replace: drop assert for suspended replace

Filipe Manana <fdmanana@suse.com>
    btrfs: fix silent failure when deleting root reference

Aleksander Jan Bajkowski <olek2@wp.pl>
    net: lantiq_xrx200: restore buffer if memory allocation failed

Aleksander Jan Bajkowski <olek2@wp.pl>
    net: lantiq_xrx200: fix lock under memory pressure

Aleksander Jan Bajkowski <olek2@wp.pl>
    net: lantiq_xrx200: confirm skb is allocated before using

Heiner Kallweit <hkallweit1@gmail.com>
    net: stmmac: work around sporadic tx issue on link-up

R Mohamed Shah <mohamed@pensando.io>
    ionic: VF initial random MAC address if no assigned mac

Shannon Nelson <snelson@pensando.io>
    ionic: fix up issues with handling EAGAIN on FW cmds

Shannon Nelson <snelson@pensando.io>
    ionic: clear broken state on generation change

David Howells <dhowells@redhat.com>
    rxrpc: Fix locking in rxrpc's sendmsg

Lorenzo Bianconi <lorenzo@kernel.org>
    net: ethernet: mtk_eth_soc: fix hw hash reporting for MTK_NETSYS_V2

Lorenzo Bianconi <lorenzo@kernel.org>
    net: ethernet: mtk_eth_soc: enable rx cksum offload for MTK_NETSYS_V2

Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
    i40e: Fix incorrect address type for IPv6 flow rules

Jacob Keller <jacob.e.keller@intel.com>
    ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter

Kuniyuki Iwashima <kuniyu@amazon.com>
    net: Fix a data-race around sysctl_somaxconn.

Kuniyuki Iwashima <kuniyu@amazon.com>
    net: Fix a data-race around netdev_unregister_timeout_secs.

Kuniyuki Iwashima <kuniyu@amazon.com>
    net: Fix a data-race around gro_normal_batch.

Kuniyuki Iwashima <kuniyu@amazon.com>
    net: Fix data-races around sysctl_devconf_inherit_init_net.

Kuniyuki Iwashima <kuniyu@amazon.com>
    net: Fix data-races around sysctl_fb_tunnels_only_for_init_net.

Kuniyuki Iwashima <kuniyu@amazon.com>
    net: Fix a data-race around netdev_budget_usecs.

Kuniyuki Iwashima <kuniyu@amazon.com>
    net: Fix data-races around sysctl_max_skb_frags.

Kuniyuki Iwashima <kuniyu@amazon.com>
    net: Fix a data-race around netdev_budget.

Kuniyuki Iwashima <kuniyu@amazon.com>
    net: Fix a data-race around sysctl_net_busy_read.

Kuniyuki Iwashima <kuniyu@amazon.com>
    net: Fix a data-race around sysctl_net_busy_poll.

Kuniyuki Iwashima <kuniyu@amazon.com>
    net: Fix a data-race around sysctl_tstamp_allow_data.

Kuniyuki Iwashima <kuniyu@amazon.com>
    net: Fix data-races around sysctl_optmem_max.

Kuniyuki Iwashima <kuniyu@amazon.com>
    ratelimit: Fix data-races in ___ratelimit().

Kuniyuki Iwashima <kuniyu@amazon.com>
    net: Fix data-races around netdev_tstamp_prequeue.

Kuniyuki Iwashima <kuniyu@amazon.com>
    net: Fix data-races around netdev_max_backlog.

Kuniyuki Iwashima <kuniyu@amazon.com>
    net: Fix data-races around weight_p and dev_weight_[rt]x_bias.

Kuniyuki Iwashima <kuniyu@amazon.com>
    net: Fix data-races around sysctl_[rw]mem_(max|default).

Pablo Neira Ayuso <pablo@netfilter.org>
    netfilter: flowtable: fix stuck flows on cleanup due to pending work

Pablo Neira Ayuso <pablo@netfilter.org>
    netfilter: flowtable: add function to invoke garbage collection immediately

Pablo Neira Ayuso <pablo@netfilter.org>
    netfilter: nf_tables: disallow binding to already bound chain

Pablo Neira Ayuso <pablo@netfilter.org>
    netfilter: nft_tunnel: restrict it to netdev family

Pablo Neira Ayuso <pablo@netfilter.org>
    netfilter: nft_osf: restrict osf to ipv4, ipv6 and inet families

Pablo Neira Ayuso <pablo@netfilter.org>
    netfilter: nf_tables: do not leave chain stats enabled on error

Pablo Neira Ayuso <pablo@netfilter.org>
    netfilter: nft_payload: do not truncate csum_offset and csum_type

Pablo Neira Ayuso <pablo@netfilter.org>
    netfilter: nft_payload: report ERANGE for too long offset and length

Pablo Neira Ayuso <pablo@netfilter.org>
    netfilter: nf_tables: make table handle allocation per-netns friendly

Pablo Neira Ayuso <pablo@netfilter.org>
    netfilter: nf_tables: disallow updates of implicit chain

Vikas Gupta <vikas.gupta@broadcom.com>
    bnxt_en: fix LRO/GRO_HW features in ndo_fix_features callback

Vikas Gupta <vikas.gupta@broadcom.com>
    bnxt_en: fix NQ resource accounting during vf creation on 57500 chips

Vikas Gupta <vikas.gupta@broadcom.com>
    bnxt_en: set missing reload flag in devlink features

Pavan Chebbi <pavan.chebbi@broadcom.com>
    bnxt_en: Use PAGE_SIZE to init buffer when multi buffer XDP is not in use

Florian Westphal <fw@strlen.de>
    netfilter: nft_tproxy: restrict to prerouting hook

Florian Westphal <fw@strlen.de>
    netfilter: ebtables: reject blobs that don't provide all entry points

Maciej Żenczykowski <maze@google.com>
    net: ipvtap - add __init/__exit annotations to module init/exit funcs

Jonathan Toppins <jtoppins@redhat.com>
    bonding: 802.3ad: fix no transmission of LACPDUs

Sergei Antonov <saproj@gmail.com>
    net: moxa: get rid of asymmetry in DMA mapping/unmapping

Xiaolei Wang <xiaolei.wang@windriver.com>
    net: phy: Don't WARN for PHY_READY state in mdio_bus_phy_resume()

Alex Elder <elder@linaro.org>
    net: ipa: don't assume SMEM is page-aligned

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: microchip: keep compatibility with device tree blobs with no phy-mode

Arun Ramadoss <arun.ramadoss@microchip.com>
    net: dsa: microchip: update the ksz_phylink_get_caps

Arun Ramadoss <arun.ramadoss@microchip.com>
    net: dsa: microchip: move the port mirror to ksz_common

Arun Ramadoss <arun.ramadoss@microchip.com>
    net: dsa: microchip: move vlan functionality to ksz_common

Arun Ramadoss <arun.ramadoss@microchip.com>
    net: dsa: microchip: move tag_protocol to ksz_common

Arun Ramadoss <arun.ramadoss@microchip.com>
    net: dsa: microchip: move switch chip_id detection to ksz_common

Arun Ramadoss <arun.ramadoss@microchip.com>
    net: dsa: microchip: ksz9477: cleanup the ksz9477_switch_detect

Maor Dickman <maord@nvidia.com>
    net/mlx5e: Fix wrong tc flag used when set hw-tc-offload off

Aya Levin <ayal@nvidia.com>
    net/mlx5e: Fix wrong application of the LRO state

Moshe Shemesh <moshe@nvidia.com>
    net/mlx5: Avoid false positive lockdep warning by adding lock_class_key

Roy Novich <royno@nvidia.com>
    net/mlx5: Fix cmd error logging for manage pages cmd

Vlad Buslov <vladbu@nvidia.com>
    net/mlx5: Disable irq when locking lag_lock

Eli Cohen <elic@nvidia.com>
    net/mlx5: Eswitch, Fix forwarding decision to uplink

Eli Cohen <elic@nvidia.com>
    net/mlx5: LAG, fix logic over MLX5_LAG_FLAG_NDEVS_READY

Vlad Buslov <vladbu@nvidia.com>
    net/mlx5e: Properly disable vlan strip on non-UL reps

Maciej Fijalkowski <maciej.fijalkowski@intel.com>
    ice: xsk: use Rx ring's XDP ring when picking NAPI context

Maciej Fijalkowski <maciej.fijalkowski@intel.com>
    ice: xsk: prohibit usage of non-balanced queue id

Duoming Zhou <duoming@zju.edu.cn>
    nfc: pn533: Fix use-after-free bugs caused by pn532_cmd_timeout

Hayes Wang <hayeswang@realtek.com>
    r8152: fix the RX FIFO settings when suspending

Hayes Wang <hayeswang@realtek.com>
    r8152: fix the units of some registers for RTL8156A

Bernard Pidoux <f6bvp@free.fr>
    rose: check NULL rose_loopback_neigh->loopback

Christian Brauner <brauner@kernel.org>
    ntfs: fix acl handling

Peter Xu <peterx@redhat.com>
    mm/smaps: don't access young/dirty bit if pte unpresent

Trond Myklebust <trond.myklebust@hammerspace.com>
    SUNRPC: RPC level errors should set task->tk_rpc_status

Olga Kornievskaia <kolga@netapp.com>
    NFSv4.2 fix problems with __nfs42_ssc_open

Sabrina Dubroca <sd@queasysnail.net>
    Revert "net: macsec: update SCI upon MAC address change."

Seth Forshee <sforshee@digitalocean.com>
    fs: require CAP_SYS_ADMIN in target namespace for idmapped mounts

Nikolay Aleksandrov <razor@blackwall.org>
    xfrm: policy: fix metadata dst->dev xmit null pointer dereference

Herbert Xu <herbert@gondor.apana.org.au>
    af_key: Do not call xfrm_probe_algs in parallel

Antony Antony <antony.antony@secunet.com>
    xfrm: clone missing x->lastused in xfrm_do_migrate

Antony Antony <antony.antony@secunet.com>
    Revert "xfrm: update SA curlft.use_time"

Xin Xiong <xiongx18@fudan.edu.cn>
    xfrm: fix refcount leak in __xfrm_policy_check()

Deren Wu <deren.wu@mediatek.com>
    mt76: mt7921: fix command timeout in AP stop period

David Hildenbrand <david@redhat.com>
    mm/hugetlb: support write-faults in shared mappings

Peter Xu <peterx@redhat.com>
    mm/uffd: reset write protection when unregister with wp-mode

Kuniyuki Iwashima <kuniyu@amazon.com>
    kprobes: don't call disarm_kprobe() for disabled kprobes

Randy Dunlap <rdunlap@infradead.org>
    kernel/sys_ni: add compat entry for fadvise64_64

Helge Deller <deller@gmx.de>
    parisc: Fix exception handler for fldw and fstw instructions

Helge Deller <deller@gmx.de>
    parisc: Make CONFIG_64BIT available for ARCH=parisc64 only

Jing-Ting Wu <Jing-Ting.Wu@mediatek.com>
    cgroup: Fix race condition at rebind_subsystems()

Gaosheng Cui <cuigaosheng1@huawei.com>
    audit: fix potential double free on error path from fsnotify_add_inode_mark

Trond Myklebust <trond.myklebust@hammerspace.com>
    NFS: Fix another fsync() issue after a server reboot

David Hildenbrand <david@redhat.com>
    mm/gup: fix FOLL_FORCE COW security issue and remove FOLL_COW


-------------

Diffstat:

 Documentation/ABI/testing/sysfs-devices-system-cpu |   1 +
 .../hw-vuln/processor_mmio_stale_data.rst          |  14 ++
 Documentation/admin-guide/kernel-parameters.txt    |   2 +
 Documentation/admin-guide/sysctl/net.rst           |   2 +-
 Makefile                                           |   4 +-
 arch/arm64/include/asm/fpsimd.h                    |   4 +-
 arch/arm64/include/asm/setup.h                     |  17 ++
 arch/arm64/kernel/cpu_errata.c                     |   2 +
 arch/arm64/kernel/fpsimd.c                         |  21 +--
 arch/arm64/kernel/ptrace.c                         |   6 +-
 arch/arm64/kernel/signal.c                         |  12 +-
 arch/arm64/mm/mmu.c                                |  18 ---
 arch/parisc/Kconfig                                |  21 +--
 arch/parisc/kernel/unaligned.c                     |   2 +-
 arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts  |   3 -
 arch/riscv/boot/dts/microchip/mpfs-polarberry.dts  |   3 -
 arch/riscv/boot/dts/microchip/mpfs.dtsi            |   5 +-
 arch/riscv/include/asm/signal.h                    |  12 ++
 arch/riscv/include/asm/thread_info.h               |   2 +
 arch/riscv/kernel/signal.c                         |   1 +
 arch/riscv/kernel/traps.c                          |   3 +-
 arch/s390/kernel/process.c                         |  22 ++-
 arch/s390/mm/fault.c                               |   4 +-
 arch/x86/boot/compressed/misc.h                    |  12 +-
 arch/x86/boot/compressed/sev.c                     |   8 +
 arch/x86/entry/entry_64_compat.S                   |   2 +-
 arch/x86/events/intel/ds.c                         |  12 +-
 arch/x86/events/intel/lbr.c                        |   8 +
 arch/x86/events/intel/uncore_snb.c                 |  18 ++-
 arch/x86/include/asm/cpufeatures.h                 |   5 +-
 arch/x86/include/asm/nospec-branch.h               |  92 ++++++-----
 arch/x86/kernel/cpu/bugs.c                         |  14 +-
 arch/x86/kernel/cpu/common.c                       |  42 +++--
 arch/x86/kernel/sev.c                              |  16 +-
 arch/x86/kernel/unwind_orc.c                       |  15 +-
 arch/x86/mm/pat/memtype.c                          |  10 +-
 block/blk-mq.c                                     |   5 +-
 drivers/acpi/processor_thermal.c                   |   2 +-
 drivers/android/binder_alloc.c                     |  31 ++--
 drivers/block/loop.c                               |   5 +
 drivers/block/zram/zram_drv.c                      |  42 +++--
 drivers/block/zram/zram_drv.h                      |   1 +
 drivers/gpu/drm/amd/amdkfd/kfd_device.c            |   6 +-
 drivers/gpu/drm/nouveau/nouveau_bo.c               |   9 ++
 drivers/md/md.c                                    |   3 +-
 drivers/net/bonding/bond_3ad.c                     |  38 ++---
 drivers/net/dsa/microchip/ksz8795.c                | 102 +++---------
 drivers/net/dsa/microchip/ksz8795_reg.h            |  16 --
 drivers/net/dsa/microchip/ksz9477.c                |  89 +++--------
 drivers/net/dsa/microchip/ksz9477_reg.h            |   1 -
 drivers/net/dsa/microchip/ksz_common.c             | 173 ++++++++++++++++++++-
 drivers/net/dsa/microchip/ksz_common.h             |  47 +++++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.c          |   5 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h          |   1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c  |   1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c    |   2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c      |  10 +-
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c     |   2 +-
 drivers/net/ethernet/intel/ice/ice.h               |  36 +++--
 drivers/net/ethernet/intel/ice/ice_lib.c           |   4 +-
 drivers/net/ethernet/intel/ice/ice_main.c          |  25 ++-
 drivers/net/ethernet/intel/ice/ice_xsk.c           |  18 ++-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c       |  59 +++++--
 drivers/net/ethernet/lantiq_xrx200.c               |   9 +-
 drivers/net/ethernet/mediatek/mtk_eth_soc.c        |  29 ++--
 drivers/net/ethernet/mediatek/mtk_eth_soc.h        |   5 +
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  12 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |   2 +
 .../ethernet/mellanox/mlx5/core/eswitch_offloads.c |   3 +-
 drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c  |  57 ++++---
 drivers/net/ethernet/mellanox/mlx5/core/main.c     |   4 +
 .../net/ethernet/mellanox/mlx5/core/pagealloc.c    |   9 +-
 drivers/net/ethernet/moxa/moxart_ether.c           |  11 +-
 drivers/net/ethernet/pensando/ionic/ionic_lif.c    |  95 ++++++++++-
 drivers/net/ethernet/pensando/ionic/ionic_main.c   |   4 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c    |   8 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |   9 +-
 drivers/net/ipa/ipa_mem.c                          |   2 +-
 drivers/net/ipvlan/ipvtap.c                        |   4 +-
 drivers/net/macsec.c                               |  11 +-
 drivers/net/phy/phy_device.c                       |   8 +-
 drivers/net/usb/r8152.c                            |  27 ++--
 .../net/wireless/mediatek/mt76/mt76_connac_mcu.c   |   2 +
 drivers/net/wireless/mediatek/mt76/mt7921/main.c   |  47 ++++--
 drivers/net/wireless/mediatek/mt76/mt7921/mcu.c    |   5 +-
 drivers/net/wireless/mediatek/mt76/mt7921/mt7921.h |   2 +
 drivers/nfc/pn533/uart.c                           |   1 +
 drivers/scsi/scsi_lib.c                            |   3 +-
 drivers/scsi/storvsc_drv.c                         |   2 +-
 drivers/video/fbdev/core/fbcon.c                   |  27 +++-
 drivers/xen/privcmd.c                              |  21 +--
 fs/btrfs/block-group.c                             |  47 ++----
 fs/btrfs/block-group.h                             |   4 +-
 fs/btrfs/ctree.h                                   |   1 -
 fs/btrfs/dev-replace.c                             |   5 +-
 fs/btrfs/extent-tree.c                             |  30 +---
 fs/btrfs/file.c                                    |   2 +
 fs/btrfs/root-tree.c                               |   5 +-
 fs/btrfs/volumes.c                                 |   5 +-
 fs/btrfs/xattr.c                                   |   3 +
 fs/cifs/smb2ops.c                                  |  12 +-
 fs/cifs/smb2pdu.c                                  |  16 +-
 fs/fs-writeback.c                                  |  12 +-
 fs/namespace.c                                     |   7 +
 fs/nfs/file.c                                      |  15 +-
 fs/nfs/inode.c                                     |   1 +
 fs/nfs/nfs4file.c                                  |   6 +
 fs/nfs/write.c                                     |   6 +-
 fs/ntfs3/xattr.c                                   |  16 +-
 fs/ocfs2/dlmglue.c                                 |   8 +-
 fs/ocfs2/super.c                                   |   3 +-
 fs/proc/task_mmu.c                                 |   7 +-
 fs/userfaultfd.c                                   |   4 +
 include/asm-generic/sections.h                     |   7 +-
 include/linux/memcontrol.h                         |  15 +-
 include/linux/mlx5/driver.h                        |   1 +
 include/linux/mm.h                                 |   1 -
 include/linux/netdevice.h                          |  20 ++-
 include/linux/netfilter_bridge/ebtables.h          |   4 -
 include/linux/nfs_fs.h                             |   1 +
 include/linux/userfaultfd_k.h                      |   2 +
 include/net/busy_poll.h                            |   2 +-
 include/net/gro.h                                  |   2 +-
 include/net/netfilter/nf_flow_table.h              |   3 +
 include/net/netfilter/nf_tables.h                  |   1 +
 include/ufs/ufshci.h                               |   6 +-
 init/main.c                                        |  18 ++-
 io_uring/io_uring.c                                |   7 +-
 kernel/audit_fsnotify.c                            |   1 +
 kernel/auditsc.c                                   |   4 +-
 kernel/bpf/verifier.c                              |  10 +-
 kernel/cgroup/cgroup.c                             |   1 +
 kernel/kprobes.c                                   |   9 +-
 kernel/sys_ni.c                                    |   1 +
 lib/ratelimit.c                                    |  12 +-
 mm/backing-dev.c                                   |  10 +-
 mm/bootmem_info.c                                  |   2 +
 mm/damon/dbgfs.c                                   |   3 +
 mm/gup.c                                           |  69 +++++---
 mm/huge_memory.c                                   |  65 +++++---
 mm/hugetlb.c                                       |  28 +++-
 mm/mmap.c                                          |   8 +-
 mm/mprotect.c                                      |   3 +-
 mm/page-writeback.c                                |   6 +-
 mm/shmem.c                                         |   1 +
 mm/userfaultfd.c                                   |  29 ++--
 net/bridge/netfilter/ebtable_broute.c              |   8 -
 net/bridge/netfilter/ebtable_filter.c              |   8 -
 net/bridge/netfilter/ebtable_nat.c                 |   8 -
 net/bridge/netfilter/ebtables.c                    |   8 +-
 net/core/bpf_sk_storage.c                          |   5 +-
 net/core/dev.c                                     |  20 +--
 net/core/filter.c                                  |  13 +-
 net/core/gro_cells.c                               |   2 +-
 net/core/skbuff.c                                  |   2 +-
 net/core/sock.c                                    |  18 ++-
 net/core/sysctl_net_core.c                         |  15 +-
 net/ipv4/devinet.c                                 |  16 +-
 net/ipv4/ip_output.c                               |   2 +-
 net/ipv4/ip_sockglue.c                             |   6 +-
 net/ipv4/tcp.c                                     |   4 +-
 net/ipv4/tcp_output.c                              |   2 +-
 net/ipv6/addrconf.c                                |   5 +-
 net/ipv6/ipv6_sockglue.c                           |   4 +-
 net/key/af_key.c                                   |   3 +
 net/mptcp/protocol.c                               |   2 +-
 net/netfilter/ipvs/ip_vs_sync.c                    |   4 +-
 net/netfilter/nf_flow_table_core.c                 |  15 +-
 net/netfilter/nf_flow_table_offload.c              |   8 +
 net/netfilter/nf_tables_api.c                      |  14 +-
 net/netfilter/nft_osf.c                            |  18 ++-
 net/netfilter/nft_payload.c                        |  29 +++-
 net/netfilter/nft_tproxy.c                         |   8 +
 net/netfilter/nft_tunnel.c                         |   1 +
 net/rose/rose_loopback.c                           |   3 +-
 net/rxrpc/call_object.c                            |   4 +-
 net/rxrpc/sendmsg.c                                |  92 ++++++-----
 net/sched/sch_generic.c                            |   2 +-
 net/socket.c                                       |   2 +-
 net/sunrpc/clnt.c                                  |   2 +-
 net/xfrm/espintcp.c                                |   2 +-
 net/xfrm/xfrm_input.c                              |   3 +-
 net/xfrm/xfrm_output.c                             |   1 -
 net/xfrm/xfrm_policy.c                             |   3 +-
 net/xfrm/xfrm_state.c                              |   1 +
 tools/perf/Makefile.config                         |   2 +-
 tools/perf/builtin-stat.c                          |   1 +
 187 files changed, 1603 insertions(+), 917 deletions(-)



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 001/158] mm/gup: fix FOLL_FORCE COW security issue and remove FOLL_COW
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 002/158] NFS: Fix another fsync() issue after a server reboot Greg Kroah-Hartman
                   ` (169 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, David Hildenbrand, Axel Rasmussen,
	Nadav Amit, Peter Xu, Hugh Dickins, Andrea Arcangeli,
	Matthew Wilcox, Vlastimil Babka, John Hubbard, Jason Gunthorpe,
	David Laight, Andrew Morton

From: David Hildenbrand <david@redhat.com>

commit 5535be3099717646781ce1540cf725965d680e7b upstream.

Ever since the Dirty COW (CVE-2016-5195) security issue happened, we know
that FOLL_FORCE can be possibly dangerous, especially if there are races
that can be exploited by user space.

Right now, it would be sufficient to have some code that sets a PTE of a
R/O-mapped shared page dirty, in order for it to erroneously become
writable by FOLL_FORCE.  The implications of setting a write-protected PTE
dirty might not be immediately obvious to everyone.

And in fact ever since commit 9ae0f87d009c ("mm/shmem: unconditionally set
pte dirty in mfill_atomic_install_pte"), we can use UFFDIO_CONTINUE to map
a shmem page R/O while marking the pte dirty.  This can be used by
unprivileged user space to modify tmpfs/shmem file content even if the
user does not have write permissions to the file, and to bypass memfd
write sealing -- Dirty COW restricted to tmpfs/shmem (CVE-2022-2590).

To fix such security issues for good, the insight is that we really only
need that fancy retry logic (FOLL_COW) for COW mappings that are not
writable (!VM_WRITE).  And in a COW mapping, we really only broke COW if
we have an exclusive anonymous page mapped.  If we have something else
mapped, or the mapped anonymous page might be shared (!PageAnonExclusive),
we have to trigger a write fault to break COW.  If we don't find an
exclusive anonymous page when we retry, we have to trigger COW breaking
once again because something intervened.

Let's move away from this mandatory-retry + dirty handling and rely on our
PageAnonExclusive() flag for making a similar decision, to use the same
COW logic as in other kernel parts here as well.  In case we stumble over
a PTE in a COW mapping that does not map an exclusive anonymous page, COW
was not properly broken and we have to trigger a fake write-fault to break
COW.

Just like we do in can_change_pte_writable() added via commit 64fe24a3e05e
("mm/mprotect: try avoiding write faults for exclusive anonymous pages
when changing protection") and commit 76aefad628aa ("mm/mprotect: fix
soft-dirty check in can_change_pte_writable()"), take care of softdirty
and uffd-wp manually.

For example, a write() via /proc/self/mem to a uffd-wp-protected range has
to fail instead of silently granting write access and bypassing the
userspace fault handler.  Note that FOLL_FORCE is not only used for debug
access, but also triggered by applications without debug intentions, for
example, when pinning pages via RDMA.

This fixes CVE-2022-2590. Note that only x86_64 and aarch64 are
affected, because only those support CONFIG_HAVE_ARCH_USERFAULTFD_MINOR.

Fortunately, FOLL_COW is no longer required to handle FOLL_FORCE. So
let's just get rid of it.

Thanks to Nadav Amit for pointing out that the pte_dirty() check in
FOLL_FORCE code is problematic and might be exploitable.

Note 1: We don't check for the PTE being dirty because it doesn't matter
	for making a "was COWed" decision anymore, and whoever modifies the
	page has to set the page dirty either way.

Note 2: Kernels before extended uffd-wp support and before
	PageAnonExclusive (< 5.19) can simply revert the problematic
	commit instead and be safe regarding UFFDIO_CONTINUE. A backport to
	v5.19 requires minor adjustments due to lack of
	vma_soft_dirty_enabled().

Link: https://lkml.kernel.org/r/20220809205640.70916-1-david@redhat.com
Fixes: 9ae0f87d009c ("mm/shmem: unconditionally set pte dirty in mfill_atomic_install_pte")
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: <stable@vger.kernel.org>	[5.16]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/mm.h |    1 
 mm/gup.c           |   69 ++++++++++++++++++++++++++++++++++++-----------------
 mm/huge_memory.c   |   65 +++++++++++++++++++++++++++++++++----------------
 3 files changed, 91 insertions(+), 44 deletions(-)

--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2939,7 +2939,6 @@ struct page *follow_page(struct vm_area_
 #define FOLL_MIGRATION	0x400	/* wait for page to replace migration entry */
 #define FOLL_TRIED	0x800	/* a retry, previous pass started an IO */
 #define FOLL_REMOTE	0x2000	/* we are working on non-current tsk/mm */
-#define FOLL_COW	0x4000	/* internal GUP flag */
 #define FOLL_ANON	0x8000	/* don't do file mappings */
 #define FOLL_LONGTERM	0x10000	/* mapping lifetime is indefinite: see below */
 #define FOLL_SPLIT_PMD	0x20000	/* split huge pmd before returning */
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -478,14 +478,43 @@ static int follow_pfn_pte(struct vm_area
 	return -EEXIST;
 }
 
-/*
- * FOLL_FORCE can write to even unwritable pte's, but only
- * after we've gone through a COW cycle and they are dirty.
- */
-static inline bool can_follow_write_pte(pte_t pte, unsigned int flags)
+/* FOLL_FORCE can write to even unwritable PTEs in COW mappings. */
+static inline bool can_follow_write_pte(pte_t pte, struct page *page,
+					struct vm_area_struct *vma,
+					unsigned int flags)
 {
-	return pte_write(pte) ||
-		((flags & FOLL_FORCE) && (flags & FOLL_COW) && pte_dirty(pte));
+	/* If the pte is writable, we can write to the page. */
+	if (pte_write(pte))
+		return true;
+
+	/* Maybe FOLL_FORCE is set to override it? */
+	if (!(flags & FOLL_FORCE))
+		return false;
+
+	/* But FOLL_FORCE has no effect on shared mappings */
+	if (vma->vm_flags & (VM_MAYSHARE | VM_SHARED))
+		return false;
+
+	/* ... or read-only private ones */
+	if (!(vma->vm_flags & VM_MAYWRITE))
+		return false;
+
+	/* ... or already writable ones that just need to take a write fault */
+	if (vma->vm_flags & VM_WRITE)
+		return false;
+
+	/*
+	 * See can_change_pte_writable(): we broke COW and could map the page
+	 * writable if we have an exclusive anonymous page ...
+	 */
+	if (!page || !PageAnon(page) || !PageAnonExclusive(page))
+		return false;
+
+	/* ... and a write-fault isn't required for other reasons. */
+	if (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) &&
+	    !(vma->vm_flags & VM_SOFTDIRTY) && !pte_soft_dirty(pte))
+		return false;
+	return !userfaultfd_pte_wp(vma, pte);
 }
 
 static struct page *follow_page_pte(struct vm_area_struct *vma,
@@ -528,12 +557,19 @@ retry:
 	}
 	if ((flags & FOLL_NUMA) && pte_protnone(pte))
 		goto no_page;
-	if ((flags & FOLL_WRITE) && !can_follow_write_pte(pte, flags)) {
-		pte_unmap_unlock(ptep, ptl);
-		return NULL;
-	}
 
 	page = vm_normal_page(vma, address, pte);
+
+	/*
+	 * We only care about anon pages in can_follow_write_pte() and don't
+	 * have to worry about pte_devmap() because they are never anon.
+	 */
+	if ((flags & FOLL_WRITE) &&
+	    !can_follow_write_pte(pte, page, vma, flags)) {
+		page = NULL;
+		goto out;
+	}
+
 	if (!page && pte_devmap(pte) && (flags & (FOLL_GET | FOLL_PIN))) {
 		/*
 		 * Only return device mapping pages in the FOLL_GET or FOLL_PIN
@@ -967,17 +1003,6 @@ static int faultin_page(struct vm_area_s
 		return -EBUSY;
 	}
 
-	/*
-	 * The VM_FAULT_WRITE bit tells us that do_wp_page has broken COW when
-	 * necessary, even if maybe_mkwrite decided not to set pte_write. We
-	 * can thus safely do subsequent page lookups as if they were reads.
-	 * But only do so when looping for pte_write is futile: in some cases
-	 * userspace may also be wanting to write to the gotten user page,
-	 * which a read fault here might prevent (a readonly page might get
-	 * reCOWed by userspace write).
-	 */
-	if ((ret & VM_FAULT_WRITE) && !(vma->vm_flags & VM_WRITE))
-		*flags |= FOLL_COW;
 	return 0;
 }
 
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -978,12 +978,6 @@ struct page *follow_devmap_pmd(struct vm
 
 	assert_spin_locked(pmd_lockptr(mm, pmd));
 
-	/*
-	 * When we COW a devmap PMD entry, we split it into PTEs, so we should
-	 * not be in this function with `flags & FOLL_COW` set.
-	 */
-	WARN_ONCE(flags & FOLL_COW, "mm: In follow_devmap_pmd with FOLL_COW set");
-
 	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
 	if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) ==
 			 (FOLL_PIN | FOLL_GET)))
@@ -1349,14 +1343,43 @@ fallback:
 	return VM_FAULT_FALLBACK;
 }
 
-/*
- * FOLL_FORCE can write to even unwritable pmd's, but only
- * after we've gone through a COW cycle and they are dirty.
- */
-static inline bool can_follow_write_pmd(pmd_t pmd, unsigned int flags)
+/* FOLL_FORCE can write to even unwritable PMDs in COW mappings. */
+static inline bool can_follow_write_pmd(pmd_t pmd, struct page *page,
+					struct vm_area_struct *vma,
+					unsigned int flags)
 {
-	return pmd_write(pmd) ||
-	       ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pmd_dirty(pmd));
+	/* If the pmd is writable, we can write to the page. */
+	if (pmd_write(pmd))
+		return true;
+
+	/* Maybe FOLL_FORCE is set to override it? */
+	if (!(flags & FOLL_FORCE))
+		return false;
+
+	/* But FOLL_FORCE has no effect on shared mappings */
+	if (vma->vm_flags & (VM_MAYSHARE | VM_SHARED))
+		return false;
+
+	/* ... or read-only private ones */
+	if (!(vma->vm_flags & VM_MAYWRITE))
+		return false;
+
+	/* ... or already writable ones that just need to take a write fault */
+	if (vma->vm_flags & VM_WRITE)
+		return false;
+
+	/*
+	 * See can_change_pte_writable(): we broke COW and could map the page
+	 * writable if we have an exclusive anonymous page ...
+	 */
+	if (!page || !PageAnon(page) || !PageAnonExclusive(page))
+		return false;
+
+	/* ... and a write-fault isn't required for other reasons. */
+	if (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) &&
+	    !(vma->vm_flags & VM_SOFTDIRTY) && !pmd_soft_dirty(pmd))
+		return false;
+	return !userfaultfd_huge_pmd_wp(vma, pmd);
 }
 
 struct page *follow_trans_huge_pmd(struct vm_area_struct *vma,
@@ -1365,12 +1388,16 @@ struct page *follow_trans_huge_pmd(struc
 				   unsigned int flags)
 {
 	struct mm_struct *mm = vma->vm_mm;
-	struct page *page = NULL;
+	struct page *page;
 
 	assert_spin_locked(pmd_lockptr(mm, pmd));
 
-	if (flags & FOLL_WRITE && !can_follow_write_pmd(*pmd, flags))
-		goto out;
+	page = pmd_page(*pmd);
+	VM_BUG_ON_PAGE(!PageHead(page) && !is_zone_device_page(page), page);
+
+	if ((flags & FOLL_WRITE) &&
+	    !can_follow_write_pmd(*pmd, page, vma, flags))
+		return NULL;
 
 	/* Avoid dumping huge zero page */
 	if ((flags & FOLL_DUMP) && is_huge_zero_pmd(*pmd))
@@ -1378,10 +1405,7 @@ struct page *follow_trans_huge_pmd(struc
 
 	/* Full NUMA hinting faults to serialise migration in fault paths */
 	if ((flags & FOLL_NUMA) && pmd_protnone(*pmd))
-		goto out;
-
-	page = pmd_page(*pmd);
-	VM_BUG_ON_PAGE(!PageHead(page) && !is_zone_device_page(page), page);
+		return NULL;
 
 	if (!pmd_write(*pmd) && gup_must_unshare(flags, page))
 		return ERR_PTR(-EMLINK);
@@ -1398,7 +1422,6 @@ struct page *follow_trans_huge_pmd(struc
 	page += (addr & ~HPAGE_PMD_MASK) >> PAGE_SHIFT;
 	VM_BUG_ON_PAGE(!PageCompound(page) && !is_zone_device_page(page), page);
 
-out:
 	return page;
 }
 



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 002/158] NFS: Fix another fsync() issue after a server reboot
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 001/158] mm/gup: fix FOLL_FORCE COW security issue and remove FOLL_COW Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 003/158] audit: fix potential double free on error path from fsnotify_add_inode_mark Greg Kroah-Hartman
                   ` (168 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Trond Myklebust

From: Trond Myklebust <trond.myklebust@hammerspace.com>

commit 67f4b5dc49913abcdb5cc736e73674e2f352f81d upstream.

Currently, when the writeback code detects a server reboot, it redirties
any pages that were not committed to disk, and it sets the flag
NFS_CONTEXT_RESEND_WRITES in the nfs_open_context of the file descriptor
that dirtied the file. While this allows the file descriptor in question
to redrive its own writes, it violates the fsync() requirement that we
should be synchronising all writes to disk.
While the problem is infrequent, we do see corner cases where an
untimely server reboot causes the fsync() call to abandon its attempt to
sync data to disk and causing data corruption issues due to missed error
conditions or similar.

In order to tighted up the client's ability to deal with this situation
without introducing livelocks, add a counter that records the number of
times pages are redirtied due to a server reboot-like condition, and use
that in fsync() to redrive the sync to disk.

Fixes: 2197e9b06c22 ("NFS: Fix up fsync() when the server rebooted")
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/nfs/file.c          |   15 ++++++---------
 fs/nfs/inode.c         |    1 +
 fs/nfs/write.c         |    6 ++++--
 include/linux/nfs_fs.h |    1 +
 4 files changed, 12 insertions(+), 11 deletions(-)

--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -221,8 +221,10 @@ nfs_file_fsync_commit(struct file *file,
 int
 nfs_file_fsync(struct file *file, loff_t start, loff_t end, int datasync)
 {
-	struct nfs_open_context *ctx = nfs_file_open_context(file);
 	struct inode *inode = file_inode(file);
+	struct nfs_inode *nfsi = NFS_I(inode);
+	long save_nredirtied = atomic_long_read(&nfsi->redirtied_pages);
+	long nredirtied;
 	int ret;
 
 	trace_nfs_fsync_enter(inode);
@@ -237,15 +239,10 @@ nfs_file_fsync(struct file *file, loff_t
 		ret = pnfs_sync_inode(inode, !!datasync);
 		if (ret != 0)
 			break;
-		if (!test_and_clear_bit(NFS_CONTEXT_RESEND_WRITES, &ctx->flags))
+		nredirtied = atomic_long_read(&nfsi->redirtied_pages);
+		if (nredirtied == save_nredirtied)
 			break;
-		/*
-		 * If nfs_file_fsync_commit detected a server reboot, then
-		 * resend all dirty pages that might have been covered by
-		 * the NFS_CONTEXT_RESEND_WRITES flag
-		 */
-		start = 0;
-		end = LLONG_MAX;
+		save_nredirtied = nredirtied;
 	}
 
 	trace_nfs_fsync_exit(inode, ret);
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -426,6 +426,7 @@ nfs_ilookup(struct super_block *sb, stru
 static void nfs_inode_init_regular(struct nfs_inode *nfsi)
 {
 	atomic_long_set(&nfsi->nrequests, 0);
+	atomic_long_set(&nfsi->redirtied_pages, 0);
 	INIT_LIST_HEAD(&nfsi->commit_info.list);
 	atomic_long_set(&nfsi->commit_info.ncommit, 0);
 	atomic_set(&nfsi->commit_info.rpcs_out, 0);
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1419,10 +1419,12 @@ static void nfs_initiate_write(struct nf
  */
 static void nfs_redirty_request(struct nfs_page *req)
 {
+	struct nfs_inode *nfsi = NFS_I(page_file_mapping(req->wb_page)->host);
+
 	/* Bump the transmission count */
 	req->wb_nio++;
 	nfs_mark_request_dirty(req);
-	set_bit(NFS_CONTEXT_RESEND_WRITES, &nfs_req_openctx(req)->flags);
+	atomic_long_inc(&nfsi->redirtied_pages);
 	nfs_end_page_writeback(req);
 	nfs_release_request(req);
 }
@@ -1892,7 +1894,7 @@ static void nfs_commit_release_pages(str
 		/* We have a mismatch. Write the page again */
 		dprintk_cont(" mismatch\n");
 		nfs_mark_request_dirty(req);
-		set_bit(NFS_CONTEXT_RESEND_WRITES, &nfs_req_openctx(req)->flags);
+		atomic_long_inc(&NFS_I(data->inode)->redirtied_pages);
 	next:
 		nfs_unlock_and_release_request(req);
 		/* Latency breaker */
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -182,6 +182,7 @@ struct nfs_inode {
 		/* Regular file */
 		struct {
 			atomic_long_t	nrequests;
+			atomic_long_t	redirtied_pages;
 			struct nfs_mds_commit_info commit_info;
 			struct mutex	commit_mutex;
 		};



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 003/158] audit: fix potential double free on error path from fsnotify_add_inode_mark
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 001/158] mm/gup: fix FOLL_FORCE COW security issue and remove FOLL_COW Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 002/158] NFS: Fix another fsync() issue after a server reboot Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 004/158] cgroup: Fix race condition at rebind_subsystems() Greg Kroah-Hartman
                   ` (167 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Gaosheng Cui, Jan Kara, Paul Moore

From: Gaosheng Cui <cuigaosheng1@huawei.com>

commit ad982c3be4e60c7d39c03f782733503cbd88fd2a upstream.

Audit_alloc_mark() assign pathname to audit_mark->path, on error path
from fsnotify_add_inode_mark(), fsnotify_put_mark will free memory
of audit_mark->path, but the caller of audit_alloc_mark will free
the pathname again, so there will be double free problem.

Fix this by resetting audit_mark->path to NULL pointer on error path
from fsnotify_add_inode_mark().

Cc: stable@vger.kernel.org
Fixes: 7b1293234084d ("fsnotify: Add group pointer in fsnotify_init_mark()")
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/audit_fsnotify.c |    1 +
 1 file changed, 1 insertion(+)

--- a/kernel/audit_fsnotify.c
+++ b/kernel/audit_fsnotify.c
@@ -102,6 +102,7 @@ struct audit_fsnotify_mark *audit_alloc_
 
 	ret = fsnotify_add_inode_mark(&audit_mark->mark, inode, 0);
 	if (ret < 0) {
+		audit_mark->path = NULL;
 		fsnotify_put_mark(&audit_mark->mark);
 		audit_mark = ERR_PTR(ret);
 	}



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 004/158] cgroup: Fix race condition at rebind_subsystems()
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 003/158] audit: fix potential double free on error path from fsnotify_add_inode_mark Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 005/158] parisc: Make CONFIG_64BIT available for ARCH=parisc64 only Greg Kroah-Hartman
                   ` (166 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jing-Ting Wu, Michal Koutný,
	Mukesh Ojha, Tejun Heo

From: Jing-Ting Wu <Jing-Ting.Wu@mediatek.com>

commit 763f4fb76e24959c370cdaa889b2492ba6175580 upstream.

Root cause:
The rebind_subsystems() is no lock held when move css object from A
list to B list,then let B's head be treated as css node at
list_for_each_entry_rcu().

Solution:
Add grace period before invalidating the removed rstat_css_node.

Reported-by: Jing-Ting Wu <jing-ting.wu@mediatek.com>
Suggested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Jing-Ting Wu <jing-ting.wu@mediatek.com>
Tested-by: Jing-Ting Wu <jing-ting.wu@mediatek.com>
Link: https://lore.kernel.org/linux-arm-kernel/d8f0bc5e2fb6ed259f9334c83279b4c011283c41.camel@mediatek.com/T/
Acked-by: Mukesh Ojha <quic_mojha@quicinc.com>
Fixes: a7df69b81aac ("cgroup: rstat: support cgroup1")
Cc: stable@vger.kernel.org # v5.13+
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/cgroup/cgroup.c |    1 +
 1 file changed, 1 insertion(+)

--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1811,6 +1811,7 @@ int rebind_subsystems(struct cgroup_root
 
 		if (ss->css_rstat_flush) {
 			list_del_rcu(&css->rstat_css_node);
+			synchronize_rcu();
 			list_add_rcu(&css->rstat_css_node,
 				     &dcgrp->rstat_css_list);
 		}



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 005/158] parisc: Make CONFIG_64BIT available for ARCH=parisc64 only
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (3 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 004/158] cgroup: Fix race condition at rebind_subsystems() Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 006/158] parisc: Fix exception handler for fldw and fstw instructions Greg Kroah-Hartman
                   ` (165 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Masahiro Yamada, Helge Deller, Randy Dunlap

From: Helge Deller <deller@gmx.de>

commit 3dcfb729b5f4a0c9b50742865cd5e6c4dbcc80dc upstream.

With this patch the ARCH= parameter decides if the
CONFIG_64BIT option will be set or not. This means, the
ARCH= parameter will give:

	ARCH=parisc	-> 32-bit kernel
	ARCH=parisc64	-> 64-bit kernel

This simplifies the usage of the other config options like
randconfig, allmodconfig and allyesconfig a lot and produces
the output which is expected for parisc64 (64-bit) vs. parisc (32-bit).

Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Cc: <stable@vger.kernel.org> # 5.15+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/parisc/Kconfig |   21 ++++++---------------
 1 file changed, 6 insertions(+), 15 deletions(-)

--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -147,10 +147,10 @@ menu "Processor type and features"
 
 choice
 	prompt "Processor type"
-	default PA7000
+	default PA7000 if "$(ARCH)" = "parisc"
 
 config PA7000
-	bool "PA7000/PA7100"
+	bool "PA7000/PA7100" if "$(ARCH)" = "parisc"
 	help
 	  This is the processor type of your CPU.  This information is
 	  used for optimizing purposes.  In order to compile a kernel
@@ -161,21 +161,21 @@ config PA7000
 	  which is required on some machines.
 
 config PA7100LC
-	bool "PA7100LC"
+	bool "PA7100LC" if "$(ARCH)" = "parisc"
 	help
 	  Select this option for the PCX-L processor, as used in the
 	  712, 715/64, 715/80, 715/100, 715/100XC, 725/100, 743, 748,
 	  D200, D210, D300, D310 and E-class
 
 config PA7200
-	bool "PA7200"
+	bool "PA7200" if "$(ARCH)" = "parisc"
 	help
 	  Select this option for the PCX-T' processor, as used in the
 	  C100, C110, J100, J110, J210XC, D250, D260, D350, D360,
 	  K100, K200, K210, K220, K400, K410 and K420
 
 config PA7300LC
-	bool "PA7300LC"
+	bool "PA7300LC" if "$(ARCH)" = "parisc"
 	help
 	  Select this option for the PCX-L2 processor, as used in the
 	  744, A180, B132L, B160L, B180L, C132L, C160L, C180L,
@@ -225,17 +225,8 @@ config MLONGCALLS
 	  Enabling this option will probably slow down your kernel.
 
 config 64BIT
-	bool "64-bit kernel"
+	def_bool "$(ARCH)" = "parisc64"
 	depends on PA8X00
-	help
-	  Enable this if you want to support 64bit kernel on PA-RISC platform.
-
-	  At the moment, only people willing to use more than 2GB of RAM,
-	  or having a 64bit-only capable PA-RISC machine should say Y here.
-
-	  Since there is no 64bit userland on PA-RISC, there is no point to
-	  enable this option otherwise. The 64bit kernel is significantly bigger
-	  and slower than the 32bit one.
 
 choice
 	prompt "Kernel page size"



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 006/158] parisc: Fix exception handler for fldw and fstw instructions
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 005/158] parisc: Make CONFIG_64BIT available for ARCH=parisc64 only Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 007/158] kernel/sys_ni: add compat entry for fadvise64_64 Greg Kroah-Hartman
                   ` (164 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Helge Deller

From: Helge Deller <deller@gmx.de>

commit 7ae1f5508d9a33fd58ed3059bd2d569961e3b8bd upstream.

The exception handler is broken for unaligned memory acceses with fldw
and fstw instructions, because it trashes or uses randomly some other
floating point register than the one specified in the instruction word
on loads and stores.

The instruction "fldw 0(addr),%fr22L" (and the other fldw/fstw
instructions) encode the target register (%fr22) in the rightmost 5 bits
of the instruction word. The 7th rightmost bit of the instruction word
defines if the left or right half of %fr22 should be used.

While processing unaligned address accesses, the FR3() define is used to
extract the offset into the local floating-point register set.  But the
calculation in FR3() was buggy, so that for example instead of %fr22,
register %fr12 [((22 * 2) & 0x1f) = 12] was used.

This bug has been since forever in the parisc kernel and I wonder why it
wasn't detected earlier. Interestingly I noticed this bug just because
the libime debian package failed to build on *native* hardware, while it
successfully built in qemu.

This patch corrects the bitshift and masking calculation in FR3().

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/parisc/kernel/unaligned.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/parisc/kernel/unaligned.c
+++ b/arch/parisc/kernel/unaligned.c
@@ -93,7 +93,7 @@
 #define R1(i) (((i)>>21)&0x1f)
 #define R2(i) (((i)>>16)&0x1f)
 #define R3(i) ((i)&0x1f)
-#define FR3(i) ((((i)<<1)&0x1f)|(((i)>>6)&1))
+#define FR3(i) ((((i)&0x1f)<<1)|(((i)>>6)&1))
 #define IM(i,n) (((i)>>1&((1<<(n-1))-1))|((i)&1?((0-1L)<<(n-1)):0))
 #define IM5_2(i) IM((i)>>16,5)
 #define IM5_3(i) IM((i),5)



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 007/158] kernel/sys_ni: add compat entry for fadvise64_64
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (5 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 006/158] parisc: Fix exception handler for fldw and fstw instructions Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 008/158] kprobes: dont call disarm_kprobe() for disabled kprobes Greg Kroah-Hartman
                   ` (163 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Randy Dunlap, Arnd Bergmann,
	Josh Triplett, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Andrew Morton

From: Randy Dunlap <rdunlap@infradead.org>

commit a8faed3a02eeb75857a3b5d660fa80fe79db77a3 upstream.

When CONFIG_ADVISE_SYSCALLS is not set/enabled and CONFIG_COMPAT is
set/enabled, the riscv compat_syscall_table references
'compat_sys_fadvise64_64', which is not defined:

riscv64-linux-ld: arch/riscv/kernel/compat_syscall_table.o:(.rodata+0x6f8):
undefined reference to `compat_sys_fadvise64_64'

Add 'fadvise64_64' to kernel/sys_ni.c as a conditional COMPAT function so
that when CONFIG_ADVISE_SYSCALLS is not set, there is a fallback function
available.

Link: https://lkml.kernel.org/r/20220807220934.5689-1-rdunlap@infradead.org
Fixes: d3ac21cacc24 ("mm: Support compiling out madvise and fadvise")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/sys_ni.c |    1 +
 1 file changed, 1 insertion(+)

--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -277,6 +277,7 @@ COND_SYSCALL(landlock_restrict_self);
 
 /* mm/fadvise.c */
 COND_SYSCALL(fadvise64_64);
+COND_SYSCALL_COMPAT(fadvise64_64);
 
 /* mm/, CONFIG_MMU only */
 COND_SYSCALL(swapon);



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 008/158] kprobes: dont call disarm_kprobe() for disabled kprobes
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (6 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 007/158] kernel/sys_ni: add compat entry for fadvise64_64 Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 009/158] mm/uffd: reset write protection when unregister with wp-mode Greg Kroah-Hartman
                   ` (162 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, Ayushman Dutta,
	Naveen N. Rao, Anil S Keshavamurthy, David S. Miller,
	Masami Hiramatsu, Wang Nan, Kuniyuki Iwashima, Andrew Morton

From: Kuniyuki Iwashima <kuniyu@amazon.com>

commit 9c80e79906b4ca440d09e7f116609262bb747909 upstream.

The assumption in __disable_kprobe() is wrong, and it could try to disarm
an already disarmed kprobe and fire the WARN_ONCE() below. [0]  We can
easily reproduce this issue.

1. Write 0 to /sys/kernel/debug/kprobes/enabled.

  # echo 0 > /sys/kernel/debug/kprobes/enabled

2. Run execsnoop.  At this time, one kprobe is disabled.

  # /usr/share/bcc/tools/execsnoop &
  [1] 2460
  PCOMM            PID    PPID   RET ARGS

  # cat /sys/kernel/debug/kprobes/list
  ffffffff91345650  r  __x64_sys_execve+0x0    [FTRACE]
  ffffffff91345650  k  __x64_sys_execve+0x0    [DISABLED][FTRACE]

3. Write 1 to /sys/kernel/debug/kprobes/enabled, which changes
   kprobes_all_disarmed to false but does not arm the disabled kprobe.

  # echo 1 > /sys/kernel/debug/kprobes/enabled

  # cat /sys/kernel/debug/kprobes/list
  ffffffff91345650  r  __x64_sys_execve+0x0    [FTRACE]
  ffffffff91345650  k  __x64_sys_execve+0x0    [DISABLED][FTRACE]

4. Kill execsnoop, when __disable_kprobe() calls disarm_kprobe() for the
   disabled kprobe and hits the WARN_ONCE() in __disarm_kprobe_ftrace().

  # fg
  /usr/share/bcc/tools/execsnoop
  ^C

Actually, WARN_ONCE() is fired twice, and __unregister_kprobe_top() misses
some cleanups and leaves the aggregated kprobe in the hash table.  Then,
__unregister_trace_kprobe() initialises tk->rp.kp.list and creates an
infinite loop like this.

  aggregated kprobe.list -> kprobe.list -.
                                     ^    |
                                     '.__.'

In this situation, these commands fall into the infinite loop and result
in RCU stall or soft lockup.

  cat /sys/kernel/debug/kprobes/list : show_kprobe_addr() enters into the
                                       infinite loop with RCU.

  /usr/share/bcc/tools/execsnoop : warn_kprobe_rereg() holds kprobe_mutex,
                                   and __get_valid_kprobe() is stuck in
				   the loop.

To avoid the issue, make sure we don't call disarm_kprobe() for disabled
kprobes.

[0]
Failed to disarm kprobe-ftrace at __x64_sys_execve+0x0/0x40 (error -2)
WARNING: CPU: 6 PID: 2460 at kernel/kprobes.c:1130 __disarm_kprobe_ftrace.isra.19 (kernel/kprobes.c:1129)
Modules linked in: ena
CPU: 6 PID: 2460 Comm: execsnoop Not tainted 5.19.0+ #28
Hardware name: Amazon EC2 c5.2xlarge/, BIOS 1.0 10/16/2017
RIP: 0010:__disarm_kprobe_ftrace.isra.19 (kernel/kprobes.c:1129)
Code: 24 8b 02 eb c1 80 3d c4 83 f2 01 00 75 d4 48 8b 75 00 89 c2 48 c7 c7 90 fa 0f 92 89 04 24 c6 05 ab 83 01 e8 e4 94 f0 ff <0f> 0b 8b 04 24 eb b1 89 c6 48 c7 c7 60 fa 0f 92 89 04 24 e8 cc 94
RSP: 0018:ffff9e6ec154bd98 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffffffff930f7b00 RCX: 0000000000000001
RDX: 0000000080000001 RSI: ffffffff921461c5 RDI: 00000000ffffffff
RBP: ffff89c504286da8 R08: 0000000000000000 R09: c0000000fffeffff
R10: 0000000000000000 R11: ffff9e6ec154bc28 R12: ffff89c502394e40
R13: ffff89c502394c00 R14: ffff9e6ec154bc00 R15: 0000000000000000
FS:  00007fe800398740(0000) GS:ffff89c812d80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000c00057f010 CR3: 0000000103b54006 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
 __disable_kprobe (kernel/kprobes.c:1716)
 disable_kprobe (kernel/kprobes.c:2392)
 __disable_trace_kprobe (kernel/trace/trace_kprobe.c:340)
 disable_trace_kprobe (kernel/trace/trace_kprobe.c:429)
 perf_trace_event_unreg.isra.2 (./include/linux/tracepoint.h:93 kernel/trace/trace_event_perf.c:168)
 perf_kprobe_destroy (kernel/trace/trace_event_perf.c:295)
 _free_event (kernel/events/core.c:4971)
 perf_event_release_kernel (kernel/events/core.c:5176)
 perf_release (kernel/events/core.c:5186)
 __fput (fs/file_table.c:321)
 task_work_run (./include/linux/sched.h:2056 (discriminator 1) kernel/task_work.c:179 (discriminator 1))
 exit_to_user_mode_prepare (./include/linux/resume_user_mode.h:49 kernel/entry/common.c:169 kernel/entry/common.c:201)
 syscall_exit_to_user_mode (./arch/x86/include/asm/jump_label.h:55 ./arch/x86/include/asm/nospec-branch.h:384 ./arch/x86/include/asm/entry-common.h:94 kernel/entry/common.c:133 kernel/entry/common.c:296)
 do_syscall_64 (arch/x86/entry/common.c:87)
 entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
RIP: 0033:0x7fe7ff210654
Code: 15 79 89 20 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb be 0f 1f 00 8b 05 9a cd 20 00 48 63 ff 85 c0 75 11 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3a f3 c3 48 83 ec 18 48 89 7c 24 08 e8 34 fc
RSP: 002b:00007ffdbd1d3538 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000008 RCX: 00007fe7ff210654
RDX: 0000000000000000 RSI: 0000000000002401 RDI: 0000000000000008
RBP: 0000000000000000 R08: 94ae31d6fda838a4 R0900007fe8001c9d30
R10: 00007ffdbd1d34b0 R11: 0000000000000246 R12: 00007ffdbd1d3600
R13: 0000000000000000 R14: fffffffffffffffc R15: 00007ffdbd1d3560
</TASK>

Link: https://lkml.kernel.org/r/20220813020509.90805-1-kuniyu@amazon.com
Fixes: 69d54b916d83 ("kprobes: makes kprobes/enabled works correctly for optimized kprobes.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reported-by: Ayushman Dutta <ayudutta@amazon.com>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Kuniyuki Iwashima <kuni1840@gmail.com>
Cc: Ayushman Dutta <ayudutta@amazon.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/kprobes.c |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1707,11 +1707,12 @@ static struct kprobe *__disable_kprobe(s
 		/* Try to disarm and disable this/parent probe */
 		if (p == orig_p || aggr_kprobe_disabled(orig_p)) {
 			/*
-			 * If 'kprobes_all_disarmed' is set, 'orig_p'
-			 * should have already been disarmed, so
-			 * skip unneed disarming process.
+			 * Don't be lazy here.  Even if 'kprobes_all_disarmed'
+			 * is false, 'orig_p' might not have been armed yet.
+			 * Note arm_all_kprobes() __tries__ to arm all kprobes
+			 * on the best effort basis.
 			 */
-			if (!kprobes_all_disarmed) {
+			if (!kprobes_all_disarmed && !kprobe_disabled(orig_p)) {
 				ret = disarm_kprobe(orig_p, true);
 				if (ret) {
 					p->flags &= ~KPROBE_FLAG_DISABLED;



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 009/158] mm/uffd: reset write protection when unregister with wp-mode
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (7 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 008/158] kprobes: dont call disarm_kprobe() for disabled kprobes Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 010/158] mm/hugetlb: support write-faults in shared mappings Greg Kroah-Hartman
                   ` (161 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Peter Xu, David Hildenbrand,
	Mike Rapoport, Mike Kravetz, Andrea Arcangeli, Nadav Amit,
	Axel Rasmussen, Andrew Morton

From: Peter Xu <peterx@redhat.com>

commit f369b07c861435bd812a9d14493f71b34132ed6f upstream.

The motivation of this patch comes from a recent report and patchfix from
David Hildenbrand on hugetlb shared handling of wr-protected page [1].

With the reproducer provided in commit message of [1], one can leverage
the uffd-wp lazy-reset of ptes to trigger a hugetlb issue which can affect
not only the attacker process, but also the whole system.

The lazy-reset mechanism of uffd-wp was used to make unregister faster,
meanwhile it has an assumption that any leftover pgtable entries should
only affect the process on its own, so not only the user should be aware
of anything it does, but also it should not affect outside of the process.

But it seems that this is not true, and it can also be utilized to make
some exploit easier.

So far there's no clue showing that the lazy-reset is important to any
userfaultfd users because normally the unregister will only happen once
for a specific range of memory of the lifecycle of the process.

Considering all above, what this patch proposes is to do explicit pte
resets when unregister an uffd region with wr-protect mode enabled.

It should be the same as calling ioctl(UFFDIO_WRITEPROTECT, wp=false)
right before ioctl(UFFDIO_UNREGISTER) for the user.  So potentially it'll
make the unregister slower.  From that pov it's a very slight abi change,
but hopefully nothing should break with this change either.

Regarding to the change itself - core of uffd write [un]protect operation
is moved into a separate function (uffd_wp_range()) and it is reused in
the unregister code path.

Note that the new function will not check for anything, e.g.  ranges or
memory types, because they should have been checked during the previous
UFFDIO_REGISTER or it should have failed already.  It also doesn't check
mmap_changing because we're with mmap write lock held anyway.

I added a Fixes upon introducing of uffd-wp shmem+hugetlbfs because that's
the only issue reported so far and that's the commit David's reproducer
will start working (v5.19+).  But the whole idea actually applies to not
only file memories but also anonymous.  It's just that we don't need to
fix anonymous prior to v5.19- because there's no known way to exploit.

IOW, this patch can also fix the issue reported in [1] as the patch 2 does.

[1] https://lore.kernel.org/all/20220811103435.188481-3-david@redhat.com/

Link: https://lkml.kernel.org/r/20220811201340.39342-1-peterx@redhat.com
Fixes: b1f9e876862d ("mm/uffd: enable write protection for shmem & hugetlbfs")
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/userfaultfd.c              |  4 ++++
 include/linux/userfaultfd_k.h |  2 ++
 mm/userfaultfd.c              | 29 ++++++++++++++++++-----------
 3 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 1c44bf75f916..175de70e3adf 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1601,6 +1601,10 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
 			wake_userfault(vma->vm_userfaultfd_ctx.ctx, &range);
 		}
 
+		/* Reset ptes for the whole vma range if wr-protected */
+		if (userfaultfd_wp(vma))
+			uffd_wp_range(mm, vma, start, vma_end - start, false);
+
 		new_flags = vma->vm_flags & ~__VM_UFFD_FLAGS;
 		prev = vma_merge(mm, prev, start, vma_end, new_flags,
 				 vma->anon_vma, vma->vm_file, vma->vm_pgoff,
diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index 732b522bacb7..e1b8a915e9e9 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -73,6 +73,8 @@ extern ssize_t mcopy_continue(struct mm_struct *dst_mm, unsigned long dst_start,
 extern int mwriteprotect_range(struct mm_struct *dst_mm,
 			       unsigned long start, unsigned long len,
 			       bool enable_wp, atomic_t *mmap_changing);
+extern void uffd_wp_range(struct mm_struct *dst_mm, struct vm_area_struct *vma,
+			  unsigned long start, unsigned long len, bool enable_wp);
 
 /* mm helpers */
 static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma,
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 07d3befc80e4..7327b2573f7c 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -703,14 +703,29 @@ ssize_t mcopy_continue(struct mm_struct *dst_mm, unsigned long start,
 			      mmap_changing, 0);
 }
 
+void uffd_wp_range(struct mm_struct *dst_mm, struct vm_area_struct *dst_vma,
+		   unsigned long start, unsigned long len, bool enable_wp)
+{
+	struct mmu_gather tlb;
+	pgprot_t newprot;
+
+	if (enable_wp)
+		newprot = vm_get_page_prot(dst_vma->vm_flags & ~(VM_WRITE));
+	else
+		newprot = vm_get_page_prot(dst_vma->vm_flags);
+
+	tlb_gather_mmu(&tlb, dst_mm);
+	change_protection(&tlb, dst_vma, start, start + len, newprot,
+			  enable_wp ? MM_CP_UFFD_WP : MM_CP_UFFD_WP_RESOLVE);
+	tlb_finish_mmu(&tlb);
+}
+
 int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start,
 			unsigned long len, bool enable_wp,
 			atomic_t *mmap_changing)
 {
 	struct vm_area_struct *dst_vma;
 	unsigned long page_mask;
-	struct mmu_gather tlb;
-	pgprot_t newprot;
 	int err;
 
 	/*
@@ -750,15 +765,7 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start,
 			goto out_unlock;
 	}
 
-	if (enable_wp)
-		newprot = vm_get_page_prot(dst_vma->vm_flags & ~(VM_WRITE));
-	else
-		newprot = vm_get_page_prot(dst_vma->vm_flags);
-
-	tlb_gather_mmu(&tlb, dst_mm);
-	change_protection(&tlb, dst_vma, start, start + len, newprot,
-			  enable_wp ? MM_CP_UFFD_WP : MM_CP_UFFD_WP_RESOLVE);
-	tlb_finish_mmu(&tlb);
+	uffd_wp_range(dst_mm, dst_vma, start, len, enable_wp);
 
 	err = 0;
 out_unlock:
-- 
2.37.2




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 010/158] mm/hugetlb: support write-faults in shared mappings
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (8 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 009/158] mm/uffd: reset write protection when unregister with wp-mode Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 011/158] mt76: mt7921: fix command timeout in AP stop period Greg Kroah-Hartman
                   ` (160 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, David Hildenbrand, Mike Kravetz,
	Bjorn Helgaas, Cyrill Gorcunov, Hugh Dickins, Jamie Liu,
	Kirill A. Shutemov, Muchun Song, Naoya Horiguchi,
	Pavel Emelyanov, Peter Feiner, Peter Xu, Andrew Morton

From: David Hildenbrand <david@redhat.com>

commit 1d8d14641fd94a01b20a4abbf2749fd8eddcf57b upstream.

If we ever get a write-fault on a write-protected page in a shared
mapping, we'd be in trouble (again).  Instead, we can simply map the page
writable.

And in fact, there is even a way right now to trigger that code via
uffd-wp ever since we stared to support it for shmem in 5.19:

--------------------------------------------------------------------------
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <fcntl.h>
 #include <unistd.h>
 #include <errno.h>
 #include <sys/mman.h>
 #include <sys/syscall.h>
 #include <sys/ioctl.h>
 #include <linux/userfaultfd.h>

 #define HUGETLB_SIZE (2 * 1024 * 1024u)

 static char *map;
 int uffd;

 static int temp_setup_uffd(void)
 {
 	struct uffdio_api uffdio_api;
 	struct uffdio_register uffdio_register;
 	struct uffdio_writeprotect uffd_writeprotect;
 	struct uffdio_range uffd_range;

 	uffd = syscall(__NR_userfaultfd,
 		       O_CLOEXEC | O_NONBLOCK | UFFD_USER_MODE_ONLY);
 	if (uffd < 0) {
 		fprintf(stderr, "syscall() failed: %d\n", errno);
 		return -errno;
 	}

 	uffdio_api.api = UFFD_API;
 	uffdio_api.features = UFFD_FEATURE_PAGEFAULT_FLAG_WP;
 	if (ioctl(uffd, UFFDIO_API, &uffdio_api) < 0) {
 		fprintf(stderr, "UFFDIO_API failed: %d\n", errno);
 		return -errno;
 	}

 	if (!(uffdio_api.features & UFFD_FEATURE_PAGEFAULT_FLAG_WP)) {
 		fprintf(stderr, "UFFD_FEATURE_WRITEPROTECT missing\n");
 		return -ENOSYS;
 	}

 	/* Register UFFD-WP */
 	uffdio_register.range.start = (unsigned long) map;
 	uffdio_register.range.len = HUGETLB_SIZE;
 	uffdio_register.mode = UFFDIO_REGISTER_MODE_WP;
 	if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) < 0) {
 		fprintf(stderr, "UFFDIO_REGISTER failed: %d\n", errno);
 		return -errno;
 	}

 	/* Writeprotect a single page. */
 	uffd_writeprotect.range.start = (unsigned long) map;
 	uffd_writeprotect.range.len = HUGETLB_SIZE;
 	uffd_writeprotect.mode = UFFDIO_WRITEPROTECT_MODE_WP;
 	if (ioctl(uffd, UFFDIO_WRITEPROTECT, &uffd_writeprotect)) {
 		fprintf(stderr, "UFFDIO_WRITEPROTECT failed: %d\n", errno);
 		return -errno;
 	}

 	/* Unregister UFFD-WP without prior writeunprotection. */
 	uffd_range.start = (unsigned long) map;
 	uffd_range.len = HUGETLB_SIZE;
 	if (ioctl(uffd, UFFDIO_UNREGISTER, &uffd_range)) {
 		fprintf(stderr, "UFFDIO_UNREGISTER failed: %d\n", errno);
 		return -errno;
 	}

 	return 0;
 }

 int main(int argc, char **argv)
 {
 	int fd;

 	fd = open("/dev/hugepages/tmp", O_RDWR | O_CREAT);
 	if (!fd) {
 		fprintf(stderr, "open() failed\n");
 		return -errno;
 	}
 	if (ftruncate(fd, HUGETLB_SIZE)) {
 		fprintf(stderr, "ftruncate() failed\n");
 		return -errno;
 	}

 	map = mmap(NULL, HUGETLB_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
 	if (map == MAP_FAILED) {
 		fprintf(stderr, "mmap() failed\n");
 		return -errno;
 	}

 	*map = 0;

 	if (temp_setup_uffd())
 		return 1;

 	*map = 0;

 	return 0;
 }
--------------------------------------------------------------------------

Above test fails with SIGBUS when there is only a single free hugetlb page.
 # echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
 # ./test
 Bus error (core dumped)

And worse, with sufficient free hugetlb pages it will map an anonymous page
into a shared mapping, for example, messing up accounting during unmap
and breaking MAP_SHARED semantics:
 # echo 2 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
 # ./test
 # cat /proc/meminfo | grep HugePages_
 HugePages_Total:       2
 HugePages_Free:        1
 HugePages_Rsvd:    18446744073709551615
 HugePages_Surp:        0

Reason is that uffd-wp doesn't clear the uffd-wp PTE bit when
unregistering and consequently keeps the PTE writeprotected.  Reason for
this is to avoid the additional overhead when unregistering.  Note that
this is the case also for !hugetlb and that we will end up with writable
PTEs that still have the uffd-wp PTE bit set once we return from
hugetlb_wp().  I'm not touching the uffd-wp PTE bit for now, because it
seems to be a generic thing -- wp_page_reuse() also doesn't clear it.

VM_MAYSHARE handling in hugetlb_fault() for FAULT_FLAG_WRITE indicates
that MAP_SHARED handling was at least envisioned, but could never have
worked as expected.

While at it, make sure that we never end up in hugetlb_wp() on write
faults without VM_WRITE, because we don't support maybe_mkwrite()
semantics as commonly used in the !hugetlb case -- for example, in
wp_page_reuse().

Note that there is no need to do any kind of reservation in
hugetlb_fault() in this case ...  because we already have a hugetlb page
mapped R/O that we will simply map writable and we are not dealing with
COW/unsharing.

Link: https://lkml.kernel.org/r/20220811103435.188481-3-david@redhat.com
Fixes: b1f9e876862d ("mm/uffd: enable write protection for shmem & hugetlbfs")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jamie Liu <jamieliu@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>	[5.19]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/hugetlb.c |   26 +++++++++++++++++++-------
 1 file changed, 19 insertions(+), 7 deletions(-)

--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5232,6 +5232,21 @@ static vm_fault_t hugetlb_wp(struct mm_s
 	VM_BUG_ON(unshare && (flags & FOLL_WRITE));
 	VM_BUG_ON(!unshare && !(flags & FOLL_WRITE));
 
+	/*
+	 * hugetlb does not support FOLL_FORCE-style write faults that keep the
+	 * PTE mapped R/O such as maybe_mkwrite() would do.
+	 */
+	if (WARN_ON_ONCE(!unshare && !(vma->vm_flags & VM_WRITE)))
+		return VM_FAULT_SIGSEGV;
+
+	/* Let's take out MAP_SHARED mappings first. */
+	if (vma->vm_flags & VM_MAYSHARE) {
+		if (unlikely(unshare))
+			return 0;
+		set_huge_ptep_writable(vma, haddr, ptep);
+		return 0;
+	}
+
 	pte = huge_ptep_get(ptep);
 	old_page = pte_page(pte);
 
@@ -5766,12 +5781,11 @@ vm_fault_t hugetlb_fault(struct mm_struc
 	 * If we are going to COW/unshare the mapping later, we examine the
 	 * pending reservations for this page now. This will ensure that any
 	 * allocations necessary to record that reservation occur outside the
-	 * spinlock. For private mappings, we also lookup the pagecache
-	 * page now as it is used to determine if a reservation has been
-	 * consumed.
+	 * spinlock. Also lookup the pagecache page now as it is used to
+	 * determine if a reservation has been consumed.
 	 */
 	if ((flags & (FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE)) &&
-	    !huge_pte_write(entry)) {
+	    !(vma->vm_flags & VM_MAYSHARE) && !huge_pte_write(entry)) {
 		if (vma_needs_reservation(h, vma, haddr) < 0) {
 			ret = VM_FAULT_OOM;
 			goto out_mutex;
@@ -5779,9 +5793,7 @@ vm_fault_t hugetlb_fault(struct mm_struc
 		/* Just decrements count, does not deallocate */
 		vma_end_reservation(h, vma, haddr);
 
-		if (!(vma->vm_flags & VM_MAYSHARE))
-			pagecache_page = hugetlbfs_pagecache_page(h,
-								vma, haddr);
+		pagecache_page = hugetlbfs_pagecache_page(h, vma, haddr);
 	}
 
 	ptl = huge_pte_lock(h, mm, ptep);



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 011/158] mt76: mt7921: fix command timeout in AP stop period
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (9 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 010/158] mm/hugetlb: support write-faults in shared mappings Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 012/158] xfrm: fix refcount leak in __xfrm_policy_check() Greg Kroah-Hartman
                   ` (159 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Sean Wang, Deren Wu, Felix Fietkau

From: Deren Wu <deren.wu@mediatek.com>

commit 9d958b60ebc2434f2b7eae83d77849e22d1059eb upstream.

Due to AP stop improperly, mt7921 driver would face random command timeout
by chip fw problem. Migrate AP start/stop process to .start_ap/.stop_ap and
congiure BSS network settings in both hooks.

The new flow is shown below.
* AP start
    .start_ap()
      configure BSS network resource
      set BSS to connected state
    .bss_info_changed()
      enable fw beacon offload

* AP stop
    .bss_info_changed()
      disable fw beacon offload (skip this command)
    .stop_ap()
      set BSS to disconnected state (beacon offload disabled automatically)
      destroy BSS network resource

Fixes: 116c69603b01 ("mt76: mt7921: Add AP mode support")
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Deren Wu <deren.wu@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.c |    2 
 drivers/net/wireless/mediatek/mt76/mt7921/main.c     |   47 +++++++++++++++----
 drivers/net/wireless/mediatek/mt76/mt7921/mcu.c      |    5 --
 drivers/net/wireless/mediatek/mt76/mt7921/mt7921.h   |    2 
 4 files changed, 43 insertions(+), 13 deletions(-)

--- a/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.c
@@ -1403,6 +1403,8 @@ int mt76_connac_mcu_uni_add_bss(struct m
 		else
 			conn_type = CONNECTION_INFRA_AP;
 		basic_req.basic.conn_type = cpu_to_le32(conn_type);
+		/* Fully active/deactivate BSS network in AP mode only */
+		basic_req.basic.active = enable;
 		break;
 	case NL80211_IFTYPE_STATION:
 		if (vif->p2p)
--- a/drivers/net/wireless/mediatek/mt76/mt7921/main.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/main.c
@@ -653,15 +653,6 @@ static void mt7921_bss_info_changed(stru
 		}
 	}
 
-	if (changed & BSS_CHANGED_BEACON_ENABLED && info->enable_beacon) {
-		struct mt7921_vif *mvif = (struct mt7921_vif *)vif->drv_priv;
-
-		mt76_connac_mcu_uni_add_bss(phy->mt76, vif, &mvif->sta.wcid,
-					    true);
-		mt7921_mcu_sta_update(dev, NULL, vif, true,
-				      MT76_STA_INFO_STATE_NONE);
-	}
-
 	if (changed & (BSS_CHANGED_BEACON |
 		       BSS_CHANGED_BEACON_ENABLED))
 		mt7921_mcu_uni_add_beacon_offload(dev, hw, vif,
@@ -1500,6 +1491,42 @@ mt7921_channel_switch_beacon(struct ieee
 	mt7921_mutex_release(dev);
 }
 
+static int
+mt7921_start_ap(struct ieee80211_hw *hw, struct ieee80211_vif *vif)
+{
+	struct mt7921_vif *mvif = (struct mt7921_vif *)vif->drv_priv;
+	struct mt7921_phy *phy = mt7921_hw_phy(hw);
+	struct mt7921_dev *dev = mt7921_hw_dev(hw);
+	int err;
+
+	err = mt76_connac_mcu_uni_add_bss(phy->mt76, vif, &mvif->sta.wcid,
+					  true);
+	if (err)
+		return err;
+
+	err = mt7921_mcu_set_bss_pm(dev, vif, true);
+	if (err)
+		return err;
+
+	return mt7921_mcu_sta_update(dev, NULL, vif, true,
+				     MT76_STA_INFO_STATE_NONE);
+}
+
+static void
+mt7921_stop_ap(struct ieee80211_hw *hw, struct ieee80211_vif *vif)
+{
+	struct mt7921_vif *mvif = (struct mt7921_vif *)vif->drv_priv;
+	struct mt7921_phy *phy = mt7921_hw_phy(hw);
+	struct mt7921_dev *dev = mt7921_hw_dev(hw);
+	int err;
+
+	err = mt7921_mcu_set_bss_pm(dev, vif, false);
+	if (err)
+		return;
+
+	mt76_connac_mcu_uni_add_bss(phy->mt76, vif, &mvif->sta.wcid, false);
+}
+
 const struct ieee80211_ops mt7921_ops = {
 	.tx = mt7921_tx,
 	.start = mt7921_start,
@@ -1510,6 +1537,8 @@ const struct ieee80211_ops mt7921_ops =
 	.conf_tx = mt7921_conf_tx,
 	.configure_filter = mt7921_configure_filter,
 	.bss_info_changed = mt7921_bss_info_changed,
+	.start_ap = mt7921_start_ap,
+	.stop_ap = mt7921_stop_ap,
 	.sta_state = mt7921_sta_state,
 	.sta_pre_rcu_remove = mt76_sta_pre_rcu_remove,
 	.set_key = mt7921_set_key,
--- a/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
@@ -1020,7 +1020,7 @@ mt7921_mcu_uni_bss_bcnft(struct mt7921_d
 				 &bcnft_req, sizeof(bcnft_req), true);
 }
 
-static int
+int
 mt7921_mcu_set_bss_pm(struct mt7921_dev *dev, struct ieee80211_vif *vif,
 		      bool enable)
 {
@@ -1049,9 +1049,6 @@ mt7921_mcu_set_bss_pm(struct mt7921_dev
 	};
 	int err;
 
-	if (vif->type != NL80211_IFTYPE_STATION)
-		return 0;
-
 	err = mt76_mcu_send_msg(&dev->mt76, MCU_CE_CMD(SET_BSS_ABORT),
 				&req_hdr, sizeof(req_hdr), false);
 	if (err < 0 || !enable)
--- a/drivers/net/wireless/mediatek/mt76/mt7921/mt7921.h
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/mt7921.h
@@ -280,6 +280,8 @@ int mt7921_wpdma_reset(struct mt7921_dev
 int mt7921_wpdma_reinit_cond(struct mt7921_dev *dev);
 void mt7921_dma_cleanup(struct mt7921_dev *dev);
 int mt7921_run_firmware(struct mt7921_dev *dev);
+int mt7921_mcu_set_bss_pm(struct mt7921_dev *dev, struct ieee80211_vif *vif,
+			  bool enable);
 int mt7921_mcu_sta_update(struct mt7921_dev *dev, struct ieee80211_sta *sta,
 			  struct ieee80211_vif *vif, bool enable,
 			  enum mt76_sta_info_state state);



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 012/158] xfrm: fix refcount leak in __xfrm_policy_check()
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (10 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 011/158] mt76: mt7921: fix command timeout in AP stop period Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 013/158] Revert "xfrm: update SA curlft.use_time" Greg Kroah-Hartman
                   ` (158 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Xin Xiong, Xin Tan, Steffen Klassert,
	Sasha Levin

From: Xin Xiong <xiongx18@fudan.edu.cn>

[ Upstream commit 9c9cb23e00ddf45679b21b4dacc11d1ae7961ebe ]

The issue happens on an error path in __xfrm_policy_check(). When the
fetching process of the object `pols[1]` fails, the function simply
returns 0, forgetting to decrement the reference count of `pols[0]`,
which is incremented earlier by either xfrm_sk_policy_lookup() or
xfrm_policy_lookup(). This may result in memory leaks.

Fix it by decreasing the reference count of `pols[0]` in that path.

Fixes: 134b0fc544ba ("IPsec: propagate security module errors up from flow_cache_lookup")
Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/xfrm/xfrm_policy.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index f1a0bab920a55..4f8bbb825abcb 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -3599,6 +3599,7 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb,
 		if (pols[1]) {
 			if (IS_ERR(pols[1])) {
 				XFRM_INC_STATS(net, LINUX_MIB_XFRMINPOLERROR);
+				xfrm_pol_put(pols[0]);
 				return 0;
 			}
 			pols[1]->curlft.use_time = ktime_get_real_seconds();
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 013/158] Revert "xfrm: update SA curlft.use_time"
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (11 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 012/158] xfrm: fix refcount leak in __xfrm_policy_check() Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 014/158] xfrm: clone missing x->lastused in xfrm_do_migrate Greg Kroah-Hartman
                   ` (157 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Antony Antony, Steffen Klassert, Sasha Levin

From: Antony Antony <antony.antony@secunet.com>

[ Upstream commit 717ada9f10f2de8c4f4d72ad045f3b67a7ced715 ]

This reverts commit af734a26a1a95a9fda51f2abb0c22a7efcafd5ca.

The abvoce commit is a regression according RFC 2367. A better fix would be
use x->lastused. Which will be propsed later.

according to RFC 2367 use_time == sadb_lifetime_usetime.

"sadb_lifetime_usetime
                   For CURRENT, the time, in seconds, when association
                   was first used. For HARD and SOFT, the number of
                   seconds after the first use of the association until
                   it expires."

Fixes: af734a26a1a9 ("xfrm: update SA curlft.use_time")
Signed-off-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/xfrm/xfrm_input.c  | 1 -
 net/xfrm/xfrm_output.c | 1 -
 2 files changed, 2 deletions(-)

diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 144238a50f3d4..70a8c36f0ba6e 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -669,7 +669,6 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
 
 		x->curlft.bytes += skb->len;
 		x->curlft.packets++;
-		x->curlft.use_time = ktime_get_real_seconds();
 
 		spin_unlock(&x->lock);
 
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index 555ab35cd119a..9a5e79a38c679 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -534,7 +534,6 @@ static int xfrm_output_one(struct sk_buff *skb, int err)
 
 		x->curlft.bytes += skb->len;
 		x->curlft.packets++;
-		x->curlft.use_time = ktime_get_real_seconds();
 
 		spin_unlock_bh(&x->lock);
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 014/158] xfrm: clone missing x->lastused in xfrm_do_migrate
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (12 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 013/158] Revert "xfrm: update SA curlft.use_time" Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 015/158] af_key: Do not call xfrm_probe_algs in parallel Greg Kroah-Hartman
                   ` (156 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Antony Antony, Steffen Klassert, Sasha Levin

From: Antony Antony <antony.antony@secunet.com>

[ Upstream commit 6aa811acdb76facca0b705f4e4c1d948ccb6af8b ]

x->lastused was not cloned in xfrm_do_migrate. Add it to clone during
migrate.

Fixes: 80c9abaabf42 ("[XFRM]: Extension for dynamic update of endpoint address(es)")
Signed-off-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/xfrm/xfrm_state.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index ccfb172eb5b8d..11d89af9cb55a 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -1592,6 +1592,7 @@ static struct xfrm_state *xfrm_state_clone(struct xfrm_state *orig,
 	x->replay = orig->replay;
 	x->preplay = orig->preplay;
 	x->mapping_maxage = orig->mapping_maxage;
+	x->lastused = orig->lastused;
 	x->new_mapping = 0;
 	x->new_mapping_sport = 0;
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 015/158] af_key: Do not call xfrm_probe_algs in parallel
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (13 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 014/158] xfrm: clone missing x->lastused in xfrm_do_migrate Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 016/158] xfrm: policy: fix metadata dst->dev xmit null pointer dereference Greg Kroah-Hartman
                   ` (155 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Abhishek Shah, Herbert Xu,
	Steffen Klassert, Sasha Levin

From: Herbert Xu <herbert@gondor.apana.org.au>

[ Upstream commit ba953a9d89a00c078b85f4b190bc1dde66fe16b5 ]

When namespace support was added to xfrm/afkey, it caused the
previously single-threaded call to xfrm_probe_algs to become
multi-threaded.  This is buggy and needs to be fixed with a mutex.

Reported-by: Abhishek Shah <abhishek.shah@columbia.edu>
Fixes: 283bc9f35bbb ("xfrm: Namespacify xfrm state/policy locks")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/key/af_key.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/key/af_key.c b/net/key/af_key.c
index fb16d7c4e1b8d..20e73643b9c89 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -1697,9 +1697,12 @@ static int pfkey_register(struct sock *sk, struct sk_buff *skb, const struct sad
 		pfk->registered |= (1<<hdr->sadb_msg_satype);
 	}
 
+	mutex_lock(&pfkey_mutex);
 	xfrm_probe_algs();
 
 	supp_skb = compose_sadb_supported(hdr, GFP_KERNEL | __GFP_ZERO);
+	mutex_unlock(&pfkey_mutex);
+
 	if (!supp_skb) {
 		if (hdr->sadb_msg_satype != SADB_SATYPE_UNSPEC)
 			pfk->registered &= ~(1<<hdr->sadb_msg_satype);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 016/158] xfrm: policy: fix metadata dst->dev xmit null pointer dereference
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (14 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 015/158] af_key: Do not call xfrm_probe_algs in parallel Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 017/158] fs: require CAP_SYS_ADMIN in target namespace for idmapped mounts Greg Kroah-Hartman
                   ` (154 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Steffen Klassert, Daniel Borkmann,
	Nikolay Aleksandrov, Sasha Levin

From: Nikolay Aleksandrov <razor@blackwall.org>

[ Upstream commit 17ecd4a4db4783392edd4944f5e8268205083f70 ]

When we try to transmit an skb with metadata_dst attached (i.e. dst->dev
== NULL) through xfrm interface we can hit a null pointer dereference[1]
in xfrmi_xmit2() -> xfrm_lookup_with_ifid() due to the check for a
loopback skb device when there's no policy which dereferences dst->dev
unconditionally. Not having dst->dev can be interepreted as it not being
a loopback device, so just add a check for a null dst_orig->dev.

With this fix xfrm interface's Tx error counters go up as usual.

[1] net-next calltrace captured via netconsole:
  BUG: kernel NULL pointer dereference, address: 00000000000000c0
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] PREEMPT SMP
  CPU: 1 PID: 7231 Comm: ping Kdump: loaded Not tainted 5.19.0+ #24
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014
  RIP: 0010:xfrm_lookup_with_ifid+0x5eb/0xa60
  Code: 8d 74 24 38 e8 26 a4 37 00 48 89 c1 e9 12 fc ff ff 49 63 ed 41 83 fd be 0f 85 be 01 00 00 41 be ff ff ff ff 45 31 ed 48 8b 03 <f6> 80 c0 00 00 00 08 75 0f 41 80 bc 24 19 0d 00 00 01 0f 84 1e 02
  RSP: 0018:ffffb0db82c679f0 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffffd0db7fcad430 RCX: ffffb0db82c67a10
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb0db82c67a80
  RBP: ffffb0db82c67a80 R08: ffffb0db82c67a14 R09: 0000000000000000
  R10: 0000000000000000 R11: ffff8fa449667dc8 R12: ffffffff966db880
  R13: 0000000000000000 R14: 00000000ffffffff R15: 0000000000000000
  FS:  00007ff35c83f000(0000) GS:ffff8fa478480000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000000c0 CR3: 000000001ebb7000 CR4: 0000000000350ee0
  Call Trace:
   <TASK>
   xfrmi_xmit+0xde/0x460
   ? tcf_bpf_act+0x13d/0x2a0
   dev_hard_start_xmit+0x72/0x1e0
   __dev_queue_xmit+0x251/0xd30
   ip_finish_output2+0x140/0x550
   ip_push_pending_frames+0x56/0x80
   raw_sendmsg+0x663/0x10a0
   ? try_charge_memcg+0x3fd/0x7a0
   ? __mod_memcg_lruvec_state+0x93/0x110
   ? sock_sendmsg+0x30/0x40
   sock_sendmsg+0x30/0x40
   __sys_sendto+0xeb/0x130
   ? handle_mm_fault+0xae/0x280
   ? do_user_addr_fault+0x1e7/0x680
   ? kvm_read_and_reset_apf_flags+0x3b/0x50
   __x64_sys_sendto+0x20/0x30
   do_syscall_64+0x34/0x80
   entry_SYSCALL_64_after_hwframe+0x46/0xb0
  RIP: 0033:0x7ff35cac1366
  Code: eb 0b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 72 c3 90 55 48 83 ec 30 44 89 4c 24 2c 4c 89
  RSP: 002b:00007fff738e4028 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
  RAX: ffffffffffffffda RBX: 00007fff738e57b0 RCX: 00007ff35cac1366
  RDX: 0000000000000040 RSI: 0000557164e4b450 RDI: 0000000000000003
  RBP: 0000557164e4b450 R08: 00007fff738e7a2c R09: 0000000000000010
  R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040
  R13: 00007fff738e5770 R14: 00007fff738e4030 R15: 0000001d00000001
   </TASK>
  Modules linked in: netconsole veth br_netfilter bridge bonding virtio_net [last unloaded: netconsole]
  CR2: 00000000000000c0

CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Daniel Borkmann <daniel@iogearbox.net>
Fixes: 2d151d39073a ("xfrm: Add possibility to set the default to block if we have no policy")
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/xfrm/xfrm_policy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 4f8bbb825abcb..cc6ab79609e29 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -3162,7 +3162,7 @@ struct dst_entry *xfrm_lookup_with_ifid(struct net *net,
 	return dst;
 
 nopol:
-	if (!(dst_orig->dev->flags & IFF_LOOPBACK) &&
+	if ((!dst_orig->dev || !(dst_orig->dev->flags & IFF_LOOPBACK)) &&
 	    net->xfrm.policy_default[dir] == XFRM_USERPOLICY_BLOCK) {
 		err = -EPERM;
 		goto error;
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 017/158] fs: require CAP_SYS_ADMIN in target namespace for idmapped mounts
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (15 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 016/158] xfrm: policy: fix metadata dst->dev xmit null pointer dereference Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 018/158] Revert "net: macsec: update SCI upon MAC address change." Greg Kroah-Hartman
                   ` (153 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Seth Forshee,
	Christian Brauner (Microsoft),
	Sasha Levin

From: Seth Forshee <sforshee@digitalocean.com>

[ Upstream commit bf1ac16edf6770a92bc75cf2373f1f9feea398a4 ]

Idmapped mounts should not allow a user to map file ownsership into a
range of ids which is not under the control of that user. However, we
currently don't check whether the mounter is privileged wrt to the
target user namespace.

Currently no FS_USERNS_MOUNT filesystems support idmapped mounts, thus
this is not a problem as only CAP_SYS_ADMIN in init_user_ns is allowed
to set up idmapped mounts. But this could change in the future, so add a
check to refuse to create idmapped mounts when the mounter does not have
CAP_SYS_ADMIN in the target user namespace.

Fixes: bd303368b776 ("fs: support mapped mounts of mapped filesystems")
Signed-off-by: Seth Forshee <sforshee@digitalocean.com>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Link: https://lore.kernel.org/r/20220816164752.2595240-1-sforshee@digitalocean.com
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/namespace.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/namespace.c b/fs/namespace.c
index e6a7e769d25dd..a59f8d645654a 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -4238,6 +4238,13 @@ static int build_mount_idmapped(const struct mount_attr *attr, size_t usize,
 		err = -EPERM;
 		goto out_fput;
 	}
+
+	/* We're not controlling the target namespace. */
+	if (!ns_capable(mnt_userns, CAP_SYS_ADMIN)) {
+		err = -EPERM;
+		goto out_fput;
+	}
+
 	kattr->mnt_userns = get_user_ns(mnt_userns);
 
 out_fput:
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 018/158] Revert "net: macsec: update SCI upon MAC address change."
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (16 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 017/158] fs: require CAP_SYS_ADMIN in target namespace for idmapped mounts Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 019/158] NFSv4.2 fix problems with __nfs42_ssc_open Greg Kroah-Hartman
                   ` (152 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Sabrina Dubroca, Jakub Kicinski, Sasha Levin

From: Sabrina Dubroca <sd@queasysnail.net>

[ Upstream commit e82c649e851c9c25367fb7a2a6cf3479187de467 ]

This reverts commit 6fc498bc82929ee23aa2f35a828c6178dfd3f823.

Commit 6fc498bc8292 states:

    SCI should be updated, because it contains MAC in its first 6
    octets.

That's not entirely correct. The SCI can be based on the MAC address,
but doesn't have to be. We can also use any 64-bit number as the
SCI. When the SCI based on the MAC address, it uses a 16-bit "port
number" provided by userspace, which commit 6fc498bc8292 overwrites
with 1.

In addition, changing the SCI after macsec has been setup can just
confuse the receiver. If we configure the RXSC on the peer based on
the original SCI, we should keep the same SCI on TX.

When the macsec device is being managed by a userspace key negotiation
daemon such as wpa_supplicant, commit 6fc498bc8292 would also
overwrite the SCI defined by userspace.

Fixes: 6fc498bc8292 ("net: macsec: update SCI upon MAC address change.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/9b1a9d28327e7eb54550a92eebda45d25e54dd0d.1660667033.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/macsec.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index f354fad05714a..5b0b23e55fa76 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -449,11 +449,6 @@ static struct macsec_eth_header *macsec_ethhdr(struct sk_buff *skb)
 	return (struct macsec_eth_header *)skb_mac_header(skb);
 }
 
-static sci_t dev_to_sci(struct net_device *dev, __be16 port)
-{
-	return make_sci(dev->dev_addr, port);
-}
-
 static void __macsec_pn_wrapped(struct macsec_secy *secy,
 				struct macsec_tx_sa *tx_sa)
 {
@@ -3622,7 +3617,6 @@ static int macsec_set_mac_address(struct net_device *dev, void *p)
 
 out:
 	eth_hw_addr_set(dev, addr->sa_data);
-	macsec->secy.sci = dev_to_sci(dev, MACSEC_PORT_ES);
 
 	/* If h/w offloading is available, propagate to the device */
 	if (macsec_is_offloaded(macsec)) {
@@ -3960,6 +3954,11 @@ static bool sci_exists(struct net_device *dev, sci_t sci)
 	return false;
 }
 
+static sci_t dev_to_sci(struct net_device *dev, __be16 port)
+{
+	return make_sci(dev->dev_addr, port);
+}
+
 static int macsec_add_dev(struct net_device *dev, sci_t sci, u8 icv_len)
 {
 	struct macsec_dev *macsec = macsec_priv(dev);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 019/158] NFSv4.2 fix problems with __nfs42_ssc_open
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (17 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 018/158] Revert "net: macsec: update SCI upon MAC address change." Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 020/158] SUNRPC: RPC level errors should set task->tk_rpc_status Greg Kroah-Hartman
                   ` (151 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Al Viro, Olga Kornievskaia,
	Trond Myklebust, Sasha Levin

From: Olga Kornievskaia <kolga@netapp.com>

[ Upstream commit fcfc8be1e9cf2f12b50dce8b579b3ae54443a014 ]

A destination server while doing a COPY shouldn't accept using the
passed in filehandle if its not a regular filehandle.

If alloc_file_pseudo() has failed, we need to decrement a reference
on the newly created inode, otherwise it leaks.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Fixes: ec4b092508982 ("NFS: inter ssc open")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/nfs/nfs4file.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index e88f6b18445ec..9eb1812878795 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -340,6 +340,11 @@ static struct file *__nfs42_ssc_open(struct vfsmount *ss_mnt,
 		goto out;
 	}
 
+	if (!S_ISREG(fattr->mode)) {
+		res = ERR_PTR(-EBADF);
+		goto out;
+	}
+
 	res = ERR_PTR(-ENOMEM);
 	len = strlen(SSC_READ_NAME_BODY) + 16;
 	read_name = kzalloc(len, GFP_KERNEL);
@@ -357,6 +362,7 @@ static struct file *__nfs42_ssc_open(struct vfsmount *ss_mnt,
 				     r_ino->i_fop);
 	if (IS_ERR(filep)) {
 		res = ERR_CAST(filep);
+		iput(r_ino);
 		goto out_free_name;
 	}
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 020/158] SUNRPC: RPC level errors should set task->tk_rpc_status
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (18 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 019/158] NFSv4.2 fix problems with __nfs42_ssc_open Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 021/158] mm/smaps: dont access young/dirty bit if pte unpresent Greg Kroah-Hartman
                   ` (150 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Trond Myklebust, Sasha Levin

From: Trond Myklebust <trond.myklebust@hammerspace.com>

[ Upstream commit ed06fce0b034b2e25bd93430f5c4cbb28036cc1a ]

Fix up a case in call_encode() where we're failing to set
task->tk_rpc_status when an RPC level error occurred.

Fixes: 9c5948c24869 ("SUNRPC: task should be exit if encode return EKEYEXPIRED more times")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/sunrpc/clnt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 733f9f2260926..c1a01947530f0 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1888,7 +1888,7 @@ call_encode(struct rpc_task *task)
 			break;
 		case -EKEYEXPIRED:
 			if (!task->tk_cred_retry) {
-				rpc_exit(task, task->tk_status);
+				rpc_call_rpcerror(task, task->tk_status);
 			} else {
 				task->tk_action = call_refresh;
 				task->tk_cred_retry--;
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 021/158] mm/smaps: dont access young/dirty bit if pte unpresent
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (19 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 020/158] SUNRPC: RPC level errors should set task->tk_rpc_status Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 022/158] ntfs: fix acl handling Greg Kroah-Hartman
                   ` (149 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Peter Xu, Vlastimil Babka,
	David Hildenbrand, Yang Shi, Konstantin Khlebnikov, Huang Ying,
	Andrew Morton, Sasha Levin

From: Peter Xu <peterx@redhat.com>

[ Upstream commit efd4149342db2df41b1bbe68972ead853b30e444 ]

These bits should only be valid when the ptes are present.  Introducing
two booleans for it and set it to false when !pte_present() for both pte
and pmd accountings.

The bug is found during code reading and no real world issue reported, but
logically such an error can cause incorrect readings for either smaps or
smaps_rollup output on quite a few fields.

For example, it could cause over-estimate on values like Shared_Dirty,
Private_Dirty, Referenced.  Or it could also cause under-estimate on
values like LazyFree, Shared_Clean, Private_Clean.

Link: https://lkml.kernel.org/r/20220805160003.58929-1-peterx@redhat.com
Fixes: b1d4d9e0cbd0 ("proc/smaps: carefully handle migration entries")
Fixes: c94b6923fa0a ("/proc/PID/smaps: Add PMD migration entry parsing")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/proc/task_mmu.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 2d04e3470d4cd..313788bc0c307 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -525,10 +525,12 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
 	struct vm_area_struct *vma = walk->vma;
 	bool locked = !!(vma->vm_flags & VM_LOCKED);
 	struct page *page = NULL;
-	bool migration = false;
+	bool migration = false, young = false, dirty = false;
 
 	if (pte_present(*pte)) {
 		page = vm_normal_page(vma, addr, *pte);
+		young = pte_young(*pte);
+		dirty = pte_dirty(*pte);
 	} else if (is_swap_pte(*pte)) {
 		swp_entry_t swpent = pte_to_swp_entry(*pte);
 
@@ -558,8 +560,7 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
 	if (!page)
 		return;
 
-	smaps_account(mss, page, false, pte_young(*pte), pte_dirty(*pte),
-		      locked, migration);
+	smaps_account(mss, page, false, young, dirty, locked, migration);
 }
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 022/158] ntfs: fix acl handling
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (20 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 021/158] mm/smaps: dont access young/dirty bit if pte unpresent Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 023/158] rose: check NULL rose_loopback_neigh->loopback Greg Kroah-Hartman
                   ` (148 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Konstantin Komarov, ntfs3,
	Christian Brauner (Microsoft),
	Sasha Levin

From: Christian Brauner <brauner@kernel.org>

[ Upstream commit 0c3bc7899e6dfb52df1c46118a5a670ae619645f ]

While looking at our current POSIX ACL handling in the context of some
overlayfs work I went through a range of other filesystems checking how they
handle them currently and encountered ntfs3.

The posic_acl_{from,to}_xattr() helpers always need to operate on the
filesystem idmapping. Since ntfs3 can only be mounted in the initial user
namespace the relevant idmapping is init_user_ns.

The posix_acl_{from,to}_xattr() helpers are concerned with translating between
the kernel internal struct posix_acl{_entry} and the uapi struct
posix_acl_xattr_{header,entry} and the kernel internal data structure is cached
filesystem wide.

Additional idmappings such as the caller's idmapping or the mount's idmapping
are handled higher up in the VFS. Individual filesystems usually do not need to
concern themselves with these.

The posix_acl_valid() helper is concerned with checking whether the values in
the kernel internal struct posix_acl can be represented in the filesystem's
idmapping. IOW, if they can be written to disk. So this helper too needs to
take the filesystem's idmapping.

Fixes: be71b5cba2e6 ("fs/ntfs3: Add attrib operations")
Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: ntfs3@lists.linux.dev
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/ntfs3/xattr.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/fs/ntfs3/xattr.c b/fs/ntfs3/xattr.c
index 1b8c89dbf6684..3629049decac1 100644
--- a/fs/ntfs3/xattr.c
+++ b/fs/ntfs3/xattr.c
@@ -478,8 +478,7 @@ static noinline int ntfs_set_ea(struct inode *inode, const char *name,
 }
 
 #ifdef CONFIG_NTFS3_FS_POSIX_ACL
-static struct posix_acl *ntfs_get_acl_ex(struct user_namespace *mnt_userns,
-					 struct inode *inode, int type,
+static struct posix_acl *ntfs_get_acl_ex(struct inode *inode, int type,
 					 int locked)
 {
 	struct ntfs_inode *ni = ntfs_i(inode);
@@ -514,7 +513,7 @@ static struct posix_acl *ntfs_get_acl_ex(struct user_namespace *mnt_userns,
 
 	/* Translate extended attribute to acl. */
 	if (err >= 0) {
-		acl = posix_acl_from_xattr(mnt_userns, buf, err);
+		acl = posix_acl_from_xattr(&init_user_ns, buf, err);
 	} else if (err == -ENODATA) {
 		acl = NULL;
 	} else {
@@ -537,8 +536,7 @@ struct posix_acl *ntfs_get_acl(struct inode *inode, int type, bool rcu)
 	if (rcu)
 		return ERR_PTR(-ECHILD);
 
-	/* TODO: init_user_ns? */
-	return ntfs_get_acl_ex(&init_user_ns, inode, type, 0);
+	return ntfs_get_acl_ex(inode, type, 0);
 }
 
 static noinline int ntfs_set_acl_ex(struct user_namespace *mnt_userns,
@@ -590,7 +588,7 @@ static noinline int ntfs_set_acl_ex(struct user_namespace *mnt_userns,
 		value = kmalloc(size, GFP_NOFS);
 		if (!value)
 			return -ENOMEM;
-		err = posix_acl_to_xattr(mnt_userns, acl, value, size);
+		err = posix_acl_to_xattr(&init_user_ns, acl, value, size);
 		if (err < 0)
 			goto out;
 		flags = 0;
@@ -641,7 +639,7 @@ static int ntfs_xattr_get_acl(struct user_namespace *mnt_userns,
 	if (!acl)
 		return -ENODATA;
 
-	err = posix_acl_to_xattr(mnt_userns, acl, buffer, size);
+	err = posix_acl_to_xattr(&init_user_ns, acl, buffer, size);
 	posix_acl_release(acl);
 
 	return err;
@@ -665,12 +663,12 @@ static int ntfs_xattr_set_acl(struct user_namespace *mnt_userns,
 	if (!value) {
 		acl = NULL;
 	} else {
-		acl = posix_acl_from_xattr(mnt_userns, value, size);
+		acl = posix_acl_from_xattr(&init_user_ns, value, size);
 		if (IS_ERR(acl))
 			return PTR_ERR(acl);
 
 		if (acl) {
-			err = posix_acl_valid(mnt_userns, acl);
+			err = posix_acl_valid(&init_user_ns, acl);
 			if (err)
 				goto release_and_out;
 		}
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 023/158] rose: check NULL rose_loopback_neigh->loopback
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (21 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 022/158] ntfs: fix acl handling Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 024/158] r8152: fix the units of some registers for RTL8156A Greg Kroah-Hartman
                   ` (147 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Bernard Pidoux, Francois Romieu,
	Thomas DL9SAU Osterried, David S. Miller, Sasha Levin

From: Bernard Pidoux <f6bvp@free.fr>

[ Upstream commit 3c53cd65dece47dd1f9d3a809f32e59d1d87b2b8 ]

Commit 3b3fd068c56e3fbea30090859216a368398e39bf added NULL check for
`rose_loopback_neigh->dev` in rose_loopback_timer() but omitted to
check rose_loopback_neigh->loopback.

It thus prevents *all* rose connect.

The reason is that a special rose_neigh loopback has a NULL device.

/proc/net/rose_neigh illustrates it via rose_neigh_show() function :
[...]
seq_printf(seq, "%05d %-9s %-4s   %3d %3d  %3s     %3s %3lu %3lu",
	   rose_neigh->number,
	   (rose_neigh->loopback) ? "RSLOOP-0" : ax2asc(buf, &rose_neigh->callsign),
	   rose_neigh->dev ? rose_neigh->dev->name : "???",
	   rose_neigh->count,

/proc/net/rose_neigh displays special rose_loopback_neigh->loopback as
callsign RSLOOP-0:

addr  callsign  dev  count use mode restart  t0  tf digipeaters
00001 RSLOOP-0  ???      1   2  DCE     yes   0   0

By checking rose_loopback_neigh->loopback, rose_rx_call_request() is called
even in case rose_loopback_neigh->dev is NULL. This repairs rose connections.

Verification with rose client application FPAC:

FPAC-Node v 4.1.3 (built Aug  5 2022) for LINUX (help = h)
F6BVP-4 (Commands = ?) : u
Users - AX.25 Level 2 sessions :
Port   Callsign     Callsign  AX.25 state  ROSE state  NetRom status
axudp  F6BVP-5   -> F6BVP-9   Connected    Connected   ---------

Fixes: 3b3fd068c56e ("rose: Fix Null pointer dereference in rose_send_frame()")
Signed-off-by: Bernard Pidoux <f6bvp@free.fr>
Suggested-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Thomas DL9SAU Osterried <thomas@osterried.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/rose/rose_loopback.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/rose/rose_loopback.c b/net/rose/rose_loopback.c
index 11c45c8c6c164..036d92c0ad794 100644
--- a/net/rose/rose_loopback.c
+++ b/net/rose/rose_loopback.c
@@ -96,7 +96,8 @@ static void rose_loopback_timer(struct timer_list *unused)
 		}
 
 		if (frametype == ROSE_CALL_REQUEST) {
-			if (!rose_loopback_neigh->dev) {
+			if (!rose_loopback_neigh->dev &&
+			    !rose_loopback_neigh->loopback) {
 				kfree_skb(skb);
 				continue;
 			}
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 024/158] r8152: fix the units of some registers for RTL8156A
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (22 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 023/158] rose: check NULL rose_loopback_neigh->loopback Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 025/158] r8152: fix the RX FIFO settings when suspending Greg Kroah-Hartman
                   ` (146 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hayes Wang, David S. Miller, Sasha Levin

From: Hayes Wang <hayeswang@realtek.com>

[ Upstream commit 6dc4df12d741c0fe8f885778a43039e0619b9cd9 ]

The units of PLA_RX_FIFO_FULL and PLA_RX_FIFO_EMPTY are 16 bytes.

Fixes: 195aae321c82 ("r8152: support new chips")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/usb/r8152.c | 17 ++---------------
 1 file changed, 2 insertions(+), 15 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 0f6efaabaa32b..46c7954d27629 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -6431,21 +6431,8 @@ static void r8156_fc_parameter(struct r8152 *tp)
 	u32 pause_on = tp->fc_pause_on ? tp->fc_pause_on : fc_pause_on_auto(tp);
 	u32 pause_off = tp->fc_pause_off ? tp->fc_pause_off : fc_pause_off_auto(tp);
 
-	switch (tp->version) {
-	case RTL_VER_10:
-	case RTL_VER_11:
-		ocp_write_word(tp, MCU_TYPE_PLA, PLA_RX_FIFO_FULL, pause_on / 8);
-		ocp_write_word(tp, MCU_TYPE_PLA, PLA_RX_FIFO_EMPTY, pause_off / 8);
-		break;
-	case RTL_VER_12:
-	case RTL_VER_13:
-	case RTL_VER_15:
-		ocp_write_word(tp, MCU_TYPE_PLA, PLA_RX_FIFO_FULL, pause_on / 16);
-		ocp_write_word(tp, MCU_TYPE_PLA, PLA_RX_FIFO_EMPTY, pause_off / 16);
-		break;
-	default:
-		break;
-	}
+	ocp_write_word(tp, MCU_TYPE_PLA, PLA_RX_FIFO_FULL, pause_on / 16);
+	ocp_write_word(tp, MCU_TYPE_PLA, PLA_RX_FIFO_EMPTY, pause_off / 16);
 }
 
 static void rtl8156_change_mtu(struct r8152 *tp)
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 025/158] r8152: fix the RX FIFO settings when suspending
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (23 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 024/158] r8152: fix the units of some registers for RTL8156A Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 026/158] nfc: pn533: Fix use-after-free bugs caused by pn532_cmd_timeout Greg Kroah-Hartman
                   ` (145 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mark Blakeney, Hayes Wang,
	David S. Miller, Sasha Levin

From: Hayes Wang <hayeswang@realtek.com>

[ Upstream commit b75d612014447e04abdf0e37ffb8f2fd8b0b49d6 ]

The RX FIFO would be changed when suspending, so the related settings
have to be modified, too. Otherwise, the flow control would work
abnormally.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216333
Reported-by: Mark Blakeney <mark.blakeney@bullet-systems.net>
Fixes: cdf0b86b250f ("r8152: fix a WOL issue")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/usb/r8152.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 46c7954d27629..d142ac8fcf6e2 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -5906,6 +5906,11 @@ static void r8153_enter_oob(struct r8152 *tp)
 	ocp_data &= ~NOW_IS_OOB;
 	ocp_write_byte(tp, MCU_TYPE_PLA, PLA_OOB_CTRL, ocp_data);
 
+	/* RX FIFO settings for OOB */
+	ocp_write_dword(tp, MCU_TYPE_PLA, PLA_RXFIFO_CTRL0, RXFIFO_THR1_OOB);
+	ocp_write_word(tp, MCU_TYPE_PLA, PLA_RXFIFO_CTRL1, RXFIFO_THR2_OOB);
+	ocp_write_word(tp, MCU_TYPE_PLA, PLA_RXFIFO_CTRL2, RXFIFO_THR3_OOB);
+
 	rtl_disable(tp);
 	rtl_reset_bmu(tp);
 
@@ -6544,6 +6549,11 @@ static void rtl8156_down(struct r8152 *tp)
 	ocp_data &= ~NOW_IS_OOB;
 	ocp_write_byte(tp, MCU_TYPE_PLA, PLA_OOB_CTRL, ocp_data);
 
+	/* RX FIFO settings for OOB */
+	ocp_write_word(tp, MCU_TYPE_PLA, PLA_RXFIFO_FULL, 64 / 16);
+	ocp_write_word(tp, MCU_TYPE_PLA, PLA_RX_FIFO_FULL, 1024 / 16);
+	ocp_write_word(tp, MCU_TYPE_PLA, PLA_RX_FIFO_EMPTY, 4096 / 16);
+
 	rtl_disable(tp);
 	rtl_reset_bmu(tp);
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 026/158] nfc: pn533: Fix use-after-free bugs caused by pn532_cmd_timeout
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (24 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 025/158] r8152: fix the RX FIFO settings when suspending Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 027/158] ice: xsk: prohibit usage of non-balanced queue id Greg Kroah-Hartman
                   ` (144 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Duoming Zhou, David S. Miller, Sasha Levin

From: Duoming Zhou <duoming@zju.edu.cn>

[ Upstream commit f1e941dbf80a9b8bab0bffbc4cbe41cc7f4c6fb6 ]

When the pn532 uart device is detaching, the pn532_uart_remove()
is called. But there are no functions in pn532_uart_remove() that
could delete the cmd_timeout timer, which will cause use-after-free
bugs. The process is shown below:

    (thread 1)                  |        (thread 2)
                                |  pn532_uart_send_frame
pn532_uart_remove               |    mod_timer(&pn532->cmd_timeout,...)
  ...                           |    (wait a time)
  kfree(pn532) //FREE           |    pn532_cmd_timeout
                                |      pn532_uart_send_frame
                                |        pn532->... //USE

This patch adds del_timer_sync() in pn532_uart_remove() in order to
prevent the use-after-free bugs. What's more, the pn53x_unregister_nfc()
is well synchronized, it sets nfc_dev->shutting_down to true and there
are no syscalls could restart the cmd_timeout timer.

Fixes: c656aa4c27b1 ("nfc: pn533: add UART phy driver")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/nfc/pn533/uart.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nfc/pn533/uart.c b/drivers/nfc/pn533/uart.c
index 2caf997f9bc94..07596bf5f7d6d 100644
--- a/drivers/nfc/pn533/uart.c
+++ b/drivers/nfc/pn533/uart.c
@@ -310,6 +310,7 @@ static void pn532_uart_remove(struct serdev_device *serdev)
 	pn53x_unregister_nfc(pn532->priv);
 	serdev_device_close(serdev);
 	pn53x_common_clean(pn532->priv);
+	del_timer_sync(&pn532->cmd_timeout);
 	kfree_skb(pn532->recv_skb);
 	kfree(pn532);
 }
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 027/158] ice: xsk: prohibit usage of non-balanced queue id
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (25 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 026/158] nfc: pn533: Fix use-after-free bugs caused by pn532_cmd_timeout Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 028/158] ice: xsk: use Rx rings XDP ring when picking NAPI context Greg Kroah-Hartman
                   ` (143 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Maciej Fijalkowski,
	George Kuruvinakunnel, Tony Nguyen, Sasha Levin

From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

[ Upstream commit 5a42f112d367bb4700a8a41f5c12724fde6bfbb9 ]

Fix the following scenario:
1. ethtool -L $IFACE rx 8 tx 96
2. xdpsock -q 10 -t -z

Above refers to a case where user would like to attach XSK socket in
txonly mode at a queue id that does not have a corresponding Rx queue.
At this moment ice's XSK logic is tightly bound to act on a "queue pair",
e.g. both Tx and Rx queues at a given queue id are disabled/enabled and
both of them will get XSK pool assigned, which is broken for the presented
queue configuration. This results in the splat included at the bottom,
which is basically an OOB access to Rx ring array.

To fix this, allow using the ids only in scope of "combined" queues
reported by ethtool. However, logic should be rewritten to allow such
configurations later on, which would end up as a complete rewrite of the
control path, so let us go with this temporary fix.

[420160.558008] BUG: kernel NULL pointer dereference, address: 0000000000000082
[420160.566359] #PF: supervisor read access in kernel mode
[420160.572657] #PF: error_code(0x0000) - not-present page
[420160.579002] PGD 0 P4D 0
[420160.582756] Oops: 0000 [#1] PREEMPT SMP NOPTI
[420160.588396] CPU: 10 PID: 21232 Comm: xdpsock Tainted: G           OE     5.19.0-rc7+ #10
[420160.597893] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
[420160.609894] RIP: 0010:ice_xsk_pool_setup+0x44/0x7d0 [ice]
[420160.616968] Code: f3 48 83 ec 40 48 8b 4f 20 48 8b 3f 65 48 8b 04 25 28 00 00 00 48 89 44 24 38 31 c0 48 8d 04 ed 00 00 00 00 48 01 c1 48 8b 11 <0f> b7 92 82 00 00 00 48 85 d2 0f 84 2d 75 00 00 48 8d 72 ff 48 85
[420160.639421] RSP: 0018:ffffc9002d2afd48 EFLAGS: 00010282
[420160.646650] RAX: 0000000000000050 RBX: ffff88811d8bdd00 RCX: ffff888112c14ff8
[420160.655893] RDX: 0000000000000000 RSI: ffff88811d8bdd00 RDI: ffff888109861000
[420160.665166] RBP: 000000000000000a R08: 000000000000000a R09: 0000000000000000
[420160.674493] R10: 000000000000889f R11: 0000000000000000 R12: 000000000000000a
[420160.683833] R13: 000000000000000a R14: 0000000000000000 R15: ffff888117611828
[420160.693211] FS:  00007fa869fc1f80(0000) GS:ffff8897e0880000(0000) knlGS:0000000000000000
[420160.703645] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[420160.711783] CR2: 0000000000000082 CR3: 00000001d076c001 CR4: 00000000007706e0
[420160.721399] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[420160.731045] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[420160.740707] PKRU: 55555554
[420160.745960] Call Trace:
[420160.750962]  <TASK>
[420160.755597]  ? kmalloc_large_node+0x79/0x90
[420160.762703]  ? __kmalloc_node+0x3f5/0x4b0
[420160.769341]  xp_assign_dev+0xfd/0x210
[420160.775661]  ? shmem_file_read_iter+0x29a/0x420
[420160.782896]  xsk_bind+0x152/0x490
[420160.788943]  __sys_bind+0xd0/0x100
[420160.795097]  ? exit_to_user_mode_prepare+0x20/0x120
[420160.802801]  __x64_sys_bind+0x16/0x20
[420160.809298]  do_syscall_64+0x38/0x90
[420160.815741]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[420160.823731] RIP: 0033:0x7fa86a0dd2fb
[420160.830264] Code: c3 66 0f 1f 44 00 00 48 8b 15 69 8b 0c 00 f7 d8 64 89 02 b8 ff ff ff ff eb bc 0f 1f 44 00 00 f3 0f 1e fa b8 31 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3d 8b 0c 00 f7 d8 64 89 01 48
[420160.855410] RSP: 002b:00007ffc1146f618 EFLAGS: 00000246 ORIG_RAX: 0000000000000031
[420160.866366] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa86a0dd2fb
[420160.876957] RDX: 0000000000000010 RSI: 00007ffc1146f680 RDI: 0000000000000003
[420160.887604] RBP: 000055d7113a0520 R08: 00007fa868fb8000 R09: 0000000080000000
[420160.898293] R10: 0000000000008001 R11: 0000000000000246 R12: 000055d7113a04e0
[420160.909038] R13: 000055d7113a0320 R14: 000000000000000a R15: 0000000000000000
[420160.919817]  </TASK>
[420160.925659] Modules linked in: ice(OE) af_packet binfmt_misc nls_iso8859_1 ipmi_ssif intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp mei_me coretemp ioatdma mei ipmi_si wmi ipmi_msghandler acpi_pad acpi_power_meter ip_tables x_tables autofs4 ixgbe i40e crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd ahci mdio dca libahci lpc_ich [last unloaded: ice]
[420160.977576] CR2: 0000000000000082
[420160.985037] ---[ end trace 0000000000000000 ]---
[420161.097724] RIP: 0010:ice_xsk_pool_setup+0x44/0x7d0 [ice]
[420161.107341] Code: f3 48 83 ec 40 48 8b 4f 20 48 8b 3f 65 48 8b 04 25 28 00 00 00 48 89 44 24 38 31 c0 48 8d 04 ed 00 00 00 00 48 01 c1 48 8b 11 <0f> b7 92 82 00 00 00 48 85 d2 0f 84 2d 75 00 00 48 8d 72 ff 48 85
[420161.134741] RSP: 0018:ffffc9002d2afd48 EFLAGS: 00010282
[420161.144274] RAX: 0000000000000050 RBX: ffff88811d8bdd00 RCX: ffff888112c14ff8
[420161.155690] RDX: 0000000000000000 RSI: ffff88811d8bdd00 RDI: ffff888109861000
[420161.168088] RBP: 000000000000000a R08: 000000000000000a R09: 0000000000000000
[420161.179295] R10: 000000000000889f R11: 0000000000000000 R12: 000000000000000a
[420161.190420] R13: 000000000000000a R14: 0000000000000000 R15: ffff888117611828
[420161.201505] FS:  00007fa869fc1f80(0000) GS:ffff8897e0880000(0000) knlGS:0000000000000000
[420161.213628] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[420161.223413] CR2: 0000000000000082 CR3: 00000001d076c001 CR4: 00000000007706e0
[420161.234653] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[420161.245893] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[420161.257052] PKRU: 55555554

Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/intel/ice/ice_xsk.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 49ba8bfdbf047..45f88e6ec25e8 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -329,6 +329,12 @@ int ice_xsk_pool_setup(struct ice_vsi *vsi, struct xsk_buff_pool *pool, u16 qid)
 	bool if_running, pool_present = !!pool;
 	int ret = 0, pool_failure = 0;
 
+	if (qid >= vsi->num_rxq || qid >= vsi->num_txq) {
+		netdev_err(vsi->netdev, "Please use queue id in scope of combined queues count\n");
+		pool_failure = -EINVAL;
+		goto failure;
+	}
+
 	if (!is_power_of_2(vsi->rx_rings[qid]->count) ||
 	    !is_power_of_2(vsi->tx_rings[qid]->count)) {
 		netdev_err(vsi->netdev, "Please align ring sizes to power of 2\n");
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 028/158] ice: xsk: use Rx rings XDP ring when picking NAPI context
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (26 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 027/158] ice: xsk: prohibit usage of non-balanced queue id Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:57 ` [PATCH 5.19 029/158] net/mlx5e: Properly disable vlan strip on non-UL reps Greg Kroah-Hartman
                   ` (142 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Maciej Fijalkowski,
	George Kuruvinakunnel, Tony Nguyen, Sasha Levin

From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

[ Upstream commit 9ead7e74bfd6dd54db12ef133b8604add72511de ]

Ice driver allocates per cpu XDP queues so that redirect path can safely
use smp_processor_id() as an index to the array. At the same time
though, XDP rings are used to pick NAPI context to call napi_schedule()
or set NAPIF_STATE_MISSED. When user reduces queue count, say to 8, and
num_possible_cpus() of underlying platform is 44, then this means queue
vectors with correlated NAPI contexts will carry several XDP queues.

This in turn can result in a broken behavior where NAPI context of
interest will never be scheduled and AF_XDP socket will not process any
traffic.

To fix this, let us change the way how XDP rings are assigned to Rx
rings and use this information later on when setting
ice_tx_ring::xsk_pool pointer. For each Rx ring, grab the associated
queue vector and walk through Tx ring's linked list. Once we stumble
upon XDP ring in it, assign this ring to ice_rx_ring::xdp_ring.

Previous [0] approach of fixing this issue was for txonly scenario
because of the described grouping of XDP rings across queue vectors. So,
relying on Rx ring meant that NAPI context could be scheduled with a
queue vector without XDP ring with associated XSK pool.

[0]: https://lore.kernel.org/netdev/20220707161128.54215-1-maciej.fijalkowski@intel.com/

Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Fixes: 22bf877e528f ("ice: introduce XDP_TX fallback path")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/intel/ice/ice.h      | 36 +++++++++++++++--------
 drivers/net/ethernet/intel/ice/ice_lib.c  |  4 +--
 drivers/net/ethernet/intel/ice/ice_main.c | 25 +++++++++++-----
 drivers/net/ethernet/intel/ice/ice_xsk.c  | 12 ++++----
 4 files changed, 48 insertions(+), 29 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 60453b3b8d233..6911cbb7afa50 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -684,8 +684,8 @@ static inline void ice_set_ring_xdp(struct ice_tx_ring *ring)
  * ice_xsk_pool - get XSK buffer pool bound to a ring
  * @ring: Rx ring to use
  *
- * Returns a pointer to xdp_umem structure if there is a buffer pool present,
- * NULL otherwise.
+ * Returns a pointer to xsk_buff_pool structure if there is a buffer pool
+ * present, NULL otherwise.
  */
 static inline struct xsk_buff_pool *ice_xsk_pool(struct ice_rx_ring *ring)
 {
@@ -699,23 +699,33 @@ static inline struct xsk_buff_pool *ice_xsk_pool(struct ice_rx_ring *ring)
 }
 
 /**
- * ice_tx_xsk_pool - get XSK buffer pool bound to a ring
- * @ring: Tx ring to use
+ * ice_tx_xsk_pool - assign XSK buff pool to XDP ring
+ * @vsi: pointer to VSI
+ * @qid: index of a queue to look at XSK buff pool presence
  *
- * Returns a pointer to xdp_umem structure if there is a buffer pool present,
- * NULL otherwise. Tx equivalent of ice_xsk_pool.
+ * Sets XSK buff pool pointer on XDP ring.
+ *
+ * XDP ring is picked from Rx ring, whereas Rx ring is picked based on provided
+ * queue id. Reason for doing so is that queue vectors might have assigned more
+ * than one XDP ring, e.g. when user reduced the queue count on netdev; Rx ring
+ * carries a pointer to one of these XDP rings for its own purposes, such as
+ * handling XDP_TX action, therefore we can piggyback here on the
+ * rx_ring->xdp_ring assignment that was done during XDP rings initialization.
  */
-static inline struct xsk_buff_pool *ice_tx_xsk_pool(struct ice_tx_ring *ring)
+static inline void ice_tx_xsk_pool(struct ice_vsi *vsi, u16 qid)
 {
-	struct ice_vsi *vsi = ring->vsi;
-	u16 qid;
+	struct ice_tx_ring *ring;
 
-	qid = ring->q_index - vsi->alloc_txq;
+	ring = vsi->rx_rings[qid]->xdp_ring;
+	if (!ring)
+		return;
 
-	if (!ice_is_xdp_ena_vsi(vsi) || !test_bit(qid, vsi->af_xdp_zc_qps))
-		return NULL;
+	if (!ice_is_xdp_ena_vsi(vsi) || !test_bit(qid, vsi->af_xdp_zc_qps)) {
+		ring->xsk_pool = NULL;
+		return;
+	}
 
-	return xsk_get_pool_from_qid(vsi->netdev, qid);
+	ring->xsk_pool = xsk_get_pool_from_qid(vsi->netdev, qid);
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index d6aafa272fb0b..6c4e1d45235ef 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -1983,8 +1983,8 @@ int ice_vsi_cfg_xdp_txqs(struct ice_vsi *vsi)
 	if (ret)
 		return ret;
 
-	ice_for_each_xdp_txq(vsi, i)
-		vsi->xdp_rings[i]->xsk_pool = ice_tx_xsk_pool(vsi->xdp_rings[i]);
+	ice_for_each_rxq(vsi, i)
+		ice_tx_xsk_pool(vsi, i);
 
 	return ret;
 }
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index bfd97a9a8f2e0..3d45e075204e3 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -2581,7 +2581,6 @@ static int ice_xdp_alloc_setup_rings(struct ice_vsi *vsi)
 		if (ice_setup_tx_ring(xdp_ring))
 			goto free_xdp_rings;
 		ice_set_ring_xdp(xdp_ring);
-		xdp_ring->xsk_pool = ice_tx_xsk_pool(xdp_ring);
 		spin_lock_init(&xdp_ring->tx_lock);
 		for (j = 0; j < xdp_ring->count; j++) {
 			tx_desc = ICE_TX_DESC(xdp_ring, j);
@@ -2589,13 +2588,6 @@ static int ice_xdp_alloc_setup_rings(struct ice_vsi *vsi)
 		}
 	}
 
-	ice_for_each_rxq(vsi, i) {
-		if (static_key_enabled(&ice_xdp_locking_key))
-			vsi->rx_rings[i]->xdp_ring = vsi->xdp_rings[i % vsi->num_xdp_txq];
-		else
-			vsi->rx_rings[i]->xdp_ring = vsi->xdp_rings[i];
-	}
-
 	return 0;
 
 free_xdp_rings:
@@ -2685,6 +2677,23 @@ int ice_prepare_xdp_rings(struct ice_vsi *vsi, struct bpf_prog *prog)
 		xdp_rings_rem -= xdp_rings_per_v;
 	}
 
+	ice_for_each_rxq(vsi, i) {
+		if (static_key_enabled(&ice_xdp_locking_key)) {
+			vsi->rx_rings[i]->xdp_ring = vsi->xdp_rings[i % vsi->num_xdp_txq];
+		} else {
+			struct ice_q_vector *q_vector = vsi->rx_rings[i]->q_vector;
+			struct ice_tx_ring *ring;
+
+			ice_for_each_tx_ring(ring, q_vector->tx) {
+				if (ice_ring_is_xdp(ring)) {
+					vsi->rx_rings[i]->xdp_ring = ring;
+					break;
+				}
+			}
+		}
+		ice_tx_xsk_pool(vsi, i);
+	}
+
 	/* omit the scheduler update if in reset path; XDP queues will be
 	 * taken into account at the end of ice_vsi_rebuild, where
 	 * ice_cfg_vsi_lan is being called
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 45f88e6ec25e8..e48e29258450f 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -243,7 +243,7 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
 		if (err)
 			goto free_buf;
 		ice_set_ring_xdp(xdp_ring);
-		xdp_ring->xsk_pool = ice_tx_xsk_pool(xdp_ring);
+		ice_tx_xsk_pool(vsi, q_idx);
 	}
 
 	err = ice_vsi_cfg_rxq(rx_ring);
@@ -359,7 +359,7 @@ int ice_xsk_pool_setup(struct ice_vsi *vsi, struct xsk_buff_pool *pool, u16 qid)
 	if (if_running) {
 		ret = ice_qp_ena(vsi, qid);
 		if (!ret && pool_present)
-			napi_schedule(&vsi->xdp_rings[qid]->q_vector->napi);
+			napi_schedule(&vsi->rx_rings[qid]->xdp_ring->q_vector->napi);
 		else if (ret)
 			netdev_err(vsi->netdev, "ice_qp_ena error = %d\n", ret);
 	}
@@ -950,13 +950,13 @@ ice_xsk_wakeup(struct net_device *netdev, u32 queue_id,
 	if (!ice_is_xdp_ena_vsi(vsi))
 		return -EINVAL;
 
-	if (queue_id >= vsi->num_txq)
+	if (queue_id >= vsi->num_txq || queue_id >= vsi->num_rxq)
 		return -EINVAL;
 
-	if (!vsi->xdp_rings[queue_id]->xsk_pool)
-		return -EINVAL;
+	ring = vsi->rx_rings[queue_id]->xdp_ring;
 
-	ring = vsi->xdp_rings[queue_id];
+	if (!ring->xsk_pool)
+		return -EINVAL;
 
 	/* The idea here is that if NAPI is running, mark a miss, so
 	 * it will run again. If not, trigger an interrupt and
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 029/158] net/mlx5e: Properly disable vlan strip on non-UL reps
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (27 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 028/158] ice: xsk: use Rx rings XDP ring when picking NAPI context Greg Kroah-Hartman
@ 2022-08-29 10:57 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 030/158] net/mlx5: LAG, fix logic over MLX5_LAG_FLAG_NDEVS_READY Greg Kroah-Hartman
                   ` (141 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Vlad Buslov, Roi Dayan,
	Saeed Mahameed, Sasha Levin

From: Vlad Buslov <vladbu@nvidia.com>

[ Upstream commit f37044fd759b6bc40b6398a978e0b1acdf717372 ]

When querying mlx5 non-uplink representors capabilities with ethtool
rx-vlan-offload is marked as "off [fixed]". However, it is actually always
enabled because mlx5e_params->vlan_strip_disable is 0 by default when
initializing struct mlx5e_params instance. Fix the issue by explicitly
setting the vlan_strip_disable to 'true' for non-uplink representors.

Fixes: cb67b832921c ("net/mlx5e: Introduce SRIOV VF representors")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index f797fd97d305b..7da3dc6261929 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -662,6 +662,8 @@ static void mlx5e_build_rep_params(struct net_device *netdev)
 
 	params->mqprio.num_tc       = 1;
 	params->tunneled_offload_en = false;
+	if (rep->vport != MLX5_VPORT_UPLINK)
+		params->vlan_strip_disable = true;
 
 	mlx5_query_min_inline(mdev, &params->tx_min_inline_mode);
 }
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 030/158] net/mlx5: LAG, fix logic over MLX5_LAG_FLAG_NDEVS_READY
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (28 preceding siblings ...)
  2022-08-29 10:57 ` [PATCH 5.19 029/158] net/mlx5e: Properly disable vlan strip on non-UL reps Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 031/158] net/mlx5: Eswitch, Fix forwarding decision to uplink Greg Kroah-Hartman
                   ` (140 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eli Cohen, Maor Dickman, Mark Bloch,
	Saeed Mahameed, Sasha Levin

From: Eli Cohen <elic@nvidia.com>

[ Upstream commit a6e675a66175869b7d87c0e1dd0ddf93e04f8098 ]

Only set MLX5_LAG_FLAG_NDEVS_READY if both netdevices are registered.
Doing so guarantees that both ldev->pf[MLX5_LAG_P0].dev and
ldev->pf[MLX5_LAG_P1].dev have valid pointers when
MLX5_LAG_FLAG_NDEVS_READY is set.

The core issue is asymmetry in setting MLX5_LAG_FLAG_NDEVS_READY and
clearing it. Setting it is done wrongly when both
ldev->pf[MLX5_LAG_P0].dev and ldev->pf[MLX5_LAG_P1].dev are set;
clearing it is done right when either of ldev->pf[i].netdev is cleared.

Consider the following scenario:
1. PF0 loads and sets ldev->pf[MLX5_LAG_P0].dev to a valid pointer
2. PF1 loads and sets both ldev->pf[MLX5_LAG_P1].dev and
   ldev->pf[MLX5_LAG_P1].netdev with valid pointers. This results in
   MLX5_LAG_FLAG_NDEVS_READY is set.
3. PF0 is unloaded before setting dev->pf[MLX5_LAG_P0].netdev.
   MLX5_LAG_FLAG_NDEVS_READY remains set.

Further execution of mlx5_do_bond() will result in null pointer
dereference when calling mlx5_lag_is_multipath()

This patch fixes the following call trace actually encountered:

[ 1293.475195] BUG: kernel NULL pointer dereference, address: 00000000000009a8
[ 1293.478756] #PF: supervisor read access in kernel mode
[ 1293.481320] #PF: error_code(0x0000) - not-present page
[ 1293.483686] PGD 0 P4D 0
[ 1293.484434] Oops: 0000 [#1] SMP PTI
[ 1293.485377] CPU: 1 PID: 23690 Comm: kworker/u16:2 Not tainted 5.18.0-rc5_for_upstream_min_debug_2022_05_05_10_13 #1
[ 1293.488039] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 1293.490836] Workqueue: mlx5_lag mlx5_do_bond_work [mlx5_core]
[ 1293.492448] RIP: 0010:mlx5_lag_is_multipath+0x5/0x50 [mlx5_core]
[ 1293.494044] Code: e8 70 40 ff e0 48 8b 14 24 48 83 05 5c 1a 1b 00 01 e9 19 ff ff ff 48 83 05 47 1a 1b 00 01 eb d7 0f 1f 44 00 00 0f 1f 44 00 00 <48> 8b 87 a8 09 00 00 48 85 c0 74 26 48 83 05 a7 1b 1b 00 01 41 b8
[ 1293.498673] RSP: 0018:ffff88811b2fbe40 EFLAGS: 00010202
[ 1293.500152] RAX: ffff88818a94e1c0 RBX: ffff888165eca6c0 RCX: 0000000000000000
[ 1293.501841] RDX: 0000000000000001 RSI: ffff88818a94e1c0 RDI: 0000000000000000
[ 1293.503585] RBP: 0000000000000000 R08: ffff888119886740 R09: ffff888165eca73c
[ 1293.505286] R10: 0000000000000018 R11: 0000000000000018 R12: ffff88818a94e1c0
[ 1293.506979] R13: ffff888112729800 R14: 0000000000000000 R15: ffff888112729858
[ 1293.508753] FS:  0000000000000000(0000) GS:ffff88852cc40000(0000) knlGS:0000000000000000
[ 1293.510782] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1293.512265] CR2: 00000000000009a8 CR3: 00000001032d4002 CR4: 0000000000370ea0
[ 1293.514001] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1293.515806] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Fixes: 8a66e4585979 ("net/mlx5: Change ownership model for lag")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index 5d41e19378e09..c520edb942ca5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -1234,7 +1234,7 @@ void mlx5_lag_add_netdev(struct mlx5_core_dev *dev,
 	mlx5_ldev_add_netdev(ldev, dev, netdev);
 
 	for (i = 0; i < ldev->ports; i++)
-		if (!ldev->pf[i].dev)
+		if (!ldev->pf[i].netdev)
 			break;
 
 	if (i >= ldev->ports)
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 031/158] net/mlx5: Eswitch, Fix forwarding decision to uplink
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (29 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 030/158] net/mlx5: LAG, fix logic over MLX5_LAG_FLAG_NDEVS_READY Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 032/158] net/mlx5: Disable irq when locking lag_lock Greg Kroah-Hartman
                   ` (139 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eli Cohen, Maor Dickman,
	Saeed Mahameed, Sasha Levin

From: Eli Cohen <elic@nvidia.com>

[ Upstream commit 942fca7e762be39204e5926e91a288a343a97c72 ]

Make sure to modify the rule for uplink forwarding only for the case
where destination vport number is MLX5_VPORT_UPLINK.

Fixes: 94db33177819 ("net/mlx5: Support multiport eswitch mode")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index eb79810199d3e..d04739cb793e5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -427,7 +427,8 @@ esw_setup_vport_dest(struct mlx5_flow_destination *dest, struct mlx5_flow_act *f
 		dest[dest_idx].vport.vhca_id =
 			MLX5_CAP_GEN(esw_attr->dests[attr_idx].mdev, vhca_id);
 		dest[dest_idx].vport.flags |= MLX5_FLOW_DEST_VPORT_VHCA_ID;
-		if (mlx5_lag_mpesw_is_activated(esw->dev))
+		if (dest[dest_idx].vport.num == MLX5_VPORT_UPLINK &&
+		    mlx5_lag_mpesw_is_activated(esw->dev))
 			dest[dest_idx].type = MLX5_FLOW_DESTINATION_TYPE_UPLINK;
 	}
 	if (esw_attr->dests[attr_idx].flags & MLX5_ESW_DEST_ENCAP) {
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 032/158] net/mlx5: Disable irq when locking lag_lock
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (30 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 031/158] net/mlx5: Eswitch, Fix forwarding decision to uplink Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 033/158] net/mlx5: Fix cmd error logging for manage pages cmd Greg Kroah-Hartman
                   ` (138 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Vlad Buslov, Mark Bloch,
	Saeed Mahameed, Sasha Levin

From: Vlad Buslov <vladbu@nvidia.com>

[ Upstream commit 8e93f29422ffe968d7161f91acdf0d47f5323727 ]

The lag_lock is taken from both process and softirq contexts which results
lockdep warning[0] about potential deadlock. However, just disabling
softirqs by using *_bh spinlock API is not enough since it will cause
warning in some contexts where the lock is obtained with hard irqs
disabled. To fix the issue save current irq state, disable them before
obtaining the lock an re-enable irqs from saved state after releasing it.

[0]:

[Sun Aug  7 13:12:29 2022] ================================
[Sun Aug  7 13:12:29 2022] WARNING: inconsistent lock state
[Sun Aug  7 13:12:29 2022] 5.19.0_for_upstream_debug_2022_08_04_16_06 #1 Not tainted
[Sun Aug  7 13:12:29 2022] --------------------------------
[Sun Aug  7 13:12:29 2022] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[Sun Aug  7 13:12:29 2022] swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
[Sun Aug  7 13:12:29 2022] ffffffffa06dc0d8 (lag_lock){+.?.}-{2:2}, at: mlx5_lag_is_shared_fdb+0x1f/0x120 [mlx5_core]
[Sun Aug  7 13:12:29 2022] {SOFTIRQ-ON-W} state was registered at:
[Sun Aug  7 13:12:29 2022]   lock_acquire+0x1c1/0x550
[Sun Aug  7 13:12:29 2022]   _raw_spin_lock+0x2c/0x40
[Sun Aug  7 13:12:29 2022]   mlx5_lag_add_netdev+0x13b/0x480 [mlx5_core]
[Sun Aug  7 13:12:29 2022]   mlx5e_nic_enable+0x114/0x470 [mlx5_core]
[Sun Aug  7 13:12:29 2022]   mlx5e_attach_netdev+0x30e/0x6a0 [mlx5_core]
[Sun Aug  7 13:12:29 2022]   mlx5e_resume+0x105/0x160 [mlx5_core]
[Sun Aug  7 13:12:29 2022]   mlx5e_probe+0xac3/0x14f0 [mlx5_core]
[Sun Aug  7 13:12:29 2022]   auxiliary_bus_probe+0x9d/0xe0
[Sun Aug  7 13:12:29 2022]   really_probe+0x1e0/0xaa0
[Sun Aug  7 13:12:29 2022]   __driver_probe_device+0x219/0x480
[Sun Aug  7 13:12:29 2022]   driver_probe_device+0x49/0x130
[Sun Aug  7 13:12:29 2022]   __driver_attach+0x1e4/0x4d0
[Sun Aug  7 13:12:29 2022]   bus_for_each_dev+0x11e/0x1a0
[Sun Aug  7 13:12:29 2022]   bus_add_driver+0x3f4/0x5a0
[Sun Aug  7 13:12:29 2022]   driver_register+0x20f/0x390
[Sun Aug  7 13:12:29 2022]   __auxiliary_driver_register+0x14e/0x260
[Sun Aug  7 13:12:29 2022]   mlx5e_init+0x38/0x90 [mlx5_core]
[Sun Aug  7 13:12:29 2022]   vhost_iotlb_itree_augment_rotate+0xcb/0x180 [vhost_iotlb]
[Sun Aug  7 13:12:29 2022]   do_one_initcall+0xc4/0x400
[Sun Aug  7 13:12:29 2022]   do_init_module+0x18a/0x620
[Sun Aug  7 13:12:29 2022]   load_module+0x563a/0x7040
[Sun Aug  7 13:12:29 2022]   __do_sys_finit_module+0x122/0x1d0
[Sun Aug  7 13:12:29 2022]   do_syscall_64+0x3d/0x90
[Sun Aug  7 13:12:29 2022]   entry_SYSCALL_64_after_hwframe+0x46/0xb0
[Sun Aug  7 13:12:29 2022] irq event stamp: 3596508
[Sun Aug  7 13:12:29 2022] hardirqs last  enabled at (3596508): [<ffffffff813687c2>] __local_bh_enable_ip+0xa2/0x100
[Sun Aug  7 13:12:29 2022] hardirqs last disabled at (3596507): [<ffffffff813687da>] __local_bh_enable_ip+0xba/0x100
[Sun Aug  7 13:12:29 2022] softirqs last  enabled at (3596488): [<ffffffff81368a2a>] irq_exit_rcu+0x11a/0x170
[Sun Aug  7 13:12:29 2022] softirqs last disabled at (3596495): [<ffffffff81368a2a>] irq_exit_rcu+0x11a/0x170
[Sun Aug  7 13:12:29 2022]
                           other info that might help us debug this:
[Sun Aug  7 13:12:29 2022]  Possible unsafe locking scenario:

[Sun Aug  7 13:12:29 2022]        CPU0
[Sun Aug  7 13:12:29 2022]        ----
[Sun Aug  7 13:12:29 2022]   lock(lag_lock);
[Sun Aug  7 13:12:29 2022]   <Interrupt>
[Sun Aug  7 13:12:29 2022]     lock(lag_lock);
[Sun Aug  7 13:12:29 2022]
                            *** DEADLOCK ***

[Sun Aug  7 13:12:29 2022] 4 locks held by swapper/0/0:
[Sun Aug  7 13:12:29 2022]  #0: ffffffff84643260 (rcu_read_lock){....}-{1:2}, at: mlx5e_napi_poll+0x43/0x20a0 [mlx5_core]
[Sun Aug  7 13:12:29 2022]  #1: ffffffff84643260 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb_list_internal+0x2d7/0xd60
[Sun Aug  7 13:12:29 2022]  #2: ffff888144a18b58 (&br->hash_lock){+.-.}-{2:2}, at: br_fdb_update+0x301/0x570
[Sun Aug  7 13:12:29 2022]  #3: ffffffff84643260 (rcu_read_lock){....}-{1:2}, at: atomic_notifier_call_chain+0x5/0x1d0
[Sun Aug  7 13:12:29 2022]
                           stack backtrace:
[Sun Aug  7 13:12:29 2022] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0_for_upstream_debug_2022_08_04_16_06 #1
[Sun Aug  7 13:12:29 2022] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[Sun Aug  7 13:12:29 2022] Call Trace:
[Sun Aug  7 13:12:29 2022]  <IRQ>
[Sun Aug  7 13:12:29 2022]  dump_stack_lvl+0x57/0x7d
[Sun Aug  7 13:12:29 2022]  mark_lock.part.0.cold+0x5f/0x92
[Sun Aug  7 13:12:29 2022]  ? lock_chain_count+0x20/0x20
[Sun Aug  7 13:12:29 2022]  ? unwind_next_frame+0x1c4/0x1b50
[Sun Aug  7 13:12:29 2022]  ? secondary_startup_64_no_verify+0xcd/0xdb
[Sun Aug  7 13:12:29 2022]  ? mlx5e_napi_poll+0x4e9/0x20a0 [mlx5_core]
[Sun Aug  7 13:12:29 2022]  ? mlx5e_napi_poll+0x4e9/0x20a0 [mlx5_core]
[Sun Aug  7 13:12:29 2022]  ? stack_access_ok+0x1d0/0x1d0
[Sun Aug  7 13:12:29 2022]  ? start_kernel+0x3a7/0x3c5
[Sun Aug  7 13:12:29 2022]  __lock_acquire+0x1260/0x6720
[Sun Aug  7 13:12:29 2022]  ? lock_chain_count+0x20/0x20
[Sun Aug  7 13:12:29 2022]  ? lock_chain_count+0x20/0x20
[Sun Aug  7 13:12:29 2022]  ? register_lock_class+0x1880/0x1880
[Sun Aug  7 13:12:29 2022]  ? mark_lock.part.0+0xed/0x3060
[Sun Aug  7 13:12:29 2022]  ? stack_trace_save+0x91/0xc0
[Sun Aug  7 13:12:29 2022]  lock_acquire+0x1c1/0x550
[Sun Aug  7 13:12:29 2022]  ? mlx5_lag_is_shared_fdb+0x1f/0x120 [mlx5_core]
[Sun Aug  7 13:12:29 2022]  ? lockdep_hardirqs_on_prepare+0x400/0x400
[Sun Aug  7 13:12:29 2022]  ? __lock_acquire+0xd6f/0x6720
[Sun Aug  7 13:12:29 2022]  _raw_spin_lock+0x2c/0x40
[Sun Aug  7 13:12:29 2022]  ? mlx5_lag_is_shared_fdb+0x1f/0x120 [mlx5_core]
[Sun Aug  7 13:12:29 2022]  mlx5_lag_is_shared_fdb+0x1f/0x120 [mlx5_core]
[Sun Aug  7 13:12:29 2022]  mlx5_esw_bridge_rep_vport_num_vhca_id_get+0x1a0/0x600 [mlx5_core]
[Sun Aug  7 13:12:29 2022]  ? mlx5_esw_bridge_update_work+0x90/0x90 [mlx5_core]
[Sun Aug  7 13:12:29 2022]  ? lock_acquire+0x1c1/0x550
[Sun Aug  7 13:12:29 2022]  mlx5_esw_bridge_switchdev_event+0x185/0x8f0 [mlx5_core]
[Sun Aug  7 13:12:29 2022]  ? mlx5_esw_bridge_port_obj_attr_set+0x3e0/0x3e0 [mlx5_core]
[Sun Aug  7 13:12:29 2022]  ? check_chain_key+0x24a/0x580
[Sun Aug  7 13:12:29 2022]  atomic_notifier_call_chain+0xd7/0x1d0
[Sun Aug  7 13:12:29 2022]  br_switchdev_fdb_notify+0xea/0x100
[Sun Aug  7 13:12:29 2022]  ? br_switchdev_set_port_flag+0x310/0x310
[Sun Aug  7 13:12:29 2022]  fdb_notify+0x11b/0x150
[Sun Aug  7 13:12:29 2022]  br_fdb_update+0x34c/0x570
[Sun Aug  7 13:12:29 2022]  ? lock_chain_count+0x20/0x20
[Sun Aug  7 13:12:29 2022]  ? br_fdb_add_local+0x50/0x50
[Sun Aug  7 13:12:29 2022]  ? br_allowed_ingress+0x5f/0x1070
[Sun Aug  7 13:12:29 2022]  ? check_chain_key+0x24a/0x580
[Sun Aug  7 13:12:29 2022]  br_handle_frame_finish+0x786/0x18e0
[Sun Aug  7 13:12:29 2022]  ? check_chain_key+0x24a/0x580
[Sun Aug  7 13:12:29 2022]  ? br_handle_local_finish+0x20/0x20
[Sun Aug  7 13:12:29 2022]  ? __lock_acquire+0xd6f/0x6720
[Sun Aug  7 13:12:29 2022]  ? sctp_inet_bind_verify+0x4d/0x190
[Sun Aug  7 13:12:29 2022]  ? xlog_unpack_data+0x2e0/0x310
[Sun Aug  7 13:12:29 2022]  ? br_handle_local_finish+0x20/0x20
[Sun Aug  7 13:12:29 2022]  br_nf_hook_thresh+0x227/0x380 [br_netfilter]
[Sun Aug  7 13:12:29 2022]  ? setup_pre_routing+0x460/0x460 [br_netfilter]
[Sun Aug  7 13:12:29 2022]  ? br_handle_local_finish+0x20/0x20
[Sun Aug  7 13:12:29 2022]  ? br_nf_pre_routing_ipv6+0x48b/0x69c [br_netfilter]
[Sun Aug  7 13:12:29 2022]  br_nf_pre_routing_finish_ipv6+0x5c2/0xbf0 [br_netfilter]
[Sun Aug  7 13:12:29 2022]  ? br_handle_local_finish+0x20/0x20
[Sun Aug  7 13:12:29 2022]  br_nf_pre_routing_ipv6+0x4c6/0x69c [br_netfilter]
[Sun Aug  7 13:12:29 2022]  ? br_validate_ipv6+0x9e0/0x9e0 [br_netfilter]
[Sun Aug  7 13:12:29 2022]  ? br_nf_forward_arp+0xb70/0xb70 [br_netfilter]
[Sun Aug  7 13:12:29 2022]  ? br_nf_pre_routing+0xacf/0x1160 [br_netfilter]
[Sun Aug  7 13:12:29 2022]  br_handle_frame+0x8a9/0x1270
[Sun Aug  7 13:12:29 2022]  ? br_handle_frame_finish+0x18e0/0x18e0
[Sun Aug  7 13:12:29 2022]  ? register_lock_class+0x1880/0x1880
[Sun Aug  7 13:12:29 2022]  ? br_handle_local_finish+0x20/0x20
[Sun Aug  7 13:12:29 2022]  ? bond_handle_frame+0xf9/0xac0 [bonding]
[Sun Aug  7 13:12:29 2022]  ? br_handle_frame_finish+0x18e0/0x18e0
[Sun Aug  7 13:12:29 2022]  __netif_receive_skb_core+0x7c0/0x2c70
[Sun Aug  7 13:12:29 2022]  ? check_chain_key+0x24a/0x580
[Sun Aug  7 13:12:29 2022]  ? generic_xdp_tx+0x5b0/0x5b0
[Sun Aug  7 13:12:29 2022]  ? __lock_acquire+0xd6f/0x6720
[Sun Aug  7 13:12:29 2022]  ? register_lock_class+0x1880/0x1880
[Sun Aug  7 13:12:29 2022]  ? check_chain_key+0x24a/0x580
[Sun Aug  7 13:12:29 2022]  __netif_receive_skb_list_core+0x2d7/0x8a0
[Sun Aug  7 13:12:29 2022]  ? lock_acquire+0x1c1/0x550
[Sun Aug  7 13:12:29 2022]  ? process_backlog+0x960/0x960
[Sun Aug  7 13:12:29 2022]  ? lockdep_hardirqs_on_prepare+0x129/0x400
[Sun Aug  7 13:12:29 2022]  ? kvm_clock_get_cycles+0x14/0x20
[Sun Aug  7 13:12:29 2022]  netif_receive_skb_list_internal+0x5f4/0xd60
[Sun Aug  7 13:12:29 2022]  ? do_xdp_generic+0x150/0x150
[Sun Aug  7 13:12:29 2022]  ? mlx5e_poll_rx_cq+0xf6b/0x2960 [mlx5_core]
[Sun Aug  7 13:12:29 2022]  ? mlx5e_poll_ico_cq+0x3d/0x1590 [mlx5_core]
[Sun Aug  7 13:12:29 2022]  napi_complete_done+0x188/0x710
[Sun Aug  7 13:12:29 2022]  mlx5e_napi_poll+0x4e9/0x20a0 [mlx5_core]
[Sun Aug  7 13:12:29 2022]  ? __queue_work+0x53c/0xeb0
[Sun Aug  7 13:12:29 2022]  __napi_poll+0x9f/0x540
[Sun Aug  7 13:12:29 2022]  net_rx_action+0x420/0xb70
[Sun Aug  7 13:12:29 2022]  ? napi_threaded_poll+0x470/0x470
[Sun Aug  7 13:12:29 2022]  ? __common_interrupt+0x79/0x1a0
[Sun Aug  7 13:12:29 2022]  __do_softirq+0x271/0x92c
[Sun Aug  7 13:12:29 2022]  irq_exit_rcu+0x11a/0x170
[Sun Aug  7 13:12:29 2022]  common_interrupt+0x7d/0xa0
[Sun Aug  7 13:12:29 2022]  </IRQ>
[Sun Aug  7 13:12:29 2022]  <TASK>
[Sun Aug  7 13:12:29 2022]  asm_common_interrupt+0x22/0x40
[Sun Aug  7 13:12:29 2022] RIP: 0010:default_idle+0x42/0x60
[Sun Aug  7 13:12:29 2022] Code: c1 83 e0 07 48 c1 e9 03 83 c0 03 0f b6 14 11 38 d0 7c 04 84 d2 75 14 8b 05 6b f1 22 02 85 c0 7e 07 0f 00 2d 80 3b 4a 00 fb f4 <c3> 48 c7 c7 e0 07 7e 85 e8 21 bd 40 fe eb de 66 66 2e 0f 1f 84 00
[Sun Aug  7 13:12:29 2022] RSP: 0018:ffffffff84407e18 EFLAGS: 00000242
[Sun Aug  7 13:12:29 2022] RAX: 0000000000000001 RBX: ffffffff84ec4a68 RCX: 1ffffffff0afc0fc
[Sun Aug  7 13:12:29 2022] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffff835b1fac
[Sun Aug  7 13:12:29 2022] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff8884d2c44ac3
[Sun Aug  7 13:12:29 2022] R10: ffffed109a588958 R11: 00000000ffffffff R12: 0000000000000000
[Sun Aug  7 13:12:29 2022] R13: ffffffff84efac20 R14: 0000000000000000 R15: dffffc0000000000
[Sun Aug  7 13:12:29 2022]  ? default_idle_call+0xcc/0x460
[Sun Aug  7 13:12:29 2022]  default_idle_call+0xec/0x460
[Sun Aug  7 13:12:29 2022]  do_idle+0x394/0x450
[Sun Aug  7 13:12:29 2022]  ? arch_cpu_idle_exit+0x40/0x40
[Sun Aug  7 13:12:29 2022]  cpu_startup_entry+0x19/0x20
[Sun Aug  7 13:12:29 2022]  rest_init+0x156/0x250
[Sun Aug  7 13:12:29 2022]  arch_call_rest_init+0xf/0x15
[Sun Aug  7 13:12:29 2022]  start_kernel+0x3a7/0x3c5
[Sun Aug  7 13:12:29 2022]  secondary_startup_64_no_verify+0xcd/0xdb
[Sun Aug  7 13:12:29 2022]  </TASK>

Fixes: ff9b7521468b ("net/mlx5: Bridge, support LAG")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../net/ethernet/mellanox/mlx5/core/lag/lag.c | 55 +++++++++++--------
 1 file changed, 33 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index c520edb942ca5..d98acd68af2ec 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -1067,30 +1067,32 @@ static void mlx5_ldev_add_netdev(struct mlx5_lag *ldev,
 				 struct net_device *netdev)
 {
 	unsigned int fn = mlx5_get_dev_index(dev);
+	unsigned long flags;
 
 	if (fn >= ldev->ports)
 		return;
 
-	spin_lock(&lag_lock);
+	spin_lock_irqsave(&lag_lock, flags);
 	ldev->pf[fn].netdev = netdev;
 	ldev->tracker.netdev_state[fn].link_up = 0;
 	ldev->tracker.netdev_state[fn].tx_enabled = 0;
-	spin_unlock(&lag_lock);
+	spin_unlock_irqrestore(&lag_lock, flags);
 }
 
 static void mlx5_ldev_remove_netdev(struct mlx5_lag *ldev,
 				    struct net_device *netdev)
 {
+	unsigned long flags;
 	int i;
 
-	spin_lock(&lag_lock);
+	spin_lock_irqsave(&lag_lock, flags);
 	for (i = 0; i < ldev->ports; i++) {
 		if (ldev->pf[i].netdev == netdev) {
 			ldev->pf[i].netdev = NULL;
 			break;
 		}
 	}
-	spin_unlock(&lag_lock);
+	spin_unlock_irqrestore(&lag_lock, flags);
 }
 
 static void mlx5_ldev_add_mdev(struct mlx5_lag *ldev,
@@ -1246,12 +1248,13 @@ void mlx5_lag_add_netdev(struct mlx5_core_dev *dev,
 bool mlx5_lag_is_roce(struct mlx5_core_dev *dev)
 {
 	struct mlx5_lag *ldev;
+	unsigned long flags;
 	bool res;
 
-	spin_lock(&lag_lock);
+	spin_lock_irqsave(&lag_lock, flags);
 	ldev = mlx5_lag_dev(dev);
 	res  = ldev && __mlx5_lag_is_roce(ldev);
-	spin_unlock(&lag_lock);
+	spin_unlock_irqrestore(&lag_lock, flags);
 
 	return res;
 }
@@ -1260,12 +1263,13 @@ EXPORT_SYMBOL(mlx5_lag_is_roce);
 bool mlx5_lag_is_active(struct mlx5_core_dev *dev)
 {
 	struct mlx5_lag *ldev;
+	unsigned long flags;
 	bool res;
 
-	spin_lock(&lag_lock);
+	spin_lock_irqsave(&lag_lock, flags);
 	ldev = mlx5_lag_dev(dev);
 	res  = ldev && __mlx5_lag_is_active(ldev);
-	spin_unlock(&lag_lock);
+	spin_unlock_irqrestore(&lag_lock, flags);
 
 	return res;
 }
@@ -1274,13 +1278,14 @@ EXPORT_SYMBOL(mlx5_lag_is_active);
 bool mlx5_lag_is_master(struct mlx5_core_dev *dev)
 {
 	struct mlx5_lag *ldev;
+	unsigned long flags;
 	bool res;
 
-	spin_lock(&lag_lock);
+	spin_lock_irqsave(&lag_lock, flags);
 	ldev = mlx5_lag_dev(dev);
 	res = ldev && __mlx5_lag_is_active(ldev) &&
 		dev == ldev->pf[MLX5_LAG_P1].dev;
-	spin_unlock(&lag_lock);
+	spin_unlock_irqrestore(&lag_lock, flags);
 
 	return res;
 }
@@ -1289,12 +1294,13 @@ EXPORT_SYMBOL(mlx5_lag_is_master);
 bool mlx5_lag_is_sriov(struct mlx5_core_dev *dev)
 {
 	struct mlx5_lag *ldev;
+	unsigned long flags;
 	bool res;
 
-	spin_lock(&lag_lock);
+	spin_lock_irqsave(&lag_lock, flags);
 	ldev = mlx5_lag_dev(dev);
 	res  = ldev && __mlx5_lag_is_sriov(ldev);
-	spin_unlock(&lag_lock);
+	spin_unlock_irqrestore(&lag_lock, flags);
 
 	return res;
 }
@@ -1303,13 +1309,14 @@ EXPORT_SYMBOL(mlx5_lag_is_sriov);
 bool mlx5_lag_is_shared_fdb(struct mlx5_core_dev *dev)
 {
 	struct mlx5_lag *ldev;
+	unsigned long flags;
 	bool res;
 
-	spin_lock(&lag_lock);
+	spin_lock_irqsave(&lag_lock, flags);
 	ldev = mlx5_lag_dev(dev);
 	res = ldev && __mlx5_lag_is_sriov(ldev) &&
 	      test_bit(MLX5_LAG_MODE_FLAG_SHARED_FDB, &ldev->mode_flags);
-	spin_unlock(&lag_lock);
+	spin_unlock_irqrestore(&lag_lock, flags);
 
 	return res;
 }
@@ -1352,9 +1359,10 @@ struct net_device *mlx5_lag_get_roce_netdev(struct mlx5_core_dev *dev)
 {
 	struct net_device *ndev = NULL;
 	struct mlx5_lag *ldev;
+	unsigned long flags;
 	int i;
 
-	spin_lock(&lag_lock);
+	spin_lock_irqsave(&lag_lock, flags);
 	ldev = mlx5_lag_dev(dev);
 
 	if (!(ldev && __mlx5_lag_is_roce(ldev)))
@@ -1373,7 +1381,7 @@ struct net_device *mlx5_lag_get_roce_netdev(struct mlx5_core_dev *dev)
 		dev_hold(ndev);
 
 unlock:
-	spin_unlock(&lag_lock);
+	spin_unlock_irqrestore(&lag_lock, flags);
 
 	return ndev;
 }
@@ -1383,10 +1391,11 @@ u8 mlx5_lag_get_slave_port(struct mlx5_core_dev *dev,
 			   struct net_device *slave)
 {
 	struct mlx5_lag *ldev;
+	unsigned long flags;
 	u8 port = 0;
 	int i;
 
-	spin_lock(&lag_lock);
+	spin_lock_irqsave(&lag_lock, flags);
 	ldev = mlx5_lag_dev(dev);
 	if (!(ldev && __mlx5_lag_is_roce(ldev)))
 		goto unlock;
@@ -1401,7 +1410,7 @@ u8 mlx5_lag_get_slave_port(struct mlx5_core_dev *dev,
 	port = ldev->v2p_map[port * ldev->buckets];
 
 unlock:
-	spin_unlock(&lag_lock);
+	spin_unlock_irqrestore(&lag_lock, flags);
 	return port;
 }
 EXPORT_SYMBOL(mlx5_lag_get_slave_port);
@@ -1422,8 +1431,9 @@ struct mlx5_core_dev *mlx5_lag_get_peer_mdev(struct mlx5_core_dev *dev)
 {
 	struct mlx5_core_dev *peer_dev = NULL;
 	struct mlx5_lag *ldev;
+	unsigned long flags;
 
-	spin_lock(&lag_lock);
+	spin_lock_irqsave(&lag_lock, flags);
 	ldev = mlx5_lag_dev(dev);
 	if (!ldev)
 		goto unlock;
@@ -1433,7 +1443,7 @@ struct mlx5_core_dev *mlx5_lag_get_peer_mdev(struct mlx5_core_dev *dev)
 			   ldev->pf[MLX5_LAG_P1].dev;
 
 unlock:
-	spin_unlock(&lag_lock);
+	spin_unlock_irqrestore(&lag_lock, flags);
 	return peer_dev;
 }
 EXPORT_SYMBOL(mlx5_lag_get_peer_mdev);
@@ -1446,6 +1456,7 @@ int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev,
 	int outlen = MLX5_ST_SZ_BYTES(query_cong_statistics_out);
 	struct mlx5_core_dev **mdev;
 	struct mlx5_lag *ldev;
+	unsigned long flags;
 	int num_ports;
 	int ret, i, j;
 	void *out;
@@ -1462,7 +1473,7 @@ int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev,
 
 	memset(values, 0, sizeof(*values) * num_counters);
 
-	spin_lock(&lag_lock);
+	spin_lock_irqsave(&lag_lock, flags);
 	ldev = mlx5_lag_dev(dev);
 	if (ldev && __mlx5_lag_is_active(ldev)) {
 		num_ports = ldev->ports;
@@ -1472,7 +1483,7 @@ int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev,
 		num_ports = 1;
 		mdev[MLX5_LAG_P1] = dev;
 	}
-	spin_unlock(&lag_lock);
+	spin_unlock_irqrestore(&lag_lock, flags);
 
 	for (i = 0; i < num_ports; ++i) {
 		u32 in[MLX5_ST_SZ_DW(query_cong_statistics_in)] = {};
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 033/158] net/mlx5: Fix cmd error logging for manage pages cmd
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (31 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 032/158] net/mlx5: Disable irq when locking lag_lock Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 034/158] net/mlx5: Avoid false positive lockdep warning by adding lock_class_key Greg Kroah-Hartman
                   ` (137 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Roy Novich, Moshe Shemesh,
	Saeed Mahameed, Sasha Levin

From: Roy Novich <royno@nvidia.com>

[ Upstream commit 090f3e4f4089ab8041ed7d632c7851c2a42fcc10 ]

When the driver unloads, give/reclaim_pages may fail as PF driver in
teardown flow, current code will lead to the following kernel log print
'failed reclaiming pages: err 0'.

Fix it to get same behavior as before the cited commits,
by calling mlx5_cmd_check before handling error state.
mlx5_cmd_check will verify if the returned error is an actual error
needed to be handled by the driver or not and will return an
appropriate value.

Fixes: 8d564292a166 ("net/mlx5: Remove redundant error on reclaim pages")
Fixes: 4dac2f10ada0 ("net/mlx5: Remove redundant notify fail on give pages")
Signed-off-by: Roy Novich <royno@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
index ec76a8b1acc1c..60596357bfc7a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c
@@ -376,8 +376,8 @@ static int give_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
 			goto out_dropped;
 		}
 	}
+	err = mlx5_cmd_check(dev, err, in, out);
 	if (err) {
-		err = mlx5_cmd_check(dev, err, in, out);
 		mlx5_core_warn(dev, "func_id 0x%x, npages %d, err %d\n",
 			       func_id, npages, err);
 		goto out_dropped;
@@ -524,10 +524,13 @@ static int reclaim_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
 		dev->priv.reclaim_pages_discard += npages;
 	}
 	/* if triggered by FW event and failed by FW then ignore */
-	if (event && err == -EREMOTEIO)
+	if (event && err == -EREMOTEIO) {
 		err = 0;
+		goto out_free;
+	}
+
+	err = mlx5_cmd_check(dev, err, in, out);
 	if (err) {
-		err = mlx5_cmd_check(dev, err, in, out);
 		mlx5_core_err(dev, "failed reclaiming pages: err %d\n", err);
 		goto out_free;
 	}
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 034/158] net/mlx5: Avoid false positive lockdep warning by adding lock_class_key
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (32 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 033/158] net/mlx5: Fix cmd error logging for manage pages cmd Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 035/158] net/mlx5e: Fix wrong application of the LRO state Greg Kroah-Hartman
                   ` (136 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Moshe Shemesh, Shay Drory,
	Saeed Mahameed, Sasha Levin

From: Moshe Shemesh <moshe@nvidia.com>

[ Upstream commit d59b73a66e5e0682442b6d7b4965364e57078b80 ]

Add a lock_class_key per mlx5 device to avoid a false positive
"possible circular locking dependency" warning by lockdep, on flows
which lock more than one mlx5 device, such as adding SF.

kernel log:
 ======================================================
 WARNING: possible circular locking dependency detected
 5.19.0-rc8+ #2 Not tainted
 ------------------------------------------------------
 kworker/u20:0/8 is trying to acquire lock:
 ffff88812dfe0d98 (&dev->intf_state_mutex){+.+.}-{3:3}, at: mlx5_init_one+0x2e/0x490 [mlx5_core]

 but task is already holding lock:
 ffff888101aa7898 (&(&notifier->n_head)->rwsem){++++}-{3:3}, at: blocking_notifier_call_chain+0x5a/0x130

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (&(&notifier->n_head)->rwsem){++++}-{3:3}:
        down_write+0x90/0x150
        blocking_notifier_chain_register+0x53/0xa0
        mlx5_sf_table_init+0x369/0x4a0 [mlx5_core]
        mlx5_init_one+0x261/0x490 [mlx5_core]
        probe_one+0x430/0x680 [mlx5_core]
        local_pci_probe+0xd6/0x170
        work_for_cpu_fn+0x4e/0xa0
        process_one_work+0x7c2/0x1340
        worker_thread+0x6f6/0xec0
        kthread+0x28f/0x330
        ret_from_fork+0x1f/0x30

 -> #0 (&dev->intf_state_mutex){+.+.}-{3:3}:
        __lock_acquire+0x2fc7/0x6720
        lock_acquire+0x1c1/0x550
        __mutex_lock+0x12c/0x14b0
        mlx5_init_one+0x2e/0x490 [mlx5_core]
        mlx5_sf_dev_probe+0x29c/0x370 [mlx5_core]
        auxiliary_bus_probe+0x9d/0xe0
        really_probe+0x1e0/0xaa0
        __driver_probe_device+0x219/0x480
        driver_probe_device+0x49/0x130
        __device_attach_driver+0x1b8/0x280
        bus_for_each_drv+0x123/0x1a0
        __device_attach+0x1a3/0x460
        bus_probe_device+0x1a2/0x260
        device_add+0x9b1/0x1b40
        __auxiliary_device_add+0x88/0xc0
        mlx5_sf_dev_state_change_handler+0x67e/0x9d0 [mlx5_core]
        blocking_notifier_call_chain+0xd5/0x130
        mlx5_vhca_state_work_handler+0x2b0/0x3f0 [mlx5_core]
        process_one_work+0x7c2/0x1340
        worker_thread+0x59d/0xec0
        kthread+0x28f/0x330
        ret_from_fork+0x1f/0x30

  other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&(&notifier->n_head)->rwsem);
                                lock(&dev->intf_state_mutex);
                                lock(&(&notifier->n_head)->rwsem);
   lock(&dev->intf_state_mutex);

  *** DEADLOCK ***

 4 locks held by kworker/u20:0/8:
  #0: ffff888150612938 ((wq_completion)mlx5_events){+.+.}-{0:0}, at: process_one_work+0x6e2/0x1340
  #1: ffff888100cafdb8 ((work_completion)(&work->work)#3){+.+.}-{0:0}, at: process_one_work+0x70f/0x1340
  #2: ffff888101aa7898 (&(&notifier->n_head)->rwsem){++++}-{3:3}, at: blocking_notifier_call_chain+0x5a/0x130
  #3: ffff88813682d0e8 (&dev->mutex){....}-{3:3}, at:__device_attach+0x76/0x460

 stack backtrace:
 CPU: 6 PID: 8 Comm: kworker/u20:0 Not tainted 5.19.0-rc8+
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Workqueue: mlx5_events mlx5_vhca_state_work_handler [mlx5_core]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x57/0x7d
  check_noncircular+0x278/0x300
  ? print_circular_bug+0x460/0x460
  ? lock_chain_count+0x20/0x20
  ? register_lock_class+0x1880/0x1880
  __lock_acquire+0x2fc7/0x6720
  ? register_lock_class+0x1880/0x1880
  ? register_lock_class+0x1880/0x1880
  lock_acquire+0x1c1/0x550
  ? mlx5_init_one+0x2e/0x490 [mlx5_core]
  ? lockdep_hardirqs_on_prepare+0x400/0x400
  __mutex_lock+0x12c/0x14b0
  ? mlx5_init_one+0x2e/0x490 [mlx5_core]
  ? mlx5_init_one+0x2e/0x490 [mlx5_core]
  ? _raw_read_unlock+0x1f/0x30
  ? mutex_lock_io_nested+0x1320/0x1320
  ? __ioremap_caller.constprop.0+0x306/0x490
  ? mlx5_sf_dev_probe+0x269/0x370 [mlx5_core]
  ? iounmap+0x160/0x160
  mlx5_init_one+0x2e/0x490 [mlx5_core]
  mlx5_sf_dev_probe+0x29c/0x370 [mlx5_core]
  ? mlx5_sf_dev_remove+0x130/0x130 [mlx5_core]
  auxiliary_bus_probe+0x9d/0xe0
  really_probe+0x1e0/0xaa0
  __driver_probe_device+0x219/0x480
  ? auxiliary_match_id+0xe9/0x140
  driver_probe_device+0x49/0x130
  __device_attach_driver+0x1b8/0x280
  ? driver_allows_async_probing+0x140/0x140
  bus_for_each_drv+0x123/0x1a0
  ? bus_for_each_dev+0x1a0/0x1a0
  ? lockdep_hardirqs_on_prepare+0x286/0x400
  ? trace_hardirqs_on+0x2d/0x100
  __device_attach+0x1a3/0x460
  ? device_driver_attach+0x1e0/0x1e0
  ? kobject_uevent_env+0x22d/0xf10
  bus_probe_device+0x1a2/0x260
  device_add+0x9b1/0x1b40
  ? dev_set_name+0xab/0xe0
  ? __fw_devlink_link_to_suppliers+0x260/0x260
  ? memset+0x20/0x40
  ? lockdep_init_map_type+0x21a/0x7d0
  __auxiliary_device_add+0x88/0xc0
  ? auxiliary_device_init+0x86/0xa0
  mlx5_sf_dev_state_change_handler+0x67e/0x9d0 [mlx5_core]
  blocking_notifier_call_chain+0xd5/0x130
  mlx5_vhca_state_work_handler+0x2b0/0x3f0 [mlx5_core]
  ? mlx5_vhca_event_arm+0x100/0x100 [mlx5_core]
  ? lock_downgrade+0x6e0/0x6e0
  ? lockdep_hardirqs_on_prepare+0x286/0x400
  process_one_work+0x7c2/0x1340
  ? lockdep_hardirqs_on_prepare+0x400/0x400
  ? pwq_dec_nr_in_flight+0x230/0x230
  ? rwlock_bug.part.0+0x90/0x90
  worker_thread+0x59d/0xec0
  ? process_one_work+0x1340/0x1340
  kthread+0x28f/0x330
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x1f/0x30
  </TASK>

Fixes: 6a3273217469 ("net/mlx5: SF, Port function state change support")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/main.c | 4 ++++
 include/linux/mlx5/driver.h                    | 1 +
 2 files changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index ba2e5232b90be..616207c3b187a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1472,7 +1472,9 @@ int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx)
 	memcpy(&dev->profile, &profile[profile_idx], sizeof(dev->profile));
 	INIT_LIST_HEAD(&priv->ctx_list);
 	spin_lock_init(&priv->ctx_lock);
+	lockdep_register_key(&dev->lock_key);
 	mutex_init(&dev->intf_state_mutex);
+	lockdep_set_class(&dev->intf_state_mutex, &dev->lock_key);
 
 	mutex_init(&priv->bfregs.reg_head.lock);
 	mutex_init(&priv->bfregs.wc_head.lock);
@@ -1527,6 +1529,7 @@ int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx)
 	mutex_destroy(&priv->bfregs.wc_head.lock);
 	mutex_destroy(&priv->bfregs.reg_head.lock);
 	mutex_destroy(&dev->intf_state_mutex);
+	lockdep_unregister_key(&dev->lock_key);
 	return err;
 }
 
@@ -1545,6 +1548,7 @@ void mlx5_mdev_uninit(struct mlx5_core_dev *dev)
 	mutex_destroy(&priv->bfregs.wc_head.lock);
 	mutex_destroy(&priv->bfregs.reg_head.lock);
 	mutex_destroy(&dev->intf_state_mutex);
+	lockdep_unregister_key(&dev->lock_key);
 }
 
 static int probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 5040cd774c5a3..b0b4ac92354a2 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -773,6 +773,7 @@ struct mlx5_core_dev {
 	enum mlx5_device_state	state;
 	/* sync interface state */
 	struct mutex		intf_state_mutex;
+	struct lock_class_key	lock_key;
 	unsigned long		intf_state;
 	struct mlx5_priv	priv;
 	struct mlx5_profile	profile;
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 035/158] net/mlx5e: Fix wrong application of the LRO state
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (33 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 034/158] net/mlx5: Avoid false positive lockdep warning by adding lock_class_key Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 036/158] net/mlx5e: Fix wrong tc flag used when set hw-tc-offload off Greg Kroah-Hartman
                   ` (135 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Aya Levin, Tariq Toukan,
	Maxim Mikityanskiy, Saeed Mahameed, Sasha Levin

From: Aya Levin <ayal@nvidia.com>

[ Upstream commit 7b3707fc79044871ab8f3d5fa5e9603155bb5577 ]

Driver caches packet merge type in mlx5e_params instance which must be
in perfect sync with the netdev_feature's bit.
Prior to this patch, in certain conditions (*) LRO state was set in
mlx5e_params, while netdev_feature's bit was off. Causing the LRO to
be applied on the RQs (HW level).

(*) This can happen only on profile init (mlx5e_build_nic_params()),
when RQ expect non-linear SKB and PCI is fast enough in comparison to
link width.

Solution: remove setting of packet merge type from
mlx5e_build_nic_params() as netdev features are not updated.

Fixes: 619a8f2a42f1 ("net/mlx5e: Use linear SKB in Striding RQ")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 087952b84ccb0..62aab20025345 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4733,14 +4733,6 @@ void mlx5e_build_nic_params(struct mlx5e_priv *priv, struct mlx5e_xsk *xsk, u16
 	/* RQ */
 	mlx5e_build_rq_params(mdev, params);
 
-	/* HW LRO */
-	if (MLX5_CAP_ETH(mdev, lro_cap) &&
-	    params->rq_wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ) {
-		/* No XSK params: checking the availability of striding RQ in general. */
-		if (!mlx5e_rx_mpwqe_is_linear_skb(mdev, params, NULL))
-			params->packet_merge.type = slow_pci_heuristic(mdev) ?
-				MLX5E_PACKET_MERGE_NONE : MLX5E_PACKET_MERGE_LRO;
-	}
 	params->packet_merge.timeout = mlx5e_choose_lro_timeout(mdev, MLX5E_DEFAULT_LRO_TIMEOUT);
 
 	/* CQ moderation params */
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 036/158] net/mlx5e: Fix wrong tc flag used when set hw-tc-offload off
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (34 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 035/158] net/mlx5e: Fix wrong application of the LRO state Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 037/158] net: dsa: microchip: ksz9477: cleanup the ksz9477_switch_detect Greg Kroah-Hartman
                   ` (134 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Maor Dickman, Saeed Mahameed, Sasha Levin

From: Maor Dickman <maord@nvidia.com>

[ Upstream commit 550f96432e6f6770efdaee0e65239d61431062a1 ]

The cited commit reintroduced the ability to set hw-tc-offload
in switchdev mode by reusing NIC mode calls without modifying it
to support both modes, this can cause an illegal memory access
when trying to turn hw-tc-offload off.

Fix this by using the right TC_FLAG when checking if tc rules
are installed while disabling hw-tc-offload.

Fixes: d3cbd4254df8 ("net/mlx5e: Add ndo_set_feature for uplink representor")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 62aab20025345..9e6db779b6efa 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3678,7 +3678,9 @@ static int set_feature_hw_tc(struct net_device *netdev, bool enable)
 	struct mlx5e_priv *priv = netdev_priv(netdev);
 
 #if IS_ENABLED(CONFIG_MLX5_CLS_ACT)
-	if (!enable && mlx5e_tc_num_filters(priv, MLX5_TC_FLAG(NIC_OFFLOAD))) {
+	int tc_flag = mlx5e_is_uplink_rep(priv) ? MLX5_TC_FLAG(ESW_OFFLOAD) :
+						  MLX5_TC_FLAG(NIC_OFFLOAD);
+	if (!enable && mlx5e_tc_num_filters(priv, tc_flag)) {
 		netdev_err(netdev,
 			   "Active offloaded tc filters, can't turn hw_tc_offload off\n");
 		return -EINVAL;
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 037/158] net: dsa: microchip: ksz9477: cleanup the ksz9477_switch_detect
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (35 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 036/158] net/mlx5e: Fix wrong tc flag used when set hw-tc-offload off Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 038/158] net: dsa: microchip: move switch chip_id detection to ksz_common Greg Kroah-Hartman
                   ` (133 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Arun Ramadoss, Vladimir Oltean,
	Paolo Abeni, Sasha Levin

From: Arun Ramadoss <arun.ramadoss@microchip.com>

[ Upstream commit 27faa0aa85f6696d411bbbebaed9f0f723c2a175 ]

The ksz9477_switch_detect performs the detecting the chip id from the
location 0x00 and also check gigabit compatibility check & number of
ports based on the register global_options0. To prepare the common ksz
switch detect function, routine other than chip id read is moved to
ksz9477_switch_init.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/dsa/microchip/ksz9477.c | 48 +++++++++++++----------------
 1 file changed, 22 insertions(+), 26 deletions(-)

diff --git a/drivers/net/dsa/microchip/ksz9477.c b/drivers/net/dsa/microchip/ksz9477.c
index ebad795e4e95f..876a801ac23a4 100644
--- a/drivers/net/dsa/microchip/ksz9477.c
+++ b/drivers/net/dsa/microchip/ksz9477.c
@@ -1365,12 +1365,30 @@ static u32 ksz9477_get_port_addr(int port, int offset)
 
 static int ksz9477_switch_detect(struct ksz_device *dev)
 {
-	u8 data8;
-	u8 id_hi;
-	u8 id_lo;
 	u32 id32;
 	int ret;
 
+	/* read chip id */
+	ret = ksz_read32(dev, REG_CHIP_ID0__1, &id32);
+	if (ret)
+		return ret;
+
+	dev_dbg(dev->dev, "Switch detect: ID=%08x\n", id32);
+
+	dev->chip_id = id32 & 0x00FFFF00;
+
+	return 0;
+}
+
+static int ksz9477_switch_init(struct ksz_device *dev)
+{
+	u8 data8;
+	int ret;
+
+	dev->ds->ops = &ksz9477_switch_ops;
+
+	dev->port_mask = (1 << dev->info->port_cnt) - 1;
+
 	/* turn off SPI DO Edge select */
 	ret = ksz_read8(dev, REG_SW_GLOBAL_SERIAL_CTRL_0, &data8);
 	if (ret)
@@ -1381,10 +1399,6 @@ static int ksz9477_switch_detect(struct ksz_device *dev)
 	if (ret)
 		return ret;
 
-	/* read chip id */
-	ret = ksz_read32(dev, REG_CHIP_ID0__1, &id32);
-	if (ret)
-		return ret;
 	ret = ksz_read8(dev, REG_GLOBAL_OPTIONS, &data8);
 	if (ret)
 		return ret;
@@ -1395,10 +1409,7 @@ static int ksz9477_switch_detect(struct ksz_device *dev)
 	/* Default capability is gigabit capable. */
 	dev->features = GBIT_SUPPORT;
 
-	dev_dbg(dev->dev, "Switch detect: ID=%08x%02x\n", id32, data8);
-	id_hi = (u8)(id32 >> 16);
-	id_lo = (u8)(id32 >> 8);
-	if ((id_lo & 0xf) == 3) {
+	if (dev->chip_id == KSZ9893_CHIP_ID) {
 		/* Chip is from KSZ9893 design. */
 		dev_info(dev->dev, "Found KSZ9893\n");
 		dev->features |= IS_9893;
@@ -1416,21 +1427,6 @@ static int ksz9477_switch_detect(struct ksz_device *dev)
 		if (!(data8 & SW_GIGABIT_ABLE))
 			dev->features &= ~GBIT_SUPPORT;
 	}
-
-	/* Change chip id to known ones so it can be matched against them. */
-	id32 = (id_hi << 16) | (id_lo << 8);
-
-	dev->chip_id = id32;
-
-	return 0;
-}
-
-static int ksz9477_switch_init(struct ksz_device *dev)
-{
-	dev->ds->ops = &ksz9477_switch_ops;
-
-	dev->port_mask = (1 << dev->info->port_cnt) - 1;
-
 	return 0;
 }
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 038/158] net: dsa: microchip: move switch chip_id detection to ksz_common
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (36 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 037/158] net: dsa: microchip: ksz9477: cleanup the ksz9477_switch_detect Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 039/158] net: dsa: microchip: move tag_protocol " Greg Kroah-Hartman
                   ` (132 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Arun Ramadoss, Paolo Abeni, Sasha Levin

From: Arun Ramadoss <arun.ramadoss@microchip.com>

[ Upstream commit 91a98917a8839923d404a77c21646ca5fc9e330a ]

KSZ87xx and KSZ88xx have chip_id representation at reg location 0. And
KSZ9477 compatible switch and LAN937x switch have same chip_id detection
at location 0x01 and 0x02. To have the common switch detect
functionality for ksz switches, ksz_switch_detect function is
introduced.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/dsa/microchip/ksz8795.c     | 48 +--------------
 drivers/net/dsa/microchip/ksz8795_reg.h | 16 -----
 drivers/net/dsa/microchip/ksz9477.c     | 21 -------
 drivers/net/dsa/microchip/ksz9477_reg.h |  1 -
 drivers/net/dsa/microchip/ksz_common.c  | 78 +++++++++++++++++++++++--
 drivers/net/dsa/microchip/ksz_common.h  | 19 +++++-
 6 files changed, 93 insertions(+), 90 deletions(-)

diff --git a/drivers/net/dsa/microchip/ksz8795.c b/drivers/net/dsa/microchip/ksz8795.c
index 12a599d5e61a4..3cc51ee5fb6cc 100644
--- a/drivers/net/dsa/microchip/ksz8795.c
+++ b/drivers/net/dsa/microchip/ksz8795.c
@@ -1272,7 +1272,7 @@ static void ksz8_config_cpu_port(struct dsa_switch *ds)
 			continue;
 		if (!ksz_is_ksz88x3(dev)) {
 			ksz_pread8(dev, i, regs[P_REMOTE_STATUS], &remote);
-			if (remote & PORT_FIBER_MODE)
+			if (remote & KSZ8_PORT_FIBER_MODE)
 				p->fiber = 1;
 		}
 		if (p->fiber)
@@ -1424,51 +1424,6 @@ static u32 ksz8_get_port_addr(int port, int offset)
 	return PORT_CTRL_ADDR(port, offset);
 }
 
-static int ksz8_switch_detect(struct ksz_device *dev)
-{
-	u8 id1, id2;
-	u16 id16;
-	int ret;
-
-	/* read chip id */
-	ret = ksz_read16(dev, REG_CHIP_ID0, &id16);
-	if (ret)
-		return ret;
-
-	id1 = id16 >> 8;
-	id2 = id16 & SW_CHIP_ID_M;
-
-	switch (id1) {
-	case KSZ87_FAMILY_ID:
-		if ((id2 != CHIP_ID_94 && id2 != CHIP_ID_95))
-			return -ENODEV;
-
-		if (id2 == CHIP_ID_95) {
-			u8 val;
-
-			id2 = 0x95;
-			ksz_read8(dev, REG_PORT_STATUS_0, &val);
-			if (val & PORT_FIBER_MODE)
-				id2 = 0x65;
-		} else if (id2 == CHIP_ID_94) {
-			id2 = 0x94;
-		}
-		break;
-	case KSZ88_FAMILY_ID:
-		if (id2 != CHIP_ID_63)
-			return -ENODEV;
-		break;
-	default:
-		dev_err(dev->dev, "invalid family id: %d\n", id1);
-		return -ENODEV;
-	}
-	id16 &= ~0xff;
-	id16 |= id2;
-	dev->chip_id = id16;
-
-	return 0;
-}
-
 static int ksz8_switch_init(struct ksz_device *dev)
 {
 	struct ksz8 *ksz8 = dev->priv;
@@ -1522,7 +1477,6 @@ static const struct ksz_dev_ops ksz8_dev_ops = {
 	.freeze_mib = ksz8_freeze_mib,
 	.port_init_cnt = ksz8_port_init_cnt,
 	.shutdown = ksz8_reset_switch,
-	.detect = ksz8_switch_detect,
 	.init = ksz8_switch_init,
 	.exit = ksz8_switch_exit,
 };
diff --git a/drivers/net/dsa/microchip/ksz8795_reg.h b/drivers/net/dsa/microchip/ksz8795_reg.h
index 4109433b6b6c2..b8f6ad7581bcd 100644
--- a/drivers/net/dsa/microchip/ksz8795_reg.h
+++ b/drivers/net/dsa/microchip/ksz8795_reg.h
@@ -14,23 +14,10 @@
 #define KS_PRIO_M			0x3
 #define KS_PRIO_S			2
 
-#define REG_CHIP_ID0			0x00
-
-#define KSZ87_FAMILY_ID			0x87
-#define KSZ88_FAMILY_ID			0x88
-
-#define REG_CHIP_ID1			0x01
-
-#define SW_CHIP_ID_M			0xF0
-#define SW_CHIP_ID_S			4
 #define SW_REVISION_M			0x0E
 #define SW_REVISION_S			1
 #define SW_START			0x01
 
-#define CHIP_ID_94			0x60
-#define CHIP_ID_95			0x90
-#define CHIP_ID_63			0x30
-
 #define KSZ8863_REG_SW_RESET		0x43
 
 #define KSZ8863_GLOBAL_SOFTWARE_RESET	BIT(4)
@@ -217,8 +204,6 @@
 #define REG_PORT_4_STATUS_0		0x48
 
 /* For KSZ8765. */
-#define PORT_FIBER_MODE			BIT(7)
-
 #define PORT_REMOTE_ASYM_PAUSE		BIT(5)
 #define PORT_REMOTE_SYM_PAUSE		BIT(4)
 #define PORT_REMOTE_100BTX_FD		BIT(3)
@@ -322,7 +307,6 @@
 
 #define REG_PORT_CTRL_5			0x05
 
-#define REG_PORT_STATUS_0		0x08
 #define REG_PORT_STATUS_1		0x09
 #define REG_PORT_LINK_MD_CTRL		0x0A
 #define REG_PORT_LINK_MD_RESULT		0x0B
diff --git a/drivers/net/dsa/microchip/ksz9477.c b/drivers/net/dsa/microchip/ksz9477.c
index 876a801ac23a4..bcfdd505ca79a 100644
--- a/drivers/net/dsa/microchip/ksz9477.c
+++ b/drivers/net/dsa/microchip/ksz9477.c
@@ -1363,23 +1363,6 @@ static u32 ksz9477_get_port_addr(int port, int offset)
 	return PORT_CTRL_ADDR(port, offset);
 }
 
-static int ksz9477_switch_detect(struct ksz_device *dev)
-{
-	u32 id32;
-	int ret;
-
-	/* read chip id */
-	ret = ksz_read32(dev, REG_CHIP_ID0__1, &id32);
-	if (ret)
-		return ret;
-
-	dev_dbg(dev->dev, "Switch detect: ID=%08x\n", id32);
-
-	dev->chip_id = id32 & 0x00FFFF00;
-
-	return 0;
-}
-
 static int ksz9477_switch_init(struct ksz_device *dev)
 {
 	u8 data8;
@@ -1410,8 +1393,6 @@ static int ksz9477_switch_init(struct ksz_device *dev)
 	dev->features = GBIT_SUPPORT;
 
 	if (dev->chip_id == KSZ9893_CHIP_ID) {
-		/* Chip is from KSZ9893 design. */
-		dev_info(dev->dev, "Found KSZ9893\n");
 		dev->features |= IS_9893;
 
 		/* Chip does not support gigabit. */
@@ -1419,7 +1400,6 @@ static int ksz9477_switch_init(struct ksz_device *dev)
 			dev->features &= ~GBIT_SUPPORT;
 		dev->phy_port_cnt = 2;
 	} else {
-		dev_info(dev->dev, "Found KSZ9477 or compatible\n");
 		/* Chip uses new XMII register definitions. */
 		dev->features |= NEW_XMII;
 
@@ -1446,7 +1426,6 @@ static const struct ksz_dev_ops ksz9477_dev_ops = {
 	.freeze_mib = ksz9477_freeze_mib,
 	.port_init_cnt = ksz9477_port_init_cnt,
 	.shutdown = ksz9477_reset_switch,
-	.detect = ksz9477_switch_detect,
 	.init = ksz9477_switch_init,
 	.exit = ksz9477_switch_exit,
 };
diff --git a/drivers/net/dsa/microchip/ksz9477_reg.h b/drivers/net/dsa/microchip/ksz9477_reg.h
index 7a2c8d4767aff..077e35ab11b54 100644
--- a/drivers/net/dsa/microchip/ksz9477_reg.h
+++ b/drivers/net/dsa/microchip/ksz9477_reg.h
@@ -25,7 +25,6 @@
 
 #define REG_CHIP_ID2__1			0x0002
 
-#define CHIP_ID_63			0x63
 #define CHIP_ID_66			0x66
 #define CHIP_ID_67			0x67
 #define CHIP_ID_77			0x77
diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
index 92a500e1ccd21..4511e99823f57 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -930,6 +930,72 @@ void ksz_port_stp_state_set(struct dsa_switch *ds, int port,
 }
 EXPORT_SYMBOL_GPL(ksz_port_stp_state_set);
 
+static int ksz_switch_detect(struct ksz_device *dev)
+{
+	u8 id1, id2;
+	u16 id16;
+	u32 id32;
+	int ret;
+
+	/* read chip id */
+	ret = ksz_read16(dev, REG_CHIP_ID0, &id16);
+	if (ret)
+		return ret;
+
+	id1 = FIELD_GET(SW_FAMILY_ID_M, id16);
+	id2 = FIELD_GET(SW_CHIP_ID_M, id16);
+
+	switch (id1) {
+	case KSZ87_FAMILY_ID:
+		if (id2 == KSZ87_CHIP_ID_95) {
+			u8 val;
+
+			dev->chip_id = KSZ8795_CHIP_ID;
+
+			ksz_read8(dev, KSZ8_PORT_STATUS_0, &val);
+			if (val & KSZ8_PORT_FIBER_MODE)
+				dev->chip_id = KSZ8765_CHIP_ID;
+		} else if (id2 == KSZ87_CHIP_ID_94) {
+			dev->chip_id = KSZ8794_CHIP_ID;
+		} else {
+			return -ENODEV;
+		}
+		break;
+	case KSZ88_FAMILY_ID:
+		if (id2 == KSZ88_CHIP_ID_63)
+			dev->chip_id = KSZ8830_CHIP_ID;
+		else
+			return -ENODEV;
+		break;
+	default:
+		ret = ksz_read32(dev, REG_CHIP_ID0, &id32);
+		if (ret)
+			return ret;
+
+		dev->chip_rev = FIELD_GET(SW_REV_ID_M, id32);
+		id32 &= ~0xFF;
+
+		switch (id32) {
+		case KSZ9477_CHIP_ID:
+		case KSZ9897_CHIP_ID:
+		case KSZ9893_CHIP_ID:
+		case KSZ9567_CHIP_ID:
+		case LAN9370_CHIP_ID:
+		case LAN9371_CHIP_ID:
+		case LAN9372_CHIP_ID:
+		case LAN9373_CHIP_ID:
+		case LAN9374_CHIP_ID:
+			dev->chip_id = id32;
+			break;
+		default:
+			dev_err(dev->dev,
+				"unsupported switch detected %x)\n", id32);
+			return -ENODEV;
+		}
+	}
+	return 0;
+}
+
 struct ksz_device *ksz_switch_alloc(struct device *base, void *priv)
 {
 	struct dsa_switch *ds;
@@ -986,10 +1052,9 @@ int ksz_switch_register(struct ksz_device *dev,
 	mutex_init(&dev->alu_mutex);
 	mutex_init(&dev->vlan_mutex);
 
-	dev->dev_ops = ops;
-
-	if (dev->dev_ops->detect(dev))
-		return -EINVAL;
+	ret = ksz_switch_detect(dev);
+	if (ret)
+		return ret;
 
 	info = ksz_lookup_info(dev->chip_id);
 	if (!info)
@@ -998,10 +1063,15 @@ int ksz_switch_register(struct ksz_device *dev,
 	/* Update the compatible info with the probed one */
 	dev->info = info;
 
+	dev_info(dev->dev, "found switch: %s, rev %i\n",
+		 dev->info->dev_name, dev->chip_rev);
+
 	ret = ksz_check_device_id(dev);
 	if (ret)
 		return ret;
 
+	dev->dev_ops = ops;
+
 	ret = dev->dev_ops->init(dev);
 	if (ret)
 		return ret;
diff --git a/drivers/net/dsa/microchip/ksz_common.h b/drivers/net/dsa/microchip/ksz_common.h
index 8500eaedad67a..e6bc5fb2b1303 100644
--- a/drivers/net/dsa/microchip/ksz_common.h
+++ b/drivers/net/dsa/microchip/ksz_common.h
@@ -90,6 +90,7 @@ struct ksz_device {
 
 	/* chip specific data */
 	u32 chip_id;
+	u8 chip_rev;
 	int cpu_port;			/* port connected to CPU */
 	int phy_port_cnt;
 	phy_interface_t compat_interface;
@@ -182,7 +183,6 @@ struct ksz_dev_ops {
 	void (*freeze_mib)(struct ksz_device *dev, int port, bool freeze);
 	void (*port_init_cnt)(struct ksz_device *dev, int port);
 	int (*shutdown)(struct ksz_device *dev);
-	int (*detect)(struct ksz_device *dev);
 	int (*init)(struct ksz_device *dev);
 	void (*exit)(struct ksz_device *dev);
 };
@@ -353,6 +353,23 @@ static inline void ksz_regmap_unlock(void *__mtx)
 #define PORT_RX_ENABLE			BIT(1)
 #define PORT_LEARN_DISABLE		BIT(0)
 
+/* Switch ID Defines */
+#define REG_CHIP_ID0			0x00
+
+#define SW_FAMILY_ID_M			GENMASK(15, 8)
+#define KSZ87_FAMILY_ID			0x87
+#define KSZ88_FAMILY_ID			0x88
+
+#define KSZ8_PORT_STATUS_0		0x08
+#define KSZ8_PORT_FIBER_MODE		BIT(7)
+
+#define SW_CHIP_ID_M			GENMASK(7, 4)
+#define KSZ87_CHIP_ID_94		0x6
+#define KSZ87_CHIP_ID_95		0x9
+#define KSZ88_CHIP_ID_63		0x3
+
+#define SW_REV_ID_M			GENMASK(7, 4)
+
 /* Regmap tables generation */
 #define KSZ_SPI_OP_RD		3
 #define KSZ_SPI_OP_WR		2
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 039/158] net: dsa: microchip: move tag_protocol to ksz_common
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (37 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 038/158] net: dsa: microchip: move switch chip_id detection to ksz_common Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 040/158] net: dsa: microchip: move vlan functionality " Greg Kroah-Hartman
                   ` (131 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Arun Ramadoss, Paolo Abeni, Sasha Levin

From: Arun Ramadoss <arun.ramadoss@microchip.com>

[ Upstream commit 534a0431e9e68959e2c0d71c141d5b911d66ad7c ]

This patch move the dsa hook get_tag_protocol to ksz_common file. And
the tag_protocol is returned based on the dev->chip_id.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/dsa/microchip/ksz8795.c    | 13 +------------
 drivers/net/dsa/microchip/ksz9477.c    | 14 +-------------
 drivers/net/dsa/microchip/ksz_common.c | 24 ++++++++++++++++++++++++
 drivers/net/dsa/microchip/ksz_common.h |  2 ++
 4 files changed, 28 insertions(+), 25 deletions(-)

diff --git a/drivers/net/dsa/microchip/ksz8795.c b/drivers/net/dsa/microchip/ksz8795.c
index 3cc51ee5fb6cc..041956e3c7b1a 100644
--- a/drivers/net/dsa/microchip/ksz8795.c
+++ b/drivers/net/dsa/microchip/ksz8795.c
@@ -898,17 +898,6 @@ static void ksz8_w_phy(struct ksz_device *dev, u16 phy, u16 reg, u16 val)
 	}
 }
 
-static enum dsa_tag_protocol ksz8_get_tag_protocol(struct dsa_switch *ds,
-						   int port,
-						   enum dsa_tag_protocol mp)
-{
-	struct ksz_device *dev = ds->priv;
-
-	/* ksz88x3 uses the same tag schema as KSZ9893 */
-	return ksz_is_ksz88x3(dev) ?
-		DSA_TAG_PROTO_KSZ9893 : DSA_TAG_PROTO_KSZ8795;
-}
-
 static u32 ksz8_sw_get_phy_flags(struct dsa_switch *ds, int port)
 {
 	/* Silicon Errata Sheet (DS80000830A):
@@ -1394,7 +1383,7 @@ static void ksz8_get_caps(struct dsa_switch *ds, int port,
 }
 
 static const struct dsa_switch_ops ksz8_switch_ops = {
-	.get_tag_protocol	= ksz8_get_tag_protocol,
+	.get_tag_protocol	= ksz_get_tag_protocol,
 	.get_phy_flags		= ksz8_sw_get_phy_flags,
 	.setup			= ksz8_setup,
 	.phy_read		= ksz_phy_read16,
diff --git a/drivers/net/dsa/microchip/ksz9477.c b/drivers/net/dsa/microchip/ksz9477.c
index bcfdd505ca79a..31be767027feb 100644
--- a/drivers/net/dsa/microchip/ksz9477.c
+++ b/drivers/net/dsa/microchip/ksz9477.c
@@ -276,18 +276,6 @@ static void ksz9477_port_init_cnt(struct ksz_device *dev, int port)
 	mutex_unlock(&mib->cnt_mutex);
 }
 
-static enum dsa_tag_protocol ksz9477_get_tag_protocol(struct dsa_switch *ds,
-						      int port,
-						      enum dsa_tag_protocol mp)
-{
-	enum dsa_tag_protocol proto = DSA_TAG_PROTO_KSZ9477;
-	struct ksz_device *dev = ds->priv;
-
-	if (dev->features & IS_9893)
-		proto = DSA_TAG_PROTO_KSZ9893;
-	return proto;
-}
-
 static int ksz9477_phy_read16(struct dsa_switch *ds, int addr, int reg)
 {
 	struct ksz_device *dev = ds->priv;
@@ -1329,7 +1317,7 @@ static int ksz9477_setup(struct dsa_switch *ds)
 }
 
 static const struct dsa_switch_ops ksz9477_switch_ops = {
-	.get_tag_protocol	= ksz9477_get_tag_protocol,
+	.get_tag_protocol	= ksz_get_tag_protocol,
 	.setup			= ksz9477_setup,
 	.phy_read		= ksz9477_phy_read16,
 	.phy_write		= ksz9477_phy_write16,
diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
index 4511e99823f57..0713a40685fa9 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -930,6 +930,30 @@ void ksz_port_stp_state_set(struct dsa_switch *ds, int port,
 }
 EXPORT_SYMBOL_GPL(ksz_port_stp_state_set);
 
+enum dsa_tag_protocol ksz_get_tag_protocol(struct dsa_switch *ds,
+					   int port, enum dsa_tag_protocol mp)
+{
+	struct ksz_device *dev = ds->priv;
+	enum dsa_tag_protocol proto = DSA_TAG_PROTO_NONE;
+
+	if (dev->chip_id == KSZ8795_CHIP_ID ||
+	    dev->chip_id == KSZ8794_CHIP_ID ||
+	    dev->chip_id == KSZ8765_CHIP_ID)
+		proto = DSA_TAG_PROTO_KSZ8795;
+
+	if (dev->chip_id == KSZ8830_CHIP_ID ||
+	    dev->chip_id == KSZ9893_CHIP_ID)
+		proto = DSA_TAG_PROTO_KSZ9893;
+
+	if (dev->chip_id == KSZ9477_CHIP_ID ||
+	    dev->chip_id == KSZ9897_CHIP_ID ||
+	    dev->chip_id == KSZ9567_CHIP_ID)
+		proto = DSA_TAG_PROTO_KSZ9477;
+
+	return proto;
+}
+EXPORT_SYMBOL_GPL(ksz_get_tag_protocol);
+
 static int ksz_switch_detect(struct ksz_device *dev)
 {
 	u8 id1, id2;
diff --git a/drivers/net/dsa/microchip/ksz_common.h b/drivers/net/dsa/microchip/ksz_common.h
index e6bc5fb2b1303..21db6f79035fa 100644
--- a/drivers/net/dsa/microchip/ksz_common.h
+++ b/drivers/net/dsa/microchip/ksz_common.h
@@ -231,6 +231,8 @@ int ksz_port_mdb_del(struct dsa_switch *ds, int port,
 int ksz_enable_port(struct dsa_switch *ds, int port, struct phy_device *phy);
 void ksz_get_strings(struct dsa_switch *ds, int port,
 		     u32 stringset, uint8_t *buf);
+enum dsa_tag_protocol ksz_get_tag_protocol(struct dsa_switch *ds,
+					   int port, enum dsa_tag_protocol mp);
 
 /* Common register access functions */
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 040/158] net: dsa: microchip: move vlan functionality to ksz_common
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (38 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 039/158] net: dsa: microchip: move tag_protocol " Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 041/158] net: dsa: microchip: move the port mirror " Greg Kroah-Hartman
                   ` (130 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Arun Ramadoss, Vladimir Oltean,
	Paolo Abeni, Sasha Levin

From: Arun Ramadoss <arun.ramadoss@microchip.com>

[ Upstream commit f0d997e31bb307c7aa046c4992c568547fd25195 ]

This patch moves the vlan dsa_switch_ops such as vlan_add, vlan_del and
vlan_filtering from the individual files ksz8795.c, ksz9477.c to
ksz_common.c file.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/dsa/microchip/ksz8795.c    | 19 +++++++------
 drivers/net/dsa/microchip/ksz9477.c    | 19 +++++++------
 drivers/net/dsa/microchip/ksz_common.c | 37 ++++++++++++++++++++++++++
 drivers/net/dsa/microchip/ksz_common.h | 14 ++++++++++
 4 files changed, 69 insertions(+), 20 deletions(-)

diff --git a/drivers/net/dsa/microchip/ksz8795.c b/drivers/net/dsa/microchip/ksz8795.c
index 041956e3c7b1a..16e946dbd9d42 100644
--- a/drivers/net/dsa/microchip/ksz8795.c
+++ b/drivers/net/dsa/microchip/ksz8795.c
@@ -958,11 +958,9 @@ static void ksz8_flush_dyn_mac_table(struct ksz_device *dev, int port)
 	}
 }
 
-static int ksz8_port_vlan_filtering(struct dsa_switch *ds, int port, bool flag,
+static int ksz8_port_vlan_filtering(struct ksz_device *dev, int port, bool flag,
 				    struct netlink_ext_ack *extack)
 {
-	struct ksz_device *dev = ds->priv;
-
 	if (ksz_is_ksz88x3(dev))
 		return -ENOTSUPP;
 
@@ -987,12 +985,11 @@ static void ksz8_port_enable_pvid(struct ksz_device *dev, int port, bool state)
 	}
 }
 
-static int ksz8_port_vlan_add(struct dsa_switch *ds, int port,
+static int ksz8_port_vlan_add(struct ksz_device *dev, int port,
 			      const struct switchdev_obj_port_vlan *vlan,
 			      struct netlink_ext_ack *extack)
 {
 	bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
-	struct ksz_device *dev = ds->priv;
 	struct ksz_port *p = &dev->ports[port];
 	u16 data, new_pvid = 0;
 	u8 fid, member, valid;
@@ -1060,10 +1057,9 @@ static int ksz8_port_vlan_add(struct dsa_switch *ds, int port,
 	return 0;
 }
 
-static int ksz8_port_vlan_del(struct dsa_switch *ds, int port,
+static int ksz8_port_vlan_del(struct ksz_device *dev, int port,
 			      const struct switchdev_obj_port_vlan *vlan)
 {
-	struct ksz_device *dev = ds->priv;
 	u16 data, pvid;
 	u8 fid, member, valid;
 
@@ -1398,9 +1394,9 @@ static const struct dsa_switch_ops ksz8_switch_ops = {
 	.port_bridge_leave	= ksz_port_bridge_leave,
 	.port_stp_state_set	= ksz8_port_stp_state_set,
 	.port_fast_age		= ksz_port_fast_age,
-	.port_vlan_filtering	= ksz8_port_vlan_filtering,
-	.port_vlan_add		= ksz8_port_vlan_add,
-	.port_vlan_del		= ksz8_port_vlan_del,
+	.port_vlan_filtering	= ksz_port_vlan_filtering,
+	.port_vlan_add		= ksz_port_vlan_add,
+	.port_vlan_del		= ksz_port_vlan_del,
 	.port_fdb_dump		= ksz_port_fdb_dump,
 	.port_mdb_add           = ksz_port_mdb_add,
 	.port_mdb_del           = ksz_port_mdb_del,
@@ -1465,6 +1461,9 @@ static const struct ksz_dev_ops ksz8_dev_ops = {
 	.r_mib_pkt = ksz8_r_mib_pkt,
 	.freeze_mib = ksz8_freeze_mib,
 	.port_init_cnt = ksz8_port_init_cnt,
+	.vlan_filtering = ksz8_port_vlan_filtering,
+	.vlan_add = ksz8_port_vlan_add,
+	.vlan_del = ksz8_port_vlan_del,
 	.shutdown = ksz8_reset_switch,
 	.init = ksz8_switch_init,
 	.exit = ksz8_switch_exit,
diff --git a/drivers/net/dsa/microchip/ksz9477.c b/drivers/net/dsa/microchip/ksz9477.c
index 31be767027feb..1bb994a9109cd 100644
--- a/drivers/net/dsa/microchip/ksz9477.c
+++ b/drivers/net/dsa/microchip/ksz9477.c
@@ -377,12 +377,10 @@ static void ksz9477_flush_dyn_mac_table(struct ksz_device *dev, int port)
 	}
 }
 
-static int ksz9477_port_vlan_filtering(struct dsa_switch *ds, int port,
+static int ksz9477_port_vlan_filtering(struct ksz_device *dev, int port,
 				       bool flag,
 				       struct netlink_ext_ack *extack)
 {
-	struct ksz_device *dev = ds->priv;
-
 	if (flag) {
 		ksz_port_cfg(dev, port, REG_PORT_LUE_CTRL,
 			     PORT_VLAN_LOOKUP_VID_0, true);
@@ -396,11 +394,10 @@ static int ksz9477_port_vlan_filtering(struct dsa_switch *ds, int port,
 	return 0;
 }
 
-static int ksz9477_port_vlan_add(struct dsa_switch *ds, int port,
+static int ksz9477_port_vlan_add(struct ksz_device *dev, int port,
 				 const struct switchdev_obj_port_vlan *vlan,
 				 struct netlink_ext_ack *extack)
 {
-	struct ksz_device *dev = ds->priv;
 	u32 vlan_table[3];
 	bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
 	int err;
@@ -433,10 +430,9 @@ static int ksz9477_port_vlan_add(struct dsa_switch *ds, int port,
 	return 0;
 }
 
-static int ksz9477_port_vlan_del(struct dsa_switch *ds, int port,
+static int ksz9477_port_vlan_del(struct ksz_device *dev, int port,
 				 const struct switchdev_obj_port_vlan *vlan)
 {
-	struct ksz_device *dev = ds->priv;
 	bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
 	u32 vlan_table[3];
 	u16 pvid;
@@ -1331,9 +1327,9 @@ static const struct dsa_switch_ops ksz9477_switch_ops = {
 	.port_bridge_leave	= ksz_port_bridge_leave,
 	.port_stp_state_set	= ksz9477_port_stp_state_set,
 	.port_fast_age		= ksz_port_fast_age,
-	.port_vlan_filtering	= ksz9477_port_vlan_filtering,
-	.port_vlan_add		= ksz9477_port_vlan_add,
-	.port_vlan_del		= ksz9477_port_vlan_del,
+	.port_vlan_filtering	= ksz_port_vlan_filtering,
+	.port_vlan_add		= ksz_port_vlan_add,
+	.port_vlan_del		= ksz_port_vlan_del,
 	.port_fdb_dump		= ksz9477_port_fdb_dump,
 	.port_fdb_add		= ksz9477_port_fdb_add,
 	.port_fdb_del		= ksz9477_port_fdb_del,
@@ -1413,6 +1409,9 @@ static const struct ksz_dev_ops ksz9477_dev_ops = {
 	.r_mib_stat64 = ksz_r_mib_stats64,
 	.freeze_mib = ksz9477_freeze_mib,
 	.port_init_cnt = ksz9477_port_init_cnt,
+	.vlan_filtering = ksz9477_port_vlan_filtering,
+	.vlan_add = ksz9477_port_vlan_add,
+	.vlan_del = ksz9477_port_vlan_del,
 	.shutdown = ksz9477_reset_switch,
 	.init = ksz9477_switch_init,
 	.exit = ksz9477_switch_exit,
diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
index 0713a40685fa9..5db2b55152885 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -954,6 +954,43 @@ enum dsa_tag_protocol ksz_get_tag_protocol(struct dsa_switch *ds,
 }
 EXPORT_SYMBOL_GPL(ksz_get_tag_protocol);
 
+int ksz_port_vlan_filtering(struct dsa_switch *ds, int port,
+			    bool flag, struct netlink_ext_ack *extack)
+{
+	struct ksz_device *dev = ds->priv;
+
+	if (!dev->dev_ops->vlan_filtering)
+		return -EOPNOTSUPP;
+
+	return dev->dev_ops->vlan_filtering(dev, port, flag, extack);
+}
+EXPORT_SYMBOL_GPL(ksz_port_vlan_filtering);
+
+int ksz_port_vlan_add(struct dsa_switch *ds, int port,
+		      const struct switchdev_obj_port_vlan *vlan,
+		      struct netlink_ext_ack *extack)
+{
+	struct ksz_device *dev = ds->priv;
+
+	if (!dev->dev_ops->vlan_add)
+		return -EOPNOTSUPP;
+
+	return dev->dev_ops->vlan_add(dev, port, vlan, extack);
+}
+EXPORT_SYMBOL_GPL(ksz_port_vlan_add);
+
+int ksz_port_vlan_del(struct dsa_switch *ds, int port,
+		      const struct switchdev_obj_port_vlan *vlan)
+{
+	struct ksz_device *dev = ds->priv;
+
+	if (!dev->dev_ops->vlan_del)
+		return -EOPNOTSUPP;
+
+	return dev->dev_ops->vlan_del(dev, port, vlan);
+}
+EXPORT_SYMBOL_GPL(ksz_port_vlan_del);
+
 static int ksz_switch_detect(struct ksz_device *dev)
 {
 	u8 id1, id2;
diff --git a/drivers/net/dsa/microchip/ksz_common.h b/drivers/net/dsa/microchip/ksz_common.h
index 21db6f79035fa..1baa270859aa2 100644
--- a/drivers/net/dsa/microchip/ksz_common.h
+++ b/drivers/net/dsa/microchip/ksz_common.h
@@ -180,6 +180,13 @@ struct ksz_dev_ops {
 	void (*r_mib_pkt)(struct ksz_device *dev, int port, u16 addr,
 			  u64 *dropped, u64 *cnt);
 	void (*r_mib_stat64)(struct ksz_device *dev, int port);
+	int  (*vlan_filtering)(struct ksz_device *dev, int port,
+			       bool flag, struct netlink_ext_ack *extack);
+	int  (*vlan_add)(struct ksz_device *dev, int port,
+			 const struct switchdev_obj_port_vlan *vlan,
+			 struct netlink_ext_ack *extack);
+	int  (*vlan_del)(struct ksz_device *dev, int port,
+			 const struct switchdev_obj_port_vlan *vlan);
 	void (*freeze_mib)(struct ksz_device *dev, int port, bool freeze);
 	void (*port_init_cnt)(struct ksz_device *dev, int port);
 	int (*shutdown)(struct ksz_device *dev);
@@ -233,6 +240,13 @@ void ksz_get_strings(struct dsa_switch *ds, int port,
 		     u32 stringset, uint8_t *buf);
 enum dsa_tag_protocol ksz_get_tag_protocol(struct dsa_switch *ds,
 					   int port, enum dsa_tag_protocol mp);
+int ksz_port_vlan_filtering(struct dsa_switch *ds, int port,
+			    bool flag, struct netlink_ext_ack *extack);
+int ksz_port_vlan_add(struct dsa_switch *ds, int port,
+		      const struct switchdev_obj_port_vlan *vlan,
+		      struct netlink_ext_ack *extack);
+int ksz_port_vlan_del(struct dsa_switch *ds, int port,
+		      const struct switchdev_obj_port_vlan *vlan);
 
 /* Common register access functions */
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 041/158] net: dsa: microchip: move the port mirror to ksz_common
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (39 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 040/158] net: dsa: microchip: move vlan functionality " Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 042/158] net: dsa: microchip: update the ksz_phylink_get_caps Greg Kroah-Hartman
                   ` (129 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Arun Ramadoss, Vladimir Oltean,
	Florian Fainelli, Paolo Abeni, Sasha Levin

From: Arun Ramadoss <arun.ramadoss@microchip.com>

[ Upstream commit 00a298bbc23876288b1cd04c38752d8e7ed53ae2 ]

This patch updates the common port mirror add/del dsa_switch_ops in
ksz_common.c. The individual switches implementation is executed based
on the ksz_dev_ops function pointers.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/dsa/microchip/ksz8795.c    | 13 ++++++-------
 drivers/net/dsa/microchip/ksz9477.c    | 12 ++++++------
 drivers/net/dsa/microchip/ksz_common.c | 23 +++++++++++++++++++++++
 drivers/net/dsa/microchip/ksz_common.h | 10 ++++++++++
 4 files changed, 45 insertions(+), 13 deletions(-)

diff --git a/drivers/net/dsa/microchip/ksz8795.c b/drivers/net/dsa/microchip/ksz8795.c
index 16e946dbd9d42..2e3d24a3260e1 100644
--- a/drivers/net/dsa/microchip/ksz8795.c
+++ b/drivers/net/dsa/microchip/ksz8795.c
@@ -1089,12 +1089,10 @@ static int ksz8_port_vlan_del(struct ksz_device *dev, int port,
 	return 0;
 }
 
-static int ksz8_port_mirror_add(struct dsa_switch *ds, int port,
+static int ksz8_port_mirror_add(struct ksz_device *dev, int port,
 				struct dsa_mall_mirror_tc_entry *mirror,
 				bool ingress, struct netlink_ext_ack *extack)
 {
-	struct ksz_device *dev = ds->priv;
-
 	if (ingress) {
 		ksz_port_cfg(dev, port, P_MIRROR_CTRL, PORT_MIRROR_RX, true);
 		dev->mirror_rx |= BIT(port);
@@ -1113,10 +1111,9 @@ static int ksz8_port_mirror_add(struct dsa_switch *ds, int port,
 	return 0;
 }
 
-static void ksz8_port_mirror_del(struct dsa_switch *ds, int port,
+static void ksz8_port_mirror_del(struct ksz_device *dev, int port,
 				 struct dsa_mall_mirror_tc_entry *mirror)
 {
-	struct ksz_device *dev = ds->priv;
 	u8 data;
 
 	if (mirror->ingress) {
@@ -1400,8 +1397,8 @@ static const struct dsa_switch_ops ksz8_switch_ops = {
 	.port_fdb_dump		= ksz_port_fdb_dump,
 	.port_mdb_add           = ksz_port_mdb_add,
 	.port_mdb_del           = ksz_port_mdb_del,
-	.port_mirror_add	= ksz8_port_mirror_add,
-	.port_mirror_del	= ksz8_port_mirror_del,
+	.port_mirror_add	= ksz_port_mirror_add,
+	.port_mirror_del	= ksz_port_mirror_del,
 };
 
 static u32 ksz8_get_port_addr(int port, int offset)
@@ -1464,6 +1461,8 @@ static const struct ksz_dev_ops ksz8_dev_ops = {
 	.vlan_filtering = ksz8_port_vlan_filtering,
 	.vlan_add = ksz8_port_vlan_add,
 	.vlan_del = ksz8_port_vlan_del,
+	.mirror_add = ksz8_port_mirror_add,
+	.mirror_del = ksz8_port_mirror_del,
 	.shutdown = ksz8_reset_switch,
 	.init = ksz8_switch_init,
 	.exit = ksz8_switch_exit,
diff --git a/drivers/net/dsa/microchip/ksz9477.c b/drivers/net/dsa/microchip/ksz9477.c
index 1bb994a9109cd..cd4a3088e9473 100644
--- a/drivers/net/dsa/microchip/ksz9477.c
+++ b/drivers/net/dsa/microchip/ksz9477.c
@@ -819,11 +819,10 @@ static int ksz9477_port_mdb_del(struct dsa_switch *ds, int port,
 	return ret;
 }
 
-static int ksz9477_port_mirror_add(struct dsa_switch *ds, int port,
+static int ksz9477_port_mirror_add(struct ksz_device *dev, int port,
 				   struct dsa_mall_mirror_tc_entry *mirror,
 				   bool ingress, struct netlink_ext_ack *extack)
 {
-	struct ksz_device *dev = ds->priv;
 	u8 data;
 	int p;
 
@@ -859,10 +858,9 @@ static int ksz9477_port_mirror_add(struct dsa_switch *ds, int port,
 	return 0;
 }
 
-static void ksz9477_port_mirror_del(struct dsa_switch *ds, int port,
+static void ksz9477_port_mirror_del(struct ksz_device *dev, int port,
 				    struct dsa_mall_mirror_tc_entry *mirror)
 {
-	struct ksz_device *dev = ds->priv;
 	bool in_use = false;
 	u8 data;
 	int p;
@@ -1335,8 +1333,8 @@ static const struct dsa_switch_ops ksz9477_switch_ops = {
 	.port_fdb_del		= ksz9477_port_fdb_del,
 	.port_mdb_add           = ksz9477_port_mdb_add,
 	.port_mdb_del           = ksz9477_port_mdb_del,
-	.port_mirror_add	= ksz9477_port_mirror_add,
-	.port_mirror_del	= ksz9477_port_mirror_del,
+	.port_mirror_add	= ksz_port_mirror_add,
+	.port_mirror_del	= ksz_port_mirror_del,
 	.get_stats64		= ksz_get_stats64,
 	.port_change_mtu	= ksz9477_change_mtu,
 	.port_max_mtu		= ksz9477_max_mtu,
@@ -1412,6 +1410,8 @@ static const struct ksz_dev_ops ksz9477_dev_ops = {
 	.vlan_filtering = ksz9477_port_vlan_filtering,
 	.vlan_add = ksz9477_port_vlan_add,
 	.vlan_del = ksz9477_port_vlan_del,
+	.mirror_add = ksz9477_port_mirror_add,
+	.mirror_del = ksz9477_port_mirror_del,
 	.shutdown = ksz9477_reset_switch,
 	.init = ksz9477_switch_init,
 	.exit = ksz9477_switch_exit,
diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
index 5db2b55152885..676669d353ea6 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -991,6 +991,29 @@ int ksz_port_vlan_del(struct dsa_switch *ds, int port,
 }
 EXPORT_SYMBOL_GPL(ksz_port_vlan_del);
 
+int ksz_port_mirror_add(struct dsa_switch *ds, int port,
+			struct dsa_mall_mirror_tc_entry *mirror,
+			bool ingress, struct netlink_ext_ack *extack)
+{
+	struct ksz_device *dev = ds->priv;
+
+	if (!dev->dev_ops->mirror_add)
+		return -EOPNOTSUPP;
+
+	return dev->dev_ops->mirror_add(dev, port, mirror, ingress, extack);
+}
+EXPORT_SYMBOL_GPL(ksz_port_mirror_add);
+
+void ksz_port_mirror_del(struct dsa_switch *ds, int port,
+			 struct dsa_mall_mirror_tc_entry *mirror)
+{
+	struct ksz_device *dev = ds->priv;
+
+	if (dev->dev_ops->mirror_del)
+		dev->dev_ops->mirror_del(dev, port, mirror);
+}
+EXPORT_SYMBOL_GPL(ksz_port_mirror_del);
+
 static int ksz_switch_detect(struct ksz_device *dev)
 {
 	u8 id1, id2;
diff --git a/drivers/net/dsa/microchip/ksz_common.h b/drivers/net/dsa/microchip/ksz_common.h
index 1baa270859aa2..c724cbb437e29 100644
--- a/drivers/net/dsa/microchip/ksz_common.h
+++ b/drivers/net/dsa/microchip/ksz_common.h
@@ -187,6 +187,11 @@ struct ksz_dev_ops {
 			 struct netlink_ext_ack *extack);
 	int  (*vlan_del)(struct ksz_device *dev, int port,
 			 const struct switchdev_obj_port_vlan *vlan);
+	int (*mirror_add)(struct ksz_device *dev, int port,
+			  struct dsa_mall_mirror_tc_entry *mirror,
+			  bool ingress, struct netlink_ext_ack *extack);
+	void (*mirror_del)(struct ksz_device *dev, int port,
+			   struct dsa_mall_mirror_tc_entry *mirror);
 	void (*freeze_mib)(struct ksz_device *dev, int port, bool freeze);
 	void (*port_init_cnt)(struct ksz_device *dev, int port);
 	int (*shutdown)(struct ksz_device *dev);
@@ -247,6 +252,11 @@ int ksz_port_vlan_add(struct dsa_switch *ds, int port,
 		      struct netlink_ext_ack *extack);
 int ksz_port_vlan_del(struct dsa_switch *ds, int port,
 		      const struct switchdev_obj_port_vlan *vlan);
+int ksz_port_mirror_add(struct dsa_switch *ds, int port,
+			struct dsa_mall_mirror_tc_entry *mirror,
+			bool ingress, struct netlink_ext_ack *extack);
+void ksz_port_mirror_del(struct dsa_switch *ds, int port,
+			 struct dsa_mall_mirror_tc_entry *mirror);
 
 /* Common register access functions */
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 042/158] net: dsa: microchip: update the ksz_phylink_get_caps
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (40 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 041/158] net: dsa: microchip: move the port mirror " Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 043/158] net: dsa: microchip: keep compatibility with device tree blobs with no phy-mode Greg Kroah-Hartman
                   ` (128 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Arun Ramadoss, Vladimir Oltean,
	Paolo Abeni, Sasha Levin

From: Arun Ramadoss <arun.ramadoss@microchip.com>

[ Upstream commit 7012033ce10e0968e6cb82709aa0ed7f2080b61e ]

This patch assigns the phylink_get_caps in ksz8795 and ksz9477 to
ksz_phylink_get_caps. And update their mac_capabilities in the
respective ksz_dev_ops.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/dsa/microchip/ksz8795.c    | 9 +++------
 drivers/net/dsa/microchip/ksz9477.c    | 7 +++----
 drivers/net/dsa/microchip/ksz_common.c | 3 +++
 drivers/net/dsa/microchip/ksz_common.h | 2 ++
 4 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/net/dsa/microchip/ksz8795.c b/drivers/net/dsa/microchip/ksz8795.c
index 2e3d24a3260e1..c771797fd902f 100644
--- a/drivers/net/dsa/microchip/ksz8795.c
+++ b/drivers/net/dsa/microchip/ksz8795.c
@@ -1353,13 +1353,9 @@ static int ksz8_setup(struct dsa_switch *ds)
 	return ksz8_handle_global_errata(ds);
 }
 
-static void ksz8_get_caps(struct dsa_switch *ds, int port,
+static void ksz8_get_caps(struct ksz_device *dev, int port,
 			  struct phylink_config *config)
 {
-	struct ksz_device *dev = ds->priv;
-
-	ksz_phylink_get_caps(ds, port, config);
-
 	config->mac_capabilities = MAC_10 | MAC_100;
 
 	/* Silicon Errata Sheet (DS80000830A):
@@ -1381,7 +1377,7 @@ static const struct dsa_switch_ops ksz8_switch_ops = {
 	.setup			= ksz8_setup,
 	.phy_read		= ksz_phy_read16,
 	.phy_write		= ksz_phy_write16,
-	.phylink_get_caps	= ksz8_get_caps,
+	.phylink_get_caps	= ksz_phylink_get_caps,
 	.phylink_mac_link_down	= ksz_mac_link_down,
 	.port_enable		= ksz_enable_port,
 	.get_strings		= ksz_get_strings,
@@ -1463,6 +1459,7 @@ static const struct ksz_dev_ops ksz8_dev_ops = {
 	.vlan_del = ksz8_port_vlan_del,
 	.mirror_add = ksz8_port_mirror_add,
 	.mirror_del = ksz8_port_mirror_del,
+	.get_caps = ksz8_get_caps,
 	.shutdown = ksz8_reset_switch,
 	.init = ksz8_switch_init,
 	.exit = ksz8_switch_exit,
diff --git a/drivers/net/dsa/microchip/ksz9477.c b/drivers/net/dsa/microchip/ksz9477.c
index cd4a3088e9473..125124fdefbf4 100644
--- a/drivers/net/dsa/microchip/ksz9477.c
+++ b/drivers/net/dsa/microchip/ksz9477.c
@@ -1082,11 +1082,9 @@ static void ksz9477_phy_errata_setup(struct ksz_device *dev, int port)
 	ksz9477_port_mmd_write(dev, port, 0x1c, 0x20, 0xeeee);
 }
 
-static void ksz9477_get_caps(struct dsa_switch *ds, int port,
+static void ksz9477_get_caps(struct ksz_device *dev, int port,
 			     struct phylink_config *config)
 {
-	ksz_phylink_get_caps(ds, port, config);
-
 	config->mac_capabilities = MAC_10 | MAC_100 | MAC_1000FD |
 				   MAC_ASYM_PAUSE | MAC_SYM_PAUSE;
 }
@@ -1316,7 +1314,7 @@ static const struct dsa_switch_ops ksz9477_switch_ops = {
 	.phy_read		= ksz9477_phy_read16,
 	.phy_write		= ksz9477_phy_write16,
 	.phylink_mac_link_down	= ksz_mac_link_down,
-	.phylink_get_caps	= ksz9477_get_caps,
+	.phylink_get_caps	= ksz_phylink_get_caps,
 	.port_enable		= ksz_enable_port,
 	.get_strings		= ksz_get_strings,
 	.get_ethtool_stats	= ksz_get_ethtool_stats,
@@ -1412,6 +1410,7 @@ static const struct ksz_dev_ops ksz9477_dev_ops = {
 	.vlan_del = ksz9477_port_vlan_del,
 	.mirror_add = ksz9477_port_mirror_add,
 	.mirror_del = ksz9477_port_mirror_del,
+	.get_caps = ksz9477_get_caps,
 	.shutdown = ksz9477_reset_switch,
 	.init = ksz9477_switch_init,
 	.exit = ksz9477_switch_exit,
diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
index 676669d353ea6..0f02c62b02685 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -456,6 +456,9 @@ void ksz_phylink_get_caps(struct dsa_switch *ds, int port,
 	if (dev->info->internal_phy[port])
 		__set_bit(PHY_INTERFACE_MODE_INTERNAL,
 			  config->supported_interfaces);
+
+	if (dev->dev_ops->get_caps)
+		dev->dev_ops->get_caps(dev, port, config);
 }
 EXPORT_SYMBOL_GPL(ksz_phylink_get_caps);
 
diff --git a/drivers/net/dsa/microchip/ksz_common.h b/drivers/net/dsa/microchip/ksz_common.h
index c724cbb437e29..10f9ef2dbf1ca 100644
--- a/drivers/net/dsa/microchip/ksz_common.h
+++ b/drivers/net/dsa/microchip/ksz_common.h
@@ -192,6 +192,8 @@ struct ksz_dev_ops {
 			  bool ingress, struct netlink_ext_ack *extack);
 	void (*mirror_del)(struct ksz_device *dev, int port,
 			   struct dsa_mall_mirror_tc_entry *mirror);
+	void (*get_caps)(struct ksz_device *dev, int port,
+			 struct phylink_config *config);
 	void (*freeze_mib)(struct ksz_device *dev, int port, bool freeze);
 	void (*port_init_cnt)(struct ksz_device *dev, int port);
 	int (*shutdown)(struct ksz_device *dev);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 043/158] net: dsa: microchip: keep compatibility with device tree blobs with no phy-mode
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (41 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 042/158] net: dsa: microchip: update the ksz_phylink_get_caps Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 044/158] net: ipa: dont assume SMEM is page-aligned Greg Kroah-Hartman
                   ` (127 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Craig McQueen, Vladimir Oltean,
	Alvin Šipraga, Rasmus Villemoes, Jakub Kicinski,
	Sasha Levin

From: Vladimir Oltean <vladimir.oltean@nxp.com>

[ Upstream commit 5fbb08eb7f945c7e8896ea39f03143ce66dfa4c7 ]

DSA has multiple ways of specifying a MAC connection to an internal PHY.
One requires a DT description like this:

	port@0 {
		reg = <0>;
		phy-handle = <&internal_phy>;
		phy-mode = "internal";
	};

(which is IMO the recommended approach, as it is the clearest
description)

but it is also possible to leave the specification as just:

	port@0 {
		reg = <0>;
	}

and if the driver implements ds->ops->phy_read and ds->ops->phy_write,
the DSA framework "knows" it should create a ds->slave_mii_bus, and it
should connect to a non-OF-based internal PHY on this MDIO bus, at an
MDIO address equal to the port address.

There is also an intermediary way of describing things:

	port@0 {
		reg = <0>;
		phy-handle = <&internal_phy>;
	};

In case 2, DSA calls phylink_connect_phy() and in case 3, it calls
phylink_of_phy_connect(). In both cases, phylink_create() has been
called with a phy_interface_t of PHY_INTERFACE_MODE_NA, and in both
cases, PHY_INTERFACE_MODE_NA is translated into phy->interface.

It is important to note that phy_device_create() initializes
dev->interface = PHY_INTERFACE_MODE_GMII, and so, when we use
phylink_create(PHY_INTERFACE_MODE_NA), no one will override this, and we
will end up with a PHY_INTERFACE_MODE_GMII interface inherited from the
PHY.

All this means that in order to maintain compatibility with device tree
blobs where the phy-mode property is missing, we need to allow the
"gmii" phy-mode and treat it as "internal".

Fixes: 2c709e0bdad4 ("net: dsa: microchip: ksz8795: add phylink support")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216320
Reported-by: Craig McQueen <craig@mcqueen.id.au>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Tested-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Link: https://lore.kernel.org/r/20220818143250.2797111-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/dsa/microchip/ksz_common.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
index 0f02c62b02685..c9389880ad1fa 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -453,9 +453,15 @@ void ksz_phylink_get_caps(struct dsa_switch *ds, int port,
 	if (dev->info->supports_rgmii[port])
 		phy_interface_set_rgmii(config->supported_interfaces);
 
-	if (dev->info->internal_phy[port])
+	if (dev->info->internal_phy[port]) {
 		__set_bit(PHY_INTERFACE_MODE_INTERNAL,
 			  config->supported_interfaces);
+		/* Compatibility for phylib's default interface type when the
+		 * phy-mode property is absent
+		 */
+		__set_bit(PHY_INTERFACE_MODE_GMII,
+			  config->supported_interfaces);
+	}
 
 	if (dev->dev_ops->get_caps)
 		dev->dev_ops->get_caps(dev, port, config);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 044/158] net: ipa: dont assume SMEM is page-aligned
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (42 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 043/158] net: dsa: microchip: keep compatibility with device tree blobs with no phy-mode Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 045/158] net: phy: Dont WARN for PHY_READY state in mdio_bus_phy_resume() Greg Kroah-Hartman
                   ` (126 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alex Elder, Jakub Kicinski, Sasha Levin

From: Alex Elder <elder@linaro.org>

[ Upstream commit b8d4380365c515d8e0351f2f46d371738dd19be1 ]

In ipa_smem_init(), a Qualcomm SMEM region is allocated (if needed)
and then its virtual address is fetched using qcom_smem_get().  The
physical address associated with that region is also fetched.

The physical address is adjusted so that it is page-aligned, and an
attempt is made to update the size of the region to compensate for
any non-zero adjustment.

But that adjustment isn't done properly.  The physical address is
aligned twice, and as a result the size is never actually adjusted.

Fix this by *not* aligning the "addr" local variable, and instead
making the "phys" local variable be the adjusted "addr" value.

Fixes: a0036bb413d5b ("net: ipa: define SMEM memory region for IPA")
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20220818134206.567618-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ipa/ipa_mem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ipa/ipa_mem.c b/drivers/net/ipa/ipa_mem.c
index 1e9eae208e44f..53a1dbeaffa6d 100644
--- a/drivers/net/ipa/ipa_mem.c
+++ b/drivers/net/ipa/ipa_mem.c
@@ -568,7 +568,7 @@ static int ipa_smem_init(struct ipa *ipa, u32 item, size_t size)
 	}
 
 	/* Align the address down and the size up to a page boundary */
-	addr = qcom_smem_virt_to_phys(virt) & PAGE_MASK;
+	addr = qcom_smem_virt_to_phys(virt);
 	phys = addr & PAGE_MASK;
 	size = PAGE_ALIGN(size + addr - phys);
 	iova = phys;	/* We just want a direct mapping */
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 045/158] net: phy: Dont WARN for PHY_READY state in mdio_bus_phy_resume()
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (43 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 044/158] net: ipa: dont assume SMEM is page-aligned Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 046/158] net: moxa: get rid of asymmetry in DMA mapping/unmapping Greg Kroah-Hartman
                   ` (125 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Xiaolei Wang, Florian Fainelli,
	Jakub Kicinski, Sasha Levin

From: Xiaolei Wang <xiaolei.wang@windriver.com>

[ Upstream commit 6dbe852c379ff032a70a6b13a91914918c82cb07 ]

For some MAC drivers, they set the mac_managed_pm to true in its
->ndo_open() callback. So before the mac_managed_pm is set to true,
we still want to leverage the mdio_bus_phy_suspend()/resume() for
the phy device suspend and resume. In this case, the phy device is
in PHY_READY, and we shouldn't warn about this. It also seems that
the check of mac_managed_pm in WARN_ON is redundant since we already
check this in the entry of mdio_bus_phy_resume(), so drop it.

Fixes: 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")
Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220819082451.1992102-1-xiaolei.wang@windriver.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/phy/phy_device.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 608de5a94165f..f90a21781d8d6 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -316,11 +316,11 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
 
 	phydev->suspended_by_mdio_bus = 0;
 
-	/* If we managed to get here with the PHY state machine in a state other
-	 * than PHY_HALTED this is an indication that something went wrong and
-	 * we should most likely be using MAC managed PM and we are not.
+	/* If we manged to get here with the PHY state machine in a state neither
+	 * PHY_HALTED nor PHY_READY this is an indication that something went wrong
+	 * and we should most likely be using MAC managed PM and we are not.
 	 */
-	WARN_ON(phydev->state != PHY_HALTED && !phydev->mac_managed_pm);
+	WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY);
 
 	ret = phy_init_hw(phydev);
 	if (ret < 0)
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 046/158] net: moxa: get rid of asymmetry in DMA mapping/unmapping
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (44 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 045/158] net: phy: Dont WARN for PHY_READY state in mdio_bus_phy_resume() Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 047/158] bonding: 802.3ad: fix no transmission of LACPDUs Greg Kroah-Hartman
                   ` (124 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Sergei Antonov, Andrew Lunn,
	Jakub Kicinski, Sasha Levin

From: Sergei Antonov <saproj@gmail.com>

[ Upstream commit 0ee7828dfc56e97d71e51e6374dc7b4eb2b6e081 ]

Since priv->rx_mapping[i] is maped in moxart_mac_open(), we
should unmap it from moxart_mac_stop(). Fixes 2 warnings.

1. During error unwinding in moxart_mac_probe(): "goto init_fail;",
then moxart_mac_free_memory() calls dma_unmap_single() with
priv->rx_mapping[i] pointers zeroed.

WARNING: CPU: 0 PID: 1 at kernel/dma/debug.c:963 check_unmap+0x704/0x980
DMA-API: moxart-ethernet 92000000.mac: device driver tries to free DMA memory it has not allocated [device address=0x0000000000000000] [size=1600 bytes]
CPU: 0 PID: 1 Comm: swapper Not tainted 5.19.0+ #60
Hardware name: Generic DT based system
 unwind_backtrace from show_stack+0x10/0x14
 show_stack from dump_stack_lvl+0x34/0x44
 dump_stack_lvl from __warn+0xbc/0x1f0
 __warn from warn_slowpath_fmt+0x94/0xc8
 warn_slowpath_fmt from check_unmap+0x704/0x980
 check_unmap from debug_dma_unmap_page+0x8c/0x9c
 debug_dma_unmap_page from moxart_mac_free_memory+0x3c/0xa8
 moxart_mac_free_memory from moxart_mac_probe+0x190/0x218
 moxart_mac_probe from platform_probe+0x48/0x88
 platform_probe from really_probe+0xc0/0x2e4

2. After commands:
 ip link set dev eth0 down
 ip link set dev eth0 up

WARNING: CPU: 0 PID: 55 at kernel/dma/debug.c:570 add_dma_entry+0x204/0x2ec
DMA-API: moxart-ethernet 92000000.mac: cacheline tracking EEXIST, overlapping mappings aren't supported
CPU: 0 PID: 55 Comm: ip Not tainted 5.19.0+ #57
Hardware name: Generic DT based system
 unwind_backtrace from show_stack+0x10/0x14
 show_stack from dump_stack_lvl+0x34/0x44
 dump_stack_lvl from __warn+0xbc/0x1f0
 __warn from warn_slowpath_fmt+0x94/0xc8
 warn_slowpath_fmt from add_dma_entry+0x204/0x2ec
 add_dma_entry from dma_map_page_attrs+0x110/0x328
 dma_map_page_attrs from moxart_mac_open+0x134/0x320
 moxart_mac_open from __dev_open+0x11c/0x1ec
 __dev_open from __dev_change_flags+0x194/0x22c
 __dev_change_flags from dev_change_flags+0x14/0x44
 dev_change_flags from devinet_ioctl+0x6d4/0x93c
 devinet_ioctl from inet_ioctl+0x1ac/0x25c

v1 -> v2:
Extraneous change removed.

Fixes: 6c821bd9edc9 ("net: Add MOXA ART SoCs ethernet driver")
Signed-off-by: Sergei Antonov <saproj@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220819110519.1230877-1-saproj@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/moxa/moxart_ether.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/moxa/moxart_ether.c b/drivers/net/ethernet/moxa/moxart_ether.c
index f11f1cb92025f..3b6beb96ca856 100644
--- a/drivers/net/ethernet/moxa/moxart_ether.c
+++ b/drivers/net/ethernet/moxa/moxart_ether.c
@@ -74,11 +74,6 @@ static int moxart_set_mac_address(struct net_device *ndev, void *addr)
 static void moxart_mac_free_memory(struct net_device *ndev)
 {
 	struct moxart_mac_priv_t *priv = netdev_priv(ndev);
-	int i;
-
-	for (i = 0; i < RX_DESC_NUM; i++)
-		dma_unmap_single(&priv->pdev->dev, priv->rx_mapping[i],
-				 priv->rx_buf_size, DMA_FROM_DEVICE);
 
 	if (priv->tx_desc_base)
 		dma_free_coherent(&priv->pdev->dev,
@@ -193,6 +188,7 @@ static int moxart_mac_open(struct net_device *ndev)
 static int moxart_mac_stop(struct net_device *ndev)
 {
 	struct moxart_mac_priv_t *priv = netdev_priv(ndev);
+	int i;
 
 	napi_disable(&priv->napi);
 
@@ -204,6 +200,11 @@ static int moxart_mac_stop(struct net_device *ndev)
 	/* disable all functions */
 	writel(0, priv->base + REG_MAC_CTRL);
 
+	/* unmap areas mapped in moxart_mac_setup_desc_ring() */
+	for (i = 0; i < RX_DESC_NUM; i++)
+		dma_unmap_single(&priv->pdev->dev, priv->rx_mapping[i],
+				 priv->rx_buf_size, DMA_FROM_DEVICE);
+
 	return 0;
 }
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 047/158] bonding: 802.3ad: fix no transmission of LACPDUs
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (45 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 046/158] net: moxa: get rid of asymmetry in DMA mapping/unmapping Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 048/158] net: ipvtap - add __init/__exit annotations to module init/exit funcs Greg Kroah-Hartman
                   ` (123 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jonathan Toppins, Jay Vosburgh,
	Jakub Kicinski, Sasha Levin

From: Jonathan Toppins <jtoppins@redhat.com>

[ Upstream commit d745b5062ad2b5da90a5e728d7ca884fc07315fd ]

This is caused by the global variable ad_ticks_per_sec being zero as
demonstrated by the reproducer script discussed below. This causes
all timer values in __ad_timer_to_ticks to be zero, resulting
in the periodic timer to never fire.

To reproduce:
Run the script in
`tools/testing/selftests/drivers/net/bonding/bond-break-lacpdu-tx.sh` which
puts bonding into a state where it never transmits LACPDUs.

line 44: ip link add fbond type bond mode 4 miimon 200 \
            xmit_hash_policy 1 ad_actor_sys_prio 65535 lacp_rate fast
setting bond param: ad_actor_sys_prio
given:
    params.ad_actor_system = 0
call stack:
    bond_option_ad_actor_sys_prio()
    -> bond_3ad_update_ad_actor_settings()
       -> set ad.system.sys_priority = bond->params.ad_actor_sys_prio
       -> ad.system.sys_mac_addr = bond->dev->dev_addr; because
            params.ad_actor_system == 0
results:
     ad.system.sys_mac_addr = bond->dev->dev_addr

line 48: ip link set fbond address 52:54:00:3B:7C:A6
setting bond MAC addr
call stack:
    bond->dev->dev_addr = new_mac

line 52: ip link set fbond type bond ad_actor_sys_prio 65535
setting bond param: ad_actor_sys_prio
given:
    params.ad_actor_system = 0
call stack:
    bond_option_ad_actor_sys_prio()
    -> bond_3ad_update_ad_actor_settings()
       -> set ad.system.sys_priority = bond->params.ad_actor_sys_prio
       -> ad.system.sys_mac_addr = bond->dev->dev_addr; because
            params.ad_actor_system == 0
results:
     ad.system.sys_mac_addr = bond->dev->dev_addr

line 60: ip link set veth1-bond down master fbond
given:
    params.ad_actor_system = 0
    params.mode = BOND_MODE_8023AD
    ad.system.sys_mac_addr == bond->dev->dev_addr
call stack:
    bond_enslave
    -> bond_3ad_initialize(); because first slave
       -> if ad.system.sys_mac_addr != bond->dev->dev_addr
          return
results:
     Nothing is run in bond_3ad_initialize() because dev_addr equals
     sys_mac_addr leaving the global ad_ticks_per_sec zero as it is
     never initialized anywhere else.

The if check around the contents of bond_3ad_initialize() is no longer
needed due to commit 5ee14e6d336f ("bonding: 3ad: apply ad_actor settings
changes immediately") which sets ad.system.sys_mac_addr if any one of
the bonding parameters whos set function calls
bond_3ad_update_ad_actor_settings(). This is because if
ad.system.sys_mac_addr is zero it will be set to the current bond mac
address, this causes the if check to never be true.

Fixes: 5ee14e6d336f ("bonding: 3ad: apply ad_actor settings changes immediately")
Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/bonding/bond_3ad.c | 38 ++++++++++++++--------------------
 1 file changed, 16 insertions(+), 22 deletions(-)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index d7fb33c078e81..1f0120cbe9e80 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -2007,30 +2007,24 @@ void bond_3ad_initiate_agg_selection(struct bonding *bond, int timeout)
  */
 void bond_3ad_initialize(struct bonding *bond, u16 tick_resolution)
 {
-	/* check that the bond is not initialized yet */
-	if (!MAC_ADDRESS_EQUAL(&(BOND_AD_INFO(bond).system.sys_mac_addr),
-				bond->dev->dev_addr)) {
-
-		BOND_AD_INFO(bond).aggregator_identifier = 0;
-
-		BOND_AD_INFO(bond).system.sys_priority =
-			bond->params.ad_actor_sys_prio;
-		if (is_zero_ether_addr(bond->params.ad_actor_system))
-			BOND_AD_INFO(bond).system.sys_mac_addr =
-			    *((struct mac_addr *)bond->dev->dev_addr);
-		else
-			BOND_AD_INFO(bond).system.sys_mac_addr =
-			    *((struct mac_addr *)bond->params.ad_actor_system);
+	BOND_AD_INFO(bond).aggregator_identifier = 0;
+	BOND_AD_INFO(bond).system.sys_priority =
+		bond->params.ad_actor_sys_prio;
+	if (is_zero_ether_addr(bond->params.ad_actor_system))
+		BOND_AD_INFO(bond).system.sys_mac_addr =
+		    *((struct mac_addr *)bond->dev->dev_addr);
+	else
+		BOND_AD_INFO(bond).system.sys_mac_addr =
+		    *((struct mac_addr *)bond->params.ad_actor_system);
 
-		/* initialize how many times this module is called in one
-		 * second (should be about every 100ms)
-		 */
-		ad_ticks_per_sec = tick_resolution;
+	/* initialize how many times this module is called in one
+	 * second (should be about every 100ms)
+	 */
+	ad_ticks_per_sec = tick_resolution;
 
-		bond_3ad_initiate_agg_selection(bond,
-						AD_AGGREGATOR_SELECTION_TIMER *
-						ad_ticks_per_sec);
-	}
+	bond_3ad_initiate_agg_selection(bond,
+					AD_AGGREGATOR_SELECTION_TIMER *
+					ad_ticks_per_sec);
 }
 
 /**
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 048/158] net: ipvtap - add __init/__exit annotations to module init/exit funcs
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (46 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 047/158] bonding: 802.3ad: fix no transmission of LACPDUs Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 049/158] netfilter: ebtables: reject blobs that dont provide all entry points Greg Kroah-Hartman
                   ` (122 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mahesh Bandewar, Sainath Grandhi,
	Maciej Żenczykowski, Paolo Abeni, Sasha Levin

From: Maciej Żenczykowski <maze@google.com>

[ Upstream commit 4b2e3a17e9f279325712b79fb01d1493f9e3e005 ]

Looks to have been left out in an oversight.

Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Sainath Grandhi <sainath.grandhi@intel.com>
Fixes: 235a9d89da97 ('ipvtap: IP-VLAN based tap driver')
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Link: https://lore.kernel.org/r/20220821130808.12143-1-zenczykowski@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ipvlan/ipvtap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ipvlan/ipvtap.c b/drivers/net/ipvlan/ipvtap.c
index ef02f2cf5ce13..cbabca167a078 100644
--- a/drivers/net/ipvlan/ipvtap.c
+++ b/drivers/net/ipvlan/ipvtap.c
@@ -194,7 +194,7 @@ static struct notifier_block ipvtap_notifier_block __read_mostly = {
 	.notifier_call	= ipvtap_device_event,
 };
 
-static int ipvtap_init(void)
+static int __init ipvtap_init(void)
 {
 	int err;
 
@@ -228,7 +228,7 @@ static int ipvtap_init(void)
 }
 module_init(ipvtap_init);
 
-static void ipvtap_exit(void)
+static void __exit ipvtap_exit(void)
 {
 	rtnl_link_unregister(&ipvtap_link_ops);
 	unregister_netdevice_notifier(&ipvtap_notifier_block);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 049/158] netfilter: ebtables: reject blobs that dont provide all entry points
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (47 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 048/158] net: ipvtap - add __init/__exit annotations to module init/exit funcs Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 050/158] netfilter: nft_tproxy: restrict to prerouting hook Greg Kroah-Hartman
                   ` (121 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Harshit Mogalapalli, syzkaller,
	Florian Westphal, Sasha Levin

From: Florian Westphal <fw@strlen.de>

[ Upstream commit 7997eff82828304b780dc0a39707e1946d6f1ebf ]

Harshit Mogalapalli says:
 In ebt_do_table() function dereferencing 'private->hook_entry[hook]'
 can lead to NULL pointer dereference. [..] Kernel panic:

general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
[..]
RIP: 0010:ebt_do_table+0x1dc/0x1ce0
Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 5c 16 00 00 48 b8 00 00 00 00 00 fc ff df 49 8b 6c df 08 48 8d 7d 2c 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 88
[..]
Call Trace:
 nf_hook_slow+0xb1/0x170
 __br_forward+0x289/0x730
 maybe_deliver+0x24b/0x380
 br_flood+0xc6/0x390
 br_dev_xmit+0xa2e/0x12c0

For some reason ebtables rejects blobs that provide entry points that are
not supported by the table, but what it should instead reject is the
opposite: blobs that DO NOT provide an entry point supported by the table.

t->valid_hooks is the bitmask of hooks (input, forward ...) that will see
packets.  Providing an entry point that is not support is harmless
(never called/used), but the inverse isn't: it results in a crash
because the ebtables traverser doesn't expect a NULL blob for a location
its receiving packets for.

Instead of fixing all the individual checks, do what iptables is doing and
reject all blobs that differ from the expected hooks.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/linux/netfilter_bridge/ebtables.h | 4 ----
 net/bridge/netfilter/ebtable_broute.c     | 8 --------
 net/bridge/netfilter/ebtable_filter.c     | 8 --------
 net/bridge/netfilter/ebtable_nat.c        | 8 --------
 net/bridge/netfilter/ebtables.c           | 8 +-------
 5 files changed, 1 insertion(+), 35 deletions(-)

diff --git a/include/linux/netfilter_bridge/ebtables.h b/include/linux/netfilter_bridge/ebtables.h
index a13296d6c7ceb..fd533552a062c 100644
--- a/include/linux/netfilter_bridge/ebtables.h
+++ b/include/linux/netfilter_bridge/ebtables.h
@@ -94,10 +94,6 @@ struct ebt_table {
 	struct ebt_replace_kernel *table;
 	unsigned int valid_hooks;
 	rwlock_t lock;
-	/* e.g. could be the table explicitly only allows certain
-	 * matches, targets, ... 0 == let it in */
-	int (*check)(const struct ebt_table_info *info,
-	   unsigned int valid_hooks);
 	/* the data used by the kernel */
 	struct ebt_table_info *private;
 	struct nf_hook_ops *ops;
diff --git a/net/bridge/netfilter/ebtable_broute.c b/net/bridge/netfilter/ebtable_broute.c
index 1a11064f99907..8f19253024b0a 100644
--- a/net/bridge/netfilter/ebtable_broute.c
+++ b/net/bridge/netfilter/ebtable_broute.c
@@ -36,18 +36,10 @@ static struct ebt_replace_kernel initial_table = {
 	.entries	= (char *)&initial_chain,
 };
 
-static int check(const struct ebt_table_info *info, unsigned int valid_hooks)
-{
-	if (valid_hooks & ~(1 << NF_BR_BROUTING))
-		return -EINVAL;
-	return 0;
-}
-
 static const struct ebt_table broute_table = {
 	.name		= "broute",
 	.table		= &initial_table,
 	.valid_hooks	= 1 << NF_BR_BROUTING,
-	.check		= check,
 	.me		= THIS_MODULE,
 };
 
diff --git a/net/bridge/netfilter/ebtable_filter.c b/net/bridge/netfilter/ebtable_filter.c
index cb949436bc0e3..278f324e67524 100644
--- a/net/bridge/netfilter/ebtable_filter.c
+++ b/net/bridge/netfilter/ebtable_filter.c
@@ -43,18 +43,10 @@ static struct ebt_replace_kernel initial_table = {
 	.entries	= (char *)initial_chains,
 };
 
-static int check(const struct ebt_table_info *info, unsigned int valid_hooks)
-{
-	if (valid_hooks & ~FILTER_VALID_HOOKS)
-		return -EINVAL;
-	return 0;
-}
-
 static const struct ebt_table frame_filter = {
 	.name		= "filter",
 	.table		= &initial_table,
 	.valid_hooks	= FILTER_VALID_HOOKS,
-	.check		= check,
 	.me		= THIS_MODULE,
 };
 
diff --git a/net/bridge/netfilter/ebtable_nat.c b/net/bridge/netfilter/ebtable_nat.c
index 5ee0531ae5061..9066f7f376d57 100644
--- a/net/bridge/netfilter/ebtable_nat.c
+++ b/net/bridge/netfilter/ebtable_nat.c
@@ -43,18 +43,10 @@ static struct ebt_replace_kernel initial_table = {
 	.entries	= (char *)initial_chains,
 };
 
-static int check(const struct ebt_table_info *info, unsigned int valid_hooks)
-{
-	if (valid_hooks & ~NAT_VALID_HOOKS)
-		return -EINVAL;
-	return 0;
-}
-
 static const struct ebt_table frame_nat = {
 	.name		= "nat",
 	.table		= &initial_table,
 	.valid_hooks	= NAT_VALID_HOOKS,
-	.check		= check,
 	.me		= THIS_MODULE,
 };
 
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index f2dbefb61ce83..9a0ae59cdc500 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -1040,8 +1040,7 @@ static int do_replace_finish(struct net *net, struct ebt_replace *repl,
 		goto free_iterate;
 	}
 
-	/* the table doesn't like it */
-	if (t->check && (ret = t->check(newinfo, repl->valid_hooks)))
+	if (repl->valid_hooks != t->valid_hooks)
 		goto free_unlock;
 
 	if (repl->num_counters && repl->num_counters != t->private->nentries) {
@@ -1231,11 +1230,6 @@ int ebt_register_table(struct net *net, const struct ebt_table *input_table,
 	if (ret != 0)
 		goto free_chainstack;
 
-	if (table->check && table->check(newinfo, table->valid_hooks)) {
-		ret = -EINVAL;
-		goto free_chainstack;
-	}
-
 	table->private = newinfo;
 	rwlock_init(&table->lock);
 	mutex_lock(&ebt_mutex);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 050/158] netfilter: nft_tproxy: restrict to prerouting hook
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (48 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 049/158] netfilter: ebtables: reject blobs that dont provide all entry points Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 051/158] bnxt_en: Use PAGE_SIZE to init buffer when multi buffer XDP is not in use Greg Kroah-Hartman
                   ` (120 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Shell Chen, Florian Westphal, Sasha Levin

From: Florian Westphal <fw@strlen.de>

[ Upstream commit 18bbc3213383a82b05383827f4b1b882e3f0a5a5 ]

TPROXY is only allowed from prerouting, but nft_tproxy doesn't check this.
This fixes a crash (null dereference) when using tproxy from e.g. output.

Fixes: 4ed8eb6570a4 ("netfilter: nf_tables: Add native tproxy support")
Reported-by: Shell Chen <xierch@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/netfilter/nft_tproxy.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/net/netfilter/nft_tproxy.c b/net/netfilter/nft_tproxy.c
index 801f013971dfa..a701ad64f10af 100644
--- a/net/netfilter/nft_tproxy.c
+++ b/net/netfilter/nft_tproxy.c
@@ -312,6 +312,13 @@ static int nft_tproxy_dump(struct sk_buff *skb,
 	return 0;
 }
 
+static int nft_tproxy_validate(const struct nft_ctx *ctx,
+			       const struct nft_expr *expr,
+			       const struct nft_data **data)
+{
+	return nft_chain_validate_hooks(ctx->chain, 1 << NF_INET_PRE_ROUTING);
+}
+
 static struct nft_expr_type nft_tproxy_type;
 static const struct nft_expr_ops nft_tproxy_ops = {
 	.type		= &nft_tproxy_type,
@@ -321,6 +328,7 @@ static const struct nft_expr_ops nft_tproxy_ops = {
 	.destroy	= nft_tproxy_destroy,
 	.dump		= nft_tproxy_dump,
 	.reduce		= NFT_REDUCE_READONLY,
+	.validate	= nft_tproxy_validate,
 };
 
 static struct nft_expr_type nft_tproxy_type __read_mostly = {
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 051/158] bnxt_en: Use PAGE_SIZE to init buffer when multi buffer XDP is not in use
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (49 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 050/158] netfilter: nft_tproxy: restrict to prerouting hook Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 052/158] bnxt_en: set missing reload flag in devlink features Greg Kroah-Hartman
                   ` (119 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Somnath Kotur, Pavan Chebbi,
	Michael Chan, Jakub Kicinski, Sasha Levin

From: Pavan Chebbi <pavan.chebbi@broadcom.com>

[ Upstream commit 7dd3de7cb1d657a918c6b2bc673c71e318aa0c05 ]

Using BNXT_PAGE_MODE_BUF_SIZE + offset as buffer length value is not
sufficient when running single buffer XDP programs doing redirect
operations. The stack will complain on missing skb tail room. Fix it
by using PAGE_SIZE when calling xdp_init_buff() for single buffer
programs.

Fixes: b231c3f3414c ("bnxt: refactor bnxt_rx_xdp to separate xdp_init_buff/xdp_prepare_buff")
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |  1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 10 ++++++++--
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 075c6206325ce..b1b17f9113006 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -2130,6 +2130,7 @@ struct bnxt {
 #define BNXT_DUMP_CRASH		1
 
 	struct bpf_prog		*xdp_prog;
+	u8			xdp_has_frags;
 
 	struct bnxt_ptp_cfg	*ptp_cfg;
 	u8			ptp_all_rx_tstamp;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
index f53387ed0167b..c3065ec0a4798 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
@@ -181,6 +181,7 @@ void bnxt_xdp_buff_init(struct bnxt *bp, struct bnxt_rx_ring_info *rxr,
 			struct xdp_buff *xdp)
 {
 	struct bnxt_sw_rx_bd *rx_buf;
+	u32 buflen = PAGE_SIZE;
 	struct pci_dev *pdev;
 	dma_addr_t mapping;
 	u32 offset;
@@ -192,7 +193,10 @@ void bnxt_xdp_buff_init(struct bnxt *bp, struct bnxt_rx_ring_info *rxr,
 	mapping = rx_buf->mapping - bp->rx_dma_offset;
 	dma_sync_single_for_cpu(&pdev->dev, mapping + offset, *len, bp->rx_dir);
 
-	xdp_init_buff(xdp, BNXT_PAGE_MODE_BUF_SIZE + offset, &rxr->xdp_rxq);
+	if (bp->xdp_has_frags)
+		buflen = BNXT_PAGE_MODE_BUF_SIZE + offset;
+
+	xdp_init_buff(xdp, buflen, &rxr->xdp_rxq);
 	xdp_prepare_buff(xdp, *data_ptr - offset, offset, *len, false);
 }
 
@@ -397,8 +401,10 @@ static int bnxt_xdp_set(struct bnxt *bp, struct bpf_prog *prog)
 		netdev_warn(dev, "ethtool rx/tx channels must be combined to support XDP.\n");
 		return -EOPNOTSUPP;
 	}
-	if (prog)
+	if (prog) {
 		tx_xdp = bp->rx_nr_rings;
+		bp->xdp_has_frags = prog->aux->xdp_has_frags;
+	}
 
 	tc = netdev_get_num_tc(dev);
 	if (!tc)
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 052/158] bnxt_en: set missing reload flag in devlink features
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (50 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 051/158] bnxt_en: Use PAGE_SIZE to init buffer when multi buffer XDP is not in use Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 053/158] bnxt_en: fix NQ resource accounting during vf creation on 57500 chips Greg Kroah-Hartman
                   ` (118 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Somnath Kotur, Vikas Gupta,
	Michael Chan, Jakub Kicinski, Sasha Levin

From: Vikas Gupta <vikas.gupta@broadcom.com>

[ Upstream commit 574b2bb9692fd3d45ed631ac447176d4679f3010 ]

Add missing devlink_set_features() API for callbacks reload_down
and reload_up to function.

Fixes: 228ea8c187d8 ("bnxt_en: implement devlink dev reload driver_reinit")
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
index 6b3d4f4c2a75f..d83be40785b89 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
@@ -1246,6 +1246,7 @@ int bnxt_dl_register(struct bnxt *bp)
 	if (rc)
 		goto err_dl_port_unreg;
 
+	devlink_set_features(dl, DEVLINK_F_RELOAD);
 out:
 	devlink_register(dl);
 	return 0;
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 053/158] bnxt_en: fix NQ resource accounting during vf creation on 57500 chips
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (51 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 052/158] bnxt_en: set missing reload flag in devlink features Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 054/158] bnxt_en: fix LRO/GRO_HW features in ndo_fix_features callback Greg Kroah-Hartman
                   ` (117 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Vikas Gupta, Michael Chan,
	Jakub Kicinski, Sasha Levin

From: Vikas Gupta <vikas.gupta@broadcom.com>

[ Upstream commit 09a89cc59ad67794a11e1d3dd13c5b3172adcc51 ]

There are 2 issues:

1. We should decrement hw_resc->max_nqs instead of hw_resc->max_irqs
   with the number of NQs assigned to the VFs.  The IRQs are fixed
   on each function and cannot be re-assigned.  Only the NQs are being
   assigned to the VFs.

2. vf_msix is the total number of NQs to be assigned to the VFs.  So
   we should decrement vf_msix from hw_resc->max_nqs.

Fixes: b16b68918674 ("bnxt_en: Add SR-IOV support for 57500 chips.")
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
index a1a2c7a64fd58..c9cf0569451a2 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
@@ -623,7 +623,7 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int num_vfs, bool reset)
 		hw_resc->max_stat_ctxs -= le16_to_cpu(req->min_stat_ctx) * n;
 		hw_resc->max_vnics -= le16_to_cpu(req->min_vnics) * n;
 		if (bp->flags & BNXT_FLAG_CHIP_P5)
-			hw_resc->max_irqs -= vf_msix * n;
+			hw_resc->max_nqs -= vf_msix;
 
 		rc = pf->active_vfs;
 	}
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 054/158] bnxt_en: fix LRO/GRO_HW features in ndo_fix_features callback
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (52 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 053/158] bnxt_en: fix NQ resource accounting during vf creation on 57500 chips Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 055/158] netfilter: nf_tables: disallow updates of implicit chain Greg Kroah-Hartman
                   ` (116 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Vikas Gupta, Michael Chan,
	Jakub Kicinski, Sasha Levin

From: Vikas Gupta <vikas.gupta@broadcom.com>

[ Upstream commit 366c304741729e64d778c80555d9eb422cf5cc89 ]

LRO/GRO_HW should be disabled if there is an attached XDP program.
BNXT_FLAG_TPA is the current setting of the LRO/GRO_HW.  Using
BNXT_FLAG_TPA to disable LRO/GRO_HW will cause these features to be
permanently disabled once they are disabled.

Fixes: 1dc4c557bfed ("bnxt: adding bnxt_xdp_build_skb to build skb from multibuffer xdp_buff")
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index cf9b00576ed36..964354536f9ce 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -11183,10 +11183,7 @@ static netdev_features_t bnxt_fix_features(struct net_device *dev,
 	if ((features & NETIF_F_NTUPLE) && !bnxt_rfs_capable(bp))
 		features &= ~NETIF_F_NTUPLE;
 
-	if (bp->flags & BNXT_FLAG_NO_AGG_RINGS)
-		features &= ~(NETIF_F_LRO | NETIF_F_GRO_HW);
-
-	if (!(bp->flags & BNXT_FLAG_TPA))
+	if ((bp->flags & BNXT_FLAG_NO_AGG_RINGS) || bp->xdp_prog)
 		features &= ~(NETIF_F_LRO | NETIF_F_GRO_HW);
 
 	if (!(features & NETIF_F_GRO))
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 055/158] netfilter: nf_tables: disallow updates of implicit chain
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (53 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 054/158] bnxt_en: fix LRO/GRO_HW features in ndo_fix_features callback Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 056/158] netfilter: nf_tables: make table handle allocation per-netns friendly Greg Kroah-Hartman
                   ` (115 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Pablo Neira Ayuso, Sasha Levin

From: Pablo Neira Ayuso <pablo@netfilter.org>

[ Upstream commit 5dc52d83baac30decf5f3b371d5eb41dfa1d1412 ]

Updates on existing implicit chain make no sense, disallow this.

Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/netfilter/nf_tables_api.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 4bd6e9427c918..8b6ee9df817fb 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2574,6 +2574,9 @@ static int nf_tables_newchain(struct sk_buff *skb, const struct nfnl_info *info,
 	nft_ctx_init(&ctx, net, skb, info->nlh, family, table, chain, nla);
 
 	if (chain != NULL) {
+		if (chain->flags & NFT_CHAIN_BINDING)
+			return -EINVAL;
+
 		if (info->nlh->nlmsg_flags & NLM_F_EXCL) {
 			NL_SET_BAD_ATTR(extack, attr);
 			return -EEXIST;
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 056/158] netfilter: nf_tables: make table handle allocation per-netns friendly
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (54 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 055/158] netfilter: nf_tables: disallow updates of implicit chain Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 057/158] netfilter: nft_payload: report ERANGE for too long offset and length Greg Kroah-Hartman
                   ` (114 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Abhishek Shah, Florian Westphal,
	Pablo Neira Ayuso, Sasha Levin

From: Pablo Neira Ayuso <pablo@netfilter.org>

[ Upstream commit ab482c6b66a4a8c0a8c0b0f577a785cf9ff1c2e2 ]

mutex is per-netns, move table_netns to the pernet area.

*read-write* to 0xffffffff883a01e8 of 8 bytes by task 6542 on cpu 0:
 nf_tables_newtable+0x6dc/0xc00 net/netfilter/nf_tables_api.c:1221
 nfnetlink_rcv_batch net/netfilter/nfnetlink.c:513 [inline]
 nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:634 [inline]
 nfnetlink_rcv+0xa6a/0x13a0 net/netfilter/nfnetlink.c:652
 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
 netlink_unicast+0x652/0x730 net/netlink/af_netlink.c:1345
 netlink_sendmsg+0x643/0x740 net/netlink/af_netlink.c:1921

Fixes: f102d66b335a ("netfilter: nf_tables: use dedicated mutex to guard transactions")
Reported-by: Abhishek Shah <abhishek.shah@columbia.edu>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/netfilter/nf_tables.h | 1 +
 net/netfilter/nf_tables_api.c     | 3 +--
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index b8890ace0f879..0daad6e63ccb2 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -1635,6 +1635,7 @@ struct nftables_pernet {
 	struct list_head	module_list;
 	struct list_head	notify_list;
 	struct mutex		commit_mutex;
+	u64			table_handle;
 	unsigned int		base_seq;
 	u8			validate_state;
 };
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 8b6ee9df817fb..e171257739c2f 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -32,7 +32,6 @@ static LIST_HEAD(nf_tables_objects);
 static LIST_HEAD(nf_tables_flowtables);
 static LIST_HEAD(nf_tables_destroy_list);
 static DEFINE_SPINLOCK(nf_tables_destroy_list_lock);
-static u64 table_handle;
 
 enum {
 	NFT_VALIDATE_SKIP	= 0,
@@ -1235,7 +1234,7 @@ static int nf_tables_newtable(struct sk_buff *skb, const struct nfnl_info *info,
 	INIT_LIST_HEAD(&table->flowtables);
 	table->family = family;
 	table->flags = flags;
-	table->handle = ++table_handle;
+	table->handle = ++nft_net->table_handle;
 	if (table->flags & NFT_TABLE_F_OWNER)
 		table->nlpid = NETLINK_CB(skb).portid;
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 057/158] netfilter: nft_payload: report ERANGE for too long offset and length
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (55 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 056/158] netfilter: nf_tables: make table handle allocation per-netns friendly Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 058/158] netfilter: nft_payload: do not truncate csum_offset and csum_type Greg Kroah-Hartman
                   ` (113 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Pablo Neira Ayuso, Sasha Levin

From: Pablo Neira Ayuso <pablo@netfilter.org>

[ Upstream commit 94254f990c07e9ddf1634e0b727fab821c3b5bf9 ]

Instead of offset and length are truncation to u8, report ERANGE.

Fixes: 96518518cc41 ("netfilter: add nftables")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/netfilter/nft_payload.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index 2e7ac007cb30f..4fee67abfe2c5 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -833,6 +833,7 @@ nft_payload_select_ops(const struct nft_ctx *ctx,
 {
 	enum nft_payload_bases base;
 	unsigned int offset, len;
+	int err;
 
 	if (tb[NFTA_PAYLOAD_BASE] == NULL ||
 	    tb[NFTA_PAYLOAD_OFFSET] == NULL ||
@@ -859,8 +860,13 @@ nft_payload_select_ops(const struct nft_ctx *ctx,
 	if (tb[NFTA_PAYLOAD_DREG] == NULL)
 		return ERR_PTR(-EINVAL);
 
-	offset = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_OFFSET]));
-	len    = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_LEN]));
+	err = nft_parse_u32_check(tb[NFTA_PAYLOAD_OFFSET], U8_MAX, &offset);
+	if (err < 0)
+		return ERR_PTR(err);
+
+	err = nft_parse_u32_check(tb[NFTA_PAYLOAD_LEN], U8_MAX, &len);
+	if (err < 0)
+		return ERR_PTR(err);
 
 	if (len <= 4 && is_power_of_2(len) && IS_ALIGNED(offset, len) &&
 	    base != NFT_PAYLOAD_LL_HEADER && base != NFT_PAYLOAD_INNER_HEADER)
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 058/158] netfilter: nft_payload: do not truncate csum_offset and csum_type
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (56 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 057/158] netfilter: nft_payload: report ERANGE for too long offset and length Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 059/158] netfilter: nf_tables: do not leave chain stats enabled on error Greg Kroah-Hartman
                   ` (112 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Pablo Neira Ayuso, Sasha Levin

From: Pablo Neira Ayuso <pablo@netfilter.org>

[ Upstream commit 7044ab281febae9e2fa9b0b247693d6026166293 ]

Instead report ERANGE if csum_offset is too long, and EOPNOTSUPP if type
is not support.

Fixes: 7ec3f7b47b8d ("netfilter: nft_payload: add packet mangling support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/netfilter/nft_payload.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index 4fee67abfe2c5..eb0e40c297121 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -740,17 +740,23 @@ static int nft_payload_set_init(const struct nft_ctx *ctx,
 				const struct nlattr * const tb[])
 {
 	struct nft_payload_set *priv = nft_expr_priv(expr);
+	u32 csum_offset, csum_type = NFT_PAYLOAD_CSUM_NONE;
+	int err;
 
 	priv->base        = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_BASE]));
 	priv->offset      = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_OFFSET]));
 	priv->len         = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_LEN]));
 
 	if (tb[NFTA_PAYLOAD_CSUM_TYPE])
-		priv->csum_type =
-			ntohl(nla_get_be32(tb[NFTA_PAYLOAD_CSUM_TYPE]));
-	if (tb[NFTA_PAYLOAD_CSUM_OFFSET])
-		priv->csum_offset =
-			ntohl(nla_get_be32(tb[NFTA_PAYLOAD_CSUM_OFFSET]));
+		csum_type = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_CSUM_TYPE]));
+	if (tb[NFTA_PAYLOAD_CSUM_OFFSET]) {
+		err = nft_parse_u32_check(tb[NFTA_PAYLOAD_CSUM_OFFSET], U8_MAX,
+					  &csum_offset);
+		if (err < 0)
+			return err;
+
+		priv->csum_offset = csum_offset;
+	}
 	if (tb[NFTA_PAYLOAD_CSUM_FLAGS]) {
 		u32 flags;
 
@@ -761,7 +767,7 @@ static int nft_payload_set_init(const struct nft_ctx *ctx,
 		priv->csum_flags = flags;
 	}
 
-	switch (priv->csum_type) {
+	switch (csum_type) {
 	case NFT_PAYLOAD_CSUM_NONE:
 	case NFT_PAYLOAD_CSUM_INET:
 		break;
@@ -775,6 +781,7 @@ static int nft_payload_set_init(const struct nft_ctx *ctx,
 	default:
 		return -EOPNOTSUPP;
 	}
+	priv->csum_type = csum_type;
 
 	return nft_parse_register_load(tb[NFTA_PAYLOAD_SREG], &priv->sreg,
 				       priv->len);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 059/158] netfilter: nf_tables: do not leave chain stats enabled on error
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (57 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 058/158] netfilter: nft_payload: do not truncate csum_offset and csum_type Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 060/158] netfilter: nft_osf: restrict osf to ipv4, ipv6 and inet families Greg Kroah-Hartman
                   ` (111 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Pablo Neira Ayuso, Sasha Levin

From: Pablo Neira Ayuso <pablo@netfilter.org>

[ Upstream commit 43eb8949cfdffa764b92bc6c54b87cbe5b0003fe ]

Error might occur later in the nf_tables_addchain() codepath, enable
static key only after transaction has been created.

Fixes: 9f08ea848117 ("netfilter: nf_tables: keep chain counters away from hot path")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/netfilter/nf_tables_api.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index e171257739c2f..b2c89e8c2a655 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2195,9 +2195,9 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
 			      struct netlink_ext_ack *extack)
 {
 	const struct nlattr * const *nla = ctx->nla;
+	struct nft_stats __percpu *stats = NULL;
 	struct nft_table *table = ctx->table;
 	struct nft_base_chain *basechain;
-	struct nft_stats __percpu *stats;
 	struct net *net = ctx->net;
 	char name[NFT_NAME_MAXLEN];
 	struct nft_rule_blob *blob;
@@ -2235,7 +2235,6 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
 				return PTR_ERR(stats);
 			}
 			rcu_assign_pointer(basechain->stats, stats);
-			static_branch_inc(&nft_counters_enabled);
 		}
 
 		err = nft_basechain_init(basechain, family, &hook, flags);
@@ -2318,6 +2317,9 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
 		goto err_unregister_hook;
 	}
 
+	if (stats)
+		static_branch_inc(&nft_counters_enabled);
+
 	table->use++;
 
 	return 0;
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 060/158] netfilter: nft_osf: restrict osf to ipv4, ipv6 and inet families
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (58 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 059/158] netfilter: nf_tables: do not leave chain stats enabled on error Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 061/158] netfilter: nft_tunnel: restrict it to netdev family Greg Kroah-Hartman
                   ` (110 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Pablo Neira Ayuso, Sasha Levin

From: Pablo Neira Ayuso <pablo@netfilter.org>

[ Upstream commit 5f3b7aae14a706d0d7da9f9e39def52ff5fc3d39 ]

As it was originally intended, restrict extension to supported families.

Fixes: b96af92d6eaf ("netfilter: nf_tables: implement Passive OS fingerprint module in nft_osf")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/netfilter/nft_osf.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nft_osf.c b/net/netfilter/nft_osf.c
index 5eed18f90b020..175d666c8d87e 100644
--- a/net/netfilter/nft_osf.c
+++ b/net/netfilter/nft_osf.c
@@ -115,9 +115,21 @@ static int nft_osf_validate(const struct nft_ctx *ctx,
 			    const struct nft_expr *expr,
 			    const struct nft_data **data)
 {
-	return nft_chain_validate_hooks(ctx->chain, (1 << NF_INET_LOCAL_IN) |
-						    (1 << NF_INET_PRE_ROUTING) |
-						    (1 << NF_INET_FORWARD));
+	unsigned int hooks;
+
+	switch (ctx->family) {
+	case NFPROTO_IPV4:
+	case NFPROTO_IPV6:
+	case NFPROTO_INET:
+		hooks = (1 << NF_INET_LOCAL_IN) |
+			(1 << NF_INET_PRE_ROUTING) |
+			(1 << NF_INET_FORWARD);
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	return nft_chain_validate_hooks(ctx->chain, hooks);
 }
 
 static bool nft_osf_reduce(struct nft_regs_track *track,
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 061/158] netfilter: nft_tunnel: restrict it to netdev family
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (59 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 060/158] netfilter: nft_osf: restrict osf to ipv4, ipv6 and inet families Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 062/158] netfilter: nf_tables: disallow binding to already bound chain Greg Kroah-Hartman
                   ` (109 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Pablo Neira Ayuso, Sasha Levin

From: Pablo Neira Ayuso <pablo@netfilter.org>

[ Upstream commit 01e4092d53bc4fe122a6e4b6d664adbd57528ca3 ]

Only allow to use this expression from NFPROTO_NETDEV family.

Fixes: af308b94a2a4 ("netfilter: nf_tables: add tunnel support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/netfilter/nft_tunnel.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/netfilter/nft_tunnel.c b/net/netfilter/nft_tunnel.c
index d0f9b1d51b0e9..96b03e0bf74ff 100644
--- a/net/netfilter/nft_tunnel.c
+++ b/net/netfilter/nft_tunnel.c
@@ -161,6 +161,7 @@ static const struct nft_expr_ops nft_tunnel_get_ops = {
 
 static struct nft_expr_type nft_tunnel_type __read_mostly = {
 	.name		= "tunnel",
+	.family		= NFPROTO_NETDEV,
 	.ops		= &nft_tunnel_get_ops,
 	.policy		= nft_tunnel_policy,
 	.maxattr	= NFTA_TUNNEL_MAX,
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 062/158] netfilter: nf_tables: disallow binding to already bound chain
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (60 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 061/158] netfilter: nft_tunnel: restrict it to netdev family Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 063/158] netfilter: flowtable: add function to invoke garbage collection immediately Greg Kroah-Hartman
                   ` (108 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Gwangun Jung, Pablo Neira Ayuso, Sasha Levin

From: Pablo Neira Ayuso <pablo@netfilter.org>

[ Upstream commit e02f0d3970404bfea385b6edb86f2d936db0ea2b ]

Update nft_data_init() to report EINVAL if chain is already bound.

Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING")
Reported-by: Gwangun Jung <exsociety@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/netfilter/nf_tables_api.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index b2c89e8c2a655..bc690238a3c56 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -9657,6 +9657,8 @@ static int nft_verdict_init(const struct nft_ctx *ctx, struct nft_data *data,
 			return PTR_ERR(chain);
 		if (nft_is_base_chain(chain))
 			return -EOPNOTSUPP;
+		if (nft_chain_is_bound(chain))
+			return -EINVAL;
 		if (desc->flags & NFT_DATA_DESC_SETELEM &&
 		    chain->flags & NFT_CHAIN_BINDING)
 			return -EINVAL;
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 063/158] netfilter: flowtable: add function to invoke garbage collection immediately
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (61 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 062/158] netfilter: nf_tables: disallow binding to already bound chain Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 064/158] netfilter: flowtable: fix stuck flows on cleanup due to pending work Greg Kroah-Hartman
                   ` (107 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Pablo Neira Ayuso, Sasha Levin

From: Pablo Neira Ayuso <pablo@netfilter.org>

[ Upstream commit 759eebbcfafcefa23b59e912396306543764bd3c ]

Expose nf_flow_table_gc_run() to force a garbage collector run from the
offload infrastructure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/netfilter/nf_flow_table.h |  1 +
 net/netfilter/nf_flow_table_core.c    | 12 +++++++++---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
index 64daafd1fc41c..32c25122ab184 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -270,6 +270,7 @@ void flow_offload_refresh(struct nf_flowtable *flow_table,
 
 struct flow_offload_tuple_rhash *flow_offload_lookup(struct nf_flowtable *flow_table,
 						     struct flow_offload_tuple *tuple);
+void nf_flow_table_gc_run(struct nf_flowtable *flow_table);
 void nf_flow_table_gc_cleanup(struct nf_flowtable *flowtable,
 			      struct net_device *dev);
 void nf_flow_table_cleanup(struct net_device *dev);
diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index f2def06d10709..18453fa25199c 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -442,12 +442,17 @@ static void nf_flow_offload_gc_step(struct nf_flowtable *flow_table,
 	}
 }
 
+void nf_flow_table_gc_run(struct nf_flowtable *flow_table)
+{
+	nf_flow_table_iterate(flow_table, nf_flow_offload_gc_step, NULL);
+}
+
 static void nf_flow_offload_work_gc(struct work_struct *work)
 {
 	struct nf_flowtable *flow_table;
 
 	flow_table = container_of(work, struct nf_flowtable, gc_work.work);
-	nf_flow_table_iterate(flow_table, nf_flow_offload_gc_step, NULL);
+	nf_flow_table_gc_run(flow_table);
 	queue_delayed_work(system_power_efficient_wq, &flow_table->gc_work, HZ);
 }
 
@@ -606,10 +611,11 @@ void nf_flow_table_free(struct nf_flowtable *flow_table)
 
 	cancel_delayed_work_sync(&flow_table->gc_work);
 	nf_flow_table_iterate(flow_table, nf_flow_table_do_cleanup, NULL);
-	nf_flow_table_iterate(flow_table, nf_flow_offload_gc_step, NULL);
+	nf_flow_table_gc_run(flow_table);
 	nf_flow_table_offload_flush(flow_table);
 	if (nf_flowtable_hw_offload(flow_table))
-		nf_flow_table_iterate(flow_table, nf_flow_offload_gc_step, NULL);
+		nf_flow_table_gc_run(flow_table);
+
 	rhashtable_destroy(&flow_table->rhashtable);
 }
 EXPORT_SYMBOL_GPL(nf_flow_table_free);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 064/158] netfilter: flowtable: fix stuck flows on cleanup due to pending work
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (62 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 063/158] netfilter: flowtable: add function to invoke garbage collection immediately Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 065/158] net: Fix data-races around sysctl_[rw]mem_(max|default) Greg Kroah-Hartman
                   ` (106 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Paul Blakey, Pablo Neira Ayuso, Sasha Levin

From: Pablo Neira Ayuso <pablo@netfilter.org>

[ Upstream commit 9afb4b27349a499483ae0134282cefd0c90f480f ]

To clear the flow table on flow table free, the following sequence
normally happens in order:

  1) gc_step work is stopped to disable any further stats/del requests.
  2) All flow table entries are set to teardown state.
  3) Run gc_step which will queue HW del work for each flow table entry.
  4) Waiting for the above del work to finish (flush).
  5) Run gc_step again, deleting all entries from the flow table.
  6) Flow table is freed.

But if a flow table entry already has pending HW stats or HW add work
step 3 will not queue HW del work (it will be skipped), step 4 will wait
for the pending add/stats to finish, and step 5 will queue HW del work
which might execute after freeing of the flow table.

To fix the above, this patch flushes the pending work, then it sets the
teardown flag to all flows in the flowtable and it forces a garbage
collector run to queue work to remove the flows from hardware, then it
flushes this new pending work and (finally) it forces another garbage
collector run to remove the entry from the software flowtable.

Stack trace:
[47773.882335] BUG: KASAN: use-after-free in down_read+0x99/0x460
[47773.883634] Write of size 8 at addr ffff888103b45aa8 by task kworker/u20:6/543704
[47773.885634] CPU: 3 PID: 543704 Comm: kworker/u20:6 Not tainted 5.12.0-rc7+ #2
[47773.886745] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)
[47773.888438] Workqueue: nf_ft_offload_del flow_offload_work_handler [nf_flow_table]
[47773.889727] Call Trace:
[47773.890214]  dump_stack+0xbb/0x107
[47773.890818]  print_address_description.constprop.0+0x18/0x140
[47773.892990]  kasan_report.cold+0x7c/0xd8
[47773.894459]  kasan_check_range+0x145/0x1a0
[47773.895174]  down_read+0x99/0x460
[47773.899706]  nf_flow_offload_tuple+0x24f/0x3c0 [nf_flow_table]
[47773.907137]  flow_offload_work_handler+0x72d/0xbe0 [nf_flow_table]
[47773.913372]  process_one_work+0x8ac/0x14e0
[47773.921325]
[47773.921325] Allocated by task 592159:
[47773.922031]  kasan_save_stack+0x1b/0x40
[47773.922730]  __kasan_kmalloc+0x7a/0x90
[47773.923411]  tcf_ct_flow_table_get+0x3cb/0x1230 [act_ct]
[47773.924363]  tcf_ct_init+0x71c/0x1156 [act_ct]
[47773.925207]  tcf_action_init_1+0x45b/0x700
[47773.925987]  tcf_action_init+0x453/0x6b0
[47773.926692]  tcf_exts_validate+0x3d0/0x600
[47773.927419]  fl_change+0x757/0x4a51 [cls_flower]
[47773.928227]  tc_new_tfilter+0x89a/0x2070
[47773.936652]
[47773.936652] Freed by task 543704:
[47773.937303]  kasan_save_stack+0x1b/0x40
[47773.938039]  kasan_set_track+0x1c/0x30
[47773.938731]  kasan_set_free_info+0x20/0x30
[47773.939467]  __kasan_slab_free+0xe7/0x120
[47773.940194]  slab_free_freelist_hook+0x86/0x190
[47773.941038]  kfree+0xce/0x3a0
[47773.941644]  tcf_ct_flow_table_cleanup_work

Original patch description and stack trace by Paul Blakey.

Fixes: c29f74e0df7a ("netfilter: nf_flow_table: hardware offload support")
Reported-by: Paul Blakey <paulb@nvidia.com>
Tested-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/netfilter/nf_flow_table.h | 2 ++
 net/netfilter/nf_flow_table_core.c    | 7 +++----
 net/netfilter/nf_flow_table_offload.c | 8 ++++++++
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
index 32c25122ab184..9c93e4981b680 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -307,6 +307,8 @@ void nf_flow_offload_stats(struct nf_flowtable *flowtable,
 			   struct flow_offload *flow);
 
 void nf_flow_table_offload_flush(struct nf_flowtable *flowtable);
+void nf_flow_table_offload_flush_cleanup(struct nf_flowtable *flowtable);
+
 int nf_flow_table_offload_setup(struct nf_flowtable *flowtable,
 				struct net_device *dev,
 				enum flow_block_command cmd);
diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index 18453fa25199c..483b18d35cade 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -610,12 +610,11 @@ void nf_flow_table_free(struct nf_flowtable *flow_table)
 	mutex_unlock(&flowtable_lock);
 
 	cancel_delayed_work_sync(&flow_table->gc_work);
+	nf_flow_table_offload_flush(flow_table);
+	/* ... no more pending work after this stage ... */
 	nf_flow_table_iterate(flow_table, nf_flow_table_do_cleanup, NULL);
 	nf_flow_table_gc_run(flow_table);
-	nf_flow_table_offload_flush(flow_table);
-	if (nf_flowtable_hw_offload(flow_table))
-		nf_flow_table_gc_run(flow_table);
-
+	nf_flow_table_offload_flush_cleanup(flow_table);
 	rhashtable_destroy(&flow_table->rhashtable);
 }
 EXPORT_SYMBOL_GPL(nf_flow_table_free);
diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
index 11b6e19420920..4d1169b634c5f 100644
--- a/net/netfilter/nf_flow_table_offload.c
+++ b/net/netfilter/nf_flow_table_offload.c
@@ -1063,6 +1063,14 @@ void nf_flow_offload_stats(struct nf_flowtable *flowtable,
 	flow_offload_queue_work(offload);
 }
 
+void nf_flow_table_offload_flush_cleanup(struct nf_flowtable *flowtable)
+{
+	if (nf_flowtable_hw_offload(flowtable)) {
+		flush_workqueue(nf_flow_offload_del_wq);
+		nf_flow_table_gc_run(flowtable);
+	}
+}
+
 void nf_flow_table_offload_flush(struct nf_flowtable *flowtable)
 {
 	if (nf_flowtable_hw_offload(flowtable)) {
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 065/158] net: Fix data-races around sysctl_[rw]mem_(max|default).
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (63 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 064/158] netfilter: flowtable: fix stuck flows on cleanup due to pending work Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 066/158] net: Fix data-races around weight_p and dev_weight_[rt]x_bias Greg Kroah-Hartman
                   ` (105 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, David S. Miller,
	Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit 1227c1771dd2ad44318aa3ab9e3a293b3f34ff2a ]

While reading sysctl_[rw]mem_(max|default), they can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/core/filter.c               | 4 ++--
 net/core/sock.c                 | 8 ++++----
 net/ipv4/ip_output.c            | 2 +-
 net/ipv4/tcp_output.c           | 2 +-
 net/netfilter/ipvs/ip_vs_sync.c | 4 ++--
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 74f05ed6aff29..60c854e7d98ba 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5036,14 +5036,14 @@ static int _bpf_setsockopt(struct sock *sk, int level, int optname,
 		/* Only some socketops are supported */
 		switch (optname) {
 		case SO_RCVBUF:
-			val = min_t(u32, val, sysctl_rmem_max);
+			val = min_t(u32, val, READ_ONCE(sysctl_rmem_max));
 			val = min_t(int, val, INT_MAX / 2);
 			sk->sk_userlocks |= SOCK_RCVBUF_LOCK;
 			WRITE_ONCE(sk->sk_rcvbuf,
 				   max_t(int, val * 2, SOCK_MIN_RCVBUF));
 			break;
 		case SO_SNDBUF:
-			val = min_t(u32, val, sysctl_wmem_max);
+			val = min_t(u32, val, READ_ONCE(sysctl_wmem_max));
 			val = min_t(int, val, INT_MAX / 2);
 			sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
 			WRITE_ONCE(sk->sk_sndbuf,
diff --git a/net/core/sock.c b/net/core/sock.c
index 2ff40dd0a7a65..62f69bc3a0e6e 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1100,7 +1100,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
 		 * play 'guess the biggest size' games. RCVBUF/SNDBUF
 		 * are treated in BSD as hints
 		 */
-		val = min_t(u32, val, sysctl_wmem_max);
+		val = min_t(u32, val, READ_ONCE(sysctl_wmem_max));
 set_sndbuf:
 		/* Ensure val * 2 fits into an int, to prevent max_t()
 		 * from treating it as a negative value.
@@ -1132,7 +1132,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
 		 * play 'guess the biggest size' games. RCVBUF/SNDBUF
 		 * are treated in BSD as hints
 		 */
-		__sock_set_rcvbuf(sk, min_t(u32, val, sysctl_rmem_max));
+		__sock_set_rcvbuf(sk, min_t(u32, val, READ_ONCE(sysctl_rmem_max)));
 		break;
 
 	case SO_RCVBUFFORCE:
@@ -3307,8 +3307,8 @@ void sock_init_data(struct socket *sock, struct sock *sk)
 	timer_setup(&sk->sk_timer, NULL, 0);
 
 	sk->sk_allocation	=	GFP_KERNEL;
-	sk->sk_rcvbuf		=	sysctl_rmem_default;
-	sk->sk_sndbuf		=	sysctl_wmem_default;
+	sk->sk_rcvbuf		=	READ_ONCE(sysctl_rmem_default);
+	sk->sk_sndbuf		=	READ_ONCE(sysctl_wmem_default);
 	sk->sk_state		=	TCP_CLOSE;
 	sk_set_socket(sk, sock);
 
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 00b4bf26fd932..da8b3cc67234d 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1712,7 +1712,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
 
 	sk->sk_protocol = ip_hdr(skb)->protocol;
 	sk->sk_bound_dev_if = arg->bound_dev_if;
-	sk->sk_sndbuf = sysctl_wmem_default;
+	sk->sk_sndbuf = READ_ONCE(sysctl_wmem_default);
 	ipc.sockc.mark = fl4.flowi4_mark;
 	err = ip_append_data(sk, &fl4, ip_reply_glue_bits, arg->iov->iov_base,
 			     len, 0, &ipc, &rt, MSG_DONTWAIT);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index aed0c5f828bef..84314de754f87 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -239,7 +239,7 @@ void tcp_select_initial_window(const struct sock *sk, int __space, __u32 mss,
 	if (wscale_ok) {
 		/* Set window scaling on max possible window */
 		space = max_t(u32, space, READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rmem[2]));
-		space = max_t(u32, space, sysctl_rmem_max);
+		space = max_t(u32, space, READ_ONCE(sysctl_rmem_max));
 		space = min_t(u32, space, *window_clamp);
 		*rcv_wscale = clamp_t(int, ilog2(space) - 15,
 				      0, TCP_MAX_WSCALE);
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index 9d43277b8b4fe..a56fd0b5a430a 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -1280,12 +1280,12 @@ static void set_sock_size(struct sock *sk, int mode, int val)
 	lock_sock(sk);
 	if (mode) {
 		val = clamp_t(int, val, (SOCK_MIN_SNDBUF + 1) / 2,
-			      sysctl_wmem_max);
+			      READ_ONCE(sysctl_wmem_max));
 		sk->sk_sndbuf = val * 2;
 		sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
 	} else {
 		val = clamp_t(int, val, (SOCK_MIN_RCVBUF + 1) / 2,
-			      sysctl_rmem_max);
+			      READ_ONCE(sysctl_rmem_max));
 		sk->sk_rcvbuf = val * 2;
 		sk->sk_userlocks |= SOCK_RCVBUF_LOCK;
 	}
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 066/158] net: Fix data-races around weight_p and dev_weight_[rt]x_bias.
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (64 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 065/158] net: Fix data-races around sysctl_[rw]mem_(max|default) Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 067/158] net: Fix data-races around netdev_max_backlog Greg Kroah-Hartman
                   ` (104 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, David S. Miller,
	Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit bf955b5ab8f6f7b0632cdef8e36b14e4f6e77829 ]

While reading weight_p, it can be changed concurrently.  Thus, we need
to add READ_ONCE() to its reader.

Also, dev_[rt]x_weight can be read/written at the same time.  So, we
need to use READ_ONCE() and WRITE_ONCE() for its access.  Moreover, to
use the same weight_p while changing dev_[rt]x_weight, we add a mutex
in proc_do_dev_weight().

Fixes: 3d48b53fb2ae ("net: dev_weight: TX/RX orthogonality")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/core/dev.c             |  2 +-
 net/core/sysctl_net_core.c | 15 +++++++++------
 net/sched/sch_generic.c    |  2 +-
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 30a1603a7225c..2a7d81cd9e2ea 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5917,7 +5917,7 @@ static int process_backlog(struct napi_struct *napi, int quota)
 		net_rps_action_and_irq_enable(sd);
 	}
 
-	napi->weight = dev_rx_weight;
+	napi->weight = READ_ONCE(dev_rx_weight);
 	while (again) {
 		struct sk_buff *skb;
 
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 71a13596ea2bf..725891527814c 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -234,14 +234,17 @@ static int set_default_qdisc(struct ctl_table *table, int write,
 static int proc_do_dev_weight(struct ctl_table *table, int write,
 			   void *buffer, size_t *lenp, loff_t *ppos)
 {
-	int ret;
+	static DEFINE_MUTEX(dev_weight_mutex);
+	int ret, weight;
 
+	mutex_lock(&dev_weight_mutex);
 	ret = proc_dointvec(table, write, buffer, lenp, ppos);
-	if (ret != 0)
-		return ret;
-
-	dev_rx_weight = weight_p * dev_weight_rx_bias;
-	dev_tx_weight = weight_p * dev_weight_tx_bias;
+	if (!ret && write) {
+		weight = READ_ONCE(weight_p);
+		WRITE_ONCE(dev_rx_weight, weight * dev_weight_rx_bias);
+		WRITE_ONCE(dev_tx_weight, weight * dev_weight_tx_bias);
+	}
+	mutex_unlock(&dev_weight_mutex);
 
 	return ret;
 }
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index dba0b3e24af5e..a64c3c1541118 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -409,7 +409,7 @@ static inline bool qdisc_restart(struct Qdisc *q, int *packets)
 
 void __qdisc_run(struct Qdisc *q)
 {
-	int quota = dev_tx_weight;
+	int quota = READ_ONCE(dev_tx_weight);
 	int packets;
 
 	while (qdisc_restart(q, &packets)) {
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 067/158] net: Fix data-races around netdev_max_backlog.
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (65 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 066/158] net: Fix data-races around weight_p and dev_weight_[rt]x_bias Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 068/158] net: Fix data-races around netdev_tstamp_prequeue Greg Kroah-Hartman
                   ` (103 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, David S. Miller,
	Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit 5dcd08cd19912892586c6082d56718333e2d19db ]

While reading netdev_max_backlog, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

While at it, we remove the unnecessary spaces in the doc.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 Documentation/admin-guide/sysctl/net.rst | 2 +-
 net/core/dev.c                           | 4 ++--
 net/core/gro_cells.c                     | 2 +-
 net/xfrm/espintcp.c                      | 2 +-
 net/xfrm/xfrm_input.c                    | 2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index fcd650bdbc7e2..01d9858197832 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -271,7 +271,7 @@ poll cycle or the number of packets processed reaches netdev_budget.
 netdev_max_backlog
 ------------------
 
-Maximum number  of  packets,  queued  on  the  INPUT  side, when the interface
+Maximum number of packets, queued on the INPUT side, when the interface
 receives packets faster than kernel can process them.
 
 netdev_rss_key
diff --git a/net/core/dev.c b/net/core/dev.c
index 2a7d81cd9e2ea..e1496e626a532 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4623,7 +4623,7 @@ static bool skb_flow_limit(struct sk_buff *skb, unsigned int qlen)
 	struct softnet_data *sd;
 	unsigned int old_flow, new_flow;
 
-	if (qlen < (netdev_max_backlog >> 1))
+	if (qlen < (READ_ONCE(netdev_max_backlog) >> 1))
 		return false;
 
 	sd = this_cpu_ptr(&softnet_data);
@@ -4671,7 +4671,7 @@ static int enqueue_to_backlog(struct sk_buff *skb, int cpu,
 	if (!netif_running(skb->dev))
 		goto drop;
 	qlen = skb_queue_len(&sd->input_pkt_queue);
-	if (qlen <= netdev_max_backlog && !skb_flow_limit(skb, qlen)) {
+	if (qlen <= READ_ONCE(netdev_max_backlog) && !skb_flow_limit(skb, qlen)) {
 		if (qlen) {
 enqueue:
 			__skb_queue_tail(&sd->input_pkt_queue, skb);
diff --git a/net/core/gro_cells.c b/net/core/gro_cells.c
index 541c7a72a28a4..21619c70a82b7 100644
--- a/net/core/gro_cells.c
+++ b/net/core/gro_cells.c
@@ -26,7 +26,7 @@ int gro_cells_receive(struct gro_cells *gcells, struct sk_buff *skb)
 
 	cell = this_cpu_ptr(gcells->cells);
 
-	if (skb_queue_len(&cell->napi_skbs) > netdev_max_backlog) {
+	if (skb_queue_len(&cell->napi_skbs) > READ_ONCE(netdev_max_backlog)) {
 drop:
 		dev_core_stats_rx_dropped_inc(dev);
 		kfree_skb(skb);
diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c
index 82d14eea1b5ad..974eb97b77d22 100644
--- a/net/xfrm/espintcp.c
+++ b/net/xfrm/espintcp.c
@@ -168,7 +168,7 @@ int espintcp_queue_out(struct sock *sk, struct sk_buff *skb)
 {
 	struct espintcp_ctx *ctx = espintcp_getctx(sk);
 
-	if (skb_queue_len(&ctx->out_queue) >= netdev_max_backlog)
+	if (skb_queue_len(&ctx->out_queue) >= READ_ONCE(netdev_max_backlog))
 		return -ENOBUFS;
 
 	__skb_queue_tail(&ctx->out_queue, skb);
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 70a8c36f0ba6e..b2f4ec9c537f0 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -782,7 +782,7 @@ int xfrm_trans_queue_net(struct net *net, struct sk_buff *skb,
 
 	trans = this_cpu_ptr(&xfrm_trans_tasklet);
 
-	if (skb_queue_len(&trans->queue) >= netdev_max_backlog)
+	if (skb_queue_len(&trans->queue) >= READ_ONCE(netdev_max_backlog))
 		return -ENOBUFS;
 
 	BUILD_BUG_ON(sizeof(struct xfrm_trans_cb) > sizeof(skb->cb));
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 068/158] net: Fix data-races around netdev_tstamp_prequeue.
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (66 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 067/158] net: Fix data-races around netdev_max_backlog Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 069/158] ratelimit: Fix data-races in ___ratelimit() Greg Kroah-Hartman
                   ` (102 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, David S. Miller,
	Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit 61adf447e38664447526698872e21c04623afb8e ]

While reading netdev_tstamp_prequeue, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 3b098e2d7c69 ("net: Consistent skb timestamping")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/core/dev.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index e1496e626a532..34282b93c3f60 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4927,7 +4927,7 @@ static int netif_rx_internal(struct sk_buff *skb)
 {
 	int ret;
 
-	net_timestamp_check(netdev_tstamp_prequeue, skb);
+	net_timestamp_check(READ_ONCE(netdev_tstamp_prequeue), skb);
 
 	trace_netif_rx(skb);
 
@@ -5280,7 +5280,7 @@ static int __netif_receive_skb_core(struct sk_buff **pskb, bool pfmemalloc,
 	int ret = NET_RX_DROP;
 	__be16 type;
 
-	net_timestamp_check(!netdev_tstamp_prequeue, skb);
+	net_timestamp_check(!READ_ONCE(netdev_tstamp_prequeue), skb);
 
 	trace_netif_receive_skb(skb);
 
@@ -5663,7 +5663,7 @@ static int netif_receive_skb_internal(struct sk_buff *skb)
 {
 	int ret;
 
-	net_timestamp_check(netdev_tstamp_prequeue, skb);
+	net_timestamp_check(READ_ONCE(netdev_tstamp_prequeue), skb);
 
 	if (skb_defer_rx_timestamp(skb))
 		return NET_RX_SUCCESS;
@@ -5693,7 +5693,7 @@ void netif_receive_skb_list_internal(struct list_head *head)
 
 	INIT_LIST_HEAD(&sublist);
 	list_for_each_entry_safe(skb, next, head, list) {
-		net_timestamp_check(netdev_tstamp_prequeue, skb);
+		net_timestamp_check(READ_ONCE(netdev_tstamp_prequeue), skb);
 		skb_list_del_init(skb);
 		if (!skb_defer_rx_timestamp(skb))
 			list_add_tail(&skb->list, &sublist);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 069/158] ratelimit: Fix data-races in ___ratelimit().
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (67 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 068/158] net: Fix data-races around netdev_tstamp_prequeue Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 070/158] net: Fix data-races around sysctl_optmem_max Greg Kroah-Hartman
                   ` (101 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, David S. Miller,
	Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit 6bae8ceb90ba76cdba39496db936164fa672b9be ]

While reading rs->interval and rs->burst, they can be changed
concurrently via sysctl (e.g. net_ratelimit_state).  Thus, we
need to add READ_ONCE() to their readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 lib/ratelimit.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/lib/ratelimit.c b/lib/ratelimit.c
index e01a93f46f833..ce945c17980b9 100644
--- a/lib/ratelimit.c
+++ b/lib/ratelimit.c
@@ -26,10 +26,16 @@
  */
 int ___ratelimit(struct ratelimit_state *rs, const char *func)
 {
+	/* Paired with WRITE_ONCE() in .proc_handler().
+	 * Changing two values seperately could be inconsistent
+	 * and some message could be lost.  (See: net_ratelimit_state).
+	 */
+	int interval = READ_ONCE(rs->interval);
+	int burst = READ_ONCE(rs->burst);
 	unsigned long flags;
 	int ret;
 
-	if (!rs->interval)
+	if (!interval)
 		return 1;
 
 	/*
@@ -44,7 +50,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
 	if (!rs->begin)
 		rs->begin = jiffies;
 
-	if (time_is_before_jiffies(rs->begin + rs->interval)) {
+	if (time_is_before_jiffies(rs->begin + interval)) {
 		if (rs->missed) {
 			if (!(rs->flags & RATELIMIT_MSG_ON_RELEASE)) {
 				printk_deferred(KERN_WARNING
@@ -56,7 +62,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
 		rs->begin   = jiffies;
 		rs->printed = 0;
 	}
-	if (rs->burst && rs->burst > rs->printed) {
+	if (burst && burst > rs->printed) {
 		rs->printed++;
 		ret = 1;
 	} else {
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 070/158] net: Fix data-races around sysctl_optmem_max.
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (68 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 069/158] ratelimit: Fix data-races in ___ratelimit() Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 071/158] net: Fix a data-race around sysctl_tstamp_allow_data Greg Kroah-Hartman
                   ` (100 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, David S. Miller,
	Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit 7de6d09f51917c829af2b835aba8bb5040f8e86a ]

While reading sysctl_optmem_max, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/core/bpf_sk_storage.c | 5 +++--
 net/core/filter.c         | 9 +++++----
 net/core/sock.c           | 8 +++++---
 net/ipv4/ip_sockglue.c    | 6 +++---
 net/ipv6/ipv6_sockglue.c  | 4 ++--
 5 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index 1b7f385643b4c..94374d529ea42 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -310,11 +310,12 @@ BPF_CALL_2(bpf_sk_storage_delete, struct bpf_map *, map, struct sock *, sk)
 static int bpf_sk_storage_charge(struct bpf_local_storage_map *smap,
 				 void *owner, u32 size)
 {
+	int optmem_max = READ_ONCE(sysctl_optmem_max);
 	struct sock *sk = (struct sock *)owner;
 
 	/* same check as in sock_kmalloc() */
-	if (size <= sysctl_optmem_max &&
-	    atomic_read(&sk->sk_omem_alloc) + size < sysctl_optmem_max) {
+	if (size <= optmem_max &&
+	    atomic_read(&sk->sk_omem_alloc) + size < optmem_max) {
 		atomic_add(size, &sk->sk_omem_alloc);
 		return 0;
 	}
diff --git a/net/core/filter.c b/net/core/filter.c
index 60c854e7d98ba..063176428086b 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1214,10 +1214,11 @@ void sk_filter_uncharge(struct sock *sk, struct sk_filter *fp)
 static bool __sk_filter_charge(struct sock *sk, struct sk_filter *fp)
 {
 	u32 filter_size = bpf_prog_size(fp->prog->len);
+	int optmem_max = READ_ONCE(sysctl_optmem_max);
 
 	/* same check as in sock_kmalloc() */
-	if (filter_size <= sysctl_optmem_max &&
-	    atomic_read(&sk->sk_omem_alloc) + filter_size < sysctl_optmem_max) {
+	if (filter_size <= optmem_max &&
+	    atomic_read(&sk->sk_omem_alloc) + filter_size < optmem_max) {
 		atomic_add(filter_size, &sk->sk_omem_alloc);
 		return true;
 	}
@@ -1548,7 +1549,7 @@ int sk_reuseport_attach_filter(struct sock_fprog *fprog, struct sock *sk)
 	if (IS_ERR(prog))
 		return PTR_ERR(prog);
 
-	if (bpf_prog_size(prog->len) > sysctl_optmem_max)
+	if (bpf_prog_size(prog->len) > READ_ONCE(sysctl_optmem_max))
 		err = -ENOMEM;
 	else
 		err = reuseport_attach_prog(sk, prog);
@@ -1615,7 +1616,7 @@ int sk_reuseport_attach_bpf(u32 ufd, struct sock *sk)
 		}
 	} else {
 		/* BPF_PROG_TYPE_SOCKET_FILTER */
-		if (bpf_prog_size(prog->len) > sysctl_optmem_max) {
+		if (bpf_prog_size(prog->len) > READ_ONCE(sysctl_optmem_max)) {
 			err = -ENOMEM;
 			goto err_prog_put;
 		}
diff --git a/net/core/sock.c b/net/core/sock.c
index 62f69bc3a0e6e..d672e63a5c2d4 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2535,7 +2535,7 @@ struct sk_buff *sock_omalloc(struct sock *sk, unsigned long size,
 
 	/* small safe race: SKB_TRUESIZE may differ from final skb->truesize */
 	if (atomic_read(&sk->sk_omem_alloc) + SKB_TRUESIZE(size) >
-	    sysctl_optmem_max)
+	    READ_ONCE(sysctl_optmem_max))
 		return NULL;
 
 	skb = alloc_skb(size, priority);
@@ -2553,8 +2553,10 @@ struct sk_buff *sock_omalloc(struct sock *sk, unsigned long size,
  */
 void *sock_kmalloc(struct sock *sk, int size, gfp_t priority)
 {
-	if ((unsigned int)size <= sysctl_optmem_max &&
-	    atomic_read(&sk->sk_omem_alloc) + size < sysctl_optmem_max) {
+	int optmem_max = READ_ONCE(sysctl_optmem_max);
+
+	if ((unsigned int)size <= optmem_max &&
+	    atomic_read(&sk->sk_omem_alloc) + size < optmem_max) {
 		void *mem;
 		/* First do the add, to avoid the race if kmalloc
 		 * might sleep.
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index a8a323ecbb54b..e49a61a053a68 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -772,7 +772,7 @@ static int ip_set_mcast_msfilter(struct sock *sk, sockptr_t optval, int optlen)
 
 	if (optlen < GROUP_FILTER_SIZE(0))
 		return -EINVAL;
-	if (optlen > sysctl_optmem_max)
+	if (optlen > READ_ONCE(sysctl_optmem_max))
 		return -ENOBUFS;
 
 	gsf = memdup_sockptr(optval, optlen);
@@ -808,7 +808,7 @@ static int compat_ip_set_mcast_msfilter(struct sock *sk, sockptr_t optval,
 
 	if (optlen < size0)
 		return -EINVAL;
-	if (optlen > sysctl_optmem_max - 4)
+	if (optlen > READ_ONCE(sysctl_optmem_max) - 4)
 		return -ENOBUFS;
 
 	p = kmalloc(optlen + 4, GFP_KERNEL);
@@ -1233,7 +1233,7 @@ static int do_ip_setsockopt(struct sock *sk, int level, int optname,
 
 		if (optlen < IP_MSFILTER_SIZE(0))
 			goto e_inval;
-		if (optlen > sysctl_optmem_max) {
+		if (optlen > READ_ONCE(sysctl_optmem_max)) {
 			err = -ENOBUFS;
 			break;
 		}
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 222f6bf220ba0..e0dcc7a193df2 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -210,7 +210,7 @@ static int ipv6_set_mcast_msfilter(struct sock *sk, sockptr_t optval,
 
 	if (optlen < GROUP_FILTER_SIZE(0))
 		return -EINVAL;
-	if (optlen > sysctl_optmem_max)
+	if (optlen > READ_ONCE(sysctl_optmem_max))
 		return -ENOBUFS;
 
 	gsf = memdup_sockptr(optval, optlen);
@@ -244,7 +244,7 @@ static int compat_ipv6_set_mcast_msfilter(struct sock *sk, sockptr_t optval,
 
 	if (optlen < size0)
 		return -EINVAL;
-	if (optlen > sysctl_optmem_max - 4)
+	if (optlen > READ_ONCE(sysctl_optmem_max) - 4)
 		return -ENOBUFS;
 
 	p = kmalloc(optlen + 4, GFP_KERNEL);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 071/158] net: Fix a data-race around sysctl_tstamp_allow_data.
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (69 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 070/158] net: Fix data-races around sysctl_optmem_max Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 072/158] net: Fix a data-race around sysctl_net_busy_poll Greg Kroah-Hartman
                   ` (99 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, David S. Miller,
	Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit d2154b0afa73c0159b2856f875c6b4fe7cf6a95e ]

While reading sysctl_tstamp_allow_data, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: b245be1f4db1 ("net-timestamp: no-payload only sysctl")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/core/skbuff.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 5b3559cb1d827..bebf58464d667 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4772,7 +4772,7 @@ static bool skb_may_tx_timestamp(struct sock *sk, bool tsonly)
 {
 	bool ret;
 
-	if (likely(sysctl_tstamp_allow_data || tsonly))
+	if (likely(READ_ONCE(sysctl_tstamp_allow_data) || tsonly))
 		return true;
 
 	read_lock_bh(&sk->sk_callback_lock);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 072/158] net: Fix a data-race around sysctl_net_busy_poll.
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (70 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 071/158] net: Fix a data-race around sysctl_tstamp_allow_data Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 073/158] net: Fix a data-race around sysctl_net_busy_read Greg Kroah-Hartman
                   ` (98 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, David S. Miller,
	Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit c42b7cddea47503411bfb5f2f93a4154aaffa2d9 ]

While reading sysctl_net_busy_poll, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 060212928670 ("net: add low latency socket poll")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/busy_poll.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h
index c4898fcbf923b..f90f0021f5f2d 100644
--- a/include/net/busy_poll.h
+++ b/include/net/busy_poll.h
@@ -33,7 +33,7 @@ extern unsigned int sysctl_net_busy_poll __read_mostly;
 
 static inline bool net_busy_loop_on(void)
 {
-	return sysctl_net_busy_poll;
+	return READ_ONCE(sysctl_net_busy_poll);
 }
 
 static inline bool sk_can_busy_loop(const struct sock *sk)
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 073/158] net: Fix a data-race around sysctl_net_busy_read.
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (71 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 072/158] net: Fix a data-race around sysctl_net_busy_poll Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 074/158] net: Fix a data-race around netdev_budget Greg Kroah-Hartman
                   ` (97 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, David S. Miller,
	Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit e59ef36f0795696ab229569c153936bfd068d21c ]

While reading sysctl_net_busy_read, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 2d48d67fa8cd ("net: poll/select low latency socket support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/core/sock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index d672e63a5c2d4..16ab5ef749c60 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3365,7 +3365,7 @@ void sock_init_data(struct socket *sock, struct sock *sk)
 
 #ifdef CONFIG_NET_RX_BUSY_POLL
 	sk->sk_napi_id		=	0;
-	sk->sk_ll_usec		=	sysctl_net_busy_read;
+	sk->sk_ll_usec		=	READ_ONCE(sysctl_net_busy_read);
 #endif
 
 	sk->sk_max_pacing_rate = ~0UL;
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 074/158] net: Fix a data-race around netdev_budget.
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (72 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 073/158] net: Fix a data-race around sysctl_net_busy_read Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 075/158] net: Fix data-races around sysctl_max_skb_frags Greg Kroah-Hartman
                   ` (96 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, David S. Miller,
	Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit 2e0c42374ee32e72948559d2ae2f7ba3dc6b977c ]

While reading netdev_budget, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 51b0bdedb8e7 ("[NET]: Separate two usages of netdev_max_backlog.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/core/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 34282b93c3f60..a330f93629314 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6647,7 +6647,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
 	struct softnet_data *sd = this_cpu_ptr(&softnet_data);
 	unsigned long time_limit = jiffies +
 		usecs_to_jiffies(netdev_budget_usecs);
-	int budget = netdev_budget;
+	int budget = READ_ONCE(netdev_budget);
 	LIST_HEAD(list);
 	LIST_HEAD(repoll);
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 075/158] net: Fix data-races around sysctl_max_skb_frags.
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (73 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 074/158] net: Fix a data-race around netdev_budget Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 076/158] net: Fix a data-race around netdev_budget_usecs Greg Kroah-Hartman
                   ` (95 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, David S. Miller,
	Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit 657b991afb89d25fe6c4783b1b75a8ad4563670d ]

While reading sysctl_max_skb_frags, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 5f74f82ea34c ("net:Add sysctl_max_skb_frags")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/ipv4/tcp.c       | 4 ++--
 net/mptcp/protocol.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 3ae2ea0488838..3d446773ff2a5 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1000,7 +1000,7 @@ static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags,
 
 	i = skb_shinfo(skb)->nr_frags;
 	can_coalesce = skb_can_coalesce(skb, i, page, offset);
-	if (!can_coalesce && i >= sysctl_max_skb_frags) {
+	if (!can_coalesce && i >= READ_ONCE(sysctl_max_skb_frags)) {
 		tcp_mark_push(tp, skb);
 		goto new_segment;
 	}
@@ -1348,7 +1348,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 
 			if (!skb_can_coalesce(skb, i, pfrag->page,
 					      pfrag->offset)) {
-				if (i >= sysctl_max_skb_frags) {
+				if (i >= READ_ONCE(sysctl_max_skb_frags)) {
 					tcp_mark_push(tp, skb);
 					goto new_segment;
 				}
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 3d90fa9653ef3..513f571a082ba 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1299,7 +1299,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 
 		i = skb_shinfo(skb)->nr_frags;
 		can_coalesce = skb_can_coalesce(skb, i, dfrag->page, offset);
-		if (!can_coalesce && i >= sysctl_max_skb_frags) {
+		if (!can_coalesce && i >= READ_ONCE(sysctl_max_skb_frags)) {
 			tcp_mark_push(tcp_sk(ssk), skb);
 			goto alloc_skb;
 		}
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 076/158] net: Fix a data-race around netdev_budget_usecs.
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (74 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 075/158] net: Fix data-races around sysctl_max_skb_frags Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 077/158] net: Fix data-races around sysctl_fb_tunnels_only_for_init_net Greg Kroah-Hartman
                   ` (94 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, David S. Miller,
	Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit fa45d484c52c73f79db2c23b0cdfc6c6455093ad ]

While reading netdev_budget_usecs, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 7acf8a1e8a28 ("Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuning")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/core/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index a330f93629314..19baeaf65a646 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6646,7 +6646,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
 {
 	struct softnet_data *sd = this_cpu_ptr(&softnet_data);
 	unsigned long time_limit = jiffies +
-		usecs_to_jiffies(netdev_budget_usecs);
+		usecs_to_jiffies(READ_ONCE(netdev_budget_usecs));
 	int budget = READ_ONCE(netdev_budget);
 	LIST_HEAD(list);
 	LIST_HEAD(repoll);
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 077/158] net: Fix data-races around sysctl_fb_tunnels_only_for_init_net.
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (75 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 076/158] net: Fix a data-race around netdev_budget_usecs Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 078/158] net: Fix data-races around sysctl_devconf_inherit_init_net Greg Kroah-Hartman
                   ` (93 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, David S. Miller,
	Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit af67508ea6cbf0e4ea27f8120056fa2efce127dd ]

While reading sysctl_fb_tunnels_only_for_init_net, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: 79134e6ce2c9 ("net: do not create fallback tunnels for non-default namespaces")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/linux/netdevice.h | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2563d30736e9a..78dd63a5c7c80 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -640,9 +640,14 @@ extern int sysctl_devconf_inherit_init_net;
  */
 static inline bool net_has_fallback_tunnels(const struct net *net)
 {
-	return !IS_ENABLED(CONFIG_SYSCTL) ||
-	       !sysctl_fb_tunnels_only_for_init_net ||
-	       (net == &init_net && sysctl_fb_tunnels_only_for_init_net == 1);
+#if IS_ENABLED(CONFIG_SYSCTL)
+	int fb_tunnels_only_for_init_net = READ_ONCE(sysctl_fb_tunnels_only_for_init_net);
+
+	return !fb_tunnels_only_for_init_net ||
+		(net_eq(net, &init_net) && fb_tunnels_only_for_init_net == 1);
+#else
+	return true;
+#endif
 }
 
 static inline int netdev_queue_numa_node_read(const struct netdev_queue *q)
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 078/158] net: Fix data-races around sysctl_devconf_inherit_init_net.
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (76 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 077/158] net: Fix data-races around sysctl_fb_tunnels_only_for_init_net Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 079/158] net: Fix a data-race around gro_normal_batch Greg Kroah-Hartman
                   ` (92 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, David S. Miller,
	Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit a5612ca10d1aa05624ebe72633e0c8c792970833 ]

While reading sysctl_devconf_inherit_init_net, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: 856c395cfa63 ("net: introduce a knob to control whether to inherit devconf config")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/linux/netdevice.h |  9 +++++++++
 net/ipv4/devinet.c        | 16 ++++++++++------
 net/ipv6/addrconf.c       |  5 ++---
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 78dd63a5c7c80..db40bc62213bd 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -650,6 +650,15 @@ static inline bool net_has_fallback_tunnels(const struct net *net)
 #endif
 }
 
+static inline int net_inherit_devconf(void)
+{
+#if IS_ENABLED(CONFIG_SYSCTL)
+	return READ_ONCE(sysctl_devconf_inherit_init_net);
+#else
+	return 0;
+#endif
+}
+
 static inline int netdev_queue_numa_node_read(const struct netdev_queue *q)
 {
 #if defined(CONFIG_XPS) && defined(CONFIG_NUMA)
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index b2366ad540e62..787a44e3222db 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -2682,23 +2682,27 @@ static __net_init int devinet_init_net(struct net *net)
 #endif
 
 	if (!net_eq(net, &init_net)) {
-		if (IS_ENABLED(CONFIG_SYSCTL) &&
-		    sysctl_devconf_inherit_init_net == 3) {
+		switch (net_inherit_devconf()) {
+		case 3:
 			/* copy from the current netns */
 			memcpy(all, current->nsproxy->net_ns->ipv4.devconf_all,
 			       sizeof(ipv4_devconf));
 			memcpy(dflt,
 			       current->nsproxy->net_ns->ipv4.devconf_dflt,
 			       sizeof(ipv4_devconf_dflt));
-		} else if (!IS_ENABLED(CONFIG_SYSCTL) ||
-			   sysctl_devconf_inherit_init_net != 2) {
-			/* inherit == 0 or 1: copy from init_net */
+			break;
+		case 0:
+		case 1:
+			/* copy from init_net */
 			memcpy(all, init_net.ipv4.devconf_all,
 			       sizeof(ipv4_devconf));
 			memcpy(dflt, init_net.ipv4.devconf_dflt,
 			       sizeof(ipv4_devconf_dflt));
+			break;
+		case 2:
+			/* use compiled values */
+			break;
 		}
-		/* else inherit == 2: use compiled values */
 	}
 
 #ifdef CONFIG_SYSCTL
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 49cc6587dd771..b738eb7e1cae8 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -7158,9 +7158,8 @@ static int __net_init addrconf_init_net(struct net *net)
 	if (!dflt)
 		goto err_alloc_dflt;
 
-	if (IS_ENABLED(CONFIG_SYSCTL) &&
-	    !net_eq(net, &init_net)) {
-		switch (sysctl_devconf_inherit_init_net) {
+	if (!net_eq(net, &init_net)) {
+		switch (net_inherit_devconf()) {
 		case 1:  /* copy from init_net */
 			memcpy(all, init_net.ipv6.devconf_all,
 			       sizeof(ipv6_devconf));
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 079/158] net: Fix a data-race around gro_normal_batch.
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (77 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 078/158] net: Fix data-races around sysctl_devconf_inherit_init_net Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 080/158] net: Fix a data-race around netdev_unregister_timeout_secs Greg Kroah-Hartman
                   ` (91 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, Edward Cree,
	David S. Miller, Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit 8db24af3f02ebdbf302196006ebb270c4c3a2706 ]

While reading gro_normal_batch, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 323ebb61e32b ("net: use listified RX for handling GRO_NORMAL skbs")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/gro.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/gro.h b/include/net/gro.h
index 867656b0739c0..24003dea8fa4d 100644
--- a/include/net/gro.h
+++ b/include/net/gro.h
@@ -439,7 +439,7 @@ static inline void gro_normal_one(struct napi_struct *napi, struct sk_buff *skb,
 {
 	list_add_tail(&skb->list, &napi->rx_list);
 	napi->rx_count += segs;
-	if (napi->rx_count >= gro_normal_batch)
+	if (napi->rx_count >= READ_ONCE(gro_normal_batch))
 		gro_normal_list(napi);
 }
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 080/158] net: Fix a data-race around netdev_unregister_timeout_secs.
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (78 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 079/158] net: Fix a data-race around gro_normal_batch Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 081/158] net: Fix a data-race around sysctl_somaxconn Greg Kroah-Hartman
                   ` (90 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, Dmitry Vyukov,
	David S. Miller, Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit 05e49cfc89e4f325eebbc62d24dd122e55f94c23 ]

While reading netdev_unregister_timeout_secs, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: 5aa3afe107d9 ("net: make unregister netdev warning timeout configurable")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/core/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 19baeaf65a646..a77a979a4bf75 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10265,7 +10265,7 @@ static struct net_device *netdev_wait_allrefs_any(struct list_head *list)
 				return dev;
 
 		if (time_after(jiffies, warning_time +
-			       netdev_unregister_timeout_secs * HZ)) {
+			       READ_ONCE(netdev_unregister_timeout_secs) * HZ)) {
 			list_for_each_entry(dev, list, todo_list) {
 				pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n",
 					 dev->name, netdev_refcnt_read(dev));
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 081/158] net: Fix a data-race around sysctl_somaxconn.
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (79 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 080/158] net: Fix a data-race around netdev_unregister_timeout_secs Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 082/158] ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter Greg Kroah-Hartman
                   ` (89 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kuniyuki Iwashima, David S. Miller,
	Sasha Levin

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit 3c9ba81d72047f2e81bb535d42856517b613aba7 ]

While reading sysctl_somaxconn, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/socket.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/socket.c b/net/socket.c
index 96300cdc06251..34102aa4ab0a6 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1801,7 +1801,7 @@ int __sys_listen(int fd, int backlog)
 
 	sock = sockfd_lookup_light(fd, &err, &fput_needed);
 	if (sock) {
-		somaxconn = sock_net(sock->sk)->core.sysctl_somaxconn;
+		somaxconn = READ_ONCE(sock_net(sock->sk)->core.sysctl_somaxconn);
 		if ((unsigned int)backlog > somaxconn)
 			backlog = somaxconn;
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 082/158] ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (80 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 081/158] net: Fix a data-race around sysctl_somaxconn Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 083/158] i40e: Fix incorrect address type for IPv6 flow rules Greg Kroah-Hartman
                   ` (88 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Steve Payne, Ilya Evenbach,
	Jacob Keller, Tony Nguyen, Sasha Levin, Gurucharan

From: Jacob Keller <jacob.e.keller@intel.com>

[ Upstream commit 25d7a5f5a6bb15a2dae0a3f39ea5dda215024726 ]

The ixgbe_ptp_start_cyclecounter is intended to be called whenever the
cyclecounter parameters need to be changed.

Since commit a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x
devices"), this function has cleared the SYSTIME registers and reset the
TSAUXC DISABLE_SYSTIME bit.

While these need to be cleared during ixgbe_ptp_reset, it is wrong to clear
them during ixgbe_ptp_start_cyclecounter. This function may be called
during both reset and link status change. When link changes, the SYSTIME
counter is still operating normally, but the cyclecounter should be updated
to account for the possibly changed parameters.

Clearing SYSTIME when link changes causes the timecounter to jump because
the cycle counter now reads zero.

Extract the SYSTIME initialization out to a new function and call this
during ixgbe_ptp_reset. This prevents the timecounter adjustment and avoids
an unnecessary reset of the current time.

This also restores the original SYSTIME clearing that occurred during
ixgbe_ptp_reset before the commit above.

Reported-by: Steve Payne <spayne@aurora.tech>
Reported-by: Ilya Evenbach <ievenbach@aurora.tech>
Fixes: a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x devices")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c | 59 +++++++++++++++-----
 1 file changed, 46 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
index 336426a67ac1b..38cda659f65f4 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
@@ -1208,7 +1208,6 @@ void ixgbe_ptp_start_cyclecounter(struct ixgbe_adapter *adapter)
 	struct cyclecounter cc;
 	unsigned long flags;
 	u32 incval = 0;
-	u32 tsauxc = 0;
 	u32 fuse0 = 0;
 
 	/* For some of the boards below this mask is technically incorrect.
@@ -1243,18 +1242,6 @@ void ixgbe_ptp_start_cyclecounter(struct ixgbe_adapter *adapter)
 	case ixgbe_mac_x550em_a:
 	case ixgbe_mac_X550:
 		cc.read = ixgbe_ptp_read_X550;
-
-		/* enable SYSTIME counter */
-		IXGBE_WRITE_REG(hw, IXGBE_SYSTIMR, 0);
-		IXGBE_WRITE_REG(hw, IXGBE_SYSTIML, 0);
-		IXGBE_WRITE_REG(hw, IXGBE_SYSTIMH, 0);
-		tsauxc = IXGBE_READ_REG(hw, IXGBE_TSAUXC);
-		IXGBE_WRITE_REG(hw, IXGBE_TSAUXC,
-				tsauxc & ~IXGBE_TSAUXC_DISABLE_SYSTIME);
-		IXGBE_WRITE_REG(hw, IXGBE_TSIM, IXGBE_TSIM_TXTS);
-		IXGBE_WRITE_REG(hw, IXGBE_EIMS, IXGBE_EIMS_TIMESYNC);
-
-		IXGBE_WRITE_FLUSH(hw);
 		break;
 	case ixgbe_mac_X540:
 		cc.read = ixgbe_ptp_read_82599;
@@ -1286,6 +1273,50 @@ void ixgbe_ptp_start_cyclecounter(struct ixgbe_adapter *adapter)
 	spin_unlock_irqrestore(&adapter->tmreg_lock, flags);
 }
 
+/**
+ * ixgbe_ptp_init_systime - Initialize SYSTIME registers
+ * @adapter: the ixgbe private board structure
+ *
+ * Initialize and start the SYSTIME registers.
+ */
+static void ixgbe_ptp_init_systime(struct ixgbe_adapter *adapter)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	u32 tsauxc;
+
+	switch (hw->mac.type) {
+	case ixgbe_mac_X550EM_x:
+	case ixgbe_mac_x550em_a:
+	case ixgbe_mac_X550:
+		tsauxc = IXGBE_READ_REG(hw, IXGBE_TSAUXC);
+
+		/* Reset SYSTIME registers to 0 */
+		IXGBE_WRITE_REG(hw, IXGBE_SYSTIMR, 0);
+		IXGBE_WRITE_REG(hw, IXGBE_SYSTIML, 0);
+		IXGBE_WRITE_REG(hw, IXGBE_SYSTIMH, 0);
+
+		/* Reset interrupt settings */
+		IXGBE_WRITE_REG(hw, IXGBE_TSIM, IXGBE_TSIM_TXTS);
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS, IXGBE_EIMS_TIMESYNC);
+
+		/* Activate the SYSTIME counter */
+		IXGBE_WRITE_REG(hw, IXGBE_TSAUXC,
+				tsauxc & ~IXGBE_TSAUXC_DISABLE_SYSTIME);
+		break;
+	case ixgbe_mac_X540:
+	case ixgbe_mac_82599EB:
+		/* Reset SYSTIME registers to 0 */
+		IXGBE_WRITE_REG(hw, IXGBE_SYSTIML, 0);
+		IXGBE_WRITE_REG(hw, IXGBE_SYSTIMH, 0);
+		break;
+	default:
+		/* Other devices aren't supported */
+		return;
+	};
+
+	IXGBE_WRITE_FLUSH(hw);
+}
+
 /**
  * ixgbe_ptp_reset
  * @adapter: the ixgbe private board structure
@@ -1312,6 +1343,8 @@ void ixgbe_ptp_reset(struct ixgbe_adapter *adapter)
 
 	ixgbe_ptp_start_cyclecounter(adapter);
 
+	ixgbe_ptp_init_systime(adapter);
+
 	spin_lock_irqsave(&adapter->tmreg_lock, flags);
 	timecounter_init(&adapter->hw_tc, &adapter->hw_cc,
 			 ktime_to_ns(ktime_get_real()));
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 083/158] i40e: Fix incorrect address type for IPv6 flow rules
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (81 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 082/158] ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 084/158] net: ethernet: mtk_eth_soc: enable rx cksum offload for MTK_NETSYS_V2 Greg Kroah-Hartman
                   ` (87 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Sylwester Dziedziuch, Tony Nguyen,
	Sasha Levin, Gurucharan

From: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>

[ Upstream commit bcf3a156429306070afbfda5544f2b492d25e75b ]

It was not possible to create 1-tuple flow director
rule for IPv6 flow type. It was caused by incorrectly
checking for source IP address when validating user provided
destination IP address.

Fix this by changing ip6src to correct ip6dst address
in destination IP address validation for IPv6 flow type.

Fixes: efca91e89b67 ("i40e: Add flow director support for IPv6")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index 19704f5c8291c..22a61802a4027 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -4395,7 +4395,7 @@ static int i40e_check_fdir_input_set(struct i40e_vsi *vsi,
 				    (struct in6_addr *)&ipv6_full_mask))
 			new_mask |= I40E_L3_V6_DST_MASK;
 		else if (ipv6_addr_any((struct in6_addr *)
-				       &usr_ip6_spec->ip6src))
+				       &usr_ip6_spec->ip6dst))
 			new_mask &= ~I40E_L3_V6_DST_MASK;
 		else
 			return -EOPNOTSUPP;
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 084/158] net: ethernet: mtk_eth_soc: enable rx cksum offload for MTK_NETSYS_V2
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (82 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 083/158] i40e: Fix incorrect address type for IPv6 flow rules Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 085/158] net: ethernet: mtk_eth_soc: fix hw hash reporting " Greg Kroah-Hartman
                   ` (86 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Lorenzo Bianconi, Jakub Kicinski,
	Sasha Levin

From: Lorenzo Bianconi <lorenzo@kernel.org>

[ Upstream commit da6e113ff010815fdd21ee1e9af2e8d179a2680f ]

Enable rx checksum offload for mt7986 chipset.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/c8699805c18f7fd38315fcb8da2787676d83a32c.1654544585.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 59c9a10f83ba5..6beb3d4873a37 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1444,8 +1444,8 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
 	int done = 0, bytes = 0;
 
 	while (done < budget) {
+		unsigned int pktlen, *rxdcsum;
 		struct net_device *netdev;
-		unsigned int pktlen;
 		dma_addr_t dma_addr;
 		u32 hash, reason;
 		int mac = 0;
@@ -1512,7 +1512,13 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
 		pktlen = RX_DMA_GET_PLEN0(trxd.rxd2);
 		skb->dev = netdev;
 		skb_put(skb, pktlen);
-		if (trxd.rxd4 & eth->soc->txrx.rx_dma_l4_valid)
+
+		if (MTK_HAS_CAPS(eth->soc->caps, MTK_NETSYS_V2))
+			rxdcsum = &trxd.rxd3;
+		else
+			rxdcsum = &trxd.rxd4;
+
+		if (*rxdcsum & eth->soc->txrx.rx_dma_l4_valid)
 			skb->ip_summed = CHECKSUM_UNNECESSARY;
 		else
 			skb_checksum_none_assert(skb);
@@ -3761,6 +3767,7 @@ static const struct mtk_soc_data mt7986_data = {
 		.txd_size = sizeof(struct mtk_tx_dma_v2),
 		.rxd_size = sizeof(struct mtk_rx_dma_v2),
 		.rx_irq_done_mask = MTK_RX_DONE_INT_V2,
+		.rx_dma_l4_valid = RX_DMA_L4_VALID_V2,
 		.dma_max_len = MTK_TX_DMA_BUF_LEN_V2,
 		.dma_len_offset = 8,
 	},
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 085/158] net: ethernet: mtk_eth_soc: fix hw hash reporting for MTK_NETSYS_V2
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (83 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 084/158] net: ethernet: mtk_eth_soc: enable rx cksum offload for MTK_NETSYS_V2 Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 086/158] rxrpc: Fix locking in rxrpcs sendmsg Greg Kroah-Hartman
                   ` (85 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Lorenzo Bianconi, Paolo Abeni, Sasha Levin

From: Lorenzo Bianconi <lorenzo@kernel.org>

[ Upstream commit 0cf731f9ebb5bf6f252055bebf4463a5c0bd490b ]

Properly report hw rx hash for mt7986 chipset accroding to the new dma
descriptor layout.

Fixes: 197c9e9b17b11 ("net: ethernet: mtk_eth_soc: introduce support for mt7986 chipset")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/091394ea4e705fbb35f828011d98d0ba33808f69.1661257293.git.lorenzo@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 22 +++++++++++----------
 drivers/net/ethernet/mediatek/mtk_eth_soc.h |  5 +++++
 2 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 6beb3d4873a37..dcf0aac0aa65d 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1513,10 +1513,19 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
 		skb->dev = netdev;
 		skb_put(skb, pktlen);
 
-		if (MTK_HAS_CAPS(eth->soc->caps, MTK_NETSYS_V2))
+		if (MTK_HAS_CAPS(eth->soc->caps, MTK_NETSYS_V2)) {
+			hash = trxd.rxd5 & MTK_RXD5_FOE_ENTRY;
+			if (hash != MTK_RXD5_FOE_ENTRY)
+				skb_set_hash(skb, jhash_1word(hash, 0),
+					     PKT_HASH_TYPE_L4);
 			rxdcsum = &trxd.rxd3;
-		else
+		} else {
+			hash = trxd.rxd4 & MTK_RXD4_FOE_ENTRY;
+			if (hash != MTK_RXD4_FOE_ENTRY)
+				skb_set_hash(skb, jhash_1word(hash, 0),
+					     PKT_HASH_TYPE_L4);
 			rxdcsum = &trxd.rxd4;
+		}
 
 		if (*rxdcsum & eth->soc->txrx.rx_dma_l4_valid)
 			skb->ip_summed = CHECKSUM_UNNECESSARY;
@@ -1525,16 +1534,9 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
 		skb->protocol = eth_type_trans(skb, netdev);
 		bytes += pktlen;
 
-		hash = trxd.rxd4 & MTK_RXD4_FOE_ENTRY;
-		if (hash != MTK_RXD4_FOE_ENTRY) {
-			hash = jhash_1word(hash, 0);
-			skb_set_hash(skb, hash, PKT_HASH_TYPE_L4);
-		}
-
 		reason = FIELD_GET(MTK_RXD4_PPE_CPU_REASON, trxd.rxd4);
 		if (reason == MTK_PPE_CPU_REASON_HIT_UNBIND_RATE_REACHED)
-			mtk_ppe_check_skb(eth->ppe, skb,
-					  trxd.rxd4 & MTK_RXD4_FOE_ENTRY);
+			mtk_ppe_check_skb(eth->ppe, skb, hash);
 
 		if (netdev->features & NETIF_F_HW_VLAN_CTAG_RX) {
 			if (MTK_HAS_CAPS(eth->soc->caps, MTK_NETSYS_V2)) {
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
index 0a632896451a4..98d6a6d047e32 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
@@ -307,6 +307,11 @@
 #define RX_DMA_L4_VALID_PDMA	BIT(30)		/* when PDMA is used */
 #define RX_DMA_SPECIAL_TAG	BIT(22)
 
+/* PDMA descriptor rxd5 */
+#define MTK_RXD5_FOE_ENTRY	GENMASK(14, 0)
+#define MTK_RXD5_PPE_CPU_REASON	GENMASK(22, 18)
+#define MTK_RXD5_SRC_PORT	GENMASK(29, 26)
+
 #define RX_DMA_GET_SPORT(x)	(((x) >> 19) & 0xf)
 #define RX_DMA_GET_SPORT_V2(x)	(((x) >> 26) & 0x7)
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 086/158] rxrpc: Fix locking in rxrpcs sendmsg
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (84 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 085/158] net: ethernet: mtk_eth_soc: fix hw hash reporting " Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 087/158] ionic: clear broken state on generation change Greg Kroah-Hartman
                   ` (84 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+7f0483225d0c94cb3441,
	David Howells, Marc Dionne, Hawkins Jiawei, Khalid Masum,
	Dan Carpenter, linux-afs, Jakub Kicinski, Sasha Levin

From: David Howells <dhowells@redhat.com>

[ Upstream commit b0f571ecd7943423c25947439045f0d352ca3dbf ]

Fix three bugs in the rxrpc's sendmsg implementation:

 (1) rxrpc_new_client_call() should release the socket lock when returning
     an error from rxrpc_get_call_slot().

 (2) rxrpc_wait_for_tx_window_intr() will return without the call mutex
     held in the event that we're interrupted by a signal whilst waiting
     for tx space on the socket or relocking the call mutex afterwards.

     Fix this by: (a) moving the unlock/lock of the call mutex up to
     rxrpc_send_data() such that the lock is not held around all of
     rxrpc_wait_for_tx_window*() and (b) indicating to higher callers
     whether we're return with the lock dropped.  Note that this means
     recvmsg() will not block on this call whilst we're waiting.

 (3) After dropping and regaining the call mutex, rxrpc_send_data() needs
     to go and recheck the state of the tx_pending buffer and the
     tx_total_len check in case we raced with another sendmsg() on the same
     call.

Thinking on this some more, it might make sense to have different locks for
sendmsg() and recvmsg().  There's probably no need to make recvmsg() wait
for sendmsg().  It does mean that recvmsg() can return MSG_EOR indicating
that a call is dead before a sendmsg() to that call returns - but that can
currently happen anyway.

Without fix (2), something like the following can be induced:

	WARNING: bad unlock balance detected!
	5.16.0-rc6-syzkaller #0 Not tainted
	-------------------------------------
	syz-executor011/3597 is trying to release lock (&call->user_mutex) at:
	[<ffffffff885163a3>] rxrpc_do_sendmsg+0xc13/0x1350 net/rxrpc/sendmsg.c:748
	but there are no more locks to release!

	other info that might help us debug this:
	no locks held by syz-executor011/3597.
	...
	Call Trace:
	 <TASK>
	 __dump_stack lib/dump_stack.c:88 [inline]
	 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
	 print_unlock_imbalance_bug include/trace/events/lock.h:58 [inline]
	 __lock_release kernel/locking/lockdep.c:5306 [inline]
	 lock_release.cold+0x49/0x4e kernel/locking/lockdep.c:5657
	 __mutex_unlock_slowpath+0x99/0x5e0 kernel/locking/mutex.c:900
	 rxrpc_do_sendmsg+0xc13/0x1350 net/rxrpc/sendmsg.c:748
	 rxrpc_sendmsg+0x420/0x630 net/rxrpc/af_rxrpc.c:561
	 sock_sendmsg_nosec net/socket.c:704 [inline]
	 sock_sendmsg+0xcf/0x120 net/socket.c:724
	 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2409
	 ___sys_sendmsg+0xf3/0x170 net/socket.c:2463
	 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2492
	 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
	 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
	 entry_SYSCALL_64_after_hwframe+0x44/0xae

[Thanks to Hawkins Jiawei and Khalid Masum for their attempts to fix this]

Fixes: bc5e3a546d55 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals")
Reported-by: syzbot+7f0483225d0c94cb3441@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Tested-by: syzbot+7f0483225d0c94cb3441@syzkaller.appspotmail.com
cc: Hawkins Jiawei <yin31149@gmail.com>
cc: Khalid Masum <khalid.masum.92@gmail.com>
cc: Dan Carpenter <dan.carpenter@oracle.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/166135894583.600315.7170979436768124075.stgit@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/rxrpc/call_object.c |  4 +-
 net/rxrpc/sendmsg.c     | 92 ++++++++++++++++++++++++-----------------
 2 files changed, 57 insertions(+), 39 deletions(-)

diff --git a/net/rxrpc/call_object.c b/net/rxrpc/call_object.c
index 84d0a41096450..6401cdf7a6246 100644
--- a/net/rxrpc/call_object.c
+++ b/net/rxrpc/call_object.c
@@ -285,8 +285,10 @@ struct rxrpc_call *rxrpc_new_client_call(struct rxrpc_sock *rx,
 	_enter("%p,%lx", rx, p->user_call_ID);
 
 	limiter = rxrpc_get_call_slot(p, gfp);
-	if (!limiter)
+	if (!limiter) {
+		release_sock(&rx->sk);
 		return ERR_PTR(-ERESTARTSYS);
+	}
 
 	call = rxrpc_alloc_client_call(rx, srx, gfp, debug_id);
 	if (IS_ERR(call)) {
diff --git a/net/rxrpc/sendmsg.c b/net/rxrpc/sendmsg.c
index 1d38e279e2efa..3c3a626459deb 100644
--- a/net/rxrpc/sendmsg.c
+++ b/net/rxrpc/sendmsg.c
@@ -51,10 +51,7 @@ static int rxrpc_wait_for_tx_window_intr(struct rxrpc_sock *rx,
 			return sock_intr_errno(*timeo);
 
 		trace_rxrpc_transmit(call, rxrpc_transmit_wait);
-		mutex_unlock(&call->user_mutex);
 		*timeo = schedule_timeout(*timeo);
-		if (mutex_lock_interruptible(&call->user_mutex) < 0)
-			return sock_intr_errno(*timeo);
 	}
 }
 
@@ -290,37 +287,48 @@ static int rxrpc_queue_packet(struct rxrpc_sock *rx, struct rxrpc_call *call,
 static int rxrpc_send_data(struct rxrpc_sock *rx,
 			   struct rxrpc_call *call,
 			   struct msghdr *msg, size_t len,
-			   rxrpc_notify_end_tx_t notify_end_tx)
+			   rxrpc_notify_end_tx_t notify_end_tx,
+			   bool *_dropped_lock)
 {
 	struct rxrpc_skb_priv *sp;
 	struct sk_buff *skb;
 	struct sock *sk = &rx->sk;
+	enum rxrpc_call_state state;
 	long timeo;
-	bool more;
-	int ret, copied;
+	bool more = msg->msg_flags & MSG_MORE;
+	int ret, copied = 0;
 
 	timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
 
 	/* this should be in poll */
 	sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
 
+reload:
+	ret = -EPIPE;
 	if (sk->sk_shutdown & SEND_SHUTDOWN)
-		return -EPIPE;
-
-	more = msg->msg_flags & MSG_MORE;
-
+		goto maybe_error;
+	state = READ_ONCE(call->state);
+	ret = -ESHUTDOWN;
+	if (state >= RXRPC_CALL_COMPLETE)
+		goto maybe_error;
+	ret = -EPROTO;
+	if (state != RXRPC_CALL_CLIENT_SEND_REQUEST &&
+	    state != RXRPC_CALL_SERVER_ACK_REQUEST &&
+	    state != RXRPC_CALL_SERVER_SEND_REPLY)
+		goto maybe_error;
+
+	ret = -EMSGSIZE;
 	if (call->tx_total_len != -1) {
-		if (len > call->tx_total_len)
-			return -EMSGSIZE;
-		if (!more && len != call->tx_total_len)
-			return -EMSGSIZE;
+		if (len - copied > call->tx_total_len)
+			goto maybe_error;
+		if (!more && len - copied != call->tx_total_len)
+			goto maybe_error;
 	}
 
 	skb = call->tx_pending;
 	call->tx_pending = NULL;
 	rxrpc_see_skb(skb, rxrpc_skb_seen);
 
-	copied = 0;
 	do {
 		/* Check to see if there's a ping ACK to reply to. */
 		if (call->ackr_reason == RXRPC_ACK_PING_RESPONSE)
@@ -331,16 +339,8 @@ static int rxrpc_send_data(struct rxrpc_sock *rx,
 
 			_debug("alloc");
 
-			if (!rxrpc_check_tx_space(call, NULL)) {
-				ret = -EAGAIN;
-				if (msg->msg_flags & MSG_DONTWAIT)
-					goto maybe_error;
-				ret = rxrpc_wait_for_tx_window(rx, call,
-							       &timeo,
-							       msg->msg_flags & MSG_WAITALL);
-				if (ret < 0)
-					goto maybe_error;
-			}
+			if (!rxrpc_check_tx_space(call, NULL))
+				goto wait_for_space;
 
 			/* Work out the maximum size of a packet.  Assume that
 			 * the security header is going to be in the padded
@@ -468,6 +468,27 @@ static int rxrpc_send_data(struct rxrpc_sock *rx,
 efault:
 	ret = -EFAULT;
 	goto out;
+
+wait_for_space:
+	ret = -EAGAIN;
+	if (msg->msg_flags & MSG_DONTWAIT)
+		goto maybe_error;
+	mutex_unlock(&call->user_mutex);
+	*_dropped_lock = true;
+	ret = rxrpc_wait_for_tx_window(rx, call, &timeo,
+				       msg->msg_flags & MSG_WAITALL);
+	if (ret < 0)
+		goto maybe_error;
+	if (call->interruptibility == RXRPC_INTERRUPTIBLE) {
+		if (mutex_lock_interruptible(&call->user_mutex) < 0) {
+			ret = sock_intr_errno(timeo);
+			goto maybe_error;
+		}
+	} else {
+		mutex_lock(&call->user_mutex);
+	}
+	*_dropped_lock = false;
+	goto reload;
 }
 
 /*
@@ -629,6 +650,7 @@ int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
 	enum rxrpc_call_state state;
 	struct rxrpc_call *call;
 	unsigned long now, j;
+	bool dropped_lock = false;
 	int ret;
 
 	struct rxrpc_send_params p = {
@@ -737,21 +759,13 @@ int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
 			ret = rxrpc_send_abort_packet(call);
 	} else if (p.command != RXRPC_CMD_SEND_DATA) {
 		ret = -EINVAL;
-	} else if (rxrpc_is_client_call(call) &&
-		   state != RXRPC_CALL_CLIENT_SEND_REQUEST) {
-		/* request phase complete for this client call */
-		ret = -EPROTO;
-	} else if (rxrpc_is_service_call(call) &&
-		   state != RXRPC_CALL_SERVER_ACK_REQUEST &&
-		   state != RXRPC_CALL_SERVER_SEND_REPLY) {
-		/* Reply phase not begun or not complete for service call. */
-		ret = -EPROTO;
 	} else {
-		ret = rxrpc_send_data(rx, call, msg, len, NULL);
+		ret = rxrpc_send_data(rx, call, msg, len, NULL, &dropped_lock);
 	}
 
 out_put_unlock:
-	mutex_unlock(&call->user_mutex);
+	if (!dropped_lock)
+		mutex_unlock(&call->user_mutex);
 error_put:
 	rxrpc_put_call(call, rxrpc_call_put);
 	_leave(" = %d", ret);
@@ -779,6 +793,7 @@ int rxrpc_kernel_send_data(struct socket *sock, struct rxrpc_call *call,
 			   struct msghdr *msg, size_t len,
 			   rxrpc_notify_end_tx_t notify_end_tx)
 {
+	bool dropped_lock = false;
 	int ret;
 
 	_enter("{%d,%s},", call->debug_id, rxrpc_call_states[call->state]);
@@ -796,7 +811,7 @@ int rxrpc_kernel_send_data(struct socket *sock, struct rxrpc_call *call,
 	case RXRPC_CALL_SERVER_ACK_REQUEST:
 	case RXRPC_CALL_SERVER_SEND_REPLY:
 		ret = rxrpc_send_data(rxrpc_sk(sock->sk), call, msg, len,
-				      notify_end_tx);
+				      notify_end_tx, &dropped_lock);
 		break;
 	case RXRPC_CALL_COMPLETE:
 		read_lock_bh(&call->state_lock);
@@ -810,7 +825,8 @@ int rxrpc_kernel_send_data(struct socket *sock, struct rxrpc_call *call,
 		break;
 	}
 
-	mutex_unlock(&call->user_mutex);
+	if (!dropped_lock)
+		mutex_unlock(&call->user_mutex);
 	_leave(" = %d", ret);
 	return ret;
 }
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 087/158] ionic: clear broken state on generation change
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (85 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 086/158] rxrpc: Fix locking in rxrpcs sendmsg Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 088/158] ionic: fix up issues with handling EAGAIN on FW cmds Greg Kroah-Hartman
                   ` (83 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Shannon Nelson, Jakub Kicinski, Sasha Levin

From: Shannon Nelson <snelson@pensando.io>

[ Upstream commit 9cb9dadb8f45c67e4310e002c2f221b70312b293 ]

There is a case found in heavy testing where a link flap happens just
before a firmware Recovery event and the driver gets stuck in the
BROKEN state.  This comes from the driver getting interrupted by a FW
generation change when coming back up from the link flap, and the call
to ionic_start_queues() in ionic_link_status_check() fails.  This can be
addressed by having the fw_up code clear the BROKEN bit if seen, rather
than waiting for a user to manually force the interface down and then
back up.

Fixes: 9e8eaf8427b6 ("ionic: stop watchdog when in broken state")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/pensando/ionic/ionic_lif.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
index 1443f788ee37c..d4226999547e8 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
@@ -2963,6 +2963,9 @@ static void ionic_lif_handle_fw_up(struct ionic_lif *lif)
 
 	mutex_lock(&lif->queue_lock);
 
+	if (test_and_clear_bit(IONIC_LIF_F_BROKEN, lif->state))
+		dev_info(ionic->dev, "FW Up: clearing broken state\n");
+
 	err = ionic_qcqs_alloc(lif);
 	if (err)
 		goto err_unlock;
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 088/158] ionic: fix up issues with handling EAGAIN on FW cmds
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (86 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 087/158] ionic: clear broken state on generation change Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:58 ` [PATCH 5.19 089/158] ionic: VF initial random MAC address if no assigned mac Greg Kroah-Hartman
                   ` (82 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Shannon Nelson, Jakub Kicinski, Sasha Levin

From: Shannon Nelson <snelson@pensando.io>

[ Upstream commit 0fc4dd452d6c14828eed6369155c75c0ac15bab3 ]

In looping on FW update tests we occasionally see the
FW_ACTIVATE_STATUS command fail while it is in its EAGAIN loop
waiting for the FW activate step to finsh inside the FW.  The
firmware is complaining that the done bit is set when a new
dev_cmd is going to be processed.

Doing a clean on the cmd registers and doorbell before exiting
the wait-for-done and cleaning the done bit before the sleep
prevents this from occurring.

Fixes: fbfb8031533c ("ionic: Add hardware init and device commands")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/pensando/ionic/ionic_main.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/pensando/ionic/ionic_main.c b/drivers/net/ethernet/pensando/ionic/ionic_main.c
index 4029b4e021f86..56f93b0305519 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_main.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_main.c
@@ -474,8 +474,8 @@ static int __ionic_dev_cmd_wait(struct ionic *ionic, unsigned long max_seconds,
 				ionic_opcode_to_str(opcode), opcode,
 				ionic_error_to_str(err), err);
 
-			msleep(1000);
 			iowrite32(0, &idev->dev_cmd_regs->done);
+			msleep(1000);
 			iowrite32(1, &idev->dev_cmd_regs->doorbell);
 			goto try_again;
 		}
@@ -488,6 +488,8 @@ static int __ionic_dev_cmd_wait(struct ionic *ionic, unsigned long max_seconds,
 		return ionic_error_to_errno(err);
 	}
 
+	ionic_dev_cmd_clean(ionic);
+
 	return 0;
 }
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 089/158] ionic: VF initial random MAC address if no assigned mac
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (87 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 088/158] ionic: fix up issues with handling EAGAIN on FW cmds Greg Kroah-Hartman
@ 2022-08-29 10:58 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 090/158] net: stmmac: work around sporadic tx issue on link-up Greg Kroah-Hartman
                   ` (81 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, R Mohamed Shah, Shannon Nelson,
	Jakub Kicinski, Sasha Levin

From: R Mohamed Shah <mohamed@pensando.io>

[ Upstream commit 19058be7c48ceb3e60fa3948e24da1059bd68ee4 ]

Assign a random mac address to the VF interface station
address if it boots with a zero mac address in order to match
similar behavior seen in other VF drivers.  Handle the errors
where the older firmware does not allow the VF to set its own
station address.

Newer firmware will allow the VF to set the station mac address
if it hasn't already been set administratively through the PF.
Setting it will also be allowed if the VF has trust.

Fixes: fbb39807e9ae ("ionic: support sr-iov operations")
Signed-off-by: R Mohamed Shah <mohamed@pensando.io>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../net/ethernet/pensando/ionic/ionic_lif.c   | 92 ++++++++++++++++++-
 1 file changed, 87 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
index d4226999547e8..0be79c5167813 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
@@ -1564,8 +1564,67 @@ static int ionic_set_features(struct net_device *netdev,
 	return err;
 }
 
+static int ionic_set_attr_mac(struct ionic_lif *lif, u8 *mac)
+{
+	struct ionic_admin_ctx ctx = {
+		.work = COMPLETION_INITIALIZER_ONSTACK(ctx.work),
+		.cmd.lif_setattr = {
+			.opcode = IONIC_CMD_LIF_SETATTR,
+			.index = cpu_to_le16(lif->index),
+			.attr = IONIC_LIF_ATTR_MAC,
+		},
+	};
+
+	ether_addr_copy(ctx.cmd.lif_setattr.mac, mac);
+	return ionic_adminq_post_wait(lif, &ctx);
+}
+
+static int ionic_get_attr_mac(struct ionic_lif *lif, u8 *mac_addr)
+{
+	struct ionic_admin_ctx ctx = {
+		.work = COMPLETION_INITIALIZER_ONSTACK(ctx.work),
+		.cmd.lif_getattr = {
+			.opcode = IONIC_CMD_LIF_GETATTR,
+			.index = cpu_to_le16(lif->index),
+			.attr = IONIC_LIF_ATTR_MAC,
+		},
+	};
+	int err;
+
+	err = ionic_adminq_post_wait(lif, &ctx);
+	if (err)
+		return err;
+
+	ether_addr_copy(mac_addr, ctx.comp.lif_getattr.mac);
+	return 0;
+}
+
+static int ionic_program_mac(struct ionic_lif *lif, u8 *mac)
+{
+	u8  get_mac[ETH_ALEN];
+	int err;
+
+	err = ionic_set_attr_mac(lif, mac);
+	if (err)
+		return err;
+
+	err = ionic_get_attr_mac(lif, get_mac);
+	if (err)
+		return err;
+
+	/* To deal with older firmware that silently ignores the set attr mac:
+	 * doesn't actually change the mac and doesn't return an error, so we
+	 * do the get attr to verify whether or not the set actually happened
+	 */
+	if (!ether_addr_equal(get_mac, mac))
+		return 1;
+
+	return 0;
+}
+
 static int ionic_set_mac_address(struct net_device *netdev, void *sa)
 {
+	struct ionic_lif *lif = netdev_priv(netdev);
 	struct sockaddr *addr = sa;
 	u8 *mac;
 	int err;
@@ -1574,6 +1633,14 @@ static int ionic_set_mac_address(struct net_device *netdev, void *sa)
 	if (ether_addr_equal(netdev->dev_addr, mac))
 		return 0;
 
+	err = ionic_program_mac(lif, mac);
+	if (err < 0)
+		return err;
+
+	if (err > 0)
+		netdev_dbg(netdev, "%s: SET and GET ATTR Mac are not equal-due to old FW running\n",
+			   __func__);
+
 	err = eth_prepare_mac_addr_change(netdev, addr);
 	if (err)
 		return err;
@@ -3172,6 +3239,7 @@ static int ionic_station_set(struct ionic_lif *lif)
 			.attr = IONIC_LIF_ATTR_MAC,
 		},
 	};
+	u8 mac_address[ETH_ALEN];
 	struct sockaddr addr;
 	int err;
 
@@ -3180,8 +3248,23 @@ static int ionic_station_set(struct ionic_lif *lif)
 		return err;
 	netdev_dbg(lif->netdev, "found initial MAC addr %pM\n",
 		   ctx.comp.lif_getattr.mac);
-	if (is_zero_ether_addr(ctx.comp.lif_getattr.mac))
-		return 0;
+	ether_addr_copy(mac_address, ctx.comp.lif_getattr.mac);
+
+	if (is_zero_ether_addr(mac_address)) {
+		eth_hw_addr_random(netdev);
+		netdev_dbg(netdev, "Random Mac generated: %pM\n", netdev->dev_addr);
+		ether_addr_copy(mac_address, netdev->dev_addr);
+
+		err = ionic_program_mac(lif, mac_address);
+		if (err < 0)
+			return err;
+
+		if (err > 0) {
+			netdev_dbg(netdev, "%s:SET/GET ATTR Mac are not same-due to old FW running\n",
+				   __func__);
+			return 0;
+		}
+	}
 
 	if (!is_zero_ether_addr(netdev->dev_addr)) {
 		/* If the netdev mac is non-zero and doesn't match the default
@@ -3189,12 +3272,11 @@ static int ionic_station_set(struct ionic_lif *lif)
 		 * likely here again after a fw-upgrade reset.  We need to be
 		 * sure the netdev mac is in our filter list.
 		 */
-		if (!ether_addr_equal(ctx.comp.lif_getattr.mac,
-				      netdev->dev_addr))
+		if (!ether_addr_equal(mac_address, netdev->dev_addr))
 			ionic_lif_addr_add(lif, netdev->dev_addr);
 	} else {
 		/* Update the netdev mac with the device's mac */
-		memcpy(addr.sa_data, ctx.comp.lif_getattr.mac, netdev->addr_len);
+		ether_addr_copy(addr.sa_data, mac_address);
 		addr.sa_family = AF_INET;
 		err = eth_prepare_mac_addr_change(netdev, &addr);
 		if (err) {
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 090/158] net: stmmac: work around sporadic tx issue on link-up
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (88 preceding siblings ...)
  2022-08-29 10:58 ` [PATCH 5.19 089/158] ionic: VF initial random MAC address if no assigned mac Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 091/158] net: lantiq_xrx200: confirm skb is allocated before using Greg Kroah-Hartman
                   ` (80 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Qi Duan, Jerome Brunet,
	Heiner Kallweit, Jakub Kicinski, Sasha Levin

From: Heiner Kallweit <hkallweit1@gmail.com>

[ Upstream commit a3a57bf07de23fe1ff779e0fdf710aa581c3ff73 ]

This is a follow-up to the discussion in [0]. It seems to me that
at least the IP version used on Amlogic SoC's sometimes has a problem
if register MAC_CTRL_REG is written whilst the chip is still processing
a previous write. But that's just a guess.
Adding a delay between two writes to this register helps, but we can
also simply omit the offending second write. This patch uses the second
approach and is based on a suggestion from Qi Duan.
Benefit of this approach is that we can save few register writes, also
on not affected chip versions.

[0] https://www.spinics.net/lists/netdev/msg831526.html

Fixes: bfab27a146ed ("stmmac: add the experimental PCI support")
Suggested-by: Qi Duan <qi.duan@amlogic.com>
Suggested-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/e99857ce-bd90-5093-ca8c-8cd480b5a0a2@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c   | 8 ++++++--
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 9 +++++----
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
index caa4bfc4c1d62..9b6138b117766 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
@@ -258,14 +258,18 @@ EXPORT_SYMBOL_GPL(stmmac_set_mac_addr);
 /* Enable disable MAC RX/TX */
 void stmmac_set_mac(void __iomem *ioaddr, bool enable)
 {
-	u32 value = readl(ioaddr + MAC_CTRL_REG);
+	u32 old_val, value;
+
+	old_val = readl(ioaddr + MAC_CTRL_REG);
+	value = old_val;
 
 	if (enable)
 		value |= MAC_ENABLE_RX | MAC_ENABLE_TX;
 	else
 		value &= ~(MAC_ENABLE_TX | MAC_ENABLE_RX);
 
-	writel(value, ioaddr + MAC_CTRL_REG);
+	if (value != old_val)
+		writel(value, ioaddr + MAC_CTRL_REG);
 }
 
 void stmmac_get_mac_addr(void __iomem *ioaddr, unsigned char *addr,
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index c5f33630e7718..78f11dabca056 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -983,10 +983,10 @@ static void stmmac_mac_link_up(struct phylink_config *config,
 			       bool tx_pause, bool rx_pause)
 {
 	struct stmmac_priv *priv = netdev_priv(to_net_dev(config->dev));
-	u32 ctrl;
+	u32 old_ctrl, ctrl;
 
-	ctrl = readl(priv->ioaddr + MAC_CTRL_REG);
-	ctrl &= ~priv->hw->link.speed_mask;
+	old_ctrl = readl(priv->ioaddr + MAC_CTRL_REG);
+	ctrl = old_ctrl & ~priv->hw->link.speed_mask;
 
 	if (interface == PHY_INTERFACE_MODE_USXGMII) {
 		switch (speed) {
@@ -1061,7 +1061,8 @@ static void stmmac_mac_link_up(struct phylink_config *config,
 	if (tx_pause && rx_pause)
 		stmmac_mac_flow_ctrl(priv, duplex);
 
-	writel(ctrl, priv->ioaddr + MAC_CTRL_REG);
+	if (ctrl != old_ctrl)
+		writel(ctrl, priv->ioaddr + MAC_CTRL_REG);
 
 	stmmac_mac_set(priv, priv->ioaddr, true);
 	if (phy && priv->dma_cap.eee) {
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 091/158] net: lantiq_xrx200: confirm skb is allocated before using
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (89 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 090/158] net: stmmac: work around sporadic tx issue on link-up Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 092/158] net: lantiq_xrx200: fix lock under memory pressure Greg Kroah-Hartman
                   ` (79 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet,
	Aleksander Jan Bajkowski, Jakub Kicinski, Sasha Levin

From: Aleksander Jan Bajkowski <olek2@wp.pl>

[ Upstream commit c8b043702dc0894c07721c5b019096cebc8c798f ]

xrx200_hw_receive() assumes build_skb() always works and goes straight
to skb_reserve(). However, build_skb() can fail under memory pressure.

Add a check in case build_skb() failed to allocate and return NULL.

Fixes: e015593573b3 ("net: lantiq_xrx200: convert to build_skb")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/lantiq_xrx200.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/lantiq_xrx200.c b/drivers/net/ethernet/lantiq_xrx200.c
index 5edb68a8aab1e..89314b645c822 100644
--- a/drivers/net/ethernet/lantiq_xrx200.c
+++ b/drivers/net/ethernet/lantiq_xrx200.c
@@ -239,6 +239,12 @@ static int xrx200_hw_receive(struct xrx200_chan *ch)
 	}
 
 	skb = build_skb(buf, priv->rx_skb_size);
+	if (!skb) {
+		skb_free_frag(buf);
+		net_dev->stats.rx_dropped++;
+		return -ENOMEM;
+	}
+
 	skb_reserve(skb, NET_SKB_PAD);
 	skb_put(skb, len);
 
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 092/158] net: lantiq_xrx200: fix lock under memory pressure
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (90 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 091/158] net: lantiq_xrx200: confirm skb is allocated before using Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 093/158] net: lantiq_xrx200: restore buffer if memory allocation failed Greg Kroah-Hartman
                   ` (78 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Aleksander Jan Bajkowski,
	Jakub Kicinski, Sasha Levin

From: Aleksander Jan Bajkowski <olek2@wp.pl>

[ Upstream commit c4b6e9341f930e4dd089231c0414758f5f1f9dbd ]

When the xrx200_hw_receive() function returns -ENOMEM, the NAPI poll
function immediately returns an error.
This is incorrect for two reasons:
* the function terminates without enabling interrupts or scheduling NAPI,
* the error code (-ENOMEM) is returned instead of the number of received
packets.

After the first memory allocation failure occurs, packet reception is
locked due to disabled interrupts from DMA..

Fixes: fe1a56420cf2 ("net: lantiq: Add Lantiq / Intel VRX200 Ethernet driver")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/lantiq_xrx200.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/lantiq_xrx200.c b/drivers/net/ethernet/lantiq_xrx200.c
index 89314b645c822..25adce7f0c7c0 100644
--- a/drivers/net/ethernet/lantiq_xrx200.c
+++ b/drivers/net/ethernet/lantiq_xrx200.c
@@ -294,7 +294,7 @@ static int xrx200_poll_rx(struct napi_struct *napi, int budget)
 			if (ret == XRX200_DMA_PACKET_IN_PROGRESS)
 				continue;
 			if (ret != XRX200_DMA_PACKET_COMPLETE)
-				return ret;
+				break;
 			rx++;
 		} else {
 			break;
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 093/158] net: lantiq_xrx200: restore buffer if memory allocation failed
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (91 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 092/158] net: lantiq_xrx200: fix lock under memory pressure Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 094/158] btrfs: fix silent failure when deleting root reference Greg Kroah-Hartman
                   ` (77 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Aleksander Jan Bajkowski,
	Jakub Kicinski, Sasha Levin

From: Aleksander Jan Bajkowski <olek2@wp.pl>

[ Upstream commit c9c3b1775f80fa21f5bff874027d2ccb10f5d90c ]

In a situation where memory allocation fails, an invalid buffer address
is stored. When this descriptor is used again, the system panics in the
build_skb() function when accessing memory.

Fixes: 7ea6cd16f159 ("lantiq: net: fix duplicated skb in rx descriptor ring")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/lantiq_xrx200.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/lantiq_xrx200.c b/drivers/net/ethernet/lantiq_xrx200.c
index 25adce7f0c7c0..57f27cc7724e7 100644
--- a/drivers/net/ethernet/lantiq_xrx200.c
+++ b/drivers/net/ethernet/lantiq_xrx200.c
@@ -193,6 +193,7 @@ static int xrx200_alloc_buf(struct xrx200_chan *ch, void *(*alloc)(unsigned int
 
 	ch->rx_buff[ch->dma.desc] = alloc(priv->rx_skb_size);
 	if (!ch->rx_buff[ch->dma.desc]) {
+		ch->rx_buff[ch->dma.desc] = buf;
 		ret = -ENOMEM;
 		goto skip;
 	}
-- 
2.35.1




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 094/158] btrfs: fix silent failure when deleting root reference
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (92 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 093/158] net: lantiq_xrx200: restore buffer if memory allocation failed Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 095/158] btrfs: replace: drop assert for suspended replace Greg Kroah-Hartman
                   ` (76 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Qu Wenruo, Filipe Manana, David Sterba

From: Filipe Manana <fdmanana@suse.com>

commit 47bf225a8d2cccb15f7e8d4a1ed9b757dd86afd7 upstream.

At btrfs_del_root_ref(), if btrfs_search_slot() returns an error, we end
up returning from the function with a value of 0 (success). This happens
because the function returns the value stored in the variable 'err',
which is 0, while the error value we got from btrfs_search_slot() is
stored in the 'ret' variable.

So fix it by setting 'err' with the error value.

Fixes: 8289ed9f93bef2 ("btrfs: replace the BUG_ON in btrfs_del_root_ref with proper error handling")
CC: stable@vger.kernel.org # 5.16+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/root-tree.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/fs/btrfs/root-tree.c
+++ b/fs/btrfs/root-tree.c
@@ -349,9 +349,10 @@ int btrfs_del_root_ref(struct btrfs_tran
 	key.offset = ref_id;
 again:
 	ret = btrfs_search_slot(trans, tree_root, &key, path, -1, 1);
-	if (ret < 0)
+	if (ret < 0) {
+		err = ret;
 		goto out;
-	if (ret == 0) {
+	} else if (ret == 0) {
 		leaf = path->nodes[0];
 		ref = btrfs_item_ptr(leaf, path->slots[0],
 				     struct btrfs_root_ref);



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 095/158] btrfs: replace: drop assert for suspended replace
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (93 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 094/158] btrfs: fix silent failure when deleting root reference Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 096/158] btrfs: add info when mount fails due to stale replace target Greg Kroah-Hartman
                   ` (75 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Anand Jain, David Sterba

From: Anand Jain <anand.jain@oracle.com>

commit 59a3991984dbc1fc47e5651a265c5200bd85464e upstream.

If the filesystem mounts with the replace-operation in a suspended state
and try to cancel the suspended replace-operation, we hit the assert. The
assert came from the commit fe97e2e173af ("btrfs: dev-replace: replace's
scrub must not be running in suspended state") that was actually not
required. So just remove it.

 $ mount /dev/sda5 /btrfs

    BTRFS info (device sda5): cannot continue dev_replace, tgtdev is missing
    BTRFS info (device sda5): you may cancel the operation after 'mount -o degraded'

 $ mount -o degraded /dev/sda5 /btrfs <-- success.

 $ btrfs replace cancel /btrfs

    kernel: assertion failed: ret != -ENOTCONN, in fs/btrfs/dev-replace.c:1131
    kernel: ------------[ cut here ]------------
    kernel: kernel BUG at fs/btrfs/ctree.h:3750!

After the patch:

 $ btrfs replace cancel /btrfs

    BTRFS info (device sda5): suspended dev_replace from /dev/sda5 (devid 1) to <missing disk> canceled

Fixes: fe97e2e173af ("btrfs: dev-replace: replace's scrub must not be running in suspended state")
CC: stable@vger.kernel.org # 5.0+
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/dev-replace.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -1128,8 +1128,7 @@ int btrfs_dev_replace_cancel(struct btrf
 		up_write(&dev_replace->rwsem);
 
 		/* Scrub for replace must not be running in suspended state */
-		ret = btrfs_scrub_cancel(fs_info);
-		ASSERT(ret != -ENOTCONN);
+		btrfs_scrub_cancel(fs_info);
 
 		trans = btrfs_start_transaction(root, 0);
 		if (IS_ERR(trans)) {



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 096/158] btrfs: add info when mount fails due to stale replace target
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (94 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 095/158] btrfs: replace: drop assert for suspended replace Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 097/158] btrfs: fix space cache corruption and potential double allocations Greg Kroah-Hartman
                   ` (74 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Samuel Greiner, Anand Jain, David Sterba

From: Anand Jain <anand.jain@oracle.com>

commit f2c3bec215694fb8bc0ef5010f2a758d1906fc2d upstream.

If the replace target device reappears after the suspended replace is
cancelled, it blocks the mount operation as it can't find the matching
replace-item in the metadata. As shown below,

   BTRFS error (device sda5): replace devid present without an active replace item

To overcome this situation, the user can run the command

   btrfs device scan --forget <replace target device>

and try the mount command again. And also, to avoid repeating the issue,
superblock on the devid=0 must be wiped.

   wipefs -a device-path-to-devid=0.

This patch adds some info when this situation occurs.

Reported-by: Samuel Greiner <samuel@balkonien.org>
Link: https://lore.kernel.org/linux-btrfs/b4f62b10-b295-26ea-71f9-9a5c9299d42c@balkonien.org/T/
CC: stable@vger.kernel.org # 5.0+
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/dev-replace.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -165,7 +165,7 @@ no_valid_dev_replace_entry_found:
 		 */
 		if (btrfs_find_device(fs_info->fs_devices, &args)) {
 			btrfs_err(fs_info,
-			"replace devid present without an active replace item");
+"replace without active item, run 'device scan --forget' on the target device");
 			ret = -EUCLEAN;
 		} else {
 			dev_replace->srcdev = NULL;



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 097/158] btrfs: fix space cache corruption and potential double allocations
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (95 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 096/158] btrfs: add info when mount fails due to stale replace target Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 098/158] btrfs: check if root is readonly while setting security xattr Greg Kroah-Hartman
                   ` (73 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Filipe Manana, Omar Sandoval, David Sterba

From: Omar Sandoval <osandov@fb.com>

commit ced8ecf026fd8084cf175530ff85c76d6085d715 upstream.

When testing space_cache v2 on a large set of machines, we encountered a
few symptoms:

1. "unable to add free space :-17" (EEXIST) errors.
2. Missing free space info items, sometimes caught with a "missing free
   space info for X" error.
3. Double-accounted space: ranges that were allocated in the extent tree
   and also marked as free in the free space tree, ranges that were
   marked as allocated twice in the extent tree, or ranges that were
   marked as free twice in the free space tree. If the latter made it
   onto disk, the next reboot would hit the BUG_ON() in
   add_new_free_space().
4. On some hosts with no on-disk corruption or error messages, the
   in-memory space cache (dumped with drgn) disagreed with the free
   space tree.

All of these symptoms have the same underlying cause: a race between
caching the free space for a block group and returning free space to the
in-memory space cache for pinned extents causes us to double-add a free
range to the space cache. This race exists when free space is cached
from the free space tree (space_cache=v2) or the extent tree
(nospace_cache, or space_cache=v1 if the cache needs to be regenerated).
struct btrfs_block_group::last_byte_to_unpin and struct
btrfs_block_group::progress are supposed to protect against this race,
but commit d0c2f4fa555e ("btrfs: make concurrent fsyncs wait less when
waiting for a transaction commit") subtly broke this by allowing
multiple transactions to be unpinning extents at the same time.

Specifically, the race is as follows:

1. An extent is deleted from an uncached block group in transaction A.
2. btrfs_commit_transaction() is called for transaction A.
3. btrfs_run_delayed_refs() -> __btrfs_free_extent() runs the delayed
   ref for the deleted extent.
4. __btrfs_free_extent() -> do_free_extent_accounting() ->
   add_to_free_space_tree() adds the deleted extent back to the free
   space tree.
5. do_free_extent_accounting() -> btrfs_update_block_group() ->
   btrfs_cache_block_group() queues up the block group to get cached.
   block_group->progress is set to block_group->start.
6. btrfs_commit_transaction() for transaction A calls
   switch_commit_roots(). It sets block_group->last_byte_to_unpin to
   block_group->progress, which is block_group->start because the block
   group hasn't been cached yet.
7. The caching thread gets to our block group. Since the commit roots
   were already switched, load_free_space_tree() sees the deleted extent
   as free and adds it to the space cache. It finishes caching and sets
   block_group->progress to U64_MAX.
8. btrfs_commit_transaction() advances transaction A to
   TRANS_STATE_SUPER_COMMITTED.
9. fsync calls btrfs_commit_transaction() for transaction B. Since
   transaction A is already in TRANS_STATE_SUPER_COMMITTED and the
   commit is for fsync, it advances.
10. btrfs_commit_transaction() for transaction B calls
    switch_commit_roots(). This time, the block group has already been
    cached, so it sets block_group->last_byte_to_unpin to U64_MAX.
11. btrfs_commit_transaction() for transaction A calls
    btrfs_finish_extent_commit(), which calls unpin_extent_range() for
    the deleted extent. It sees last_byte_to_unpin set to U64_MAX (by
    transaction B!), so it adds the deleted extent to the space cache
    again!

This explains all of our symptoms above:

* If the sequence of events is exactly as described above, when the free
  space is re-added in step 11, it will fail with EEXIST.
* If another thread reallocates the deleted extent in between steps 7
  and 11, then step 11 will silently re-add that space to the space
  cache as free even though it is actually allocated. Then, if that
  space is allocated *again*, the free space tree will be corrupted
  (namely, the wrong item will be deleted).
* If we don't catch this free space tree corruption, it will continue
  to get worse as extents are deleted and reallocated.

The v1 space_cache is synchronously loaded when an extent is deleted
(btrfs_update_block_group() with alloc=0 calls btrfs_cache_block_group()
with load_cache_only=1), so it is not normally affected by this bug.
However, as noted above, if we fail to load the space cache, we will
fall back to caching from the extent tree and may hit this bug.

The easiest fix for this race is to also make caching from the free
space tree or extent tree synchronous. Josef tested this and found no
performance regressions.

A few extra changes fall out of this change. Namely, this fix does the
following, with step 2 being the crucial fix:

1. Factor btrfs_caching_ctl_wait_done() out of
   btrfs_wait_block_group_cache_done() to allow waiting on a caching_ctl
   that we already hold a reference to.
2. Change the call in btrfs_cache_block_group() of
   btrfs_wait_space_cache_v1_finished() to
   btrfs_caching_ctl_wait_done(), which makes us wait regardless of the
   space_cache option.
3. Delete the now unused btrfs_wait_space_cache_v1_finished() and
   space_cache_v1_done().
4. Change btrfs_cache_block_group()'s `int load_cache_only` parameter to
   `bool wait` to more accurately describe its new meaning.
5. Change a few callers which had a separate call to
   btrfs_wait_block_group_cache_done() to use wait = true instead.
6. Make btrfs_wait_block_group_cache_done() static now that it's not
   used outside of block-group.c anymore.

Fixes: d0c2f4fa555e ("btrfs: make concurrent fsyncs wait less when waiting for a transaction commit")
CC: stable@vger.kernel.org # 5.12+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/block-group.c |   47 +++++++++++++++--------------------------------
 fs/btrfs/block-group.h |    4 +---
 fs/btrfs/ctree.h       |    1 -
 fs/btrfs/extent-tree.c |   30 ++++++------------------------
 4 files changed, 22 insertions(+), 60 deletions(-)

--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -440,39 +440,26 @@ void btrfs_wait_block_group_cache_progre
 	btrfs_put_caching_control(caching_ctl);
 }
 
-int btrfs_wait_block_group_cache_done(struct btrfs_block_group *cache)
+static int btrfs_caching_ctl_wait_done(struct btrfs_block_group *cache,
+				       struct btrfs_caching_control *caching_ctl)
+{
+	wait_event(caching_ctl->wait, btrfs_block_group_done(cache));
+	return cache->cached == BTRFS_CACHE_ERROR ? -EIO : 0;
+}
+
+static int btrfs_wait_block_group_cache_done(struct btrfs_block_group *cache)
 {
 	struct btrfs_caching_control *caching_ctl;
-	int ret = 0;
+	int ret;
 
 	caching_ctl = btrfs_get_caching_control(cache);
 	if (!caching_ctl)
 		return (cache->cached == BTRFS_CACHE_ERROR) ? -EIO : 0;
-
-	wait_event(caching_ctl->wait, btrfs_block_group_done(cache));
-	if (cache->cached == BTRFS_CACHE_ERROR)
-		ret = -EIO;
+	ret = btrfs_caching_ctl_wait_done(cache, caching_ctl);
 	btrfs_put_caching_control(caching_ctl);
 	return ret;
 }
 
-static bool space_cache_v1_done(struct btrfs_block_group *cache)
-{
-	bool ret;
-
-	spin_lock(&cache->lock);
-	ret = cache->cached != BTRFS_CACHE_FAST;
-	spin_unlock(&cache->lock);
-
-	return ret;
-}
-
-void btrfs_wait_space_cache_v1_finished(struct btrfs_block_group *cache,
-				struct btrfs_caching_control *caching_ctl)
-{
-	wait_event(caching_ctl->wait, space_cache_v1_done(cache));
-}
-
 #ifdef CONFIG_BTRFS_DEBUG
 static void fragment_free_space(struct btrfs_block_group *block_group)
 {
@@ -750,9 +737,8 @@ done:
 	btrfs_put_block_group(block_group);
 }
 
-int btrfs_cache_block_group(struct btrfs_block_group *cache, int load_cache_only)
+int btrfs_cache_block_group(struct btrfs_block_group *cache, bool wait)
 {
-	DEFINE_WAIT(wait);
 	struct btrfs_fs_info *fs_info = cache->fs_info;
 	struct btrfs_caching_control *caching_ctl = NULL;
 	int ret = 0;
@@ -785,10 +771,7 @@ int btrfs_cache_block_group(struct btrfs
 	}
 	WARN_ON(cache->caching_ctl);
 	cache->caching_ctl = caching_ctl;
-	if (btrfs_test_opt(fs_info, SPACE_CACHE))
-		cache->cached = BTRFS_CACHE_FAST;
-	else
-		cache->cached = BTRFS_CACHE_STARTED;
+	cache->cached = BTRFS_CACHE_STARTED;
 	cache->has_caching_ctl = 1;
 	spin_unlock(&cache->lock);
 
@@ -801,8 +784,8 @@ int btrfs_cache_block_group(struct btrfs
 
 	btrfs_queue_work(fs_info->caching_workers, &caching_ctl->work);
 out:
-	if (load_cache_only && caching_ctl)
-		btrfs_wait_space_cache_v1_finished(cache, caching_ctl);
+	if (wait && caching_ctl)
+		ret = btrfs_caching_ctl_wait_done(cache, caching_ctl);
 	if (caching_ctl)
 		btrfs_put_caching_control(caching_ctl);
 
@@ -3313,7 +3296,7 @@ int btrfs_update_block_group(struct btrf
 		 * space back to the block group, otherwise we will leak space.
 		 */
 		if (!alloc && !btrfs_block_group_done(cache))
-			btrfs_cache_block_group(cache, 1);
+			btrfs_cache_block_group(cache, true);
 
 		byte_in_group = bytenr - cache->start;
 		WARN_ON(byte_in_group > cache->length);
--- a/fs/btrfs/block-group.h
+++ b/fs/btrfs/block-group.h
@@ -263,9 +263,7 @@ void btrfs_dec_nocow_writers(struct btrf
 void btrfs_wait_nocow_writers(struct btrfs_block_group *bg);
 void btrfs_wait_block_group_cache_progress(struct btrfs_block_group *cache,
 				           u64 num_bytes);
-int btrfs_wait_block_group_cache_done(struct btrfs_block_group *cache);
-int btrfs_cache_block_group(struct btrfs_block_group *cache,
-			    int load_cache_only);
+int btrfs_cache_block_group(struct btrfs_block_group *cache, bool wait);
 void btrfs_put_caching_control(struct btrfs_caching_control *ctl);
 struct btrfs_caching_control *btrfs_get_caching_control(
 		struct btrfs_block_group *cache);
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -494,7 +494,6 @@ struct btrfs_free_cluster {
 enum btrfs_caching_type {
 	BTRFS_CACHE_NO,
 	BTRFS_CACHE_STARTED,
-	BTRFS_CACHE_FAST,
 	BTRFS_CACHE_FINISHED,
 	BTRFS_CACHE_ERROR,
 };
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2567,17 +2567,10 @@ int btrfs_pin_extent_for_log_replay(stru
 		return -EINVAL;
 
 	/*
-	 * pull in the free space cache (if any) so that our pin
-	 * removes the free space from the cache.  We have load_only set
-	 * to one because the slow code to read in the free extents does check
-	 * the pinned extents.
+	 * Fully cache the free space first so that our pin removes the free space
+	 * from the cache.
 	 */
-	btrfs_cache_block_group(cache, 1);
-	/*
-	 * Make sure we wait until the cache is completely built in case it is
-	 * missing or is invalid and therefore needs to be rebuilt.
-	 */
-	ret = btrfs_wait_block_group_cache_done(cache);
+	ret = btrfs_cache_block_group(cache, true);
 	if (ret)
 		goto out;
 
@@ -2600,12 +2593,7 @@ static int __exclude_logged_extent(struc
 	if (!block_group)
 		return -EINVAL;
 
-	btrfs_cache_block_group(block_group, 1);
-	/*
-	 * Make sure we wait until the cache is completely built in case it is
-	 * missing or is invalid and therefore needs to be rebuilt.
-	 */
-	ret = btrfs_wait_block_group_cache_done(block_group);
+	ret = btrfs_cache_block_group(block_group, true);
 	if (ret)
 		goto out;
 
@@ -4415,7 +4403,7 @@ have_block_group:
 		ffe_ctl->cached = btrfs_block_group_done(block_group);
 		if (unlikely(!ffe_ctl->cached)) {
 			ffe_ctl->have_caching_bg = true;
-			ret = btrfs_cache_block_group(block_group, 0);
+			ret = btrfs_cache_block_group(block_group, false);
 
 			/*
 			 * If we get ENOMEM here or something else we want to
@@ -6169,13 +6157,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *
 
 		if (end - start >= range->minlen) {
 			if (!btrfs_block_group_done(cache)) {
-				ret = btrfs_cache_block_group(cache, 0);
-				if (ret) {
-					bg_failed++;
-					bg_ret = ret;
-					continue;
-				}
-				ret = btrfs_wait_block_group_cache_done(cache);
+				ret = btrfs_cache_block_group(cache, true);
 				if (ret) {
 					bg_failed++;
 					bg_ret = ret;



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 098/158] btrfs: check if root is readonly while setting security xattr
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (96 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 097/158] btrfs: fix space cache corruption and potential double allocations Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 099/158] btrfs: fix possible memory leak in btrfs_get_dev_args_from_path() Greg Kroah-Hartman
                   ` (72 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Qu Wenruo, Filipe Manana,
	Goldwyn Rodrigues, David Sterba

From: Goldwyn Rodrigues <rgoldwyn@suse.de>

commit b51111271b0352aa596c5ae8faf06939e91b3b68 upstream.

For a filesystem which has btrfs read-only property set to true, all
write operations including xattr should be denied. However, security
xattr can still be changed even if btrfs ro property is true.

This happens because xattr_permission() does not have any restrictions
on security.*, system.*  and in some cases trusted.* from VFS and
the decision is left to the underlying filesystem. See comments in
xattr_permission() for more details.

This patch checks if the root is read-only before performing the set
xattr operation.

Testcase:

  DEV=/dev/vdb
  MNT=/mnt

  mkfs.btrfs -f $DEV
  mount $DEV $MNT
  echo "file one" > $MNT/f1

  setfattr -n "security.one" -v 2 $MNT/f1
  btrfs property set /mnt ro true

  setfattr -n "security.one" -v 1 $MNT/f1

  umount $MNT

CC: stable@vger.kernel.org # 4.9+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/xattr.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/fs/btrfs/xattr.c
+++ b/fs/btrfs/xattr.c
@@ -371,6 +371,9 @@ static int btrfs_xattr_handler_set(const
 				   const char *name, const void *buffer,
 				   size_t size, int flags)
 {
+	if (btrfs_root_readonly(BTRFS_I(inode)->root))
+		return -EROFS;
+
 	name = xattr_full_name(handler, name);
 	return btrfs_setxattr_trans(inode, name, buffer, size, flags);
 }



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 099/158] btrfs: fix possible memory leak in btrfs_get_dev_args_from_path()
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (97 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 098/158] btrfs: check if root is readonly while setting security xattr Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 100/158] btrfs: update generation of hole file extent item when merging holes Greg Kroah-Hartman
                   ` (71 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, TOTE Robot, Boris Burkov, Zixuan Fu,
	David Sterba

From: Zixuan Fu <r33s3n6@gmail.com>

commit 9ea0106a7a3d8116860712e3f17cd52ce99f6707 upstream.

In btrfs_get_dev_args_from_path(), btrfs_get_bdev_and_sb() can fail if
the path is invalid. In this case, btrfs_get_dev_args_from_path()
returns directly without freeing args->uuid and args->fsid allocated
before, which causes memory leak.

To fix these possible leaks, when btrfs_get_bdev_and_sb() fails,
btrfs_put_dev_args_from_path() is called to clean up the memory.

Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Fixes: faa775c41d655 ("btrfs: add a btrfs_get_dev_args_from_path helper")
CC: stable@vger.kernel.org # 5.16
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Zixuan Fu <r33s3n6@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/volumes.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2344,8 +2344,11 @@ int btrfs_get_dev_args_from_path(struct
 
 	ret = btrfs_get_bdev_and_sb(path, FMODE_READ, fs_info->bdev_holder, 0,
 				    &bdev, &disk_super);
-	if (ret)
+	if (ret) {
+		btrfs_put_dev_args_from_path(args);
 		return ret;
+	}
+
 	args->devid = btrfs_stack_device_id(&disk_super->dev_item);
 	memcpy(args->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE);
 	if (btrfs_fs_incompat(fs_info, METADATA_UUID))



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 100/158] btrfs: update generation of hole file extent item when merging holes
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (98 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 099/158] btrfs: fix possible memory leak in btrfs_get_dev_args_from_path() Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 101/158] x86/boot: Dont propagate uninitialized boot_params->cc_blob_address Greg Kroah-Hartman
                   ` (70 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Josef Bacik, Filipe Manana, David Sterba

From: Filipe Manana <fdmanana@suse.com>

commit e6e3dec6c3c288d556b991a85d5d8e3ee71e9046 upstream.

When punching a hole into a file range that is adjacent with a hole and we
are not using the no-holes feature, we expand the range of the adjacent
file extent item that represents a hole, to save metadata space.

However we don't update the generation of hole file extent item, which
means a full fsync will not log that file extent item if the fsync happens
in a later transaction (since commit 7f30c07288bb9e ("btrfs: stop copying
old file extents when doing a full fsync")).

For example, if we do this:

    $ mkfs.btrfs -f -O ^no-holes /dev/sdb
    $ mount /dev/sdb /mnt
    $ xfs_io -f -c "pwrite -S 0xab 2M 2M" /mnt/foobar
    $ sync

We end up with 2 file extent items in our file:

1) One that represents the hole for the file range [0, 2M), with a
   generation of 7;

2) Another one that represents an extent covering the range [2M, 4M).

After that if we do the following:

    $ xfs_io -c "fpunch 2M 2M" /mnt/foobar

We end up with a single file extent item in the file, which represents a
hole for the range [0, 4M) and with a generation of 7 - because we end
dropping the data extent for range [2M, 4M) and then update the file
extent item that represented the hole at [0, 2M), by increasing
length from 2M to 4M.

Then doing a full fsync and power failing:

    $ xfs_io -c "fsync" /mnt/foobar
    <power failure>

will result in the full fsync not logging the file extent item that
represents the hole for the range [0, 4M), because its generation is 7,
which is lower than the generation of the current transaction (8).
As a consequence, after mounting again the filesystem (after log replay),
the region [2M, 4M) does not have a hole, it still points to the
previous data extent.

So fix this by always updating the generation of existing file extent
items representing holes when we merge/expand them. This solves the
problem and it's the same approach as when we merge prealloc extents that
got written (at btrfs_mark_extent_written()). Setting the generation to
the current transaction's generation is also what we do when merging
the new hole extent map with the previous one or the next one.

A test case for fstests, covering both cases of hole file extent item
merging (to the left and to the right), will be sent soon.

Fixes: 7f30c07288bb9e ("btrfs: stop copying old file extents when doing a full fsync")
CC: stable@vger.kernel.org # 5.18+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/file.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2483,6 +2483,7 @@ static int fill_holes(struct btrfs_trans
 		btrfs_set_file_extent_num_bytes(leaf, fi, num_bytes);
 		btrfs_set_file_extent_ram_bytes(leaf, fi, num_bytes);
 		btrfs_set_file_extent_offset(leaf, fi, 0);
+		btrfs_set_file_extent_generation(leaf, fi, trans->transid);
 		btrfs_mark_buffer_dirty(leaf);
 		goto out;
 	}
@@ -2499,6 +2500,7 @@ static int fill_holes(struct btrfs_trans
 		btrfs_set_file_extent_num_bytes(leaf, fi, num_bytes);
 		btrfs_set_file_extent_ram_bytes(leaf, fi, num_bytes);
 		btrfs_set_file_extent_offset(leaf, fi, 0);
+		btrfs_set_file_extent_generation(leaf, fi, trans->transid);
 		btrfs_mark_buffer_dirty(leaf);
 		goto out;
 	}



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 101/158] x86/boot: Dont propagate uninitialized boot_params->cc_blob_address
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (99 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 100/158] btrfs: update generation of hole file extent item when merging holes Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 102/158] perf/x86/intel: Fix pebs event constraints for ADL Greg Kroah-Hartman
                   ` (69 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jeremi Piotrowski, watnuss,
	Michael Roth, Borislav Petkov

From: Michael Roth <michael.roth@amd.com>

commit 4b1c742407571eff58b6de9881889f7ca7c4b4dc upstream.

In some cases, bootloaders will leave boot_params->cc_blob_address
uninitialized rather than zeroing it out. This field is only meant to be
set by the boot/compressed kernel in order to pass information to the
uncompressed kernel when SEV-SNP support is enabled.

Therefore, there are no cases where the bootloader-provided values
should be treated as anything other than garbage. Otherwise, the
uncompressed kernel may attempt to access this bogus address, leading to
a crash during early boot.

Normally, sanitize_boot_params() would be used to clear out such fields
but that happens too late: sev_enable() may have already initialized
it to a valid value that should not be zeroed out. Instead, have
sev_enable() zero it out unconditionally beforehand.

Also ensure this happens for !CONFIG_AMD_MEM_ENCRYPT as well by also
including this handling in the sev_enable() stub function.

  [ bp: Massage commit message and comments. ]

Fixes: b190a043c49a ("x86/sev: Add SEV-SNP feature detection/setup")
Reported-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
Reported-by: watnuss@gmx.de
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216387
Link: https://lore.kernel.org/r/20220823160734.89036-1-michael.roth@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/boot/compressed/misc.h | 12 +++++++++++-
 arch/x86/boot/compressed/sev.c  |  8 ++++++++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 4910bf230d7b..62208ec04ca4 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -132,7 +132,17 @@ void snp_set_page_private(unsigned long paddr);
 void snp_set_page_shared(unsigned long paddr);
 void sev_prep_identity_maps(unsigned long top_level_pgt);
 #else
-static inline void sev_enable(struct boot_params *bp) { }
+static inline void sev_enable(struct boot_params *bp)
+{
+	/*
+	 * bp->cc_blob_address should only be set by boot/compressed kernel.
+	 * Initialize it to 0 unconditionally (thus here in this stub too) to
+	 * ensure that uninitialized values from buggy bootloaders aren't
+	 * propagated.
+	 */
+	if (bp)
+		bp->cc_blob_address = 0;
+}
 static inline void sev_es_shutdown_ghcb(void) { }
 static inline bool sev_es_check_ghcb_fault(unsigned long address)
 {
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 52f989f6acc2..c93930d5ccbd 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -276,6 +276,14 @@ void sev_enable(struct boot_params *bp)
 	struct msr m;
 	bool snp;
 
+	/*
+	 * bp->cc_blob_address should only be set by boot/compressed kernel.
+	 * Initialize it to 0 to ensure that uninitialized values from
+	 * buggy bootloaders aren't propagated.
+	 */
+	if (bp)
+		bp->cc_blob_address = 0;
+
 	/*
 	 * Setup/preliminary detection of SNP. This will be sanity-checked
 	 * against CPUID/MSR values later.
-- 
2.37.2




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 102/158] perf/x86/intel: Fix pebs event constraints for ADL
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (100 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 101/158] x86/boot: Dont propagate uninitialized boot_params->cc_blob_address Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 103/158] perf/x86/lbr: Enable the branch type for the Arch LBR by default Greg Kroah-Hartman
                   ` (68 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ammy Yi, Kan Liang, Peter Zijlstra (Intel)

From: Kan Liang <kan.liang@linux.intel.com>

commit cde643ff75bc20c538dfae787ca3b587bab16b50 upstream.

According to the latest event list, the LOAD_LATENCY PEBS event only
works on the GP counter 0 and 1 for ADL and RPL.

Update the pebs event constraints table.

Fixes: f83d2f91d259 ("perf/x86/intel: Add Alder Lake Hybrid support")
Reported-by: Ammy Yi <ammy.yi@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20220818184429.2355857-1-kan.liang@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/events/intel/ds.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -822,7 +822,7 @@ struct event_constraint intel_glm_pebs_e
 
 struct event_constraint intel_grt_pebs_event_constraints[] = {
 	/* Allow all events as PEBS with no flags */
-	INTEL_HYBRID_LAT_CONSTRAINT(0x5d0, 0xf),
+	INTEL_HYBRID_LAT_CONSTRAINT(0x5d0, 0x3),
 	INTEL_HYBRID_LAT_CONSTRAINT(0x6d0, 0xf),
 	EVENT_CONSTRAINT_END
 };



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 103/158] perf/x86/lbr: Enable the branch type for the Arch LBR by default
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (101 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 102/158] perf/x86/intel: Fix pebs event constraints for ADL Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 104/158] x86/entry: Fix entry_INT80_compat for Xen PV guests Greg Kroah-Hartman
                   ` (67 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stephane Eranian, Kan Liang,
	Peter Zijlstra (Intel)

From: Kan Liang <kan.liang@linux.intel.com>

commit 32ba156df1b1c8804a4e5be5339616945eafea22 upstream.

On the platform with Arch LBR, the HW raw branch type encoding may leak
to the perf tool when the SAVE_TYPE option is not set.

In the intel_pmu_store_lbr(), the HW raw branch type is stored in
lbr_entries[].type. If the SAVE_TYPE option is set, the
lbr_entries[].type will be converted into the generic PERF_BR_* type
in the intel_pmu_lbr_filter() and exposed to the user tools.
But if the SAVE_TYPE option is NOT set by the user, the current perf
kernel doesn't clear the field. The HW raw branch type leaks.

There are two solutions to fix the issue for the Arch LBR.
One is to clear the field if the SAVE_TYPE option is NOT set.
The other solution is to unconditionally convert the branch type and
expose the generic type to the user tools.

The latter is implemented here, because
- The branch type is valuable information. I don't see a case where
  you would not benefit from the branch type. (Stephane Eranian)
- Not having the branch type DOES NOT save any space in the
  branch record (Stephane Eranian)
- The Arch LBR HW can retrieve the common branch types from the
  LBR_INFO. It doesn't require the high overhead SW disassemble.

Fixes: 47125db27e47 ("perf/x86/intel/lbr: Support Architectural LBR")
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20220816125612.2042397-1-kan.liang@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/events/intel/lbr.c |    8 ++++++++
 1 file changed, 8 insertions(+)

--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -1097,6 +1097,14 @@ static int intel_pmu_setup_hw_lbr_filter
 
 	if (static_cpu_has(X86_FEATURE_ARCH_LBR)) {
 		reg->config = mask;
+
+		/*
+		 * The Arch LBR HW can retrieve the common branch types
+		 * from the LBR_INFO. It doesn't require the high overhead
+		 * SW disassemble.
+		 * Enable the branch type by default for the Arch LBR.
+		 */
+		reg->reg |= X86_BR_TYPE_SAVE;
 		return 0;
 	}
 



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 104/158] x86/entry: Fix entry_INT80_compat for Xen PV guests
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (102 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 103/158] perf/x86/lbr: Enable the branch type for the Arch LBR by default Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 105/158] x86/unwind/orc: Unwind ftrace trampolines with correct ORC entry Greg Kroah-Hartman
                   ` (66 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Juergen Gross, Borislav Petkov, Jan Beulich

From: Juergen Gross <jgross@suse.com>

commit 5b9f0c4df1c1152403c738373fb063e9ffdac0a1 upstream.

Commit

  c89191ce67ef ("x86/entry: Convert SWAPGS to swapgs and remove the definition of SWAPGS")

missed one use case of SWAPGS in entry_INT80_compat(). Removing of
the SWAPGS macro led to asm just using "swapgs", as it is accepting
instructions in capital letters, too.

This in turn leads to splats in Xen PV guests like:

  [   36.145223] general protection fault, maybe for address 0x2d: 0000 [#1] PREEMPT SMP NOPTI
  [   36.145794] CPU: 2 PID: 1847 Comm: ld-linux.so.2 Not tainted 5.19.1-1-default #1 \
	  openSUSE Tumbleweed f3b44bfb672cdb9f235aff53b57724eba8b9411b
  [   36.146608] Hardware name: HP ProLiant ML350p Gen8, BIOS P72 11/14/2013
  [   36.148126] RIP: e030:entry_INT80_compat+0x3/0xa3

Fix that by open coding this single instance of the SWAPGS macro.

Fixes: c89191ce67ef ("x86/entry: Convert SWAPGS to swapgs and remove the definition of SWAPGS")
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Cc: <stable@vger.kernel.org> # 5.19
Link: https://lore.kernel.org/r/20220816071137.4893-1-jgross@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/entry/entry_64_compat.S |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -311,7 +311,7 @@ SYM_CODE_START(entry_INT80_compat)
 	 * Interrupts are off on entry.
 	 */
 	ASM_CLAC			/* Do this early to minimize exposure */
-	SWAPGS
+	ALTERNATIVE "swapgs", "", X86_FEATURE_XENPV
 
 	/*
 	 * User tracing code (ptrace or signal handlers) might assume that



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 105/158] x86/unwind/orc: Unwind ftrace trampolines with correct ORC entry
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (103 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 104/158] x86/entry: Fix entry_INT80_compat for Xen PV guests Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 106/158] x86/sev: Dont use cc_platform_has() for early SEV-SNP calls Greg Kroah-Hartman
                   ` (65 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Chen Zhongjin, Ingo Molnar,
	Steven Rostedt (Google)

From: Chen Zhongjin <chenzhongjin@huawei.com>

commit fc2e426b1161761561624ebd43ce8c8d2fa058da upstream.

When meeting ftrace trampolines in ORC unwinding, unwinder uses address
of ftrace_{regs_}call address to find the ORC entry, which gets next frame at
sp+176.

If there is an IRQ hitting at sub $0xa8,%rsp, the next frame should be
sp+8 instead of 176. It makes unwinder skip correct frame and throw
warnings such as "wrong direction" or "can't access registers", etc,
depending on the content of the incorrect frame address.

By adding the base address ftrace_{regs_}caller with the offset
*ip - ops->trampoline*, we can get the correct address to find the ORC entry.

Also change "caller" to "tramp_addr" to make variable name conform to
its content.

[ mingo: Clarified the changelog a bit. ]

Fixes: 6be7fa3c74d1 ("ftrace, orc, x86: Handle ftrace dynamically allocated trampolines")
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220819084334.244016-1-chenzhongjin@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/unwind_orc.c |   15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -93,22 +93,27 @@ static struct orc_entry *orc_find(unsign
 static struct orc_entry *orc_ftrace_find(unsigned long ip)
 {
 	struct ftrace_ops *ops;
-	unsigned long caller;
+	unsigned long tramp_addr, offset;
 
 	ops = ftrace_ops_trampoline(ip);
 	if (!ops)
 		return NULL;
 
+	/* Set tramp_addr to the start of the code copied by the trampoline */
 	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS)
-		caller = (unsigned long)ftrace_regs_call;
+		tramp_addr = (unsigned long)ftrace_regs_caller;
 	else
-		caller = (unsigned long)ftrace_call;
+		tramp_addr = (unsigned long)ftrace_caller;
+
+	/* Now place tramp_addr to the location within the trampoline ip is at */
+	offset = ip - ops->trampoline;
+	tramp_addr += offset;
 
 	/* Prevent unlikely recursion */
-	if (ip == caller)
+	if (ip == tramp_addr)
 		return NULL;
 
-	return orc_find(caller);
+	return orc_find(tramp_addr);
 }
 #else
 static struct orc_entry *orc_ftrace_find(unsigned long ip)



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 106/158] x86/sev: Dont use cc_platform_has() for early SEV-SNP calls
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (104 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 105/158] x86/unwind/orc: Unwind ftrace trampolines with correct ORC entry Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 107/158] x86/bugs: Add "unknown" reporting for MMIO Stale Data Greg Kroah-Hartman
                   ` (64 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Sean Christopherson, Tom Lendacky,
	Borislav Petkov

From: Tom Lendacky <thomas.lendacky@amd.com>

commit cdaa0a407f1acd3a44861e3aea6e3c7349e668f1 upstream.

When running identity-mapped and depending on the kernel configuration,
it is possible that the compiler uses jump tables when generating code
for cc_platform_has().

This causes a boot failure because the jump table uses un-mapped kernel
virtual addresses, not identity-mapped addresses. This has been seen
with CONFIG_RETPOLINE=n.

Similar to sme_encrypt_kernel(), use an open-coded direct check for the
status of SNP rather than trying to eliminate the jump table. This
preserves any code optimization in cc_platform_has() that can be useful
post boot. It also limits the changes to SEV-specific files so that
future compiler features won't necessarily require possible build changes
just because they are not compatible with running identity-mapped.

  [ bp: Massage commit message. ]

Fixes: 5e5ccff60a29 ("x86/sev: Add helper for validating pages in early enc attribute changes")
Reported-by: Sean Christopherson <seanjc@google.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 5.19.x
Link: https://lore.kernel.org/all/YqfabnTRxFSM+LoX@google.com/
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/sev.c |   16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -701,7 +701,13 @@ e_term:
 void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
 					 unsigned int npages)
 {
-	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+	/*
+	 * This can be invoked in early boot while running identity mapped, so
+	 * use an open coded check for SNP instead of using cc_platform_has().
+	 * This eliminates worries about jump tables or checking boot_cpu_data
+	 * in the cc_platform_has() function.
+	 */
+	if (!(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
 		return;
 
 	 /*
@@ -717,7 +723,13 @@ void __init early_snp_set_memory_private
 void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
 					unsigned int npages)
 {
-	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+	/*
+	 * This can be invoked in early boot while running identity mapped, so
+	 * use an open coded check for SNP instead of using cc_platform_has().
+	 * This eliminates worries about jump tables or checking boot_cpu_data
+	 * in the cc_platform_has() function.
+	 */
+	if (!(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
 		return;
 
 	/* Invalidate the memory pages before they are marked shared in the RMP table. */



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 107/158] x86/bugs: Add "unknown" reporting for MMIO Stale Data
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (105 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 106/158] x86/sev: Dont use cc_platform_has() for early SEV-SNP calls Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 108/158] x86/nospec: Unwreck the RSB stuffing Greg Kroah-Hartman
                   ` (63 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andrew Cooper, Tony Luck,
	Pawan Gupta, Borislav Petkov

From: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>

commit 7df548840c496b0141fb2404b889c346380c2b22 upstream.

Older Intel CPUs that are not in the affected processor list for MMIO
Stale Data vulnerabilities currently report "Not affected" in sysfs,
which may not be correct. Vulnerability status for these older CPUs is
unknown.

Add known-not-affected CPUs to the whitelist. Report "unknown"
mitigation status for CPUs that are not in blacklist, whitelist and also
don't enumerate MSR ARCH_CAPABILITIES bits that reflect hardware
immunity to MMIO Stale Data vulnerabilities.

Mitigation is not deployed when the status is unknown.

  [ bp: Massage, fixup. ]

Fixes: 8d50cdf8b834 ("x86/speculation/mmio: Add sysfs reporting for Processor MMIO Stale Data")
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Suggested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/a932c154772f2121794a5f2eded1a11013114711.1657846269.git.pawan.kumar.gupta@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst |   14 +++
 arch/x86/include/asm/cpufeatures.h                              |    5 -
 arch/x86/kernel/cpu/bugs.c                                      |   14 ++-
 arch/x86/kernel/cpu/common.c                                    |   42 ++++++----
 4 files changed, 56 insertions(+), 19 deletions(-)

--- a/Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst
+++ b/Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst
@@ -230,6 +230,20 @@ The possible values in this file are:
      * - 'Mitigation: Clear CPU buffers'
        - The processor is vulnerable and the CPU buffer clearing mitigation is
          enabled.
+     * - 'Unknown: No mitigations'
+       - The processor vulnerability status is unknown because it is
+	 out of Servicing period. Mitigation is not attempted.
+
+Definitions:
+------------
+
+Servicing period: The process of providing functional and security updates to
+Intel processors or platforms, utilizing the Intel Platform Update (IPU)
+process or other similar mechanisms.
+
+End of Servicing Updates (ESU): ESU is the date at which Intel will no
+longer provide Servicing, such as through IPU or other similar update
+processes. ESU dates will typically be aligned to end of quarter.
 
 If the processor is vulnerable then the following information is appended to
 the above information:
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -456,7 +456,8 @@
 #define X86_BUG_ITLB_MULTIHIT		X86_BUG(23) /* CPU may incur MCE during certain page attribute changes */
 #define X86_BUG_SRBDS			X86_BUG(24) /* CPU may leak RNG bits if not mitigated */
 #define X86_BUG_MMIO_STALE_DATA		X86_BUG(25) /* CPU is affected by Processor MMIO Stale Data vulnerabilities */
-#define X86_BUG_RETBLEED		X86_BUG(26) /* CPU is affected by RETBleed */
-#define X86_BUG_EIBRS_PBRSB		X86_BUG(27) /* EIBRS is vulnerable to Post Barrier RSB Predictions */
+#define X86_BUG_MMIO_UNKNOWN		X86_BUG(26) /* CPU is too old and its MMIO Stale Data status is unknown */
+#define X86_BUG_RETBLEED		X86_BUG(27) /* CPU is affected by RETBleed */
+#define X86_BUG_EIBRS_PBRSB		X86_BUG(28) /* EIBRS is vulnerable to Post Barrier RSB Predictions */
 
 #endif /* _ASM_X86_CPUFEATURES_H */
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -433,7 +433,8 @@ static void __init mmio_select_mitigatio
 	u64 ia32_cap;
 
 	if (!boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA) ||
-	    cpu_mitigations_off()) {
+	     boot_cpu_has_bug(X86_BUG_MMIO_UNKNOWN) ||
+	     cpu_mitigations_off()) {
 		mmio_mitigation = MMIO_MITIGATION_OFF;
 		return;
 	}
@@ -538,6 +539,8 @@ out:
 		pr_info("TAA: %s\n", taa_strings[taa_mitigation]);
 	if (boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA))
 		pr_info("MMIO Stale Data: %s\n", mmio_strings[mmio_mitigation]);
+	else if (boot_cpu_has_bug(X86_BUG_MMIO_UNKNOWN))
+		pr_info("MMIO Stale Data: Unknown: No mitigations\n");
 }
 
 static void __init md_clear_select_mitigation(void)
@@ -2275,6 +2278,9 @@ static ssize_t tsx_async_abort_show_stat
 
 static ssize_t mmio_stale_data_show_state(char *buf)
 {
+	if (boot_cpu_has_bug(X86_BUG_MMIO_UNKNOWN))
+		return sysfs_emit(buf, "Unknown: No mitigations\n");
+
 	if (mmio_mitigation == MMIO_MITIGATION_OFF)
 		return sysfs_emit(buf, "%s\n", mmio_strings[mmio_mitigation]);
 
@@ -2421,6 +2427,7 @@ static ssize_t cpu_show_common(struct de
 		return srbds_show_state(buf);
 
 	case X86_BUG_MMIO_STALE_DATA:
+	case X86_BUG_MMIO_UNKNOWN:
 		return mmio_stale_data_show_state(buf);
 
 	case X86_BUG_RETBLEED:
@@ -2480,7 +2487,10 @@ ssize_t cpu_show_srbds(struct device *de
 
 ssize_t cpu_show_mmio_stale_data(struct device *dev, struct device_attribute *attr, char *buf)
 {
-	return cpu_show_common(dev, attr, buf, X86_BUG_MMIO_STALE_DATA);
+	if (boot_cpu_has_bug(X86_BUG_MMIO_UNKNOWN))
+		return cpu_show_common(dev, attr, buf, X86_BUG_MMIO_UNKNOWN);
+	else
+		return cpu_show_common(dev, attr, buf, X86_BUG_MMIO_STALE_DATA);
 }
 
 ssize_t cpu_show_retbleed(struct device *dev, struct device_attribute *attr, char *buf)
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1135,7 +1135,8 @@ static void identify_cpu_without_cpuid(s
 #define NO_SWAPGS		BIT(6)
 #define NO_ITLB_MULTIHIT	BIT(7)
 #define NO_SPECTRE_V2		BIT(8)
-#define NO_EIBRS_PBRSB		BIT(9)
+#define NO_MMIO			BIT(9)
+#define NO_EIBRS_PBRSB		BIT(10)
 
 #define VULNWL(vendor, family, model, whitelist)	\
 	X86_MATCH_VENDOR_FAM_MODEL(vendor, family, model, whitelist)
@@ -1158,6 +1159,11 @@ static const __initconst struct x86_cpu_
 	VULNWL(VORTEX,	6, X86_MODEL_ANY,	NO_SPECULATION),
 
 	/* Intel Family 6 */
+	VULNWL_INTEL(TIGERLAKE,			NO_MMIO),
+	VULNWL_INTEL(TIGERLAKE_L,		NO_MMIO),
+	VULNWL_INTEL(ALDERLAKE,			NO_MMIO),
+	VULNWL_INTEL(ALDERLAKE_L,		NO_MMIO),
+
 	VULNWL_INTEL(ATOM_SALTWELL,		NO_SPECULATION | NO_ITLB_MULTIHIT),
 	VULNWL_INTEL(ATOM_SALTWELL_TABLET,	NO_SPECULATION | NO_ITLB_MULTIHIT),
 	VULNWL_INTEL(ATOM_SALTWELL_MID,		NO_SPECULATION | NO_ITLB_MULTIHIT),
@@ -1176,9 +1182,9 @@ static const __initconst struct x86_cpu_
 	VULNWL_INTEL(ATOM_AIRMONT_MID,		NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
 	VULNWL_INTEL(ATOM_AIRMONT_NP,		NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
 
-	VULNWL_INTEL(ATOM_GOLDMONT,		NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_INTEL(ATOM_GOLDMONT_D,		NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_INTEL(ATOM_GOLDMONT_PLUS,	NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_EIBRS_PBRSB),
+	VULNWL_INTEL(ATOM_GOLDMONT,		NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
+	VULNWL_INTEL(ATOM_GOLDMONT_D,		NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
+	VULNWL_INTEL(ATOM_GOLDMONT_PLUS,	NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB),
 
 	/*
 	 * Technically, swapgs isn't serializing on AMD (despite it previously
@@ -1193,18 +1199,18 @@ static const __initconst struct x86_cpu_
 	VULNWL_INTEL(ATOM_TREMONT_D,		NO_ITLB_MULTIHIT | NO_EIBRS_PBRSB),
 
 	/* AMD Family 0xf - 0x12 */
-	VULNWL_AMD(0x0f,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_AMD(0x10,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_AMD(0x11,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_AMD(0x12,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNWL_AMD(0x0f,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
+	VULNWL_AMD(0x10,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
+	VULNWL_AMD(0x11,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
+	VULNWL_AMD(0x12,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
 
 	/* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */
-	VULNWL_AMD(X86_FAMILY_ANY,	NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_HYGON(X86_FAMILY_ANY,	NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNWL_AMD(X86_FAMILY_ANY,	NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
+	VULNWL_HYGON(X86_FAMILY_ANY,	NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
 
 	/* Zhaoxin Family 7 */
-	VULNWL(CENTAUR,	7, X86_MODEL_ANY,	NO_SPECTRE_V2 | NO_SWAPGS),
-	VULNWL(ZHAOXIN,	7, X86_MODEL_ANY,	NO_SPECTRE_V2 | NO_SWAPGS),
+	VULNWL(CENTAUR,	7, X86_MODEL_ANY,	NO_SPECTRE_V2 | NO_SWAPGS | NO_MMIO),
+	VULNWL(ZHAOXIN,	7, X86_MODEL_ANY,	NO_SPECTRE_V2 | NO_SWAPGS | NO_MMIO),
 	{}
 };
 
@@ -1358,10 +1364,16 @@ static void __init cpu_set_bug_bits(stru
 	 * Affected CPU list is generally enough to enumerate the vulnerability,
 	 * but for virtualization case check for ARCH_CAP MSR bits also, VMM may
 	 * not want the guest to enumerate the bug.
+	 *
+	 * Set X86_BUG_MMIO_UNKNOWN for CPUs that are neither in the blacklist,
+	 * nor in the whitelist and also don't enumerate MSR ARCH_CAP MMIO bits.
 	 */
-	if (cpu_matches(cpu_vuln_blacklist, MMIO) &&
-	    !arch_cap_mmio_immune(ia32_cap))
-		setup_force_cpu_bug(X86_BUG_MMIO_STALE_DATA);
+	if (!arch_cap_mmio_immune(ia32_cap)) {
+		if (cpu_matches(cpu_vuln_blacklist, MMIO))
+			setup_force_cpu_bug(X86_BUG_MMIO_STALE_DATA);
+		else if (!cpu_matches(cpu_vuln_whitelist, NO_MMIO))
+			setup_force_cpu_bug(X86_BUG_MMIO_UNKNOWN);
+	}
 
 	if (!cpu_has(c, X86_FEATURE_BTC_NO)) {
 		if (cpu_matches(cpu_vuln_blacklist, RETBLEED) || (ia32_cap & ARCH_CAP_RSBA))



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 108/158] x86/nospec: Unwreck the RSB stuffing
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (106 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 107/158] x86/bugs: Add "unknown" reporting for MMIO Stale Data Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 109/158] x86/PAT: Have pat_enabled() properly reflect state when running on Xen Greg Kroah-Hartman
                   ` (62 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Peter Zijlstra (Intel)

From: Peter Zijlstra <peterz@infradead.org>

commit 4e3aa9238277597c6c7624f302d81a7b568b6f2d upstream.

Commit 2b1299322016 ("x86/speculation: Add RSB VM Exit protections")
made a right mess of the RSB stuffing, rewrite the whole thing to not
suck.

Thanks to Andrew for the enlightening comment about Post-Barrier RSB
things so we can make this code less magical.

Cc: stable@vger.kernel.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/YvuNdDWoUZSBjYcm@worktop.programming.kicks-ass.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/nospec-branch.h |   80 +++++++++++++++++------------------
 1 file changed, 39 insertions(+), 41 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -35,33 +35,44 @@
 #define RSB_CLEAR_LOOPS		32	/* To forcibly overwrite all entries */
 
 /*
+ * Common helper for __FILL_RETURN_BUFFER and __FILL_ONE_RETURN.
+ */
+#define __FILL_RETURN_SLOT			\
+	ANNOTATE_INTRA_FUNCTION_CALL;		\
+	call	772f;				\
+	int3;					\
+772:
+
+/*
+ * Stuff the entire RSB.
+ *
  * Google experimented with loop-unrolling and this turned out to be
  * the optimal version - two calls, each with their own speculation
  * trap should their return address end up getting used, in a loop.
  */
-#define __FILL_RETURN_BUFFER(reg, nr, sp)	\
-	mov	$(nr/2), reg;			\
-771:						\
-	ANNOTATE_INTRA_FUNCTION_CALL;		\
-	call	772f;				\
-773:	/* speculation trap */			\
-	UNWIND_HINT_EMPTY;			\
-	pause;					\
-	lfence;					\
-	jmp	773b;				\
-772:						\
-	ANNOTATE_INTRA_FUNCTION_CALL;		\
-	call	774f;				\
-775:	/* speculation trap */			\
-	UNWIND_HINT_EMPTY;			\
-	pause;					\
-	lfence;					\
-	jmp	775b;				\
-774:						\
-	add	$(BITS_PER_LONG/8) * 2, sp;	\
-	dec	reg;				\
-	jnz	771b;				\
-	/* barrier for jnz misprediction */	\
+#define __FILL_RETURN_BUFFER(reg, nr)			\
+	mov	$(nr/2), reg;				\
+771:							\
+	__FILL_RETURN_SLOT				\
+	__FILL_RETURN_SLOT				\
+	add	$(BITS_PER_LONG/8) * 2, %_ASM_SP;	\
+	dec	reg;					\
+	jnz	771b;					\
+	/* barrier for jnz misprediction */		\
+	lfence;
+
+/*
+ * Stuff a single RSB slot.
+ *
+ * To mitigate Post-Barrier RSB speculation, one CALL instruction must be
+ * forced to retire before letting a RET instruction execute.
+ *
+ * On PBRSB-vulnerable CPUs, it is not safe for a RET to be executed
+ * before this point.
+ */
+#define __FILL_ONE_RETURN				\
+	__FILL_RETURN_SLOT				\
+	add	$(BITS_PER_LONG/8), %_ASM_SP;		\
 	lfence;
 
 #ifdef __ASSEMBLY__
@@ -120,28 +131,15 @@
 #endif
 .endm
 
-.macro ISSUE_UNBALANCED_RET_GUARD
-	ANNOTATE_INTRA_FUNCTION_CALL
-	call .Lunbalanced_ret_guard_\@
-	int3
-.Lunbalanced_ret_guard_\@:
-	add $(BITS_PER_LONG/8), %_ASM_SP
-	lfence
-.endm
-
  /*
   * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP
   * monstrosity above, manually.
   */
-.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req ftr2
-.ifb \ftr2
-	ALTERNATIVE "jmp .Lskip_rsb_\@", "", \ftr
-.else
-	ALTERNATIVE_2 "jmp .Lskip_rsb_\@", "", \ftr, "jmp .Lunbalanced_\@", \ftr2
-.endif
-	__FILL_RETURN_BUFFER(\reg,\nr,%_ASM_SP)
-.Lunbalanced_\@:
-	ISSUE_UNBALANCED_RET_GUARD
+.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req ftr2=ALT_NOT(X86_FEATURE_ALWAYS)
+	ALTERNATIVE_2 "jmp .Lskip_rsb_\@", \
+		__stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr, \
+		__stringify(__FILL_ONE_RETURN), \ftr2
+
 .Lskip_rsb_\@:
 .endm
 



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 109/158] x86/PAT: Have pat_enabled() properly reflect state when running on Xen
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (107 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 108/158] x86/nospec: Unwreck the RSB stuffing Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-09-02 18:16   ` Chuck Zmudzinski
  2022-08-29 10:59 ` [PATCH 5.19 110/158] loop: Check for overflow while configuring loop Greg Kroah-Hartman
                   ` (61 subsequent siblings)
  170 siblings, 1 reply; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jan Beulich, Borislav Petkov,
	Ingo Molnar, Juergen Gross, Lucas De Marchi

From: Jan Beulich <jbeulich@suse.com>

commit 72cbc8f04fe2fa93443c0fcccb7ad91dfea3d9ce upstream.

After commit ID in the Fixes: tag, pat_enabled() returns false (because
of PAT initialization being suppressed in the absence of MTRRs being
announced to be available).

This has become a problem: the i915 driver now fails to initialize when
running PV on Xen (i915_gem_object_pin_map() is where I located the
induced failure), and its error handling is flaky enough to (at least
sometimes) result in a hung system.

Yet even beyond that problem the keying of the use of WC mappings to
pat_enabled() (see arch_can_pci_mmap_wc()) means that in particular
graphics frame buffer accesses would have been quite a bit less optimal
than possible.

Arrange for the function to return true in such environments, without
undermining the rest of PAT MSR management logic considering PAT to be
disabled: specifically, no writes to the PAT MSR should occur.

For the new boolean to live in .init.data, init_cache_modes() also needs
moving to .init.text (where it could/should have lived already before).

  [ bp: This is the "small fix" variant for stable. It'll get replaced
    with a proper PAT and MTRR detection split upstream but that is too
    involved for a stable backport.
    - additional touchups to commit msg. Use cpu_feature_enabled(). ]

Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: <stable@vger.kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/mm/pat/memtype.c |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

--- a/arch/x86/mm/pat/memtype.c
+++ b/arch/x86/mm/pat/memtype.c
@@ -62,6 +62,7 @@
 
 static bool __read_mostly pat_bp_initialized;
 static bool __read_mostly pat_disabled = !IS_ENABLED(CONFIG_X86_PAT);
+static bool __initdata pat_force_disabled = !IS_ENABLED(CONFIG_X86_PAT);
 static bool __read_mostly pat_bp_enabled;
 static bool __read_mostly pat_cm_initialized;
 
@@ -86,6 +87,7 @@ void pat_disable(const char *msg_reason)
 static int __init nopat(char *str)
 {
 	pat_disable("PAT support disabled via boot option.");
+	pat_force_disabled = true;
 	return 0;
 }
 early_param("nopat", nopat);
@@ -272,7 +274,7 @@ static void pat_ap_init(u64 pat)
 	wrmsrl(MSR_IA32_CR_PAT, pat);
 }
 
-void init_cache_modes(void)
+void __init init_cache_modes(void)
 {
 	u64 pat = 0;
 
@@ -313,6 +315,12 @@ void init_cache_modes(void)
 		 */
 		pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) |
 		      PAT(4, WB) | PAT(5, WT) | PAT(6, UC_MINUS) | PAT(7, UC);
+	} else if (!pat_force_disabled && cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) {
+		/*
+		 * Clearly PAT is enabled underneath. Allow pat_enabled() to
+		 * reflect this.
+		 */
+		pat_bp_enabled = true;
 	}
 
 	__init_cache_modes(pat);



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 110/158] loop: Check for overflow while configuring loop
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (108 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 109/158] x86/PAT: Have pat_enabled() properly reflect state when running on Xen Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 111/158] writeback: avoid use-after-free after removing device Greg Kroah-Hartman
                   ` (60 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Matthew Wilcox (Oracle),
	Siddh Raman Pant, Christoph Hellwig, Jens Axboe,
	syzbot+a8e049cd3abd342936b6

From: Siddh Raman Pant <code@siddh.me>

commit c490a0b5a4f36da3918181a8acdc6991d967c5f3 upstream.

The userspace can configure a loop using an ioctl call, wherein
a configuration of type loop_config is passed (see lo_ioctl()'s
case on line 1550 of drivers/block/loop.c). This proceeds to call
loop_configure() which in turn calls loop_set_status_from_info()
(see line 1050 of loop.c), passing &config->info which is of type
loop_info64*. This function then sets the appropriate values, like
the offset.

loop_device has lo_offset of type loff_t (see line 52 of loop.c),
which is typdef-chained to long long, whereas loop_info64 has
lo_offset of type __u64 (see line 56 of include/uapi/linux/loop.h).

The function directly copies offset from info to the device as
follows (See line 980 of loop.c):
	lo->lo_offset = info->lo_offset;

This results in an overflow, which triggers a warning in iomap_iter()
due to a call to iomap_iter_done() which has:
	WARN_ON_ONCE(iter->iomap.offset > iter->pos);

Thus, check for negative value during loop_set_status_from_info().

Bug report: https://syzkaller.appspot.com/bug?id=c620fe14aac810396d3c3edc9ad73848bf69a29e

Reported-and-tested-by: syzbot+a8e049cd3abd342936b6@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Siddh Raman Pant <code@siddh.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220823160810.181275-1-code@siddh.me
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/block/loop.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -979,6 +979,11 @@ loop_set_status_from_info(struct loop_de
 
 	lo->lo_offset = info->lo_offset;
 	lo->lo_sizelimit = info->lo_sizelimit;
+
+	/* loff_t vars have been assigned __u64 */
+	if (lo->lo_offset < 0 || lo->lo_sizelimit < 0)
+		return -EOVERFLOW;
+
 	memcpy(lo->lo_file_name, info->lo_file_name, LO_NAME_SIZE);
 	lo->lo_file_name[LO_NAME_SIZE-1] = 0;
 	lo->lo_flags = info->lo_flags;



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 111/158] writeback: avoid use-after-free after removing device
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (109 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 110/158] loop: Check for overflow while configuring loop Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 112/158] audit: move audit_return_fixup before the filters Greg Kroah-Hartman
                   ` (59 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Khazhismel Kumykov, Jan Kara,
	Michael Stapelberg, Wu Fengguang, Alexander Viro, Andrew Morton

From: Khazhismel Kumykov <khazhy@chromium.org>

commit f87904c075515f3e1d8f4a7115869d3b914674fd upstream.

When a disk is removed, bdi_unregister gets called to stop further
writeback and wait for associated delayed work to complete.  However,
wb_inode_writeback_end() may schedule bandwidth estimation dwork after
this has completed, which can result in the timer attempting to access the
just freed bdi_writeback.

Fix this by checking if the bdi_writeback is alive, similar to when
scheduling writeback work.

Since this requires wb->work_lock, and wb_inode_writeback_end() may get
called from interrupt, switch wb->work_lock to an irqsafe lock.

Link: https://lkml.kernel.org/r/20220801155034.3772543-1-khazhy@google.com
Fixes: 45a2966fd641 ("writeback: fix bandwidth estimate for spiky workload")
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Michael Stapelberg <stapelberg+linux@google.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/fs-writeback.c   |   12 ++++++------
 mm/backing-dev.c    |   10 +++++-----
 mm/page-writeback.c |    6 +++++-
 3 files changed, 16 insertions(+), 12 deletions(-)

--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -134,10 +134,10 @@ static bool inode_io_list_move_locked(st
 
 static void wb_wakeup(struct bdi_writeback *wb)
 {
-	spin_lock_bh(&wb->work_lock);
+	spin_lock_irq(&wb->work_lock);
 	if (test_bit(WB_registered, &wb->state))
 		mod_delayed_work(bdi_wq, &wb->dwork, 0);
-	spin_unlock_bh(&wb->work_lock);
+	spin_unlock_irq(&wb->work_lock);
 }
 
 static void finish_writeback_work(struct bdi_writeback *wb,
@@ -164,7 +164,7 @@ static void wb_queue_work(struct bdi_wri
 	if (work->done)
 		atomic_inc(&work->done->cnt);
 
-	spin_lock_bh(&wb->work_lock);
+	spin_lock_irq(&wb->work_lock);
 
 	if (test_bit(WB_registered, &wb->state)) {
 		list_add_tail(&work->list, &wb->work_list);
@@ -172,7 +172,7 @@ static void wb_queue_work(struct bdi_wri
 	} else
 		finish_writeback_work(wb, work);
 
-	spin_unlock_bh(&wb->work_lock);
+	spin_unlock_irq(&wb->work_lock);
 }
 
 /**
@@ -2082,13 +2082,13 @@ static struct wb_writeback_work *get_nex
 {
 	struct wb_writeback_work *work = NULL;
 
-	spin_lock_bh(&wb->work_lock);
+	spin_lock_irq(&wb->work_lock);
 	if (!list_empty(&wb->work_list)) {
 		work = list_entry(wb->work_list.next,
 				  struct wb_writeback_work, list);
 		list_del_init(&work->list);
 	}
-	spin_unlock_bh(&wb->work_lock);
+	spin_unlock_irq(&wb->work_lock);
 	return work;
 }
 
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -260,10 +260,10 @@ void wb_wakeup_delayed(struct bdi_writeb
 	unsigned long timeout;
 
 	timeout = msecs_to_jiffies(dirty_writeback_interval * 10);
-	spin_lock_bh(&wb->work_lock);
+	spin_lock_irq(&wb->work_lock);
 	if (test_bit(WB_registered, &wb->state))
 		queue_delayed_work(bdi_wq, &wb->dwork, timeout);
-	spin_unlock_bh(&wb->work_lock);
+	spin_unlock_irq(&wb->work_lock);
 }
 
 static void wb_update_bandwidth_workfn(struct work_struct *work)
@@ -334,12 +334,12 @@ static void cgwb_remove_from_bdi_list(st
 static void wb_shutdown(struct bdi_writeback *wb)
 {
 	/* Make sure nobody queues further work */
-	spin_lock_bh(&wb->work_lock);
+	spin_lock_irq(&wb->work_lock);
 	if (!test_and_clear_bit(WB_registered, &wb->state)) {
-		spin_unlock_bh(&wb->work_lock);
+		spin_unlock_irq(&wb->work_lock);
 		return;
 	}
-	spin_unlock_bh(&wb->work_lock);
+	spin_unlock_irq(&wb->work_lock);
 
 	cgwb_remove_from_bdi_list(wb);
 	/*
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2867,6 +2867,7 @@ static void wb_inode_writeback_start(str
 
 static void wb_inode_writeback_end(struct bdi_writeback *wb)
 {
+	unsigned long flags;
 	atomic_dec(&wb->writeback_inodes);
 	/*
 	 * Make sure estimate of writeback throughput gets updated after
@@ -2875,7 +2876,10 @@ static void wb_inode_writeback_end(struc
 	 * that if multiple inodes end writeback at a similar time, they get
 	 * batched into one bandwidth update.
 	 */
-	queue_delayed_work(bdi_wq, &wb->bw_dwork, BANDWIDTH_INTERVAL);
+	spin_lock_irqsave(&wb->work_lock, flags);
+	if (test_bit(WB_registered, &wb->state))
+		queue_delayed_work(bdi_wq, &wb->bw_dwork, BANDWIDTH_INTERVAL);
+	spin_unlock_irqrestore(&wb->work_lock, flags);
 }
 
 bool __folio_end_writeback(struct folio *folio)



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 112/158] audit: move audit_return_fixup before the filters
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (110 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 111/158] writeback: avoid use-after-free after removing device Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 113/158] asm-generic: sections: refactor memory_intersects Greg Kroah-Hartman
                   ` (58 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Richard Guy Briggs, Paul Moore

From: Richard Guy Briggs <rgb@redhat.com>

commit d4fefa4801a1c2f9c0c7a48fbb0fdf384e89a4ab upstream.

The success and return_code are needed by the filters.  Move
audit_return_fixup() before the filters.  This was causing syscall
auditing events to be missed.

Link: https://github.com/linux-audit/audit-kernel/issues/138
Cc: stable@vger.kernel.org
Fixes: 12c5e81d3fd0 ("audit: prepare audit_context for use in calling contexts beyond syscalls")
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: manual merge required]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/auditsc.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -1965,6 +1965,7 @@ void __audit_uring_exit(int success, lon
 		goto out;
 	}
 
+	audit_return_fixup(ctx, success, code);
 	if (ctx->context == AUDIT_CTX_SYSCALL) {
 		/*
 		 * NOTE: See the note in __audit_uring_entry() about the case
@@ -2006,7 +2007,6 @@ void __audit_uring_exit(int success, lon
 	audit_filter_inodes(current, ctx);
 	if (ctx->current_state != AUDIT_STATE_RECORD)
 		goto out;
-	audit_return_fixup(ctx, success, code);
 	audit_log_exit();
 
 out:
@@ -2090,13 +2090,13 @@ void __audit_syscall_exit(int success, l
 	if (!list_empty(&context->killed_trees))
 		audit_kill_trees(context);
 
+	audit_return_fixup(context, success, return_code);
 	/* run through both filters to ensure we set the filterkey properly */
 	audit_filter_syscall(current, context);
 	audit_filter_inodes(current, context);
 	if (context->current_state < AUDIT_STATE_RECORD)
 		goto out;
 
-	audit_return_fixup(context, success, return_code);
 	audit_log_exit();
 
 out:



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 113/158] asm-generic: sections: refactor memory_intersects
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (111 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 112/158] audit: move audit_return_fixup before the filters Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 114/158] mm/damon/dbgfs: avoid duplicate context directory creation Greg Kroah-Hartman
                   ` (57 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Quanyang Wang, Ard Biesheuvel,
	Arnd Bergmann, Thierry Reding, Andrew Morton

From: Quanyang Wang <quanyang.wang@windriver.com>

commit 0c7d7cc2b4fe2e74ef8728f030f0f1674f9f6aee upstream.

There are two problems with the current code of memory_intersects:

First, it doesn't check whether the region (begin, end) falls inside the
region (virt, vend), that is (virt < begin && vend > end).

The second problem is if vend is equal to begin, it will return true but
this is wrong since vend (virt + size) is not the last address of the
memory region but (virt + size -1) is.  The wrong determination will
trigger the misreporting when the function check_for_illegal_area calls
memory_intersects to check if the dma region intersects with stext region.

The misreporting is as below (stext is at 0x80100000):
 WARNING: CPU: 0 PID: 77 at kernel/dma/debug.c:1073 check_for_illegal_area+0x130/0x168
 DMA-API: chipidea-usb2 e0002000.usb: device driver maps memory from kernel text or rodata [addr=800f0000] [len=65536]
 Modules linked in:
 CPU: 1 PID: 77 Comm: usb-storage Not tainted 5.19.0-yocto-standard #5
 Hardware name: Xilinx Zynq Platform
  unwind_backtrace from show_stack+0x18/0x1c
  show_stack from dump_stack_lvl+0x58/0x70
  dump_stack_lvl from __warn+0xb0/0x198
  __warn from warn_slowpath_fmt+0x80/0xb4
  warn_slowpath_fmt from check_for_illegal_area+0x130/0x168
  check_for_illegal_area from debug_dma_map_sg+0x94/0x368
  debug_dma_map_sg from __dma_map_sg_attrs+0x114/0x128
  __dma_map_sg_attrs from dma_map_sg_attrs+0x18/0x24
  dma_map_sg_attrs from usb_hcd_map_urb_for_dma+0x250/0x3b4
  usb_hcd_map_urb_for_dma from usb_hcd_submit_urb+0x194/0x214
  usb_hcd_submit_urb from usb_sg_wait+0xa4/0x118
  usb_sg_wait from usb_stor_bulk_transfer_sglist+0xa0/0xec
  usb_stor_bulk_transfer_sglist from usb_stor_bulk_srb+0x38/0x70
  usb_stor_bulk_srb from usb_stor_Bulk_transport+0x150/0x360
  usb_stor_Bulk_transport from usb_stor_invoke_transport+0x38/0x440
  usb_stor_invoke_transport from usb_stor_control_thread+0x1e0/0x238
  usb_stor_control_thread from kthread+0xf8/0x104
  kthread from ret_from_fork+0x14/0x2c

Refactor memory_intersects to fix the two problems above.

Before the 1d7db834a027e ("dma-debug: use memory_intersects()
directly"), memory_intersects is called only by printk_late_init:

printk_late_init -> init_section_intersects ->memory_intersects.

There were few places where memory_intersects was called.

When commit 1d7db834a027e ("dma-debug: use memory_intersects()
directly") was merged and CONFIG_DMA_API_DEBUG is enabled, the DMA
subsystem uses it to check for an illegal area and the calltrace above
is triggered.

[akpm@linux-foundation.org: fix nearby comment typo]
Link: https://lkml.kernel.org/r/20220819081145.948016-1-quanyang.wang@windriver.com
Fixes: 979559362516 ("asm/sections: add helpers to check for section data")
Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Thierry Reding <treding@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/asm-generic/sections.h |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/include/asm-generic/sections.h
+++ b/include/asm-generic/sections.h
@@ -97,7 +97,7 @@ static inline bool memory_contains(void
 /**
  * memory_intersects - checks if the region occupied by an object intersects
  *                     with another memory region
- * @begin: virtual address of the beginning of the memory regien
+ * @begin: virtual address of the beginning of the memory region
  * @end: virtual address of the end of the memory region
  * @virt: virtual address of the memory object
  * @size: size of the memory object
@@ -110,7 +110,10 @@ static inline bool memory_intersects(voi
 {
 	void *vend = virt + size;
 
-	return (virt >= begin && virt < end) || (vend >= begin && vend < end);
+	if (virt < end && vend > begin)
+		return true;
+
+	return false;
 }
 
 /**



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 114/158] mm/damon/dbgfs: avoid duplicate context directory creation
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (112 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 113/158] asm-generic: sections: refactor memory_intersects Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 115/158] s390/mm: do not trigger write fault when vma does not allow VM_WRITE Greg Kroah-Hartman
                   ` (56 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Badari Pulavarty, SeongJae Park,
	Andrew Morton

From: Badari Pulavarty <badari.pulavarty@intel.com>

commit d26f60703606ab425eee9882b32a1781a8bed74d upstream.

When user tries to create a DAMON context via the DAMON debugfs interface
with a name of an already existing context, the context directory creation
fails but a new context is created and added in the internal data
structure, due to absence of the directory creation success check.  As a
result, memory could leak and DAMON cannot be turned on.  An example test
case is as below:

    # cd /sys/kernel/debug/damon/
    # echo "off" >  monitor_on
    # echo paddr > target_ids
    # echo "abc" > mk_context
    # echo "abc" > mk_context
    # echo $$ > abc/target_ids
    # echo "on" > monitor_on  <<< fails

Return value of 'debugfs_create_dir()' is expected to be ignored in
general, but this is an exceptional case as DAMON feature is depending
on the debugfs functionality and it has the potential duplicate name
issue.  This commit therefore fixes the issue by checking the directory
creation failure and immediately return the error in the case.

Link: https://lkml.kernel.org/r/20220821180853.2400-1-sj@kernel.org
Fixes: 75c1c2b53c78 ("mm/damon/dbgfs: support multiple contexts")
Signed-off-by: Badari Pulavarty <badari.pulavarty@intel.com>
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[ 5.15.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/damon/dbgfs.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/mm/damon/dbgfs.c
+++ b/mm/damon/dbgfs.c
@@ -787,6 +787,9 @@ static int dbgfs_mk_context(char *name)
 		return -ENOENT;
 
 	new_dir = debugfs_create_dir(name, root);
+	/* Below check is required for a potential duplicated name case */
+	if (IS_ERR(new_dir))
+		return PTR_ERR(new_dir);
 	dbgfs_dirs[dbgfs_nr_ctxs] = new_dir;
 
 	new_ctx = dbgfs_new_ctx();



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 115/158] s390/mm: do not trigger write fault when vma does not allow VM_WRITE
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (113 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 114/158] mm/damon/dbgfs: avoid duplicate context directory creation Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 116/158] bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem Greg Kroah-Hartman
                   ` (55 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, David Hildenbrand, Heiko Carstens,
	Gerald Schaefer, Vasily Gorbik

From: Gerald Schaefer <gerald.schaefer@linux.ibm.com>

commit 41ac42f137080bc230b5882e3c88c392ab7f2d32 upstream.

For non-protection pXd_none() page faults in do_dat_exception(), we
call do_exception() with access == (VM_READ | VM_WRITE | VM_EXEC).
In do_exception(), vma->vm_flags is checked against that before
calling handle_mm_fault().

Since commit 92f842eac7ee3 ("[S390] store indication fault optimization"),
we call handle_mm_fault() with FAULT_FLAG_WRITE, when recognizing that
it was a write access. However, the vma flags check is still only
checking against (VM_READ | VM_WRITE | VM_EXEC), and therefore also
calling handle_mm_fault() with FAULT_FLAG_WRITE in cases where the vma
does not allow VM_WRITE.

Fix this by changing access check in do_exception() to VM_WRITE only,
when recognizing write access.

Link: https://lkml.kernel.org/r/20220811103435.188481-3-david@redhat.com
Fixes: 92f842eac7ee3 ("[S390] store indication fault optimization")
Cc: <stable@vger.kernel.org>
Reported-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/s390/mm/fault.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -379,7 +379,9 @@ static inline vm_fault_t do_exception(st
 	flags = FAULT_FLAG_DEFAULT;
 	if (user_mode(regs))
 		flags |= FAULT_FLAG_USER;
-	if (access == VM_WRITE || is_write)
+	if (is_write)
+		access = VM_WRITE;
+	if (access == VM_WRITE)
 		flags |= FAULT_FLAG_WRITE;
 	mmap_read_lock(mm);
 



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 116/158] bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (114 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 115/158] s390/mm: do not trigger write fault when vma does not allow VM_WRITE Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 117/158] mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte Greg Kroah-Hartman
                   ` (54 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Liu Shixin, Muchun Song,
	Matthew Wilcox, Mike Kravetz, Oscar Salvador, Andrew Morton

From: Liu Shixin <liushixin2@huawei.com>

commit dd0ff4d12dd284c334f7e9b07f8f335af856ac78 upstream.

The vmemmap pages is marked by kmemleak when allocated from memblock.
Remove it from kmemleak when freeing the page.  Otherwise, when we reuse
the page, kmemleak may report such an error and then stop working.

 kmemleak: Cannot insert 0xffff98fb6eab3d40 into the object search tree (overlaps existing)
 kmemleak: Kernel memory leak detector disabled
 kmemleak: Object 0xffff98fb6be00000 (size 335544320):
 kmemleak:   comm "swapper", pid 0, jiffies 4294892296
 kmemleak:   min_count = 0
 kmemleak:   count = 0
 kmemleak:   flags = 0x1
 kmemleak:   checksum = 0
 kmemleak:   backtrace:

Link: https://lkml.kernel.org/r/20220819094005.2928241-1-liushixin2@huawei.com
Fixes: f41f2ed43ca5 (mm: hugetlb: free the vmemmap pages associated with each HugeTLB page)
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/bootmem_info.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/mm/bootmem_info.c
+++ b/mm/bootmem_info.c
@@ -12,6 +12,7 @@
 #include <linux/memblock.h>
 #include <linux/bootmem_info.h>
 #include <linux/memory_hotplug.h>
+#include <linux/kmemleak.h>
 
 void get_page_bootmem(unsigned long info, struct page *page, unsigned long type)
 {
@@ -33,6 +34,7 @@ void put_page_bootmem(struct page *page)
 		ClearPagePrivate(page);
 		set_page_private(page, 0);
 		INIT_LIST_HEAD(&page->lru);
+		kmemleak_free_part(page_to_virt(page), PAGE_SIZE);
 		free_reserved_page(page);
 	}
 }



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 117/158] mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (115 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 116/158] bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 118/158] mm/mprotect: only reference swap pfn page if type match Greg Kroah-Hartman
                   ` (53 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Miaohe Lin, Mike Kravetz,
	Axel Rasmussen, Peter Xu, Andrew Morton

From: Miaohe Lin <linmiaohe@huawei.com>

commit ab74ef708dc51df7cf2b8a890b9c6990fac5c0c6 upstream.

In MCOPY_ATOMIC_CONTINUE case with a non-shared VMA, pages in the page
cache are installed in the ptes.  But hugepage_add_new_anon_rmap is called
for them mistakenly because they're not vm_shared.  This will corrupt the
page->mapping used by page cache code.

Link: https://lkml.kernel.org/r/20220712130542.18836-1-linmiaohe@huawei.com
Fixes: f619147104c8 ("userfaultfd: add UFFDIO_CONTINUE ioctl")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/hugetlb.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -6026,7 +6026,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_s
 	if (!huge_pte_none_mostly(huge_ptep_get(dst_pte)))
 		goto out_release_unlock;
 
-	if (vm_shared) {
+	if (page_in_pagecache) {
 		page_dup_file_rmap(page, true);
 	} else {
 		ClearHPageRestoreReserve(page);



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 118/158] mm/mprotect: only reference swap pfn page if type match
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (116 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 117/158] mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 119/158] cifs: skip extra NULL byte in filenames Greg Kroah-Hartman
                   ` (52 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Peter Xu, Yu Zhao, David Hildenbrand,
	Huang, Ying, Andrew Morton

From: Peter Xu <peterx@redhat.com>

commit 3d2f78f08cd8388035ac375e731ec1ac1b79b09d upstream.

Yu Zhao reported a bug after the commit "mm/swap: Add swp_offset_pfn() to
fetch PFN from swap entry" added a check in swp_offset_pfn() for swap type [1]:

  kernel BUG at include/linux/swapops.h:117!
  CPU: 46 PID: 5245 Comm: EventManager_De Tainted: G S         O L 6.0.0-dbg-DEV #2
  RIP: 0010:pfn_swap_entry_to_page+0x72/0xf0
  Code: c6 48 8b 36 48 83 fe ff 74 53 48 01 d1 48 83 c1 08 48 8b 09 f6
  c1 01 75 7b 66 90 48 89 c1 48 8b 09 f6 c1 01 74 74 5d c3 eb 9e <0f> 0b
  48 ba ff ff ff ff 03 00 00 00 eb ae a9 ff 0f 00 00 75 13 48
  RSP: 0018:ffffa59e73fabb80 EFLAGS: 00010282
  RAX: 00000000ffffffe8 RBX: 0c00000000000000 RCX: ffffcd5440000000
  RDX: 1ffffffffff7a80a RSI: 0000000000000000 RDI: 0c0000000000042b
  RBP: ffffa59e73fabb80 R08: ffff9965ca6e8bb8 R09: 0000000000000000
  R10: ffffffffa5a2f62d R11: 0000030b372e9fff R12: ffff997b79db5738
  R13: 000000000000042b R14: 0c0000000000042b R15: 1ffffffffff7a80a
  FS:  00007f549d1bb700(0000) GS:ffff99d3cf680000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000440d035b3180 CR3: 0000002243176004 CR4: 00000000003706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   <TASK>
   change_pte_range+0x36e/0x880
   change_p4d_range+0x2e8/0x670
   change_protection_range+0x14e/0x2c0
   mprotect_fixup+0x1ee/0x330
   do_mprotect_pkey+0x34c/0x440
   __x64_sys_mprotect+0x1d/0x30

It triggers because pfn_swap_entry_to_page() could be called upon e.g. a
genuine swap entry.

Fix it by only calling it when it's a write migration entry where the page*
is used.

[1] https://lore.kernel.org/lkml/CAOUHufaVC2Za-p8m0aiHw6YkheDcrO-C3wRGixwDS32VTS+k1w@mail.gmail.com/

Link: https://lkml.kernel.org/r/20220823221138.45602-1-peterx@redhat.com
Fixes: 6c287605fd56 ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Yu Zhao <yuzhao@google.com>
Tested-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/mprotect.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -158,10 +158,11 @@ static unsigned long change_pte_range(st
 			pages++;
 		} else if (is_swap_pte(oldpte)) {
 			swp_entry_t entry = pte_to_swp_entry(oldpte);
-			struct page *page = pfn_swap_entry_to_page(entry);
 			pte_t newpte;
 
 			if (is_writable_migration_entry(entry)) {
+				struct page *page = pfn_swap_entry_to_page(entry);
+
 				/*
 				 * A protection check is difficult so
 				 * just be safe and disable write



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 119/158] cifs: skip extra NULL byte in filenames
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (117 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 118/158] mm/mprotect: only reference swap pfn page if type match Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 120/158] s390: fix double free of GS and RI CBs on fork() failure Greg Kroah-Hartman
                   ` (51 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Paulo Alcantara (SUSE), Steve French

From: Paulo Alcantara <pc@cjr.nz>

commit a1d2eb51f0a33c28f5399a1610e66b3fbd24e884 upstream.

Since commit:
 cifs: alloc_path_with_tree_prefix: do not append sep. if the path is empty
alloc_path_with_tree_prefix() function was no longer including the
trailing separator when @path is empty, although @out_len was still
assuming a path separator thus adding an extra byte to the final
filename.

This has caused mount issues in some Synology servers due to the extra
NULL byte in filenames when sending SMB2_CREATE requests with
SMB2_FLAGS_DFS_OPERATIONS set.

Fix this by checking if @path is not empty and then add extra byte for
separator.  Also, do not include any trailing NULL bytes in filename
as MS-SMB2 requires it to be 8-byte aligned and not NULL terminated.

Cc: stable@vger.kernel.org
Fixes: 7eacba3b00a3 ("cifs: alloc_path_with_tree_prefix: do not append sep. if the path is empty")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/cifs/smb2pdu.c |   16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -2571,19 +2571,15 @@ alloc_path_with_tree_prefix(__le16 **out
 
 	path_len = UniStrnlen((wchar_t *)path, PATH_MAX);
 
-	/*
-	 * make room for one path separator between the treename and
-	 * path
-	 */
-	*out_len = treename_len + 1 + path_len;
+	/* make room for one path separator only if @path isn't empty */
+	*out_len = treename_len + (path[0] ? 1 : 0) + path_len;
 
 	/*
-	 * final path needs to be null-terminated UTF16 with a
-	 * size aligned to 8
+	 * final path needs to be 8-byte aligned as specified in
+	 * MS-SMB2 2.2.13 SMB2 CREATE Request.
 	 */
-
-	*out_size = roundup((*out_len+1)*2, 8);
-	*out_path = kzalloc(*out_size, GFP_KERNEL);
+	*out_size = roundup(*out_len * sizeof(__le16), 8);
+	*out_path = kzalloc(*out_size + sizeof(__le16) /* null */, GFP_KERNEL);
 	if (!*out_path)
 		return -ENOMEM;
 



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 120/158] s390: fix double free of GS and RI CBs on fork() failure
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (118 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 119/158] cifs: skip extra NULL byte in filenames Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 121/158] fbdev: fbcon: Properly revert changes when vc_resize() failed Greg Kroah-Hartman
                   ` (50 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Brian Foster, Gerald Schaefer,
	Heiko Carstens, Vasily Gorbik

From: Brian Foster <bfoster@redhat.com>

commit 13cccafe0edcd03bf1c841de8ab8a1c8e34f77d9 upstream.

The pointers for guarded storage and runtime instrumentation control
blocks are stored in the thread_struct of the associated task. These
pointers are initially copied on fork() via arch_dup_task_struct()
and then cleared via copy_thread() before fork() returns. If fork()
happens to fail after the initial task dup and before copy_thread(),
the newly allocated task and associated thread_struct memory are
freed via free_task() -> arch_release_task_struct(). This results in
a double free of the guarded storage and runtime info structs
because the fields in the failed task still refer to memory
associated with the source task.

This problem can manifest as a BUG_ON() in set_freepointer() (with
CONFIG_SLAB_FREELIST_HARDENED enabled) or KASAN splat (if enabled)
when running trinity syscall fuzz tests on s390x. To avoid this
problem, clear the associated pointer fields in
arch_dup_task_struct() immediately after the new task is copied.
Note that the RI flag is still cleared in copy_thread() because it
resides in thread stack memory and that is where stack info is
copied.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Fixes: 8d9047f8b967c ("s390/runtime instrumentation: simplify task exit handling")
Fixes: 7b83c6297d2fc ("s390/guarded storage: simplify task exit handling")
Cc: <stable@vger.kernel.org> # 4.15
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20220816155407.537372-1-bfoster@redhat.com
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/s390/kernel/process.c |   22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

--- a/arch/s390/kernel/process.c
+++ b/arch/s390/kernel/process.c
@@ -91,6 +91,18 @@ int arch_dup_task_struct(struct task_str
 
 	memcpy(dst, src, arch_task_struct_size);
 	dst->thread.fpu.regs = dst->thread.fpu.fprs;
+
+	/*
+	 * Don't transfer over the runtime instrumentation or the guarded
+	 * storage control block pointers. These fields are cleared here instead
+	 * of in copy_thread() to avoid premature freeing of associated memory
+	 * on fork() failure. Wait to clear the RI flag because ->stack still
+	 * refers to the source thread.
+	 */
+	dst->thread.ri_cb = NULL;
+	dst->thread.gs_cb = NULL;
+	dst->thread.gs_bc_cb = NULL;
+
 	return 0;
 }
 
@@ -150,13 +162,11 @@ int copy_thread(struct task_struct *p, c
 	frame->childregs.flags = 0;
 	if (new_stackp)
 		frame->childregs.gprs[15] = new_stackp;
-
-	/* Don't copy runtime instrumentation info */
-	p->thread.ri_cb = NULL;
+	/*
+	 * Clear the runtime instrumentation flag after the above childregs
+	 * copy. The CB pointer was already cleared in arch_dup_task_struct().
+	 */
 	frame->childregs.psw.mask &= ~PSW_MASK_RI;
-	/* Don't copy guarded storage control block */
-	p->thread.gs_cb = NULL;
-	p->thread.gs_bc_cb = NULL;
 
 	/* Set a new TLS ?  */
 	if (clone_flags & CLONE_SETTLS) {



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 121/158] fbdev: fbcon: Properly revert changes when vc_resize() failed
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (119 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 120/158] s390: fix double free of GS and RI CBs on fork() failure Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 122/158] Revert "memcg: cleanup racy sum avoidance code" Greg Kroah-Hartman
                   ` (49 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+a168dbeaaa7778273c1b,
	Shigeru Yoshida, Helge Deller

From: Shigeru Yoshida <syoshida@redhat.com>

commit a5a923038d70d2d4a86cb4e3f32625a5ee6e7e24 upstream.

fbcon_do_set_font() calls vc_resize() when font size is changed.
However, if if vc_resize() failed, current implementation doesn't
revert changes for font size, and this causes inconsistent state.

syzbot reported unable to handle page fault due to this issue [1].
syzbot's repro uses fault injection which cause failure for memory
allocation, so vc_resize() failed.

This patch fixes this issue by properly revert changes for font
related date when vc_resize() failed.

Link: https://syzkaller.appspot.com/bug?id=3443d3a1fa6d964dd7310a0cb1696d165a3e07c4 [1]
Reported-by: syzbot+a168dbeaaa7778273c1b@syzkaller.appspotmail.com
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
CC: stable@vger.kernel.org # 5.15+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/video/fbdev/core/fbcon.c |   27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

--- a/drivers/video/fbdev/core/fbcon.c
+++ b/drivers/video/fbdev/core/fbcon.c
@@ -2402,15 +2402,21 @@ static int fbcon_do_set_font(struct vc_d
 	struct fb_info *info = fbcon_info_from_console(vc->vc_num);
 	struct fbcon_ops *ops = info->fbcon_par;
 	struct fbcon_display *p = &fb_display[vc->vc_num];
-	int resize;
+	int resize, ret, old_userfont, old_width, old_height, old_charcount;
 	char *old_data = NULL;
 
 	resize = (w != vc->vc_font.width) || (h != vc->vc_font.height);
 	if (p->userfont)
 		old_data = vc->vc_font.data;
 	vc->vc_font.data = (void *)(p->fontdata = data);
+	old_userfont = p->userfont;
 	if ((p->userfont = userfont))
 		REFCOUNT(data)++;
+
+	old_width = vc->vc_font.width;
+	old_height = vc->vc_font.height;
+	old_charcount = vc->vc_font.charcount;
+
 	vc->vc_font.width = w;
 	vc->vc_font.height = h;
 	vc->vc_font.charcount = charcount;
@@ -2426,7 +2432,9 @@ static int fbcon_do_set_font(struct vc_d
 		rows = FBCON_SWAP(ops->rotate, info->var.yres, info->var.xres);
 		cols /= w;
 		rows /= h;
-		vc_resize(vc, cols, rows);
+		ret = vc_resize(vc, cols, rows);
+		if (ret)
+			goto err_out;
 	} else if (con_is_visible(vc)
 		   && vc->vc_mode == KD_TEXT) {
 		fbcon_clear_margins(vc, 0);
@@ -2436,6 +2444,21 @@ static int fbcon_do_set_font(struct vc_d
 	if (old_data && (--REFCOUNT(old_data) == 0))
 		kfree(old_data - FONT_EXTRA_WORDS * sizeof(int));
 	return 0;
+
+err_out:
+	p->fontdata = old_data;
+	vc->vc_font.data = (void *)old_data;
+
+	if (userfont) {
+		p->userfont = old_userfont;
+		REFCOUNT(data)--;
+	}
+
+	vc->vc_font.width = old_width;
+	vc->vc_font.height = old_height;
+	vc->vc_font.charcount = old_charcount;
+
+	return ret;
 }
 
 /*



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 122/158] Revert "memcg: cleanup racy sum avoidance code"
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (120 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 121/158] fbdev: fbcon: Properly revert changes when vc_resize() failed Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 123/158] shmem: update folio if shmem_replace_page() updates the page Greg Kroah-Hartman
                   ` (48 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Shakeel Butt, Michal Koutný,
	Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
	David Hildenbrand, Yosry Ahmed, Greg Thelen, Andrew Morton

From: Shakeel Butt <shakeelb@google.com>

commit dbb16df6443c59e8a1ef21c2272fcf387d600ddf upstream.

This reverts commit 96e51ccf1af33e82f429a0d6baebba29c6448d0f.

Recently we started running the kernel with rstat infrastructure on
production traffic and begin to see negative memcg stats values.
Particularly the 'sock' stat is the one which we observed having negative
value.

$ grep "sock " /mnt/memory/job/memory.stat
sock 253952
total_sock 18446744073708724224

Re-run after couple of seconds

$ grep "sock " /mnt/memory/job/memory.stat
sock 253952
total_sock 53248

For now we are only seeing this issue on large machines (256 CPUs) and
only with 'sock' stat.  I think the networking stack increase the stat on
one cpu and decrease it on another cpu much more often.  So, this negative
sock is due to rstat flusher flushing the stats on the CPU that has seen
the decrement of sock but missed the CPU that has increments.  A typical
race condition.

For easy stable backport, revert is the most simple solution.  For long
term solution, I am thinking of two directions.  First is just reduce the
race window by optimizing the rstat flusher.  Second is if the reader sees
a negative stat value, force flush and restart the stat collection.
Basically retry but limited.

Link: https://lkml.kernel.org/r/20220817172139.3141101-1-shakeelb@google.com
Fixes: 96e51ccf1af33e8 ("memcg: cleanup racy sum avoidance code")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Cc: "Michal Koutný" <mkoutny@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: <stable@vger.kernel.org>	[5.15]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/memcontrol.h |   15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -978,19 +978,30 @@ static inline void mod_memcg_page_state(
 
 static inline unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx)
 {
-	return READ_ONCE(memcg->vmstats.state[idx]);
+	long x = READ_ONCE(memcg->vmstats.state[idx]);
+#ifdef CONFIG_SMP
+	if (x < 0)
+		x = 0;
+#endif
+	return x;
 }
 
 static inline unsigned long lruvec_page_state(struct lruvec *lruvec,
 					      enum node_stat_item idx)
 {
 	struct mem_cgroup_per_node *pn;
+	long x;
 
 	if (mem_cgroup_disabled())
 		return node_page_state(lruvec_pgdat(lruvec), idx);
 
 	pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec);
-	return READ_ONCE(pn->lruvec_stats.state[idx]);
+	x = READ_ONCE(pn->lruvec_stats.state[idx]);
+#ifdef CONFIG_SMP
+	if (x < 0)
+		x = 0;
+#endif
+	return x;
 }
 
 static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec,



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 123/158] shmem: update folio if shmem_replace_page() updates the page
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (121 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 122/158] Revert "memcg: cleanup racy sum avoidance code" Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 124/158] ACPI: processor: Remove freq Qos request for all CPUs Greg Kroah-Hartman
                   ` (47 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Matthew Wilcox (Oracle),
	William Kucharski, Hugh Dickins, Andrew Morton

From: Matthew Wilcox (Oracle) <willy@infradead.org>

commit 9dfb3b8d655022760ca68af11821f1c63aa547c3 upstream.

If we allocate a new page, we need to make sure that our folio matches
that new page.

If we do end up in this code path, we store the wrong page in the shmem
inode's page cache, and I would rather imagine that data corruption
ensues.

This will be solved by changing shmem_replace_page() to
shmem_replace_folio(), but this is the minimal fix.

Link: https://lkml.kernel.org/r/20220730042518.1264767-1-willy@infradead.org
Fixes: da08e9b79323 ("mm/shmem: convert shmem_swapin_page() to shmem_swapin_folio()")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/shmem.c |    1 +
 1 file changed, 1 insertion(+)

--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1771,6 +1771,7 @@ static int shmem_swapin_folio(struct ino
 
 	if (shmem_should_replace_folio(folio, gfp)) {
 		error = shmem_replace_page(&page, gfp, info, index);
+		folio = page_folio(page);
 		if (error)
 			goto failed;
 	}



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 124/158] ACPI: processor: Remove freq Qos request for all CPUs
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (122 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 123/158] shmem: update folio if shmem_replace_page() updates the page Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 125/158] nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf Greg Kroah-Hartman
                   ` (46 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jeremy Linton, Jeremy Linton,
	Riwen Lu, Rafael J. Wysocki

From: Riwen Lu <luriwen@kylinos.cn>

commit 36527b9d882362567ceb4eea8666813280f30e6f upstream.

The freq Qos request would be removed repeatedly if the cpufreq policy
relates to more than one CPU. Then, it would cause the "called for unknown
object" warning.

Remove the freq Qos request for each CPU relates to the cpufreq policy,
instead of removing repeatedly for the last CPU of it.

Fixes: a1bb46c36ce3 ("ACPI: processor: Add QoS requests for all CPUs")
Reported-by: Jeremy Linton <Jeremy.Linton@arm.com>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Riwen Lu <luriwen@kylinos.cn>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/acpi/processor_thermal.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/acpi/processor_thermal.c
+++ b/drivers/acpi/processor_thermal.c
@@ -151,7 +151,7 @@ void acpi_thermal_cpufreq_exit(struct cp
 	unsigned int cpu;
 
 	for_each_cpu(cpu, policy->related_cpus) {
-		struct acpi_processor *pr = per_cpu(processors, policy->cpu);
+		struct acpi_processor *pr = per_cpu(processors, cpu);
 
 		if (pr)
 			freq_qos_remove_request(&pr->thermal_req);



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 125/158] nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (123 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 124/158] ACPI: processor: Remove freq Qos request for all CPUs Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 126/158] smb3: missing inode locks in punch hole Greg Kroah-Hartman
                   ` (45 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Karol Herbst, Lyude Paul

From: Karol Herbst <kherbst@redhat.com>

commit 6b04ce966a738ecdd9294c9593e48513c0dc90aa upstream.

It is a bit unlcear to us why that's helping, but it does and unbreaks
suspend/resume on a lot of GPUs without any known drawbacks.

Cc: stable@vger.kernel.org # v5.15+
Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/156
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220819200928.401416-1-kherbst@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/nouveau/nouveau_bo.c |    9 +++++++++
 1 file changed, 9 insertions(+)

--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -820,6 +820,15 @@ nouveau_bo_move_m2mf(struct ttm_buffer_o
 		if (ret == 0) {
 			ret = nouveau_fence_new(chan, false, &fence);
 			if (ret == 0) {
+				/* TODO: figure out a better solution here
+				 *
+				 * wait on the fence here explicitly as going through
+				 * ttm_bo_move_accel_cleanup somehow doesn't seem to do it.
+				 *
+				 * Without this the operation can timeout and we'll fallback to a
+				 * software copy, which might take several minutes to finish.
+				 */
+				nouveau_fence_wait(fence, false, false);
 				ret = ttm_bo_move_accel_cleanup(bo,
 								&fence->base,
 								evict, false,



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 126/158] smb3: missing inode locks in punch hole
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (124 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 125/158] nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 127/158] ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown Greg Kroah-Hartman
                   ` (44 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, David Howells, Steve French

From: David Howells <dhowells@redhat.com>

commit ba0803050d610d5072666be727bca5e03e55b242 upstream.

smb3 fallocate punch hole was not grabbing the inode or filemap_invalidate
locks so could have race with pagemap reinstantiating the page.

Cc: stable@vger.kernel.org
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/cifs/smb2ops.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -3671,7 +3671,7 @@ static long smb3_zero_range(struct file
 static long smb3_punch_hole(struct file *file, struct cifs_tcon *tcon,
 			    loff_t offset, loff_t len)
 {
-	struct inode *inode;
+	struct inode *inode = file_inode(file);
 	struct cifsFileInfo *cfile = file->private_data;
 	struct file_zero_data_information fsctl_buf;
 	long rc;
@@ -3680,14 +3680,12 @@ static long smb3_punch_hole(struct file
 
 	xid = get_xid();
 
-	inode = d_inode(cfile->dentry);
-
+	inode_lock(inode);
 	/* Need to make file sparse, if not already, before freeing range. */
 	/* Consider adding equivalent for compressed since it could also work */
 	if (!smb2_set_sparse(xid, tcon, cfile, inode, set_sparse)) {
 		rc = -EOPNOTSUPP;
-		free_xid(xid);
-		return rc;
+		goto out;
 	}
 
 	filemap_invalidate_lock(inode->i_mapping);
@@ -3707,8 +3705,10 @@ static long smb3_punch_hole(struct file
 			true /* is_fctl */, (char *)&fsctl_buf,
 			sizeof(struct file_zero_data_information),
 			CIFSMaxBufSize, NULL, NULL);
-	free_xid(xid);
 	filemap_invalidate_unlock(inode->i_mapping);
+out:
+	inode_unlock(inode);
+	free_xid(xid);
 	return rc;
 }
 



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 127/158] ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (125 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 126/158] smb3: missing inode locks in punch hole Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 128/158] xen/privcmd: fix error exit of privcmd_ioctl_dm_op() Greg Kroah-Hartman
                   ` (43 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Heming Zhao, Joseph Qi, Mark Fasheh,
	Joel Becker, Junxiao Bi, Changwei Ge, Gang He, Jun Piao,
	Andrew Morton

From: Heming Zhao <ocfs2-devel@oss.oracle.com>

commit 550842cc60987b269e31b222283ade3e1b6c7fc8 upstream.

After commit 0737e01de9c4 ("ocfs2: ocfs2_mount_volume does cleanup job
before return error"), any procedure after ocfs2_dlm_init() fails will
trigger crash when calling ocfs2_dlm_shutdown().

ie: On local mount mode, no dlm resource is initialized.  If
ocfs2_mount_volume() fails in ocfs2_find_slot(), error handling will call
ocfs2_dlm_shutdown(), then does dlm resource cleanup job, which will
trigger kernel crash.

This solution should bypass uninitialized resources in
ocfs2_dlm_shutdown().

Link: https://lkml.kernel.org/r/20220815085754.20417-1-heming.zhao@suse.com
Fixes: 0737e01de9c4 ("ocfs2: ocfs2_mount_volume does cleanup job before return error")
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/ocfs2/dlmglue.c |    8 +++++---
 fs/ocfs2/super.c   |    3 +--
 2 files changed, 6 insertions(+), 5 deletions(-)

--- a/fs/ocfs2/dlmglue.c
+++ b/fs/ocfs2/dlmglue.c
@@ -3403,10 +3403,12 @@ void ocfs2_dlm_shutdown(struct ocfs2_sup
 	ocfs2_lock_res_free(&osb->osb_nfs_sync_lockres);
 	ocfs2_lock_res_free(&osb->osb_orphan_scan.os_lockres);
 
-	ocfs2_cluster_disconnect(osb->cconn, hangup_pending);
-	osb->cconn = NULL;
+	if (osb->cconn) {
+		ocfs2_cluster_disconnect(osb->cconn, hangup_pending);
+		osb->cconn = NULL;
 
-	ocfs2_dlm_shutdown_debug(osb);
+		ocfs2_dlm_shutdown_debug(osb);
+	}
 }
 
 static int ocfs2_drop_lock(struct ocfs2_super *osb,
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1914,8 +1914,7 @@ static void ocfs2_dismount_volume(struct
 	    !ocfs2_is_hard_readonly(osb))
 		hangup_needed = 1;
 
-	if (osb->cconn)
-		ocfs2_dlm_shutdown(osb, hangup_needed);
+	ocfs2_dlm_shutdown(osb, hangup_needed);
 
 	ocfs2_blockcheck_stats_debugfs_remove(&osb->osb_ecc_stats);
 	debugfs_remove_recursive(osb->osb_debug_root);



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 128/158] xen/privcmd: fix error exit of privcmd_ioctl_dm_op()
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (126 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 127/158] ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 129/158] riscv: signal: fix missing prototype warning Greg Kroah-Hartman
                   ` (42 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Rustam Subkhankulov, Juergen Gross,
	Jan Beulich, Oleksandr Tyshchenko

From: Juergen Gross <jgross@suse.com>

commit c5deb27895e017a0267de0a20d140ad5fcc55a54 upstream.

The error exit of privcmd_ioctl_dm_op() is calling unlock_pages()
potentially with pages being NULL, leading to a NULL dereference.

Additionally lock_pages() doesn't check for pin_user_pages_fast()
having been completely successful, resulting in potentially not
locking all pages into memory. This could result in sporadic failures
when using the related memory in user mode.

Fix all of that by calling unlock_pages() always with the real number
of pinned pages, which will be zero in case pages being NULL, and by
checking the number of pages pinned by pin_user_pages_fast() matching
the expected number of pages.

Cc: <stable@vger.kernel.org>
Fixes: ab520be8cd5d ("xen/privcmd: Add IOCTL_PRIVCMD_DM_OP")
Reported-by: Rustam Subkhankulov <subkhankulov@ispras.ru>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Link: https://lore.kernel.org/r/20220825141918.3581-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/xen/privcmd.c |   21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -581,27 +581,30 @@ static int lock_pages(
 	struct privcmd_dm_op_buf kbufs[], unsigned int num,
 	struct page *pages[], unsigned int nr_pages, unsigned int *pinned)
 {
-	unsigned int i;
+	unsigned int i, off = 0;
 
-	for (i = 0; i < num; i++) {
+	for (i = 0; i < num; ) {
 		unsigned int requested;
 		int page_count;
 
 		requested = DIV_ROUND_UP(
 			offset_in_page(kbufs[i].uptr) + kbufs[i].size,
-			PAGE_SIZE);
+			PAGE_SIZE) - off;
 		if (requested > nr_pages)
 			return -ENOSPC;
 
 		page_count = pin_user_pages_fast(
-			(unsigned long) kbufs[i].uptr,
+			(unsigned long)kbufs[i].uptr + off * PAGE_SIZE,
 			requested, FOLL_WRITE, pages);
-		if (page_count < 0)
-			return page_count;
+		if (page_count <= 0)
+			return page_count ? : -EFAULT;
 
 		*pinned += page_count;
 		nr_pages -= page_count;
 		pages += page_count;
+
+		off = (requested == page_count) ? 0 : off + page_count;
+		i += !off;
 	}
 
 	return 0;
@@ -677,10 +680,8 @@ static long privcmd_ioctl_dm_op(struct f
 	}
 
 	rc = lock_pages(kbufs, kdata.num, pages, nr_pages, &pinned);
-	if (rc < 0) {
-		nr_pages = pinned;
+	if (rc < 0)
 		goto out;
-	}
 
 	for (i = 0; i < kdata.num; i++) {
 		set_xen_guest_handle(xbufs[i].h, kbufs[i].uptr);
@@ -692,7 +693,7 @@ static long privcmd_ioctl_dm_op(struct f
 	xen_preemptible_hcall_end();
 
 out:
-	unlock_pages(pages, nr_pages);
+	unlock_pages(pages, pinned);
 	kfree(xbufs);
 	kfree(pages);
 	kfree(kbufs);



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 129/158] riscv: signal: fix missing prototype warning
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (127 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 128/158] xen/privcmd: fix error exit of privcmd_ioctl_dm_op() Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 130/158] riscv: traps: add missing prototype Greg Kroah-Hartman
                   ` (41 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Conor Dooley, Palmer Dabbelt

From: Conor Dooley <conor.dooley@microchip.com>

commit b5c3aca86d2698c4850b6ee8b341938025d2780c upstream.

Fix the warning:
arch/riscv/kernel/signal.c:316:27: warning: no previous prototype for function 'do_notify_resume' [-Wmissing-prototypes]
asmlinkage __visible void do_notify_resume(struct pt_regs *regs,

All other functions in the file are static & none of the existing
headers stood out as an obvious location. Create signal.h to hold the
declaration.

Fixes: e2c0cdfba7f6 ("RISC-V: User-facing API")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220814141237.493457-4-mail@conchuod.ie
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/riscv/include/asm/signal.h |   12 ++++++++++++
 arch/riscv/kernel/signal.c      |    1 +
 2 files changed, 13 insertions(+)
 create mode 100644 arch/riscv/include/asm/signal.h

--- /dev/null
+++ b/arch/riscv/include/asm/signal.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef __ASM_SIGNAL_H
+#define __ASM_SIGNAL_H
+
+#include <uapi/asm/signal.h>
+#include <uapi/asm/ptrace.h>
+
+asmlinkage __visible
+void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags);
+
+#endif
--- a/arch/riscv/kernel/signal.c
+++ b/arch/riscv/kernel/signal.c
@@ -15,6 +15,7 @@
 
 #include <asm/ucontext.h>
 #include <asm/vdso.h>
+#include <asm/signal.h>
 #include <asm/signal32.h>
 #include <asm/switch_to.h>
 #include <asm/csr.h>



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 130/158] riscv: traps: add missing prototype
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (128 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 129/158] riscv: signal: fix missing prototype warning Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 131/158] riscv: dts: microchip: correct L2 cache interrupts Greg Kroah-Hartman
                   ` (40 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Conor Dooley, Palmer Dabbelt

From: Conor Dooley <conor.dooley@microchip.com>

commit d951b20b9def73dcc39a5379831525d0d2a537e9 upstream.

Sparse complains:
arch/riscv/kernel/traps.c:213:6: warning: symbol 'shadow_stack' was not declared. Should it be static?

The variable is used in entry.S, so declare shadow_stack there
alongside SHADOW_OVERFLOW_STACK_SIZE.

Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220814141237.493457-5-mail@conchuod.ie
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/riscv/include/asm/thread_info.h |    2 ++
 arch/riscv/kernel/traps.c            |    3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

--- a/arch/riscv/include/asm/thread_info.h
+++ b/arch/riscv/include/asm/thread_info.h
@@ -42,6 +42,8 @@
 
 #ifndef __ASSEMBLY__
 
+extern long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE / sizeof(long)];
+
 #include <asm/processor.h>
 #include <asm/csr.h>
 
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -20,9 +20,10 @@
 
 #include <asm/asm-prototypes.h>
 #include <asm/bug.h>
+#include <asm/csr.h>
 #include <asm/processor.h>
 #include <asm/ptrace.h>
-#include <asm/csr.h>
+#include <asm/thread_info.h>
 
 int show_unhandled_signals = 1;
 



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 131/158] riscv: dts: microchip: correct L2 cache interrupts
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (129 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 130/158] riscv: traps: add missing prototype Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 132/158] Revert "zram: remove double compression logic" Greg Kroah-Hartman
                   ` (39 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Heinrich Schuchardt, Conor Dooley

From: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>

commit 34fc9cc3aebe8b9e27d3bc821543dd482dc686ca upstream.

The "PolarFire SoC MSS Technical Reference Manual" documents the
following PLIC interrupts:

1 - L2 Cache Controller Signals when a metadata correction event occurs
2 - L2 Cache Controller Signals when an uncorrectable metadata event occurs
3 - L2 Cache Controller Signals when a data correction event occurs
4 - L2 Cache Controller Signals when an uncorrectable data event occurs

This differs from the SiFive FU540 which only has three L2 cache related
interrupts.

The sequence in the device tree is defined by an enum:

    enum {
            DIR_CORR = 0,
            DATA_CORR,
            DATA_UNCORR,
            DIR_UNCORR,
    };

So the correct sequence of the L2 cache interrupts is

    interrupts = <1>, <3>, <4>, <2>;

[Conor]
This manifests as an unusable system if the l2-cache driver is enabled,
as the wrong interrupt gets cleared & the handler prints errors to the
console ad infinitum.

Fixes: 0fa6107eca41 ("RISC-V: Initial DTS for Microchip ICICLE board")
CC: stable@vger.kernel.org # 5.15: e35b07a7df9b: riscv: dts: microchip: mpfs: Group tuples in interrupt properties
Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/riscv/boot/dts/microchip/mpfs.dtsi |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/riscv/boot/dts/microchip/mpfs.dtsi
+++ b/arch/riscv/boot/dts/microchip/mpfs.dtsi
@@ -169,7 +169,7 @@
 			cache-size = <2097152>;
 			cache-unified;
 			interrupt-parent = <&plic>;
-			interrupts = <1>, <2>, <3>;
+			interrupts = <1>, <3>, <4>, <2>;
 		};
 
 		clint: clint@2000000 {



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 132/158] Revert "zram: remove double compression logic"
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (130 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 131/158] riscv: dts: microchip: correct L2 cache interrupts Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 133/158] io_uring: fix issue with io_write() not always undoing sb_start_write() Greg Kroah-Hartman
                   ` (38 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jiri Slaby, Sergey Senozhatsky,
	Minchan Kim, Nitin Gupta, Alexey Romanov, Dmitry Rokosov,
	Lukas Czerner, Andrew Morton

From: Jiri Slaby <jslaby@suse.cz>

commit 37887783b3fef877bf34b8992c9199864da4afcb upstream.

This reverts commit e7be8d1dd983156b ("zram: remove double compression
logic") as it causes zram failures.  It does not revert cleanly, PTR_ERR
handling was introduced in the meantime.  This is handled by appropriate
IS_ERR.

When under memory pressure, zs_malloc() can fail.  Before the above
commit, the allocation was retried with direct reclaim enabled (GFP_NOIO).
After the commit, it is not -- only __GFP_KSWAPD_RECLAIM is tried.

So when the failure occurs under memory pressure, the overlaying
filesystem such as ext2 (mounted by ext4 module in this case) can emit
failures, making the (file)system unusable:
  EXT4-fs warning (device zram0): ext4_end_bio:343: I/O error 10 writing to inode 16386 starting block 159744)
  Buffer I/O error on device zram0, logical block 159744

With direct reclaim, memory is really reclaimed and allocation succeeds,
eventually.  In the worst case, the oom killer is invoked, which is proper
outcome if user sets up zram too large (in comparison to available RAM).

This very diff doesn't apply to 5.19 (stable) cleanly (see PTR_ERR note
above). Use revert of e7be8d1dd983 directly.

Link: https://bugzilla.suse.com/show_bug.cgi?id=1202203
Link: https://lkml.kernel.org/r/20220810070609.14402-1-jslaby@suse.cz
Fixes: e7be8d1dd983 ("zram: remove double compression logic")
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Alexey Romanov <avromanov@sberdevices.ru>
Cc: Dmitry Rokosov <ddrokosov@sberdevices.ru>
Cc: Lukas Czerner <lczerner@redhat.com>
Cc: <stable@vger.kernel.org>	[5.19]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/block/zram/zram_drv.c |   42 ++++++++++++++++++++++++++++++++----------
 drivers/block/zram/zram_drv.h |    1 +
 2 files changed, 33 insertions(+), 10 deletions(-)

--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1144,14 +1144,15 @@ static ssize_t bd_stat_show(struct devic
 static ssize_t debug_stat_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
 {
-	int version = 2;
+	int version = 1;
 	struct zram *zram = dev_to_zram(dev);
 	ssize_t ret;
 
 	down_read(&zram->init_lock);
 	ret = scnprintf(buf, PAGE_SIZE,
-			"version: %d\n%8llu\n",
+			"version: %d\n%8llu %8llu\n",
 			version,
+			(u64)atomic64_read(&zram->stats.writestall),
 			(u64)atomic64_read(&zram->stats.miss_free));
 	up_read(&zram->init_lock);
 
@@ -1367,6 +1368,7 @@ static int __zram_bvec_write(struct zram
 	}
 	kunmap_atomic(mem);
 
+compress_again:
 	zstrm = zcomp_stream_get(zram->comp);
 	src = kmap_atomic(page);
 	ret = zcomp_compress(zstrm, src, &comp_len);
@@ -1375,20 +1377,39 @@ static int __zram_bvec_write(struct zram
 	if (unlikely(ret)) {
 		zcomp_stream_put(zram->comp);
 		pr_err("Compression failed! err=%d\n", ret);
+		zs_free(zram->mem_pool, handle);
 		return ret;
 	}
 
 	if (comp_len >= huge_class_size)
 		comp_len = PAGE_SIZE;
-
-	handle = zs_malloc(zram->mem_pool, comp_len,
-			__GFP_KSWAPD_RECLAIM |
-			__GFP_NOWARN |
-			__GFP_HIGHMEM |
-			__GFP_MOVABLE);
-
-	if (unlikely(!handle)) {
+	/*
+	 * handle allocation has 2 paths:
+	 * a) fast path is executed with preemption disabled (for
+	 *  per-cpu streams) and has __GFP_DIRECT_RECLAIM bit clear,
+	 *  since we can't sleep;
+	 * b) slow path enables preemption and attempts to allocate
+	 *  the page with __GFP_DIRECT_RECLAIM bit set. we have to
+	 *  put per-cpu compression stream and, thus, to re-do
+	 *  the compression once handle is allocated.
+	 *
+	 * if we have a 'non-null' handle here then we are coming
+	 * from the slow path and handle has already been allocated.
+	 */
+	if (!handle)
+		handle = zs_malloc(zram->mem_pool, comp_len,
+				__GFP_KSWAPD_RECLAIM |
+				__GFP_NOWARN |
+				__GFP_HIGHMEM |
+				__GFP_MOVABLE);
+	if (!handle) {
 		zcomp_stream_put(zram->comp);
+		atomic64_inc(&zram->stats.writestall);
+		handle = zs_malloc(zram->mem_pool, comp_len,
+				GFP_NOIO | __GFP_HIGHMEM |
+				__GFP_MOVABLE);
+		if (handle)
+			goto compress_again;
 		return -ENOMEM;
 	}
 
@@ -1946,6 +1967,7 @@ static int zram_add(void)
 	if (ZRAM_LOGICAL_BLOCK_SIZE == PAGE_SIZE)
 		blk_queue_max_write_zeroes_sectors(zram->disk->queue, UINT_MAX);
 
+	blk_queue_flag_set(QUEUE_FLAG_STABLE_WRITES, zram->disk->queue);
 	ret = device_add_disk(NULL, zram->disk, zram_disk_groups);
 	if (ret)
 		goto out_cleanup_disk;
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -81,6 +81,7 @@ struct zram_stats {
 	atomic64_t huge_pages_since;	/* no. of huge pages since zram set up */
 	atomic64_t pages_stored;	/* no. of pages currently stored */
 	atomic_long_t max_used_pages;	/* no. of maximum pages stored */
+	atomic64_t writestall;		/* no. of write slow paths */
 	atomic64_t miss_free;		/* no. of missed free */
 #ifdef	CONFIG_ZRAM_WRITEBACK
 	atomic64_t bd_count;		/* no. of pages in backing device */



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 133/158] io_uring: fix issue with io_write() not always undoing sb_start_write()
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (131 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 132/158] Revert "zram: remove double compression logic" Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 134/158] mm/hugetlb: fix hugetlb not supporting softdirty tracking Greg Kroah-Hartman
                   ` (37 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Jens Axboe

From: Jens Axboe <axboe@kernel.dk>

commit e053aaf4da56cbf0afb33a0fda4a62188e2c0637 upstream.

This is actually an older issue, but we never used to hit the -EAGAIN
path before having done sb_start_write(). Make sure that we always call
kiocb_end_write() if we need to retry the write, so that we keep the
calls to sb_start_write() etc balanced.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 io_uring/io_uring.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -4331,7 +4331,12 @@ done:
 copy_iov:
 		iov_iter_restore(&s->iter, &s->iter_state);
 		ret = io_setup_async_rw(req, iovec, s, false);
-		return ret ?: -EAGAIN;
+		if (!ret) {
+			if (kiocb->ki_flags & IOCB_WRITE)
+				kiocb_end_write(req);
+			return -EAGAIN;
+		}
+		return ret;
 	}
 out_free:
 	/* it's reportedly faster than delegating the null check to kfree() */



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 134/158] mm/hugetlb: fix hugetlb not supporting softdirty tracking
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (132 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 133/158] io_uring: fix issue with io_write() not always undoing sb_start_write() Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 135/158] Revert "md-raid: destroy the bitmap after destroying the thread" Greg Kroah-Hartman
                   ` (36 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, David Hildenbrand, Mike Kravetz,
	Peter Feiner, Kirill A. Shutemov, Cyrill Gorcunov,
	Pavel Emelyanov, Jamie Liu, Hugh Dickins, Naoya Horiguchi,
	Bjorn Helgaas, Muchun Song, Peter Xu, Andrew Morton

From: David Hildenbrand <david@redhat.com>

commit f96f7a40874d7c746680c0b9f57cef2262ae551f upstream.

Patch series "mm/hugetlb: fix write-fault handling for shared mappings", v2.

I observed that hugetlb does not support/expect write-faults in shared
mappings that would have to map the R/O-mapped page writable -- and I
found two case where we could currently get such faults and would
erroneously map an anon page into a shared mapping.

Reproducers part of the patches.

I propose to backport both fixes to stable trees.  The first fix needs a
small adjustment.


This patch (of 2):

Staring at hugetlb_wp(), one might wonder where all the logic for shared
mappings is when stumbling over a write-protected page in a shared
mapping.  In fact, there is none, and so far we thought we could get away
with that because e.g., mprotect() should always do the right thing and
map all pages directly writable.

Looks like we were wrong:

--------------------------------------------------------------------------
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <fcntl.h>
 #include <unistd.h>
 #include <errno.h>
 #include <sys/mman.h>

 #define HUGETLB_SIZE (2 * 1024 * 1024u)

 static void clear_softdirty(void)
 {
         int fd = open("/proc/self/clear_refs", O_WRONLY);
         const char *ctrl = "4";
         int ret;

         if (fd < 0) {
                 fprintf(stderr, "open(clear_refs) failed\n");
                 exit(1);
         }
         ret = write(fd, ctrl, strlen(ctrl));
         if (ret != strlen(ctrl)) {
                 fprintf(stderr, "write(clear_refs) failed\n");
                 exit(1);
         }
         close(fd);
 }

 int main(int argc, char **argv)
 {
         char *map;
         int fd;

         fd = open("/dev/hugepages/tmp", O_RDWR | O_CREAT);
         if (!fd) {
                 fprintf(stderr, "open() failed\n");
                 return -errno;
         }
         if (ftruncate(fd, HUGETLB_SIZE)) {
                 fprintf(stderr, "ftruncate() failed\n");
                 return -errno;
         }

         map = mmap(NULL, HUGETLB_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
         if (map == MAP_FAILED) {
                 fprintf(stderr, "mmap() failed\n");
                 return -errno;
         }

         *map = 0;

         if (mprotect(map, HUGETLB_SIZE, PROT_READ)) {
                 fprintf(stderr, "mmprotect() failed\n");
                 return -errno;
         }

         clear_softdirty();

         if (mprotect(map, HUGETLB_SIZE, PROT_READ|PROT_WRITE)) {
                 fprintf(stderr, "mmprotect() failed\n");
                 return -errno;
         }

         *map = 0;

         return 0;
 }
--------------------------------------------------------------------------

Above test fails with SIGBUS when there is only a single free hugetlb page.
 # echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
 # ./test
 Bus error (core dumped)

And worse, with sufficient free hugetlb pages it will map an anonymous page
into a shared mapping, for example, messing up accounting during unmap
and breaking MAP_SHARED semantics:
 # echo 2 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
 # ./test
 # cat /proc/meminfo | grep HugePages_
 HugePages_Total:       2
 HugePages_Free:        1
 HugePages_Rsvd:    18446744073709551615
 HugePages_Surp:        0

Reason in this particular case is that vma_wants_writenotify() will
return "true", removing VM_SHARED in vma_set_page_prot() to map pages
write-protected. Let's teach vma_wants_writenotify() that hugetlb does not
support softdirty tracking.

Link: https://lkml.kernel.org/r/20220811103435.188481-1-david@redhat.com
Link: https://lkml.kernel.org/r/20220811103435.188481-2-david@redhat.com
Fixes: 64e455079e1b ("mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Jamie Liu <jamieliu@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>	[3.18+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/mmap.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1693,8 +1693,12 @@ int vma_wants_writenotify(struct vm_area
 	    pgprot_val(vm_pgprot_modify(vm_page_prot, vm_flags)))
 		return 0;
 
-	/* Do we need to track softdirty? */
-	if (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && !(vm_flags & VM_SOFTDIRTY))
+	/*
+	 * Do we need to track softdirty? hugetlb does not support softdirty
+	 * tracking yet.
+	 */
+	if (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && !(vm_flags & VM_SOFTDIRTY) &&
+	    !is_vm_hugetlb_page(vma))
 		return 1;
 
 	/* Specialty mapping? */



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 135/158] Revert "md-raid: destroy the bitmap after destroying the thread"
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (133 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 134/158] mm/hugetlb: fix hugetlb not supporting softdirty tracking Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 136/158] md: call __md_stop_writes in md_stop Greg Kroah-Hartman
                   ` (35 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Guoqing Jiang, Song Liu

From: Guoqing Jiang <guoqing.jiang@linux.dev>

commit 1d258758cf06a0734482989911d184dd5837ed4e upstream.

This reverts commit e151db8ecfb019b7da31d076130a794574c89f6f. Because it
obviously breaks clustered raid as noticed by Neil though it fixed KASAN
issue for dm-raid, let's revert it and fix KASAN issue in next commit.

[1]. https://lore.kernel.org/linux-raid/a6657e08-b6a7-358b-2d2a-0ac37d49d23a@linux.dev/T/#m95ac225cab7409f66c295772483d091084a6d470

Fixes: e151db8ecfb0 ("md-raid: destroy the bitmap after destroying the thread")
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/md/md.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6244,11 +6244,11 @@ static void mddev_detach(struct mddev *m
 static void __md_stop(struct mddev *mddev)
 {
 	struct md_personality *pers = mddev->pers;
+	md_bitmap_destroy(mddev);
 	mddev_detach(mddev);
 	/* Ensure ->event_work is done */
 	if (mddev->event_work.func)
 		flush_workqueue(md_misc_wq);
-	md_bitmap_destroy(mddev);
 	spin_lock(&mddev->lock);
 	mddev->pers = NULL;
 	spin_unlock(&mddev->lock);



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 136/158] md: call __md_stop_writes in md_stop
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (134 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 135/158] Revert "md-raid: destroy the bitmap after destroying the thread" Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 137/158] arm64: Fix match_list for erratum 1286807 on Arm Cortex-A76 Greg Kroah-Hartman
                   ` (34 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mikulas Patocka, Guoqing Jiang, Song Liu

From: Guoqing Jiang <guoqing.jiang@linux.dev>

commit 0dd84b319352bb8ba64752d4e45396d8b13e6018 upstream.

>From the link [1], we can see raid1d was running even after the path
raid_dtr -> md_stop -> __md_stop.

Let's stop write first in destructor to align with normal md-raid to
fix the KASAN issue.

[1]. https://lore.kernel.org/linux-raid/CAPhsuW5gc4AakdGNdF8ubpezAuDLFOYUO_sfMZcec6hQFm8nhg@mail.gmail.com/T/#m7f12bf90481c02c6d2da68c64aeed4779b7df74a

Fixes: 48df498daf62 ("md: move bitmap_destroy to the beginning of __md_stop")
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/md/md.c |    1 +
 1 file changed, 1 insertion(+)

--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6266,6 +6266,7 @@ void md_stop(struct mddev *mddev)
 	/* stop the array and free an attached data structures.
 	 * This is called from dm-raid
 	 */
+	__md_stop_writes(mddev);
 	__md_stop(mddev);
 	bioset_exit(&mddev->bio_set);
 	bioset_exit(&mddev->sync_set);



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 137/158] arm64: Fix match_list for erratum 1286807 on Arm Cortex-A76
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (135 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 136/158] md: call __md_stop_writes in md_stop Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 138/158] binder_alloc: add missing mmap_lock calls when using the VMA Greg Kroah-Hartman
                   ` (33 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Shreyas K K, Zenghui Yu,
	Marc Zyngier, Will Deacon

From: Zenghui Yu <yuzenghui@huawei.com>

commit 5e1e087457c94ad7fafbe1cf6f774c6999ee29d4 upstream.

Since commit 51f559d66527 ("arm64: Enable repeat tlbi workaround on KRYO4XX
gold CPUs"), we failed to detect erratum 1286807 on Cortex-A76 because its
entry in arm64_repeat_tlbi_list[] was accidently corrupted by this commit.

Fix this issue by creating a separate entry for Kryo4xx Gold.

Fixes: 51f559d66527 ("arm64: Enable repeat tlbi workaround on KRYO4XX gold CPUs")
Cc: Shreyas K K <quic_shrekk@quicinc.com>
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220809043848.969-1-yuzenghui@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/kernel/cpu_errata.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -208,6 +208,8 @@ static const struct arm64_cpu_capabiliti
 #ifdef CONFIG_ARM64_ERRATUM_1286807
 	{
 		ERRATA_MIDR_RANGE(MIDR_CORTEX_A76, 0, 0, 3, 0),
+	},
+	{
 		/* Kryo4xx Gold (rcpe to rfpe) => (r0p0 to r3p0) */
 		ERRATA_MIDR_RANGE(MIDR_QCOM_KRYO_4XX_GOLD, 0xc, 0xe, 0xf, 0xe),
 	},



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 138/158] binder_alloc: add missing mmap_lock calls when using the VMA
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (136 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 137/158] arm64: Fix match_list for erratum 1286807 on Arm Cortex-A76 Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 139/158] x86/nospec: Fix i386 RSB stuffing Greg Kroah-Hartman
                   ` (32 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Liam R. Howlett, Ondrej Mosnacek,
	syzbot+a7b60a176ec13cafb793, Carlos Llamas, Minchan Kim,
	Christian Brauner (Microsoft),
	Hridya Valsaraju, Joel Fernandes, Martijn Coenen,
	Suren Baghdasaryan, Todd Kjos, Matthew Wilcox (Oracle),
	Arve Hjønnevåg, Andrew Morton

From: Liam Howlett <liam.howlett@oracle.com>

commit 44e602b4e52f70f04620bbbf4fe46ecb40170bde upstream.

Take the mmap_read_lock() when using the VMA in binder_alloc_print_pages()
and when checking for a VMA in binder_alloc_new_buf_locked().

It is worth noting binder_alloc_new_buf_locked() drops the VMA read lock
after it verifies a VMA exists, but may be taken again deeper in the call
stack, if necessary.

Link: https://lkml.kernel.org/r/20220810160209.1630707-1-Liam.Howlett@oracle.com
Fixes: a43cfc87caaf (android: binder: stop saving a pointer to the VMA)
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
Reported-by: <syzbot+a7b60a176ec13cafb793@syzkaller.appspotmail.com>
Acked-by: Carlos Llamas <cmllamas@google.com>
Tested-by: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Christian Brauner (Microsoft) <brauner@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Martijn Coenen <maco@android.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Todd Kjos <tkjos@android.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: "Arve Hjønnevåg" <arve@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/android/binder_alloc.c |   31 +++++++++++++++++++++----------
 1 file changed, 21 insertions(+), 10 deletions(-)

--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -395,12 +395,15 @@ static struct binder_buffer *binder_allo
 	size_t size, data_offsets_size;
 	int ret;
 
+	mmap_read_lock(alloc->vma_vm_mm);
 	if (!binder_alloc_get_vma(alloc)) {
+		mmap_read_unlock(alloc->vma_vm_mm);
 		binder_alloc_debug(BINDER_DEBUG_USER_ERROR,
 				   "%d: binder_alloc_buf, no vma\n",
 				   alloc->pid);
 		return ERR_PTR(-ESRCH);
 	}
+	mmap_read_unlock(alloc->vma_vm_mm);
 
 	data_offsets_size = ALIGN(data_size, sizeof(void *)) +
 		ALIGN(offsets_size, sizeof(void *));
@@ -922,17 +925,25 @@ void binder_alloc_print_pages(struct seq
 	 * Make sure the binder_alloc is fully initialized, otherwise we might
 	 * read inconsistent state.
 	 */
-	if (binder_alloc_get_vma(alloc) != NULL) {
-		for (i = 0; i < alloc->buffer_size / PAGE_SIZE; i++) {
-			page = &alloc->pages[i];
-			if (!page->page_ptr)
-				free++;
-			else if (list_empty(&page->lru))
-				active++;
-			else
-				lru++;
-		}
+
+	mmap_read_lock(alloc->vma_vm_mm);
+	if (binder_alloc_get_vma(alloc) == NULL) {
+		mmap_read_unlock(alloc->vma_vm_mm);
+		goto uninitialized;
+	}
+
+	mmap_read_unlock(alloc->vma_vm_mm);
+	for (i = 0; i < alloc->buffer_size / PAGE_SIZE; i++) {
+		page = &alloc->pages[i];
+		if (!page->page_ptr)
+			free++;
+		else if (list_empty(&page->lru))
+			active++;
+		else
+			lru++;
 	}
+
+uninitialized:
 	mutex_unlock(&alloc->mutex);
 	seq_printf(m, "  pages: %d:%d:%d\n", active, lru, free);
 	seq_printf(m, "  pages high watermark: %zu\n", alloc->pages_high);



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 139/158] x86/nospec: Fix i386 RSB stuffing
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (137 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 138/158] binder_alloc: add missing mmap_lock calls when using the VMA Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 140/158] drm/amdkfd: Fix isa version for the GC 10.3.7 Greg Kroah-Hartman
                   ` (31 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ben Hutchings, Peter Zijlstra (Intel)

From: Peter Zijlstra <peterz@infradead.org>

commit 332924973725e8cdcc783c175f68cf7e162cb9e5 upstream.

Turns out that i386 doesn't unconditionally have LFENCE, as such the
loop in __FILL_RETURN_BUFFER isn't actually speculation safe on such
chips.

Fixes: ba6e31af2be9 ("x86/speculation: Add LFENCE to RSB fill sequence")
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Yv9tj9vbQ9nNlXoY@worktop.programming.kicks-ass.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/include/asm/nospec-branch.h |   12 ++++++++++++
 1 file changed, 12 insertions(+)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -50,6 +50,7 @@
  * the optimal version - two calls, each with their own speculation
  * trap should their return address end up getting used, in a loop.
  */
+#ifdef CONFIG_X86_64
 #define __FILL_RETURN_BUFFER(reg, nr)			\
 	mov	$(nr/2), reg;				\
 771:							\
@@ -60,6 +61,17 @@
 	jnz	771b;					\
 	/* barrier for jnz misprediction */		\
 	lfence;
+#else
+/*
+ * i386 doesn't unconditionally have LFENCE, as such it can't
+ * do a loop.
+ */
+#define __FILL_RETURN_BUFFER(reg, nr)			\
+	.rept nr;					\
+	__FILL_RETURN_SLOT;				\
+	.endr;						\
+	add	$(BITS_PER_LONG/8) * nr, %_ASM_SP;
+#endif
 
 /*
  * Stuff a single RSB slot.



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 140/158] drm/amdkfd: Fix isa version for the GC 10.3.7
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (138 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 139/158] x86/nospec: Fix i386 RSB stuffing Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 141/158] Documentation/ABI: Mention retbleed vulnerability info file for sysfs Greg Kroah-Hartman
                   ` (30 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Prike Liang, Aaron Liu, Alex Deucher

From: Prike Liang <Prike.Liang@amd.com>

commit ee8086dbc1585d9f4020a19447388246a5cff5c8 upstream.

Correct the isa version for handling KFD test.

Fixes: 7c4f4f197e0c ("drm/amdkfd: Add GC 10.3.6 and 10.3.7 KFD definitions")
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c |    6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -377,12 +377,8 @@ struct kfd_dev *kgd2kfd_probe(struct amd
 				f2g = &gfx_v10_3_kfd2kgd;
 			break;
 		case IP_VERSION(10, 3, 6):
-			gfx_target_version = 100306;
-			if (!vf)
-				f2g = &gfx_v10_3_kfd2kgd;
-			break;
 		case IP_VERSION(10, 3, 7):
-			gfx_target_version = 100307;
+			gfx_target_version = 100306;
 			if (!vf)
 				f2g = &gfx_v10_3_kfd2kgd;
 			break;



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 141/158] Documentation/ABI: Mention retbleed vulnerability info file for sysfs
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (139 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 140/158] drm/amdkfd: Fix isa version for the GC 10.3.7 Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 142/158] blk-mq: fix io hung due to missing commit_rqs Greg Kroah-Hartman
                   ` (29 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Salvatore Bonaccorso, Borislav Petkov

From: Salvatore Bonaccorso <carnil@debian.org>

commit 00da0cb385d05a89226e150a102eb49d8abb0359 upstream.

While reporting for the AMD retbleed vulnerability was added in

  6b80b59b3555 ("x86/bugs: Report AMD retbleed vulnerability")

the new sysfs file was not mentioned so far in the ABI documentation for
sysfs-devices-system-cpu. Fix that.

Fixes: 6b80b59b3555 ("x86/bugs: Report AMD retbleed vulnerability")
Signed-off-by: Salvatore Bonaccorso <carnil@debian.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220801091529.325327-1-carnil@debian.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/ABI/testing/sysfs-devices-system-cpu |    1 +
 1 file changed, 1 insertion(+)

--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -527,6 +527,7 @@ What:		/sys/devices/system/cpu/vulnerabi
 		/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
 		/sys/devices/system/cpu/vulnerabilities/itlb_multihit
 		/sys/devices/system/cpu/vulnerabilities/mmio_stale_data
+		/sys/devices/system/cpu/vulnerabilities/retbleed
 Date:		January 2018
 Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
 Description:	Information about CPU vulnerabilities



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 142/158] blk-mq: fix io hung due to missing commit_rqs
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (140 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 141/158] Documentation/ABI: Mention retbleed vulnerability info file for sysfs Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 143/158] perf python: Fix build when PYTHON_CONFIG is user supplied Greg Kroah-Hartman
                   ` (28 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Yu Kuai, Ming Lei, Jens Axboe

From: Yu Kuai <yukuai3@huawei.com>

commit 65fac0d54f374625b43a9d6ad1f2c212bd41f518 upstream.

Currently, in virtio_scsi, if 'bd->last' is not set to true while
dispatching request, such io will stay in driver's queue, and driver
will wait for block layer to dispatch more rqs. However, if block
layer failed to dispatch more rq, it should trigger commit_rqs to
inform driver.

There is a problem in blk_mq_try_issue_list_directly() that commit_rqs
won't be called:

// assume that queue_depth is set to 1, list contains two rq
blk_mq_try_issue_list_directly
 blk_mq_request_issue_directly
 // dispatch first rq
 // last is false
  __blk_mq_try_issue_directly
   blk_mq_get_dispatch_budget
   // succeed to get first budget
   __blk_mq_issue_directly
    scsi_queue_rq
     cmd->flags |= SCMD_LAST
      virtscsi_queuecommand
       kick = (sc->flags & SCMD_LAST) != 0
       // kick is false, first rq won't issue to disk
 queued++

 blk_mq_request_issue_directly
 // dispatch second rq
  __blk_mq_try_issue_directly
   blk_mq_get_dispatch_budget
   // failed to get second budget
 ret == BLK_STS_RESOURCE
  blk_mq_request_bypass_insert
 // errors is still 0

 if (!list_empty(list) || errors && ...)
  // won't pass, commit_rqs won't be called

In this situation, first rq relied on second rq to dispatch, while
second rq relied on first rq to complete, thus they will both hung.

Fix the problem by also treat 'BLK_STS_*RESOURCE' as 'errors' since
it means that request is not queued successfully.

Same problem exists in blk_mq_dispatch_rq_list(), 'BLK_STS_*RESOURCE'
can't be treated as 'errors' here, fix the problem by calling
commit_rqs if queue_rq return 'BLK_STS_*RESOURCE'.

Fixes: d666ba98f849 ("blk-mq: add mq_ops->commit_rqs()")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20220726122224.1790882-1-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 block/blk-mq.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1925,7 +1925,8 @@ out:
 	/* If we didn't flush the entire list, we could have told the driver
 	 * there was more coming, but that turned out to be a lie.
 	 */
-	if ((!list_empty(list) || errors) && q->mq_ops->commit_rqs && queued)
+	if ((!list_empty(list) || errors || needs_resource ||
+	     ret == BLK_STS_DEV_RESOURCE) && q->mq_ops->commit_rqs && queued)
 		q->mq_ops->commit_rqs(hctx);
 	/*
 	 * Any items that need requeuing? Stuff them into hctx->dispatch,
@@ -2678,6 +2679,7 @@ void blk_mq_try_issue_list_directly(stru
 		list_del_init(&rq->queuelist);
 		ret = blk_mq_request_issue_directly(rq, list_empty(list));
 		if (ret != BLK_STS_OK) {
+			errors++;
 			if (ret == BLK_STS_RESOURCE ||
 					ret == BLK_STS_DEV_RESOURCE) {
 				blk_mq_request_bypass_insert(rq, false,
@@ -2685,7 +2687,6 @@ void blk_mq_try_issue_list_directly(stru
 				break;
 			}
 			blk_mq_end_request(rq, ret);
-			errors++;
 		} else
 			queued++;
 	}



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 143/158] perf python: Fix build when PYTHON_CONFIG is user supplied
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (141 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 142/158] blk-mq: fix io hung due to missing commit_rqs Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 144/158] perf/x86/intel/uncore: Fix broken read_counter() for SNB IMC PMU Greg Kroah-Hartman
                   ` (27 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, James Clark, Ian Rogers,
	Alexander Shishkin, Ingo Molnar, Jiri Olsa, Mark Rutland,
	Namhyung Kim, Peter Zijlstra, Arnaldo Carvalho de Melo

From: James Clark <james.clark@arm.com>

commit bc9e7fe313d5e56d4d5f34bcc04d1165f94f86fb upstream.

The previous change to Python autodetection had a small mistake where
the auto value was used to determine the Python binary, rather than the
user supplied value. The Python binary is only used for one part of the
build process, rather than the final linking, so it was producing
correct builds in most scenarios, especially when the auto detected
value matched what the user wanted, or the system only had a valid set
of Pythons.

Change it so that the Python binary path is derived from either the
PYTHON_CONFIG value or PYTHON value, depending on what is specified by
the user. This was the original intention.

This error was spotted in a build failure an odd cross compilation
environment after commit 4c41cb46a732fe82 ("perf python: Prefer
python3") was merged.

Fixes: 630af16eee495f58 ("perf tools: Use Python devtools for version autodetection rather than runtime")
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220728093946.1337642-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 tools/perf/Makefile.config |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -265,7 +265,7 @@ endif
 # defined. get-executable-or-default fails with an error if the first argument is supplied but
 # doesn't exist.
 override PYTHON_CONFIG := $(call get-executable-or-default,PYTHON_CONFIG,$(PYTHON_AUTO))
-override PYTHON := $(call get-executable-or-default,PYTHON,$(subst -config,,$(PYTHON_AUTO)))
+override PYTHON := $(call get-executable-or-default,PYTHON,$(subst -config,,$(PYTHON_CONFIG)))
 
 grep-libs  = $(filter -l%,$(1))
 strip-libs  = $(filter-out -l%,$(1))



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 144/158] perf/x86/intel/uncore: Fix broken read_counter() for SNB IMC PMU
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (142 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 143/158] perf python: Fix build when PYTHON_CONFIG is user supplied Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 145/158] perf/x86/intel/ds: Fix precise store latency handling Greg Kroah-Hartman
                   ` (26 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stephane Eranian,
	Peter Zijlstra (Intel),
	Kan Liang

From: Stephane Eranian <eranian@google.com>

commit 11745ecfe8fea4b4a4c322967a7605d2ecbd5080 upstream.

Existing code was generating bogus counts for the SNB IMC bandwidth counters:

$ perf stat -a -I 1000 -e uncore_imc/data_reads/,uncore_imc/data_writes/
     1.000327813           1,024.03 MiB  uncore_imc/data_reads/
     1.000327813              20.73 MiB  uncore_imc/data_writes/
     2.000580153         261,120.00 MiB  uncore_imc/data_reads/
     2.000580153              23.28 MiB  uncore_imc/data_writes/

The problem was introduced by commit:
  07ce734dd8ad ("perf/x86/intel/uncore: Clean up client IMC")

Where the read_counter callback was replace to point to the generic
uncore_mmio_read_counter() function.

The SNB IMC counters are freerunnig 32-bit counters laid out contiguously in
MMIO. But uncore_mmio_read_counter() is using a readq() call to read from
MMIO therefore reading 64-bit from MMIO. Although this is okay for the
uncore_perf_event_update() function because it is shifting the value based
on the actual counter width to compute a delta, it is not okay for the
uncore_pmu_event_start() which is simply reading the counter  and therefore
priming the event->prev_count with a bogus value which is responsible for
causing bogus deltas in the perf stat command above.

The fix is to reintroduce the custom callback for read_counter for the SNB
IMC PMU and use readl() instead of readq(). With the change the output of
perf stat is back to normal:
$ perf stat -a -I 1000 -e uncore_imc/data_reads/,uncore_imc/data_writes/
     1.000120987             296.94 MiB  uncore_imc/data_reads/
     1.000120987             138.42 MiB  uncore_imc/data_writes/
     2.000403144             175.91 MiB  uncore_imc/data_reads/
     2.000403144              68.50 MiB  uncore_imc/data_writes/

Fixes: 07ce734dd8ad ("perf/x86/intel/uncore: Clean up client IMC")
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20220803160031.1379788-1-eranian@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/events/intel/uncore_snb.c |   18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

--- a/arch/x86/events/intel/uncore_snb.c
+++ b/arch/x86/events/intel/uncore_snb.c
@@ -841,6 +841,22 @@ int snb_pci2phy_map_init(int devid)
 	return 0;
 }
 
+static u64 snb_uncore_imc_read_counter(struct intel_uncore_box *box, struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	/*
+	 * SNB IMC counters are 32-bit and are laid out back to back
+	 * in MMIO space. Therefore we must use a 32-bit accessor function
+	 * using readq() from uncore_mmio_read_counter() causes problems
+	 * because it is reading 64-bit at a time. This is okay for the
+	 * uncore_perf_event_update() function because it drops the upper
+	 * 32-bits but not okay for plain uncore_read_counter() as invoked
+	 * in uncore_pmu_event_start().
+	 */
+	return (u64)readl(box->io_addr + hwc->event_base);
+}
+
 static struct pmu snb_uncore_imc_pmu = {
 	.task_ctx_nr	= perf_invalid_context,
 	.event_init	= snb_uncore_imc_event_init,
@@ -860,7 +876,7 @@ static struct intel_uncore_ops snb_uncor
 	.disable_event	= snb_uncore_imc_disable_event,
 	.enable_event	= snb_uncore_imc_enable_event,
 	.hw_config	= snb_uncore_imc_hw_config,
-	.read_counter	= uncore_mmio_read_counter,
+	.read_counter	= snb_uncore_imc_read_counter,
 };
 
 static struct intel_uncore_type snb_uncore_imc = {



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 145/158] perf/x86/intel/ds: Fix precise store latency handling
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (143 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 144/158] perf/x86/intel/uncore: Fix broken read_counter() for SNB IMC PMU Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 146/158] perf stat: Clear evsel->reset_group for each stat run Greg Kroah-Hartman
                   ` (25 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stephane Eranian, Peter Zijlstra (Intel)

From: Stephane Eranian <eranian@google.com>

commit d4bdb0bebc5ba3299d74f123c782d99cd4e25c49 upstream.

With the existing code in store_latency_data(), the memory operation (mem_op)
returned to the user is always OP_LOAD where in fact, it should be OP_STORE.
This comes from the fact that the function is simply grabbing the information
from a data source map which covers only load accesses. Intel 12th gen CPU
offers precise store sampling that captures both the data source and latency.
Therefore it can use the data source mapping table but must override the
memory operation to reflect stores instead of loads.

Fixes: 61b985e3e775 ("perf/x86/intel: Add perf core PMU support for Sapphire Rapids")
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220818054613.1548130-1-eranian@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/events/intel/ds.c |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -291,6 +291,7 @@ static u64 load_latency_data(struct perf
 static u64 store_latency_data(struct perf_event *event, u64 status)
 {
 	union intel_x86_pebs_dse dse;
+	union perf_mem_data_src src;
 	u64 val;
 
 	dse.val = status;
@@ -304,7 +305,14 @@ static u64 store_latency_data(struct per
 
 	val |= P(BLK, NA);
 
-	return val;
+	/*
+	 * the pebs_data_source table is only for loads
+	 * so override the mem_op to say STORE instead
+	 */
+	src.val = val;
+	src.mem_op = P(OP,STORE);
+
+	return src.val;
 }
 
 struct pebs_record_core {



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 146/158] perf stat: Clear evsel->reset_group for each stat run
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (144 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 145/158] perf/x86/intel/ds: Fix precise store latency handling Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 147/158] arm64: fix rodata=full Greg Kroah-Hartman
                   ` (24 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andi Kleen, Ian Rogers,
	Arnaldo Carvalho de Melo, Xing Zhengjun, Alexander Shishkin,
	Ingo Molnar, Jiri Olsa, Kan Liang, Mark Rutland, Namhyung Kim,
	Peter Zijlstra, Stephane Eranian

From: Ian Rogers <irogers@google.com>

commit bf515f024e4c0ca46a1b08c4f31860c01781d8a5 upstream.

If a weak group is broken then the reset_group flag remains set for
the next run. Having reset_group set means the counter isn't created
and ultimately a segfault.

A simple reproduction of this is:

  # perf stat -r2 -e '{cycles,cycles,cycles,cycles,cycles,cycles,cycles,cycles,cycles,cycles}:W

which will be added as a test in the next patch.

Fixes: 4804e0111662d7d8 ("perf stat: Use affinity for opening events")
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220822213352.75721-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 tools/perf/builtin-stat.c |    1 +
 1 file changed, 1 insertion(+)

--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -826,6 +826,7 @@ static int __run_perf_stat(int argc, con
 	}
 
 	evlist__for_each_entry(evsel_list, counter) {
+		counter->reset_group = false;
 		if (bpf_counter__load(counter, &target))
 			return -1;
 		if (!evsel__is_bpf(counter))



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 147/158] arm64: fix rodata=full
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (145 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 146/158] perf stat: Clear evsel->reset_group for each stat run Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 148/158] arm64/signal: Flush FPSIMD register state when disabling streaming mode Greg Kroah-Hartman
                   ` (23 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mark Rutland, Andy Shevchenko,
	Ard Biesheuvel, Catalin Marinas, Jagdish Gediya, Matthew Wilcox,
	Randy Dunlap, Will Deacon

From: Mark Rutland <mark.rutland@arm.com>

commit 2e8cff0a0eee87b27f0cf87ad8310eb41b5886ab upstream.

On arm64, "rodata=full" has been suppored (but not documented) since
commit:

  c55191e96caa9d78 ("arm64: mm: apply r/o permissions of VM areas to its linear alias as well")

As it's necessary to determine the rodata configuration early during
boot, arm64 has an early_param() handler for this, whereas init/main.c
has a __setup() handler which is run later.

Unfortunately, this split meant that since commit:

  f9a40b0890658330 ("init/main.c: return 1 from handled __setup() functions")

... passing "rodata=full" would result in a spurious warning from the
__setup() handler (though RO permissions would be configured
appropriately).

Further, "rodata=full" has been broken since commit:

  0d6ea3ac94ca77c5 ("lib/kstrtox.c: add "false"/"true" support to kstrtobool()")

... which caused strtobool() to parse "full" as false (in addition to
many other values not documented for the "rodata=" kernel parameter.

This patch fixes this breakage by:

* Moving the core parameter parser to an __early_param(), such that it
  is available early.

* Adding an (optional) arch hook which arm64 can use to parse "full".

* Updating the documentation to mention that "full" is valid for arm64.

* Having the core parameter parser handle "on" and "off" explicitly,
  such that any undocumented values (e.g. typos such as "ful") are
  reported as errors rather than being silently accepted.

Note that __setup() and early_param() have opposite conventions for
their return values, where __setup() uses 1 to indicate a parameter was
handled and early_param() uses 0 to indicate a parameter was handled.

Fixes: f9a40b089065 ("init/main.c: return 1 from handled __setup() functions")
Fixes: 0d6ea3ac94ca ("lib/kstrtox.c: add "false"/"true" support to kstrtobool()")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jagdish Gediya <jvgediya@linux.ibm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220817154022.3974645-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/admin-guide/kernel-parameters.txt |    2 ++
 arch/arm64/include/asm/setup.h                  |   17 +++++++++++++++++
 arch/arm64/mm/mmu.c                             |   18 ------------------
 init/main.c                                     |   18 +++++++++++++++---
 4 files changed, 34 insertions(+), 21 deletions(-)

--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5260,6 +5260,8 @@
 	rodata=		[KNL]
 		on	Mark read-only kernel memory as read-only (default).
 		off	Leave read-only kernel memory writable for debugging.
+		full	Mark read-only kernel memory and aliases as read-only
+		        [arm64]
 
 	rockchip.usb_uart
 			Enable the uart passthrough on the designated usb port
--- a/arch/arm64/include/asm/setup.h
+++ b/arch/arm64/include/asm/setup.h
@@ -3,6 +3,8 @@
 #ifndef __ARM64_ASM_SETUP_H
 #define __ARM64_ASM_SETUP_H
 
+#include <linux/string.h>
+
 #include <uapi/asm/setup.h>
 
 void *get_early_fdt_ptr(void);
@@ -14,4 +16,19 @@ void early_fdt_map(u64 dt_phys);
 extern phys_addr_t __fdt_pointer __initdata;
 extern u64 __cacheline_aligned boot_args[4];
 
+static inline bool arch_parse_debug_rodata(char *arg)
+{
+	extern bool rodata_enabled;
+	extern bool rodata_full;
+
+	if (arg && !strcmp(arg, "full")) {
+		rodata_enabled = true;
+		rodata_full = true;
+		return true;
+	}
+
+	return false;
+}
+#define arch_parse_debug_rodata arch_parse_debug_rodata
+
 #endif
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -625,24 +625,6 @@ static void __init map_kernel_segment(pg
 	vm_area_add_early(vma);
 }
 
-static int __init parse_rodata(char *arg)
-{
-	int ret = strtobool(arg, &rodata_enabled);
-	if (!ret) {
-		rodata_full = false;
-		return 0;
-	}
-
-	/* permit 'full' in addition to boolean options */
-	if (strcmp(arg, "full"))
-		return -EINVAL;
-
-	rodata_enabled = true;
-	rodata_full = true;
-	return 0;
-}
-early_param("rodata", parse_rodata);
-
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
 static int __init map_entry_trampoline(void)
 {
--- a/init/main.c
+++ b/init/main.c
@@ -1446,13 +1446,25 @@ static noinline void __init kernel_init_
 
 #if defined(CONFIG_STRICT_KERNEL_RWX) || defined(CONFIG_STRICT_MODULE_RWX)
 bool rodata_enabled __ro_after_init = true;
+
+#ifndef arch_parse_debug_rodata
+static inline bool arch_parse_debug_rodata(char *str) { return false; }
+#endif
+
 static int __init set_debug_rodata(char *str)
 {
-	if (strtobool(str, &rodata_enabled))
+	if (arch_parse_debug_rodata(str))
+		return 0;
+
+	if (str && !strcmp(str, "on"))
+		rodata_enabled = true;
+	else if (str && !strcmp(str, "off"))
+		rodata_enabled = false;
+	else
 		pr_warn("Invalid option string for rodata: '%s'\n", str);
-	return 1;
+	return 0;
 }
-__setup("rodata=", set_debug_rodata);
+early_param("rodata", set_debug_rodata);
 #endif
 
 #ifdef CONFIG_STRICT_KERNEL_RWX



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 148/158] arm64/signal: Flush FPSIMD register state when disabling streaming mode
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (146 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 147/158] arm64: fix rodata=full Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 10:59 ` [PATCH 5.19 149/158] arm64/sme: Dont flush SVE register state when allocating SME storage Greg Kroah-Hartman
                   ` (22 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mark Brown, Catalin Marinas, Will Deacon

From: Mark Brown <broonie@kernel.org>

commit ea64baacbc36a0d552aec0d87107182f40211131 upstream.

When handling a signal delivered to a context with streaming mode enabled
we will disable streaming mode for the signal handler, when doing so we
should also flush the saved FPSIMD register state like exiting streaming
mode in the hardware would do so that if that state is reloaded we get the
same behaviour. Without this we will reload whatever the last FPSIMD state
that was saved for the task was.

Fixes: 40a8e87bb328 ("arm64/sme: Disable ZA and streaming mode when handling signals")
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220817182324.638214-3-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/kernel/signal.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -922,6 +922,16 @@ static void setup_return(struct pt_regs
 
 	/* Signal handlers are invoked with ZA and streaming mode disabled */
 	if (system_supports_sme()) {
+		/*
+		 * If we were in streaming mode the saved register
+		 * state was SVE but we will exit SM and use the
+		 * FPSIMD register state - flush the saved FPSIMD
+		 * register state in case it gets loaded.
+		 */
+		if (current->thread.svcr & SVCR_SM_MASK)
+			memset(&current->thread.uw.fpsimd_state, 0,
+			       sizeof(current->thread.uw.fpsimd_state));
+
 		current->thread.svcr &= ~(SVCR_ZA_MASK |
 					  SVCR_SM_MASK);
 		sme_smstop();



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 149/158] arm64/sme: Dont flush SVE register state when allocating SME storage
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (147 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 148/158] arm64/signal: Flush FPSIMD register state when disabling streaming mode Greg Kroah-Hartman
@ 2022-08-29 10:59 ` Greg Kroah-Hartman
  2022-08-29 11:00 ` [PATCH 5.19 150/158] arm64/sme: Dont flush SVE register state when handling SME traps Greg Kroah-Hartman
                   ` (21 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 10:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mark Brown, Catalin Marinas, Will Deacon

From: Mark Brown <broonie@kernel.org>

commit 826a4fdd2ada9e5923c58bdd168f31a42e958ffc upstream.

Currently when taking a SME access trap we allocate storage for the SVE
register state in order to be able to handle storage of streaming mode SVE.
Due to the original usage in a purely SVE context the SVE register state
allocation this also flushes the register state for SVE if storage was
already allocated but in the SME context this is not desirable. For a SME
access trap to be taken the task must not be in streaming mode so either
there already is SVE register state present for regular SVE mode which would
be corrupted or the task does not have TIF_SVE and the flush is redundant.

Fix this by adding a flag to sve_alloc() indicating if we are in a SVE
context and need to flush the state. Freshly allocated storage is always
zeroed either way.

Fixes: 8bd7f91c03d8 ("arm64/sme: Implement traps and syscall handling for SME")
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220817182324.638214-4-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/include/asm/fpsimd.h |    4 ++--
 arch/arm64/kernel/fpsimd.c      |   10 ++++++----
 arch/arm64/kernel/ptrace.c      |    6 +++---
 arch/arm64/kernel/signal.c      |    2 +-
 4 files changed, 12 insertions(+), 10 deletions(-)

--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -153,7 +153,7 @@ struct vl_info {
 
 #ifdef CONFIG_ARM64_SVE
 
-extern void sve_alloc(struct task_struct *task);
+extern void sve_alloc(struct task_struct *task, bool flush);
 extern void fpsimd_release_task(struct task_struct *task);
 extern void fpsimd_sync_to_sve(struct task_struct *task);
 extern void fpsimd_force_sync_to_sve(struct task_struct *task);
@@ -256,7 +256,7 @@ size_t sve_state_size(struct task_struct
 
 #else /* ! CONFIG_ARM64_SVE */
 
-static inline void sve_alloc(struct task_struct *task) { }
+static inline void sve_alloc(struct task_struct *task, bool flush) { }
 static inline void fpsimd_release_task(struct task_struct *task) { }
 static inline void sve_sync_to_fpsimd(struct task_struct *task) { }
 static inline void sve_sync_from_fpsimd_zeropad(struct task_struct *task) { }
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -716,10 +716,12 @@ size_t sve_state_size(struct task_struct
  * do_sve_acc() case, there is no ABI requirement to hide stale data
  * written previously be task.
  */
-void sve_alloc(struct task_struct *task)
+void sve_alloc(struct task_struct *task, bool flush)
 {
 	if (task->thread.sve_state) {
-		memset(task->thread.sve_state, 0, sve_state_size(task));
+		if (flush)
+			memset(task->thread.sve_state, 0,
+			       sve_state_size(task));
 		return;
 	}
 
@@ -1389,7 +1391,7 @@ void do_sve_acc(unsigned long esr, struc
 		return;
 	}
 
-	sve_alloc(current);
+	sve_alloc(current, true);
 	if (!current->thread.sve_state) {
 		force_sig(SIGKILL);
 		return;
@@ -1440,7 +1442,7 @@ void do_sme_acc(unsigned long esr, struc
 		return;
 	}
 
-	sve_alloc(current);
+	sve_alloc(current, false);
 	sme_alloc(current);
 	if (!current->thread.sve_state || !current->thread.za_state) {
 		force_sig(SIGKILL);
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -882,7 +882,7 @@ static int sve_set_common(struct task_st
 		 * state and ensure there's storage.
 		 */
 		if (target->thread.svcr != old_svcr)
-			sve_alloc(target);
+			sve_alloc(target, true);
 	}
 
 	/* Registers: FPSIMD-only case */
@@ -912,7 +912,7 @@ static int sve_set_common(struct task_st
 		goto out;
 	}
 
-	sve_alloc(target);
+	sve_alloc(target, true);
 	if (!target->thread.sve_state) {
 		ret = -ENOMEM;
 		clear_tsk_thread_flag(target, TIF_SVE);
@@ -1082,7 +1082,7 @@ static int za_set(struct task_struct *ta
 
 	/* Ensure there is some SVE storage for streaming mode */
 	if (!target->thread.sve_state) {
-		sve_alloc(target);
+		sve_alloc(target, false);
 		if (!target->thread.sve_state) {
 			clear_thread_flag(TIF_SME);
 			ret = -ENOMEM;
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -307,7 +307,7 @@ static int restore_sve_fpsimd_context(st
 	fpsimd_flush_task_state(current);
 	/* From now, fpsimd_thread_switch() won't touch thread.sve_state */
 
-	sve_alloc(current);
+	sve_alloc(current, true);
 	if (!current->thread.sve_state) {
 		clear_thread_flag(TIF_SVE);
 		return -ENOMEM;



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 150/158] arm64/sme: Dont flush SVE register state when handling SME traps
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (148 preceding siblings ...)
  2022-08-29 10:59 ` [PATCH 5.19 149/158] arm64/sme: Dont flush SVE register state when allocating SME storage Greg Kroah-Hartman
@ 2022-08-29 11:00 ` Greg Kroah-Hartman
  2022-08-29 11:00 ` [PATCH 5.19 151/158] scsi: ufs: core: Enable link lost interrupt Greg Kroah-Hartman
                   ` (20 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 11:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mark Brown, Catalin Marinas, Will Deacon

From: Mark Brown <broonie@kernel.org>

commit 714f3cbd70a4db9f9b7fe5b8a032896ed33fb824 upstream.

Currently as part of handling a SME access trap we flush the SVE register
state. This is not needed and would corrupt register state if the task has
access to the SVE registers already. For non-streaming mode accesses the
required flushing will be done in the SVE access trap. For streaming
mode SVE register accesses the architecture guarantees that the register
state will be flushed when streaming mode is entered or exited so there is
no need for us to do so. Simply remove the register initialisation.

Fixes: 8bd7f91c03d8 ("arm64/sme: Implement traps and syscall handling for SME")
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220817182324.638214-5-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/kernel/fpsimd.c |   11 -----------
 1 file changed, 11 deletions(-)

--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1463,17 +1463,6 @@ void do_sme_acc(unsigned long esr, struc
 		fpsimd_bind_task_to_cpu();
 	}
 
-	/*
-	 * If SVE was not already active initialise the SVE registers,
-	 * any non-shared state between the streaming and regular SVE
-	 * registers is architecturally guaranteed to be zeroed when
-	 * we enter streaming mode.  We do not need to initialize ZA
-	 * since ZA must be disabled at this point and enabling ZA is
-	 * architecturally defined to zero ZA.
-	 */
-	if (system_supports_sve() && !test_thread_flag(TIF_SVE))
-		sve_init_regs();
-
 	put_cpu_fpsimd_context();
 }
 



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 151/158] scsi: ufs: core: Enable link lost interrupt
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (149 preceding siblings ...)
  2022-08-29 11:00 ` [PATCH 5.19 150/158] arm64/sme: Dont flush SVE register state when handling SME traps Greg Kroah-Hartman
@ 2022-08-29 11:00 ` Greg Kroah-Hartman
  2022-08-29 11:00 ` [PATCH 5.19 152/158] scsi: storvsc: Remove WQ_MEM_RECLAIM from storvsc_error_wq Greg Kroah-Hartman
                   ` (19 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 11:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Bart Van Assche, Kiwoong Kim,
	Martin K. Petersen

From: Kiwoong Kim <kwmad.kim@samsung.com>

commit 6d17a112e9a63ff6a5edffd1676b99e0ffbcd269 upstream.

Link lost is treated as fatal error with commit c99b9b230149 ("scsi: ufs:
Treat link loss as fatal error"), but the event isn't registered as
interrupt source. Enable it.

Link: https://lore.kernel.org/r/1659404551-160958-1-git-send-email-kwmad.kim@samsung.com
Fixes: c99b9b230149 ("scsi: ufs: Treat link loss as fatal error")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/ufs/ufshci.h |    6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

--- a/include/ufs/ufshci.h
+++ b/include/ufs/ufshci.h
@@ -135,11 +135,7 @@ static inline u32 ufshci_version(u32 maj
 
 #define UFSHCD_UIC_MASK		(UIC_COMMAND_COMPL | UFSHCD_UIC_PWR_MASK)
 
-#define UFSHCD_ERROR_MASK	(UIC_ERROR |\
-				DEVICE_FATAL_ERROR |\
-				CONTROLLER_FATAL_ERROR |\
-				SYSTEM_BUS_FATAL_ERROR |\
-				CRYPTO_ENGINE_FATAL_ERROR)
+#define UFSHCD_ERROR_MASK	(UIC_ERROR | INT_FATAL_ERRORS)
 
 #define INT_FATAL_ERRORS	(DEVICE_FATAL_ERROR |\
 				CONTROLLER_FATAL_ERROR |\



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 152/158] scsi: storvsc: Remove WQ_MEM_RECLAIM from storvsc_error_wq
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (150 preceding siblings ...)
  2022-08-29 11:00 ` [PATCH 5.19 151/158] scsi: ufs: core: Enable link lost interrupt Greg Kroah-Hartman
@ 2022-08-29 11:00 ` Greg Kroah-Hartman
  2022-08-29 11:00 ` [PATCH 5.19 153/158] scsi: core: Fix passthrough retry counter handling Greg Kroah-Hartman
                   ` (18 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 11:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Michael Kelley, Saurabh Sengar,
	Martin K. Petersen

From: Saurabh Sengar <ssengar@linux.microsoft.com>

commit d957e7ffb2c72410bcc1a514153a46719255a5da upstream.

storvsc_error_wq workqueue should not be marked as WQ_MEM_RECLAIM as it
doesn't need to make forward progress under memory pressure.  Marking this
workqueue as WQ_MEM_RECLAIM may cause deadlock while flushing a
non-WQ_MEM_RECLAIM workqueue.  In the current state it causes the following
warning:

[   14.506347] ------------[ cut here ]------------
[   14.506354] workqueue: WQ_MEM_RECLAIM storvsc_error_wq_0:storvsc_remove_lun is flushing !WQ_MEM_RECLAIM events_freezable_power_:disk_events_workfn
[   14.506360] WARNING: CPU: 0 PID: 8 at <-snip->kernel/workqueue.c:2623 check_flush_dependency+0xb5/0x130
[   14.506390] CPU: 0 PID: 8 Comm: kworker/u4:0 Not tainted 5.4.0-1086-azure #91~18.04.1-Ubuntu
[   14.506391] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 05/09/2022
[   14.506393] Workqueue: storvsc_error_wq_0 storvsc_remove_lun
[   14.506395] RIP: 0010:check_flush_dependency+0xb5/0x130
		<-snip->
[   14.506408] Call Trace:
[   14.506412]  __flush_work+0xf1/0x1c0
[   14.506414]  __cancel_work_timer+0x12f/0x1b0
[   14.506417]  ? kernfs_put+0xf0/0x190
[   14.506418]  cancel_delayed_work_sync+0x13/0x20
[   14.506420]  disk_block_events+0x78/0x80
[   14.506421]  del_gendisk+0x3d/0x2f0
[   14.506423]  sr_remove+0x28/0x70
[   14.506427]  device_release_driver_internal+0xef/0x1c0
[   14.506428]  device_release_driver+0x12/0x20
[   14.506429]  bus_remove_device+0xe1/0x150
[   14.506431]  device_del+0x167/0x380
[   14.506432]  __scsi_remove_device+0x11d/0x150
[   14.506433]  scsi_remove_device+0x26/0x40
[   14.506434]  storvsc_remove_lun+0x40/0x60
[   14.506436]  process_one_work+0x209/0x400
[   14.506437]  worker_thread+0x34/0x400
[   14.506439]  kthread+0x121/0x140
[   14.506440]  ? process_one_work+0x400/0x400
[   14.506441]  ? kthread_park+0x90/0x90
[   14.506443]  ret_from_fork+0x35/0x40
[   14.506445] ---[ end trace 2d9633159fdc6ee7 ]---

Link: https://lore.kernel.org/r/1659628534-17539-1-git-send-email-ssengar@linux.microsoft.com
Fixes: 436ad9413353 ("scsi: storvsc: Allow only one remove lun work item to be issued per lun")
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/storvsc_drv.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -2012,7 +2012,7 @@ static int storvsc_probe(struct hv_devic
 	 */
 	host_dev->handle_error_wq =
 			alloc_ordered_workqueue("storvsc_error_wq_%d",
-						WQ_MEM_RECLAIM,
+						0,
 						host->host_no);
 	if (!host_dev->handle_error_wq) {
 		ret = -ENOMEM;



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 153/158] scsi: core: Fix passthrough retry counter handling
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (151 preceding siblings ...)
  2022-08-29 11:00 ` [PATCH 5.19 152/158] scsi: storvsc: Remove WQ_MEM_RECLAIM from storvsc_error_wq Greg Kroah-Hartman
@ 2022-08-29 11:00 ` Greg Kroah-Hartman
  2022-08-29 11:00 ` [PATCH 5.19 154/158] riscv: dts: microchip: mpfs: fix incorrect pcie child node name Greg Kroah-Hartman
                   ` (17 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 11:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Christoph Hellwig, Mike Christie,
	Martin K. Petersen

From: Mike Christie <michael.christie@oracle.com>

commit fac8e558da9485e13a0ae0488aa0b8a8c307cd34 upstream.

Passthrough users will set the scsi_cmnd->allowed value and were expecting
up to $allowed retries. The problem is that before:

commit 6aded12b10e0 ("scsi: core: Remove struct scsi_request")

we used to set the retries on the scsi_request then copy them over to
scsi_cmnd->allowed in scsi_setup_scsi_cmnd. With that patch we now set
scsi_cmnd->allowed to 0 in scsi_prepare_cmd and overwrite what the
passthrough user set.

This moves the allowed initialization to after the blk_rq_is_passthrough()
check so it's only done for the non-passthrough path where the ULD
init_command will normally set an allowed value it prefers.

Link: https://lore.kernel.org/r/20220812011206.9157-1-michael.christie@oracle.com
Fixes: 6aded12b10e0 ("scsi: core: Remove struct scsi_request")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/scsi_lib.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1549,7 +1549,6 @@ static blk_status_t scsi_prepare_cmd(str
 	scsi_init_command(sdev, cmd);
 
 	cmd->eh_eflags = 0;
-	cmd->allowed = 0;
 	cmd->prot_type = 0;
 	cmd->prot_flags = 0;
 	cmd->submitter = 0;
@@ -1600,6 +1599,8 @@ static blk_status_t scsi_prepare_cmd(str
 			return ret;
 	}
 
+	/* Usually overridden by the ULP */
+	cmd->allowed = 0;
 	memset(cmd->cmnd, 0, sizeof(cmd->cmnd));
 	return scsi_cmd_to_driver(cmd)->init_command(cmd);
 }



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 154/158] riscv: dts: microchip: mpfs: fix incorrect pcie child node name
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (152 preceding siblings ...)
  2022-08-29 11:00 ` [PATCH 5.19 153/158] scsi: core: Fix passthrough retry counter handling Greg Kroah-Hartman
@ 2022-08-29 11:00 ` Greg Kroah-Hartman
  2022-08-29 11:00 ` [PATCH 5.19 155/158] riscv: dts: microchip: mpfs: remove ti,fifo-depth property Greg Kroah-Hartman
                   ` (16 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 11:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Conor Dooley

From: Conor Dooley <conor.dooley@microchip.com>

commit 3f67e69976035352db110443916bcce32c7f64ac upstream.

Recent versions of dt-schema complain about the PCIe controller's child
node name:
arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dtb: pcie@2000000000: Unevaluated properties are not allowed ('clock-names', 'clocks', 'legacy-interrupt-controller', 'microchip,axi-m-atr0' were unexpected)
            From schema: Documentation/devicetree/bindings/pci/microchip,pcie-host.yaml
Make the dts match the correct property name in the dts.

Fixes: 528a5b1f2556 ("riscv: dts: microchip: add new peripherals to icicle kit device tree")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/riscv/boot/dts/microchip/mpfs.dtsi |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/riscv/boot/dts/microchip/mpfs.dtsi
+++ b/arch/riscv/boot/dts/microchip/mpfs.dtsi
@@ -448,7 +448,7 @@
 			msi-controller;
 			microchip,axi-m-atr0 = <0x10 0x0>;
 			status = "disabled";
-			pcie_intc: legacy-interrupt-controller {
+			pcie_intc: interrupt-controller {
 				#address-cells = <0>;
 				#interrupt-cells = <1>;
 				interrupt-controller;



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 155/158] riscv: dts: microchip: mpfs: remove ti,fifo-depth property
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (153 preceding siblings ...)
  2022-08-29 11:00 ` [PATCH 5.19 154/158] riscv: dts: microchip: mpfs: fix incorrect pcie child node name Greg Kroah-Hartman
@ 2022-08-29 11:00 ` Greg Kroah-Hartman
  2022-08-29 11:00 ` [PATCH 5.19 156/158] riscv: dts: microchip: mpfs: remove bogus card-detect-delay Greg Kroah-Hartman
                   ` (15 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 11:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Conor Dooley

From: Conor Dooley <conor.dooley@microchip.com>

commit 72a05748cbd285567d69f173f8694e3471b79f20 upstream.

Recent versions of dt-schema warn about a previously undetected
undocument property on the icicle & polarberry devicetrees:

arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dtb: ethernet@20112000: ethernet-phy@8: Unevaluated properties are not allowed ('ti,fifo-depth' was unexpected)
        From schema: Documentation/devicetree/bindings/net/cdns,macb.yaml

I know what you're thinking, the binding doesn't look to be the problem
and I agree. I am not sure why a TI vendor property was ever actually
added since it has no meaning... just get rid of it.

Fixes: bc47b2217f24 ("riscv: dts: microchip: add the sundance polarberry")
Fixes: 0fa6107eca41 ("RISC-V: Initial DTS for Microchip ICICLE board")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts |    2 --
 arch/riscv/boot/dts/microchip/mpfs-polarberry.dts |    2 --
 2 files changed, 4 deletions(-)

--- a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts
+++ b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts
@@ -84,12 +84,10 @@
 
 	phy1: ethernet-phy@9 {
 		reg = <9>;
-		ti,fifo-depth = <0x1>;
 	};
 
 	phy0: ethernet-phy@8 {
 		reg = <8>;
-		ti,fifo-depth = <0x1>;
 	};
 };
 
--- a/arch/riscv/boot/dts/microchip/mpfs-polarberry.dts
+++ b/arch/riscv/boot/dts/microchip/mpfs-polarberry.dts
@@ -54,12 +54,10 @@
 
 	phy1: ethernet-phy@5 {
 		reg = <5>;
-		ti,fifo-depth = <0x01>;
 	};
 
 	phy0: ethernet-phy@4 {
 		reg = <4>;
-		ti,fifo-depth = <0x01>;
 	};
 };
 



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 156/158] riscv: dts: microchip: mpfs: remove bogus card-detect-delay
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (154 preceding siblings ...)
  2022-08-29 11:00 ` [PATCH 5.19 155/158] riscv: dts: microchip: mpfs: remove ti,fifo-depth property Greg Kroah-Hartman
@ 2022-08-29 11:00 ` Greg Kroah-Hartman
  2022-08-29 11:00 ` [PATCH 5.19 157/158] riscv: dts: microchip: mpfs: remove pci axi address translation property Greg Kroah-Hartman
                   ` (14 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 11:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Conor Dooley

From: Conor Dooley <conor.dooley@microchip.com>

commit 2b55915d27dcaa35f54bad7925af0a76001079bc upstream.

Recent versions of dt-schema warn about a previously undetected
undocumented property:
arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dtb: mmc@20008000: Unevaluated properties are not allowed ('card-detect-delay' was unexpected)
        From schema: Documentation/devicetree/bindings/mmc/cdns,sdhci.yaml

There are no GPIOs connected to MSSIO6B4 pin K3 so adding the common
cd-debounce-delay-ms property makes no sense. The Cadence IP has a
register that sets the card detect delay as "DP * tclk". On MPFS, this
clock frequency is not configurable (it must be 200 MHz) & the FPGA
comes out of reset with this register already set.

Fixes: bc47b2217f24 ("riscv: dts: microchip: add the sundance polarberry")
Fixes: 0fa6107eca41 ("RISC-V: Initial DTS for Microchip ICICLE board")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts | 1 -
 arch/riscv/boot/dts/microchip/mpfs-polarberry.dts | 1 -
 2 files changed, 2 deletions(-)

diff --git a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts
index ee548ab61a2a..f3f87ed2007f 100644
--- a/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts
+++ b/arch/riscv/boot/dts/microchip/mpfs-icicle-kit.dts
@@ -100,7 +100,6 @@ &mmc {
 	disable-wp;
 	cap-sd-highspeed;
 	cap-mmc-highspeed;
-	card-detect-delay = <200>;
 	mmc-ddr-1_8v;
 	mmc-hs200-1_8v;
 	sd-uhs-sdr12;
diff --git a/arch/riscv/boot/dts/microchip/mpfs-polarberry.dts b/arch/riscv/boot/dts/microchip/mpfs-polarberry.dts
index dc11bb8fc833..c87cc2d8fe29 100644
--- a/arch/riscv/boot/dts/microchip/mpfs-polarberry.dts
+++ b/arch/riscv/boot/dts/microchip/mpfs-polarberry.dts
@@ -70,7 +70,6 @@ &mmc {
 	disable-wp;
 	cap-sd-highspeed;
 	cap-mmc-highspeed;
-	card-detect-delay = <200>;
 	mmc-ddr-1_8v;
 	mmc-hs200-1_8v;
 	sd-uhs-sdr12;
-- 
2.37.2




^ permalink raw reply related	[flat|nested] 174+ messages in thread

* [PATCH 5.19 157/158] riscv: dts: microchip: mpfs: remove pci axi address translation property
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (155 preceding siblings ...)
  2022-08-29 11:00 ` [PATCH 5.19 156/158] riscv: dts: microchip: mpfs: remove bogus card-detect-delay Greg Kroah-Hartman
@ 2022-08-29 11:00 ` Greg Kroah-Hartman
  2022-08-29 11:00 ` [PATCH 5.19 158/158] bpf: Dont use tnum_range on array range checking for poke descriptors Greg Kroah-Hartman
                   ` (13 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 11:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Conor Dooley

From: Conor Dooley <conor.dooley@microchip.com>

commit e4009c5fa77b4356aa37ce002e9f9952dfd7a615 upstream.

An AXI master address translation table property was inadvertently
added to the device tree & this was not caught by dtbs_check at the
time. Remove the property - it should not be in mpfs.dtsi anyway as
it would be more suitable in -fabric.dtsi nor does it actually apply
to the version of the reference design we are using for upstream.

Link: https://www.microsemi.com/document-portal/doc_download/1245812-polarfire-fpga-and-polarfire-soc-fpga-pci-express-user-guide # Section 1.3.3
Fixes: 528a5b1f2556 ("riscv: dts: microchip: add new peripherals to icicle kit device tree")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/riscv/boot/dts/microchip/mpfs.dtsi |    1 -
 1 file changed, 1 deletion(-)

--- a/arch/riscv/boot/dts/microchip/mpfs.dtsi
+++ b/arch/riscv/boot/dts/microchip/mpfs.dtsi
@@ -446,7 +446,6 @@
 			ranges = <0x3000000 0x0 0x8000000 0x20 0x8000000 0x0 0x80000000>;
 			msi-parent = <&pcie>;
 			msi-controller;
-			microchip,axi-m-atr0 = <0x10 0x0>;
 			status = "disabled";
 			pcie_intc: interrupt-controller {
 				#address-cells = <0>;



^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH 5.19 158/158] bpf: Dont use tnum_range on array range checking for poke descriptors
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (156 preceding siblings ...)
  2022-08-29 11:00 ` [PATCH 5.19 157/158] riscv: dts: microchip: mpfs: remove pci axi address translation property Greg Kroah-Hartman
@ 2022-08-29 11:00 ` Greg Kroah-Hartman
  2022-08-29 18:08 ` [PATCH 5.19 000/158] 5.19.6-rc1 review Florian Fainelli
                   ` (12 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-08-29 11:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hsin-Wei Hung, Daniel Borkmann,
	Shung-Hsi Yu, John Fastabend, Alexei Starovoitov

From: Daniel Borkmann <daniel@iogearbox.net>

commit a657182a5c5150cdfacb6640aad1d2712571a409 upstream.

Hsin-Wei reported a KASAN splat triggered by their BPF runtime fuzzer which
is based on a customized syzkaller:

  BUG: KASAN: slab-out-of-bounds in bpf_int_jit_compile+0x1257/0x13f0
  Read of size 8 at addr ffff888004e90b58 by task syz-executor.0/1489
  CPU: 1 PID: 1489 Comm: syz-executor.0 Not tainted 5.19.0 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
  1.13.0-1ubuntu1.1 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x9c/0xc9
   print_address_description.constprop.0+0x1f/0x1f0
   ? bpf_int_jit_compile+0x1257/0x13f0
   kasan_report.cold+0xeb/0x197
   ? kvmalloc_node+0x170/0x200
   ? bpf_int_jit_compile+0x1257/0x13f0
   bpf_int_jit_compile+0x1257/0x13f0
   ? arch_prepare_bpf_dispatcher+0xd0/0xd0
   ? rcu_read_lock_sched_held+0x43/0x70
   bpf_prog_select_runtime+0x3e8/0x640
   ? bpf_obj_name_cpy+0x149/0x1b0
   bpf_prog_load+0x102f/0x2220
   ? __bpf_prog_put.constprop.0+0x220/0x220
   ? find_held_lock+0x2c/0x110
   ? __might_fault+0xd6/0x180
   ? lock_downgrade+0x6e0/0x6e0
   ? lock_is_held_type+0xa6/0x120
   ? __might_fault+0x147/0x180
   __sys_bpf+0x137b/0x6070
   ? bpf_perf_link_attach+0x530/0x530
   ? new_sync_read+0x600/0x600
   ? __fget_files+0x255/0x450
   ? lock_downgrade+0x6e0/0x6e0
   ? fput+0x30/0x1a0
   ? ksys_write+0x1a8/0x260
   __x64_sys_bpf+0x7a/0xc0
   ? syscall_enter_from_user_mode+0x21/0x70
   do_syscall_64+0x3b/0x90
   entry_SYSCALL_64_after_hwframe+0x63/0xcd
  RIP: 0033:0x7f917c4e2c2d

The problem here is that a range of tnum_range(0, map->max_entries - 1) has
limited ability to represent the concrete tight range with the tnum as the
set of resulting states from value + mask can result in a superset of the
actual intended range, and as such a tnum_in(range, reg->var_off) check may
yield true when it shouldn't, for example tnum_range(0, 2) would result in
00XX -> v = 0000, m = 0011 such that the intended set of {0, 1, 2} is here
represented by a less precise superset of {0, 1, 2, 3}. As the register is
known const scalar, really just use the concrete reg->var_off.value for the
upper index check.

Fixes: d2e4c1e6c294 ("bpf: Constant map key tracking for prog array pokes")
Reported-by: Hsin-Wei Hung <hsinweih@uci.edu>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/984b37f9fdf7ac36831d2137415a4a915744c1b6.1661462653.git.daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/bpf/verifier.c |   10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -6999,8 +6999,7 @@ record_func_key(struct bpf_verifier_env
 	struct bpf_insn_aux_data *aux = &env->insn_aux_data[insn_idx];
 	struct bpf_reg_state *regs = cur_regs(env), *reg;
 	struct bpf_map *map = meta->map_ptr;
-	struct tnum range;
-	u64 val;
+	u64 val, max;
 	int err;
 
 	if (func_id != BPF_FUNC_tail_call)
@@ -7010,10 +7009,11 @@ record_func_key(struct bpf_verifier_env
 		return -EINVAL;
 	}
 
-	range = tnum_range(0, map->max_entries - 1);
 	reg = &regs[BPF_REG_3];
+	val = reg->var_off.value;
+	max = map->max_entries;
 
-	if (!register_is_const(reg) || !tnum_in(range, reg->var_off)) {
+	if (!(register_is_const(reg) && val < max)) {
 		bpf_map_key_store(aux, BPF_MAP_KEY_POISON);
 		return 0;
 	}
@@ -7021,8 +7021,6 @@ record_func_key(struct bpf_verifier_env
 	err = mark_chain_precision(env, BPF_REG_3);
 	if (err)
 		return err;
-
-	val = reg->var_off.value;
 	if (bpf_map_key_unseen(aux))
 		bpf_map_key_store(aux, val);
 	else if (!bpf_map_key_poisoned(aux) &&



^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH 5.19 000/158] 5.19.6-rc1 review
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (157 preceding siblings ...)
  2022-08-29 11:00 ` [PATCH 5.19 158/158] bpf: Dont use tnum_range on array range checking for poke descriptors Greg Kroah-Hartman
@ 2022-08-29 18:08 ` Florian Fainelli
  2022-08-29 21:05 ` Ron Economos
                   ` (11 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Florian Fainelli @ 2022-08-29 18:08 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: stable, torvalds, akpm, linux, shuah, patches, lkft-triage,
	pavel, jonathanh, sudipm.mukherjee, slade



On 8/29/2022 3:57 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.19.6 release.
> There are 158 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed, 31 Aug 2022 10:57:37 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.6-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels and build tested on 
BMIPS_GENERIC:

Tested-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH 5.19 000/158] 5.19.6-rc1 review
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (158 preceding siblings ...)
  2022-08-29 18:08 ` [PATCH 5.19 000/158] 5.19.6-rc1 review Florian Fainelli
@ 2022-08-29 21:05 ` Ron Economos
  2022-08-29 22:13 ` Shuah Khan
                   ` (10 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Ron Economos @ 2022-08-29 21:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: stable, torvalds, akpm, linux, shuah, patches, lkft-triage,
	pavel, jonathanh, f.fainelli, sudipm.mukherjee, slade

On 8/29/22 3:57 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.19.6 release.
> There are 158 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 31 Aug 2022 10:57:37 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.6-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Built and booted successfully on RISC-V RV64 (HiFive Unmatched).

Tested-by: Ron Economos <re@w6rz.net>


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH 5.19 000/158] 5.19.6-rc1 review
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (159 preceding siblings ...)
  2022-08-29 21:05 ` Ron Economos
@ 2022-08-29 22:13 ` Shuah Khan
  2022-08-29 23:12 ` Zan Aziz
                   ` (9 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Shuah Khan @ 2022-08-29 22:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: stable, torvalds, akpm, linux, shuah, patches, lkft-triage,
	pavel, jonathanh, f.fainelli, sudipm.mukherjee, slade,
	Shuah Khan

On 8/29/22 04:57, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.19.6 release.
> There are 158 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed, 31 Aug 2022 10:57:37 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.6-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

Tested-by: Shuah Khan <skhan@linuxfoundation.org>

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH 5.19 000/158] 5.19.6-rc1 review
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (160 preceding siblings ...)
  2022-08-29 22:13 ` Shuah Khan
@ 2022-08-29 23:12 ` Zan Aziz
  2022-08-30  0:54 ` Guenter Roeck
                   ` (8 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Zan Aziz @ 2022-08-29 23:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee,
	slade

On Mon, Aug 29, 2022 at 1:03 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> This is the start of the stable review cycle for the 5.19.6 release.
> There are 158 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 31 Aug 2022 10:57:37 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
>         https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.6-rc1.gz
> or in the git tree and branch at:
>         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Hi Greg,

Compiled and booted on my test system Lenovo P50s: Intel Core i7
No emergency and critical messages in the dmesg

./perf bench sched all
# Running sched/messaging benchmark...
# 20 sender and receiver processes per group
# 10 groups == 400 processes run

     Total time: 0.692 [sec]

# Running sched/pipe benchmark...
# Executed 1000000 pipe operations between two processes

     Total time: 8.521 [sec]

       8.521891 usecs/op
         117344 ops/sec

Tested-by: Zan Aziz <zanaziz313@gmail.com>

Thanks
-Zan

^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH 5.19 000/158] 5.19.6-rc1 review
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (161 preceding siblings ...)
  2022-08-29 23:12 ` Zan Aziz
@ 2022-08-30  0:54 ` Guenter Roeck
  2022-08-30  2:14 ` Daniel Díaz
                   ` (7 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Guenter Roeck @ 2022-08-30  0:54 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, torvalds, akpm, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee,
	slade

On Mon, Aug 29, 2022 at 12:57:30PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.19.6 release.
> There are 158 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed, 31 Aug 2022 10:57:37 +0000.
> Anything received after that time might be too late.
> 

Build results:
	total: 150 pass: 150 fail: 0
Qemu test results:
	total: 489 pass: 489 fail: 0

Tested-by: Guenter Roeck <linux@roeck-us.net>

Guenter

^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH 5.19 000/158] 5.19.6-rc1 review
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (162 preceding siblings ...)
  2022-08-30  0:54 ` Guenter Roeck
@ 2022-08-30  2:14 ` Daniel Díaz
  2022-08-30 10:37 ` Sudip Mukherjee (Codethink)
                   ` (6 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Daniel Díaz @ 2022-08-30  2:14 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: stable, torvalds, akpm, linux, shuah, patches, lkft-triage,
	pavel, jonathanh, f.fainelli, sudipm.mukherjee, slade

Hello!

On 29/08/22 05:57, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.19.6 release.
> There are 158 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed, 31 Aug 2022 10:57:37 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.6-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

Results from Linaro's test farm.
No regressions on arm64, arm, x86_64, and i386.

Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>

## Build
* kernel: 5.19.6-rc1
* git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc
* git branch: linux-5.19.y
* git commit: 5cfa6cf5bd86ffb0f956ff2ac4876ec6905dd3cf
* git describe: v5.19.4-161-g5cfa6cf5bd86
* test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.19.y/build/v5.19.4-161-g5cfa6cf5bd86

## No test regressions (compared to v5.19.5)

## No metric regressions (compared to v5.19.5)

## No test fixes (compared to v5.19.5)

## No metric fixes (compared to v5.19.5)

## Test result summary
total: 112530, pass: 101420, fail: 894, skip: 9953, xfail: 263

## Build Summary
* arc: 10 total, 10 passed, 0 failed
* arm: 306 total, 303 passed, 3 failed
* arm64: 68 total, 65 passed, 3 failed
* i386: 57 total, 51 passed, 6 failed
* mips: 50 total, 47 passed, 3 failed
* parisc: 14 total, 14 passed, 0 failed
* powerpc: 65 total, 56 passed, 9 failed
* riscv: 32 total, 26 passed, 6 failed
* s390: 22 total, 20 passed, 2 failed
* sh: 26 total, 24 passed, 2 failed
* sparc: 14 total, 14 passed, 0 failed
* x86_64: 61 total, 58 passed, 3 failed

## Test suites summary
* fwts
* igt-gpu-tools
* kunit
* kvm-unit-tests
* libgpiod
* libhugetlbfs
* log-parser-boot
* log-parser-test
* ltp-cap_bounds
* ltp-commands
* ltp-containers
* ltp-controllers
* ltp-cpuhotplug
* ltp-crypto
* ltp-cve
* ltp-dio
* ltp-fcntl-locktests
* ltp-filecaps
* ltp-fs
* ltp-fs_bind
* ltp-fs_perms_simple
* ltp-fsx
* ltp-hugetlb
* ltp-io
* ltp-ipc
* ltp-math
* ltp-mm
* ltp-nptl
* ltp-open-posix-tests
* ltp-pty
* ltp-sched
* ltp-securebits
* ltp-smoke
* ltp-syscalls
* ltp-tracing
* network-basic-tests
* packetdrill
* rcutorture
* v4l2-compliance
* vdso


Greetings!

Daniel Díaz
daniel.diaz@linaro.org

-- 
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH 5.19 000/158] 5.19.6-rc1 review
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (163 preceding siblings ...)
  2022-08-30  2:14 ` Daniel Díaz
@ 2022-08-30 10:37 ` Sudip Mukherjee (Codethink)
  2022-09-01  9:47   ` Greg Kroah-Hartman
  2022-08-30 12:04 ` Bagas Sanjaya
                   ` (5 subsequent siblings)
  170 siblings, 1 reply; 174+ messages in thread
From: Sudip Mukherjee (Codethink) @ 2022-08-30 10:37 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, slade

Hi Greg,

On Mon, Aug 29, 2022 at 12:57:30PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.19.6 release.
> There are 158 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed, 31 Aug 2022 10:57:37 +0000.
> Anything received after that time might be too late.

Build test (gcc version 12.2.1 20220819):
mips: 59 configs -> 1 failure
arm: 99 configs -> no failure
arm64: 3 configs -> no failure
x86_64: 4 configs -> no failure
alpha allmodconfig -> no failure
csky allmodconfig -> fails
powerpc allmodconfig -> no failure
riscv allmodconfig -> no failure
s390 allmodconfig -> no failure
xtensa allmodconfig -> no failure

mips and csky are known failure. Fix not yet in mainline.

Boot test:
x86_64: Booted on my test laptop. No regression.
x86_64: Booted on qemu. No regression. [1]
arm64: Booted on rpi4b (4GB model). No regression. [2]
mips: Booted on ci20 board. No regression. [3]

DRM warnings in rpi4b, now fixed in mainline.
Will need:
258e483a4d5e ("drm/vc4: hdmi: Rework power up")
72e2329e7c9b ("drm/vc4: hdmi: Depends on CONFIG_PM")

[1]. https://openqa.qa.codethink.co.uk/tests/1731
[2]. https://openqa.qa.codethink.co.uk/tests/1734
[3]. https://openqa.qa.codethink.co.uk/tests/1736

Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>

--
Regards
Sudip

^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH 5.19 000/158] 5.19.6-rc1 review
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (164 preceding siblings ...)
  2022-08-30 10:37 ` Sudip Mukherjee (Codethink)
@ 2022-08-30 12:04 ` Bagas Sanjaya
  2022-08-30 12:12 ` Bagas Sanjaya
                   ` (4 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Bagas Sanjaya @ 2022-08-30 12:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee,
	slade

[-- Attachment #1: Type: text/plain, Size: 539 bytes --]

On Mon, Aug 29, 2022 at 12:57:30PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.19.6 release.
> There are 158 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 

Successfully cross-compiled for arm64 (bcm2711_defconfig, GCC 10.2.0) and
powerpc (ps3_defconfig, GCC 12.1.0).

Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
 
-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH 5.19 000/158] 5.19.6-rc1 review
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (165 preceding siblings ...)
  2022-08-30 12:04 ` Bagas Sanjaya
@ 2022-08-30 12:12 ` Bagas Sanjaya
  2022-08-30 12:30 ` Fenil Jain
                   ` (3 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Bagas Sanjaya @ 2022-08-30 12:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee,
	slade

[-- Attachment #1: Type: text/plain, Size: 538 bytes --]

On Mon, Aug 29, 2022 at 12:57:30PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.19.6 release.
> There are 158 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 

Successfully cross-compiled for arm64 (bcm2711_defconfig, GCC 10.2.0) and
powerpc (ps3_defconfig, GCC 12.1.0).

Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH 5.19 000/158] 5.19.6-rc1 review
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (166 preceding siblings ...)
  2022-08-30 12:12 ` Bagas Sanjaya
@ 2022-08-30 12:30 ` Fenil Jain
  2022-08-30 14:33 ` Rudi Heitbaum
                   ` (2 subsequent siblings)
  170 siblings, 0 replies; 174+ messages in thread
From: Fenil Jain @ 2022-08-30 12:30 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: stable

Hey Greg,

Ran tests and boot tested on my system, no regression found

Tested-by: Fenil Jain <fkjainco@gmail.com>

^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH 5.19 000/158] 5.19.6-rc1 review
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (167 preceding siblings ...)
  2022-08-30 12:30 ` Fenil Jain
@ 2022-08-30 14:33 ` Rudi Heitbaum
  2022-08-30 21:09 ` Justin Forbes
  2022-08-31  5:53 ` Jiri Slaby
  170 siblings, 0 replies; 174+ messages in thread
From: Rudi Heitbaum @ 2022-08-30 14:33 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee,
	slade

On Mon, Aug 29, 2022 at 12:57:30PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.19.6 release.
> There are 158 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed, 31 Aug 2022 10:57:37 +0000.
> Anything received after that time might be too late.

Hi Greg,

5.19.6-rc1 tested.

Run tested on:
- Intel Tiger Lake x86_64 (nuc11 i7-1165G7)

In addition - build tested for:
- Allwinner A64
- Allwinner H3
- Allwinner H5
- Allwinner H6
- NXP iMX6
- NXP iMX8
- Qualcomm Dragonboard
- Rockchip RK3288
- Rockchip RK3328
- Rockchip RK3399pro
- Samsung Exynos

Tested-by: Rudi Heitbaum <rudi@heitbaum.com>
--
Rudi

^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH 5.19 000/158] 5.19.6-rc1 review
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (168 preceding siblings ...)
  2022-08-30 14:33 ` Rudi Heitbaum
@ 2022-08-30 21:09 ` Justin Forbes
  2022-08-31  5:53 ` Jiri Slaby
  170 siblings, 0 replies; 174+ messages in thread
From: Justin Forbes @ 2022-08-30 21:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee,
	slade

On Mon, Aug 29, 2022 at 12:57:30PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.19.6 release.
> There are 158 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed, 31 Aug 2022 10:57:37 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.6-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

Tested rc1 against the Fedora build system (aarch64, armv7, ppc64le,
s390x, x86_64), and boot tested x86_64. No regressions noted.

Tested-by: Justin M. Forbes <jforbes@fedoraproject.org>

^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH 5.19 000/158] 5.19.6-rc1 review
  2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
                   ` (169 preceding siblings ...)
  2022-08-30 21:09 ` Justin Forbes
@ 2022-08-31  5:53 ` Jiri Slaby
  170 siblings, 0 replies; 174+ messages in thread
From: Jiri Slaby @ 2022-08-31  5:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: stable, torvalds, akpm, linux, shuah, patches, lkft-triage,
	pavel, jonathanh, f.fainelli, sudipm.mukherjee, slade

On 29. 08. 22, 12:57, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.19.6 release.
> There are 158 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed, 31 Aug 2022 10:57:37 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.6-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y
> and the diffstat can be found below.

openSUSE configs¹⁾ all green.

Tested-by: Jiri Slaby <jirislaby@kernel.org>

¹⁾ armv6hl armv7hl arm64 i386 ppc64 ppc64le riscv64 s390x x86_64

-- 
js
suse labs


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH 5.19 000/158] 5.19.6-rc1 review
  2022-08-30 10:37 ` Sudip Mukherjee (Codethink)
@ 2022-09-01  9:47   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 174+ messages in thread
From: Greg Kroah-Hartman @ 2022-09-01  9:47 UTC (permalink / raw)
  To: Sudip Mukherjee (Codethink)
  Cc: linux-kernel, stable, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, slade

On Tue, Aug 30, 2022 at 11:37:26AM +0100, Sudip Mukherjee (Codethink) wrote:
> Hi Greg,
> 
> On Mon, Aug 29, 2022 at 12:57:30PM +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.19.6 release.
> > There are 158 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Wed, 31 Aug 2022 10:57:37 +0000.
> > Anything received after that time might be too late.
> 
> Build test (gcc version 12.2.1 20220819):
> mips: 59 configs -> 1 failure
> arm: 99 configs -> no failure
> arm64: 3 configs -> no failure
> x86_64: 4 configs -> no failure
> alpha allmodconfig -> no failure
> csky allmodconfig -> fails
> powerpc allmodconfig -> no failure
> riscv allmodconfig -> no failure
> s390 allmodconfig -> no failure
> xtensa allmodconfig -> no failure
> 
> mips and csky are known failure. Fix not yet in mainline.
> 
> Boot test:
> x86_64: Booted on my test laptop. No regression.
> x86_64: Booted on qemu. No regression. [1]
> arm64: Booted on rpi4b (4GB model). No regression. [2]
> mips: Booted on ci20 board. No regression. [3]
> 
> DRM warnings in rpi4b, now fixed in mainline.
> Will need:
> 258e483a4d5e ("drm/vc4: hdmi: Rework power up")
> 72e2329e7c9b ("drm/vc4: hdmi: Depends on CONFIG_PM")

Both now queued up, thanks!

greg k-h

^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH 5.19 109/158] x86/PAT: Have pat_enabled() properly reflect state when running on Xen
  2022-08-29 10:59 ` [PATCH 5.19 109/158] x86/PAT: Have pat_enabled() properly reflect state when running on Xen Greg Kroah-Hartman
@ 2022-09-02 18:16   ` Chuck Zmudzinski
  0 siblings, 0 replies; 174+ messages in thread
From: Chuck Zmudzinski @ 2022-09-02 18:16 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jan Beulich, Borislav Petkov, Juergen Gross,
	Thorsten Leemhuis
  Cc: stable, Ingo Molnar, Lucas De Marchi, regressions, lkml

On 8/29/2022 6:59 AM, Greg Kroah-Hartman wrote:
> From: Jan Beulich <jbeulich@suse.com>
>
> commit 72cbc8f04fe2fa93443c0fcccb7ad91dfea3d9ce upstream.
>
> After commit ID in the Fixes: tag, pat_enabled() returns false (because
> of PAT initialization being suppressed in the absence of MTRRs being
> announced to be available).
>
> This has become a problem: the i915 driver now fails to initialize when
> running PV on Xen (i915_gem_object_pin_map() is where I located the
> induced failure), and its error handling is flaky enough to (at least
> sometimes) result in a hung system.
>
> Yet even beyond that problem the keying of the use of WC mappings to
> pat_enabled() (see arch_can_pci_mmap_wc()) means that in particular
> graphics frame buffer accesses would have been quite a bit less optimal
> than possible.
>
> Arrange for the function to return true in such environments, without
> undermining the rest of PAT MSR management logic considering PAT to be
> disabled: specifically, no writes to the PAT MSR should occur.
>
> For the new boolean to live in .init.data, init_cache_modes() also needs
> moving to .init.text (where it could/should have lived already before).
>
>   [ bp: This is the "small fix" variant for stable. It'll get replaced
>     with a proper PAT and MTRR detection split upstream but that is too
>     involved for a stable backport.
>     - additional touchups to commit msg. Use cpu_feature_enabled(). ]
>
> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Signed-off-by: Borislav Petkov <bp@suse.de>
> Acked-by: Ingo Molnar <mingo@kernel.org>
> Cc: <stable@vger.kernel.org>
> Cc: Juergen Gross <jgross@suse.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Link: https://lore.kernel.org/r/9385fa60-fa5d-f559-a137-6608408f88b0@suse.com
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  arch/x86/mm/pat/memtype.c |   10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> --- a/arch/x86/mm/pat/memtype.c
> +++ b/arch/x86/mm/pat/memtype.c
> @@ -62,6 +62,7 @@
>  
>  static bool __read_mostly pat_bp_initialized;
>  static bool __read_mostly pat_disabled = !IS_ENABLED(CONFIG_X86_PAT);
> +static bool __initdata pat_force_disabled = !IS_ENABLED(CONFIG_X86_PAT);
>  static bool __read_mostly pat_bp_enabled;
>  static bool __read_mostly pat_cm_initialized;
>  
> @@ -86,6 +87,7 @@ void pat_disable(const char *msg_reason)
>  static int __init nopat(char *str)
>  {
>  	pat_disable("PAT support disabled via boot option.");
> +	pat_force_disabled = true;
>  	return 0;
>  }
>  early_param("nopat", nopat);
> @@ -272,7 +274,7 @@ static void pat_ap_init(u64 pat)
>  	wrmsrl(MSR_IA32_CR_PAT, pat);
>  }
>  
> -void init_cache_modes(void)
> +void __init init_cache_modes(void)
>  {
>  	u64 pat = 0;
>  
> @@ -313,6 +315,12 @@ void init_cache_modes(void)
>  		 */
>  		pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) |
>  		      PAT(4, WB) | PAT(5, WT) | PAT(6, UC_MINUS) | PAT(7, UC);
> +	} else if (!pat_force_disabled && cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) {
> +		/*
> +		 * Clearly PAT is enabled underneath. Allow pat_enabled() to
> +		 * reflect this.
> +		 */
> +		pat_bp_enabled = true;
>  	}
>  
>  	__init_cache_modes(pat);
>
>

Dear Boris, Jan, Juergen, Thorsten, and Greg,

Just a note to say thanks to everyone who worked on this regression.
I just upgraded my Xen PV dom0 on fc36 that was affected with the
Fedora build of 5.19.6, which has the fix, and it works fine.

Thanks again,

Chuck

^ permalink raw reply	[flat|nested] 174+ messages in thread

end of thread, other threads:[~2022-09-02 18:16 UTC | newest]

Thread overview: 174+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-29 10:57 [PATCH 5.19 000/158] 5.19.6-rc1 review Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 001/158] mm/gup: fix FOLL_FORCE COW security issue and remove FOLL_COW Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 002/158] NFS: Fix another fsync() issue after a server reboot Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 003/158] audit: fix potential double free on error path from fsnotify_add_inode_mark Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 004/158] cgroup: Fix race condition at rebind_subsystems() Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 005/158] parisc: Make CONFIG_64BIT available for ARCH=parisc64 only Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 006/158] parisc: Fix exception handler for fldw and fstw instructions Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 007/158] kernel/sys_ni: add compat entry for fadvise64_64 Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 008/158] kprobes: dont call disarm_kprobe() for disabled kprobes Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 009/158] mm/uffd: reset write protection when unregister with wp-mode Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 010/158] mm/hugetlb: support write-faults in shared mappings Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 011/158] mt76: mt7921: fix command timeout in AP stop period Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 012/158] xfrm: fix refcount leak in __xfrm_policy_check() Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 013/158] Revert "xfrm: update SA curlft.use_time" Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 014/158] xfrm: clone missing x->lastused in xfrm_do_migrate Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 015/158] af_key: Do not call xfrm_probe_algs in parallel Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 016/158] xfrm: policy: fix metadata dst->dev xmit null pointer dereference Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 017/158] fs: require CAP_SYS_ADMIN in target namespace for idmapped mounts Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 018/158] Revert "net: macsec: update SCI upon MAC address change." Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 019/158] NFSv4.2 fix problems with __nfs42_ssc_open Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 020/158] SUNRPC: RPC level errors should set task->tk_rpc_status Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 021/158] mm/smaps: dont access young/dirty bit if pte unpresent Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 022/158] ntfs: fix acl handling Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 023/158] rose: check NULL rose_loopback_neigh->loopback Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 024/158] r8152: fix the units of some registers for RTL8156A Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 025/158] r8152: fix the RX FIFO settings when suspending Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 026/158] nfc: pn533: Fix use-after-free bugs caused by pn532_cmd_timeout Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 027/158] ice: xsk: prohibit usage of non-balanced queue id Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 028/158] ice: xsk: use Rx rings XDP ring when picking NAPI context Greg Kroah-Hartman
2022-08-29 10:57 ` [PATCH 5.19 029/158] net/mlx5e: Properly disable vlan strip on non-UL reps Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 030/158] net/mlx5: LAG, fix logic over MLX5_LAG_FLAG_NDEVS_READY Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 031/158] net/mlx5: Eswitch, Fix forwarding decision to uplink Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 032/158] net/mlx5: Disable irq when locking lag_lock Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 033/158] net/mlx5: Fix cmd error logging for manage pages cmd Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 034/158] net/mlx5: Avoid false positive lockdep warning by adding lock_class_key Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 035/158] net/mlx5e: Fix wrong application of the LRO state Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 036/158] net/mlx5e: Fix wrong tc flag used when set hw-tc-offload off Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 037/158] net: dsa: microchip: ksz9477: cleanup the ksz9477_switch_detect Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 038/158] net: dsa: microchip: move switch chip_id detection to ksz_common Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 039/158] net: dsa: microchip: move tag_protocol " Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 040/158] net: dsa: microchip: move vlan functionality " Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 041/158] net: dsa: microchip: move the port mirror " Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 042/158] net: dsa: microchip: update the ksz_phylink_get_caps Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 043/158] net: dsa: microchip: keep compatibility with device tree blobs with no phy-mode Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 044/158] net: ipa: dont assume SMEM is page-aligned Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 045/158] net: phy: Dont WARN for PHY_READY state in mdio_bus_phy_resume() Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 046/158] net: moxa: get rid of asymmetry in DMA mapping/unmapping Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 047/158] bonding: 802.3ad: fix no transmission of LACPDUs Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 048/158] net: ipvtap - add __init/__exit annotations to module init/exit funcs Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 049/158] netfilter: ebtables: reject blobs that dont provide all entry points Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 050/158] netfilter: nft_tproxy: restrict to prerouting hook Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 051/158] bnxt_en: Use PAGE_SIZE to init buffer when multi buffer XDP is not in use Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 052/158] bnxt_en: set missing reload flag in devlink features Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 053/158] bnxt_en: fix NQ resource accounting during vf creation on 57500 chips Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 054/158] bnxt_en: fix LRO/GRO_HW features in ndo_fix_features callback Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 055/158] netfilter: nf_tables: disallow updates of implicit chain Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 056/158] netfilter: nf_tables: make table handle allocation per-netns friendly Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 057/158] netfilter: nft_payload: report ERANGE for too long offset and length Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 058/158] netfilter: nft_payload: do not truncate csum_offset and csum_type Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 059/158] netfilter: nf_tables: do not leave chain stats enabled on error Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 060/158] netfilter: nft_osf: restrict osf to ipv4, ipv6 and inet families Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 061/158] netfilter: nft_tunnel: restrict it to netdev family Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 062/158] netfilter: nf_tables: disallow binding to already bound chain Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 063/158] netfilter: flowtable: add function to invoke garbage collection immediately Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 064/158] netfilter: flowtable: fix stuck flows on cleanup due to pending work Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 065/158] net: Fix data-races around sysctl_[rw]mem_(max|default) Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 066/158] net: Fix data-races around weight_p and dev_weight_[rt]x_bias Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 067/158] net: Fix data-races around netdev_max_backlog Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 068/158] net: Fix data-races around netdev_tstamp_prequeue Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 069/158] ratelimit: Fix data-races in ___ratelimit() Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 070/158] net: Fix data-races around sysctl_optmem_max Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 071/158] net: Fix a data-race around sysctl_tstamp_allow_data Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 072/158] net: Fix a data-race around sysctl_net_busy_poll Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 073/158] net: Fix a data-race around sysctl_net_busy_read Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 074/158] net: Fix a data-race around netdev_budget Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 075/158] net: Fix data-races around sysctl_max_skb_frags Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 076/158] net: Fix a data-race around netdev_budget_usecs Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 077/158] net: Fix data-races around sysctl_fb_tunnels_only_for_init_net Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 078/158] net: Fix data-races around sysctl_devconf_inherit_init_net Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 079/158] net: Fix a data-race around gro_normal_batch Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 080/158] net: Fix a data-race around netdev_unregister_timeout_secs Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 081/158] net: Fix a data-race around sysctl_somaxconn Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 082/158] ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 083/158] i40e: Fix incorrect address type for IPv6 flow rules Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 084/158] net: ethernet: mtk_eth_soc: enable rx cksum offload for MTK_NETSYS_V2 Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 085/158] net: ethernet: mtk_eth_soc: fix hw hash reporting " Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 086/158] rxrpc: Fix locking in rxrpcs sendmsg Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 087/158] ionic: clear broken state on generation change Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 088/158] ionic: fix up issues with handling EAGAIN on FW cmds Greg Kroah-Hartman
2022-08-29 10:58 ` [PATCH 5.19 089/158] ionic: VF initial random MAC address if no assigned mac Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 090/158] net: stmmac: work around sporadic tx issue on link-up Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 091/158] net: lantiq_xrx200: confirm skb is allocated before using Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 092/158] net: lantiq_xrx200: fix lock under memory pressure Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 093/158] net: lantiq_xrx200: restore buffer if memory allocation failed Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 094/158] btrfs: fix silent failure when deleting root reference Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 095/158] btrfs: replace: drop assert for suspended replace Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 096/158] btrfs: add info when mount fails due to stale replace target Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 097/158] btrfs: fix space cache corruption and potential double allocations Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 098/158] btrfs: check if root is readonly while setting security xattr Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 099/158] btrfs: fix possible memory leak in btrfs_get_dev_args_from_path() Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 100/158] btrfs: update generation of hole file extent item when merging holes Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 101/158] x86/boot: Dont propagate uninitialized boot_params->cc_blob_address Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 102/158] perf/x86/intel: Fix pebs event constraints for ADL Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 103/158] perf/x86/lbr: Enable the branch type for the Arch LBR by default Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 104/158] x86/entry: Fix entry_INT80_compat for Xen PV guests Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 105/158] x86/unwind/orc: Unwind ftrace trampolines with correct ORC entry Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 106/158] x86/sev: Dont use cc_platform_has() for early SEV-SNP calls Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 107/158] x86/bugs: Add "unknown" reporting for MMIO Stale Data Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 108/158] x86/nospec: Unwreck the RSB stuffing Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 109/158] x86/PAT: Have pat_enabled() properly reflect state when running on Xen Greg Kroah-Hartman
2022-09-02 18:16   ` Chuck Zmudzinski
2022-08-29 10:59 ` [PATCH 5.19 110/158] loop: Check for overflow while configuring loop Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 111/158] writeback: avoid use-after-free after removing device Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 112/158] audit: move audit_return_fixup before the filters Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 113/158] asm-generic: sections: refactor memory_intersects Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 114/158] mm/damon/dbgfs: avoid duplicate context directory creation Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 115/158] s390/mm: do not trigger write fault when vma does not allow VM_WRITE Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 116/158] bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 117/158] mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 118/158] mm/mprotect: only reference swap pfn page if type match Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 119/158] cifs: skip extra NULL byte in filenames Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 120/158] s390: fix double free of GS and RI CBs on fork() failure Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 121/158] fbdev: fbcon: Properly revert changes when vc_resize() failed Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 122/158] Revert "memcg: cleanup racy sum avoidance code" Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 123/158] shmem: update folio if shmem_replace_page() updates the page Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 124/158] ACPI: processor: Remove freq Qos request for all CPUs Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 125/158] nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 126/158] smb3: missing inode locks in punch hole Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 127/158] ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 128/158] xen/privcmd: fix error exit of privcmd_ioctl_dm_op() Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 129/158] riscv: signal: fix missing prototype warning Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 130/158] riscv: traps: add missing prototype Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 131/158] riscv: dts: microchip: correct L2 cache interrupts Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 132/158] Revert "zram: remove double compression logic" Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 133/158] io_uring: fix issue with io_write() not always undoing sb_start_write() Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 134/158] mm/hugetlb: fix hugetlb not supporting softdirty tracking Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 135/158] Revert "md-raid: destroy the bitmap after destroying the thread" Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 136/158] md: call __md_stop_writes in md_stop Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 137/158] arm64: Fix match_list for erratum 1286807 on Arm Cortex-A76 Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 138/158] binder_alloc: add missing mmap_lock calls when using the VMA Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 139/158] x86/nospec: Fix i386 RSB stuffing Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 140/158] drm/amdkfd: Fix isa version for the GC 10.3.7 Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 141/158] Documentation/ABI: Mention retbleed vulnerability info file for sysfs Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 142/158] blk-mq: fix io hung due to missing commit_rqs Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 143/158] perf python: Fix build when PYTHON_CONFIG is user supplied Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 144/158] perf/x86/intel/uncore: Fix broken read_counter() for SNB IMC PMU Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 145/158] perf/x86/intel/ds: Fix precise store latency handling Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 146/158] perf stat: Clear evsel->reset_group for each stat run Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 147/158] arm64: fix rodata=full Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 148/158] arm64/signal: Flush FPSIMD register state when disabling streaming mode Greg Kroah-Hartman
2022-08-29 10:59 ` [PATCH 5.19 149/158] arm64/sme: Dont flush SVE register state when allocating SME storage Greg Kroah-Hartman
2022-08-29 11:00 ` [PATCH 5.19 150/158] arm64/sme: Dont flush SVE register state when handling SME traps Greg Kroah-Hartman
2022-08-29 11:00 ` [PATCH 5.19 151/158] scsi: ufs: core: Enable link lost interrupt Greg Kroah-Hartman
2022-08-29 11:00 ` [PATCH 5.19 152/158] scsi: storvsc: Remove WQ_MEM_RECLAIM from storvsc_error_wq Greg Kroah-Hartman
2022-08-29 11:00 ` [PATCH 5.19 153/158] scsi: core: Fix passthrough retry counter handling Greg Kroah-Hartman
2022-08-29 11:00 ` [PATCH 5.19 154/158] riscv: dts: microchip: mpfs: fix incorrect pcie child node name Greg Kroah-Hartman
2022-08-29 11:00 ` [PATCH 5.19 155/158] riscv: dts: microchip: mpfs: remove ti,fifo-depth property Greg Kroah-Hartman
2022-08-29 11:00 ` [PATCH 5.19 156/158] riscv: dts: microchip: mpfs: remove bogus card-detect-delay Greg Kroah-Hartman
2022-08-29 11:00 ` [PATCH 5.19 157/158] riscv: dts: microchip: mpfs: remove pci axi address translation property Greg Kroah-Hartman
2022-08-29 11:00 ` [PATCH 5.19 158/158] bpf: Dont use tnum_range on array range checking for poke descriptors Greg Kroah-Hartman
2022-08-29 18:08 ` [PATCH 5.19 000/158] 5.19.6-rc1 review Florian Fainelli
2022-08-29 21:05 ` Ron Economos
2022-08-29 22:13 ` Shuah Khan
2022-08-29 23:12 ` Zan Aziz
2022-08-30  0:54 ` Guenter Roeck
2022-08-30  2:14 ` Daniel Díaz
2022-08-30 10:37 ` Sudip Mukherjee (Codethink)
2022-09-01  9:47   ` Greg Kroah-Hartman
2022-08-30 12:04 ` Bagas Sanjaya
2022-08-30 12:12 ` Bagas Sanjaya
2022-08-30 12:30 ` Fenil Jain
2022-08-30 14:33 ` Rudi Heitbaum
2022-08-30 21:09 ` Justin Forbes
2022-08-31  5:53 ` Jiri Slaby

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).