All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 4.9 00/93] 4.9.17-stable review
@ 2017-03-20 17:50 Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 01/93] net/mlx5e: Register/unregister vport representors on interface attach/detach Greg Kroah-Hartman
                   ` (89 more replies)
  0 siblings, 90 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuahkh, patches,
	ben.hutchings, stable

This is the start of the stable review cycle for the 4.9.17 release.
There are 93 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed Mar 22 17:47:16 UTC 2017.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.17-rc1.gz
or in the git tree and branch at:
  git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 4.9.17-rc1

Daniel Axtens <dja@axtens.net>
    crypto: powerpc - Fix initialisation of crc32c context

Niklas Cassel <niklas.cassel@axis.com>
    locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y

Peter Zijlstra <peterz@infradead.org>
    futex: Add missing error handling to FUTEX_REQUEUE_PI

Peter Zijlstra <peterz@infradead.org>
    futex: Fix potential use-after-free in FUTEX_REQUEUE_PI

Andy Lutomirski <luto@kernel.org>
    x86/perf: Fix CR4.PCE propagation to use active_mm instead of mm

Andrey Ryabinin <aryabinin@virtuozzo.com>
    x86/kasan: Fix boot with KASAN=y and PROFILE_ANNOTATED_BRANCHES=y

Peter Zijlstra <peterz@infradead.org>
    x86/tsc: Fix ART for TSC_KNOWN_FREQ

Shanker Donthineni <shankerd@codeaurora.org>
    irqchip/gicv3-its: Add workaround for QDF2400 ITS erratum 0065

Marc Zyngier <marc.zyngier@arm.com>
    arm64: KVM: VHE: Clear HCR_TGE when invalidating guest TLBs

Boris Brezillon <boris.brezillon@free-electrons.com>
    drm/vc4: Fix ->clock_select setting for the VEC encoder

Derek Foreman <derekf@osg.samsung.com>
    drm/vc4: Fix race between page flip completion event and clean-up

Boris Brezillon <boris.brezillon@free-electrons.com>
    clk: bcm2835: Fix ->fixed_divider of pllh_aux

Michael Ellerman <mpe@ellerman.id.au>
    powerpc/mm: Fix build break when CMA=n && SPAPR_TCE_IOMMU=y

Alexandre Belloni <alexandre.belloni@free-electrons.com>
    usb: gadget: udc: atmel: remove memory leak

Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
    serial: 8250_pci: Detach low-level driver during PCI error recovery

Michael Pobega <mpobega@neverware.com>
    ACPI / blacklist: Make Dell Latitude 3350 ethernet work

Alex Hung <alex.hung@canonical.com>
    ACPI / blacklist: add _REV quirks for Dell Precision 5520 and 3520

Vladimir Davydov <vdavydov.dev@gmail.com>
    slub: move synchronize_sched out of slab_mutex on shrink

Henrik Ingo <henrik.ingo@avoinelama.fi>
    uvcvideo: uvc_scan_fallback() for webcams with broken chain

Harald Freudenberger <freude@linux.vnet.ibm.com>
    s390/zcrypt: Introduce CEX6 toleration

Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
    block: allow WRITE_SAME commands with the SG_IO ioctl

Ben Skeggs <bskeggs@redhat.com>
    drm/nouveau/disp/nv50-: specify ctrl/user separately when constructing classes

Ben Skeggs <bskeggs@redhat.com>
    drm/nouveau/disp/nv50-: split chid into chid.ctrl and chid.user

Ben Skeggs <bskeggs@redhat.com>
    drm/nouveau/disp/gp102: fix cursor/overlay immediate channel indices

Alexey Kardashevskiy <aik@ozlabs.ru>
    vfio/spapr: Postpone default window creation

Alexey Kardashevskiy <aik@ozlabs.ru>
    vfio/spapr: Add a helper to create default DMA window

Alexey Kardashevskiy <aik@ozlabs.ru>
    powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown

Alexey Kardashevskiy <aik@ozlabs.ru>
    vfio/spapr: Reference mm in tce_container

Alexey Kardashevskiy <aik@ozlabs.ru>
    powerpc/iommu: Stop using @current in mm_iommu_xxx

Alexey Kardashevskiy <aik@ozlabs.ru>
    powerpc/iommu: Pass mm_struct to init/cleanup helpers

Alexey Kardashevskiy <aik@ozlabs.ru>
    vfio/spapr: Postpone allocation of userspace version of TCE table

Vitaly Kuznetsov <vkuznets@redhat.com>
    Drivers: hv: ring_buffer: count on wrap around mappings in get_next_pkt_raw() (v2)

Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
    ibmveth: calculate gso_segs for large packets

Gavin Shan <gwshan@linux.vnet.ibm.com>
    PCI: Do any VF BAR updates before enabling the BARs

Bjorn Helgaas <bhelgaas@google.com>
    PCI: Ignore BAR updates on virtual functions

Bjorn Helgaas <bhelgaas@google.com>
    PCI: Update BARs using property bits appropriate for type

Bjorn Helgaas <bhelgaas@google.com>
    PCI: Don't update VF BARs while VF memory space is enabled

Bjorn Helgaas <bhelgaas@google.com>
    PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE

Bjorn Helgaas <bhelgaas@google.com>
    PCI: Add comments about ROM BAR updating

Bjorn Helgaas <bhelgaas@google.com>
    PCI: Remove pci_resource_bar() and pci_iov_resource_bar()

Bjorn Helgaas <bhelgaas@google.com>
    PCI: Separate VF BAR updates from standard BAR updates

Vitaly Kuznetsov <vkuznets@redhat.com>
    x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic

Michael Cyr <mikecyr@us.ibm.com>
    scsi: ibmvscsis: Synchronize cmds at remove time

Michael Cyr <mikecyr@us.ibm.com>
    scsi: ibmvscsis: Synchronize cmds at tpg_enable_store time

Michael Cyr <mikecyr@us.ibm.com>
    scsi: ibmvscsis: Rearrange functions for future patches

Michael Cyr <mikecyr@us.ibm.com>
    scsi: ibmvscsis: Clean up properly if target_submit_cmd/tmr fails

Michael Cyr <mikecyr@us.ibm.com>
    scsi: ibmvscsis: Return correct partition name/# to client

Michael Cyr <mikecyr@us.ibm.com>
    scsi: ibmvscsis: Issues from Dan Carpenter/Smatch

Todd Fujinaka <todd.fujinaka@intel.com>
    igb: add i211 to i210 PHY workaround

Chris J Arges <christopherarges@gmail.com>
    igb: Workaround for igb i210 firmware issue

Dan Streetman <ddstreet@ieee.org>
    xen: do not re-use pirq number cached in pci device msi msg data

Krister Johansen <kjlx@templeofstupid.com>
    dmaengine: iota: ioat_alloc_chan_resources should not perform sleeping allocations.

Daniel Borkmann <daniel@iogearbox.net>
    bpf: fix mark_reg_unknown_value for spilled regs on map value marking

Daniel Borkmann <daniel@iogearbox.net>
    bpf: fix regression on verifier pruning wrt map lookups

Alexei Starovoitov <ast@fb.com>
    bpf: fix state equivalence

Thomas Graf <tgraf@suug.ch>
    bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers

Hannes Frederic Sowa <hannes@stressinduktion.org>
    dccp: fix memory leak during tear-down of unsuccessful connection request

Hannes Frederic Sowa <hannes@stressinduktion.org>
    tun: fix premature POLLOUT notification on tun devices

Jon Maxwell <jmaxwell37@gmail.com>
    dccp/tcp: fix routing redirect race

Florian Westphal <fw@strlen.de>
    bridge: drop netfilter fake rtable unconditionally

Florian Westphal <fw@strlen.de>
    ipv6: avoid write to a possibly cloned skb

Sabrina Dubroca <sd@queasysnail.net>
    ipv6: make ECMP route replacement less greedy

David Ahern <dsa@cumulusnetworks.com>
    mpls: Do not decrement alive counter for unregister events

David Ahern <dsa@cumulusnetworks.com>
    mpls: Send route delete notifications when router module is unloaded

Etienne Noss <etienne.noss@wifirst.fr>
    act_connmark: avoid crashing on malformed nlattrs with null parms

Dmitry V. Levin <ldv@altlinux.org>
    uapi: fix linux/packet_diag.h userspace compilation error

Paolo Abeni <pabeni@redhat.com>
    net/tunnel: set inner protocol in network gro hooks

David Ahern <dsa@cumulusnetworks.com>
    vrf: Fix use-after-free in vrf_xmit

Eric Dumazet <edumazet@google.com>
    dccp: fix use-after-free in dccp_feat_activate_values

Alexey Khoroshilov <khoroshilov@ispras.ru>
    net/sched: act_skbmod: remove unneeded rcu_read_unlock in tcf_skbmod_dump

Eric Dumazet <edumazet@google.com>
    net: fix socket refcounting in skb_complete_tx_timestamp()

Eric Dumazet <edumazet@google.com>
    net: fix socket refcounting in skb_complete_wifi_ack()

Eric Dumazet <edumazet@google.com>
    tcp: fix various issues for sockets morphing to listen state

WANG Cong <xiyou.wangcong@gmail.com>
    strparser: destroy workqueue on module exit

Arnaldo Carvalho de Melo <acme@redhat.com>
    dccp: Unlock sock before calling sk_free()

Eric Dumazet <edumazet@google.com>
    ipv6: orphan skbs in reassembly unit

Eric Dumazet <edumazet@google.com>
    net: net_enable_timestamp() can be called from irq contexts

Alexander Potapenko <glider@google.com>
    net: don't call strlen() on the user buffer in packet_bind_spkt()

Mike Manning <mmanning@brocade.com>
    net: bridge: allow IPv6 when multicast flood is disabled

Eric Dumazet <edumazet@google.com>
    tcp/dccp: block BH for SYN processing

Ido Schimmel <idosch@mellanox.com>
    mlxsw: spectrum_router: Avoid potential packets loss

Jakub Kicinski <jakub.kicinski@netronome.com>
    geneve: lock RCU on TX path

Jakub Kicinski <jakub.kicinski@netronome.com>
    vxlan: lock RCU on TX path

Florian Fainelli <f.fainelli@gmail.com>
    net: phy: Avoid deadlock during phy_error()

Paul Hüber <phueber@kernsp.in>
    l2tp: avoid use-after-free caused by l2tp_ip_backlog_recv

Roman Mashak <mrv@mojatatu.com>
    net sched actions: decrement module reference count after table flush.

Julian Anastasov <ja@ssi.bg>
    ipv4: mask tos for input route

Brian Russell <brussell@brocade.com>
    vxlan: don't allow overwrite of config src addr

David Forster <dforster@brocade.com>
    vti6: return GRE_KEY for vti6

Matthias Schiffer <mschiffer@universe-factory.net>
    vxlan: correctly validate VXLAN ID against VXLAN_N_VID

Tariq Toukan <tariqt@mellanox.com>
    net/mlx5e: Fix wrong CQE decompression

Tariq Toukan <tariqt@mellanox.com>
    net/mlx5e: Do not reduce LRO WQE size when not using build_skb

Saeed Mahameed <saeedm@mellanox.com>
    net/mlx5e: Register/unregister vport representors on interface attach/detach


-------------

Diffstat:

 Documentation/arm64/silicon-errata.txt             |   44 +-
 Makefile                                           |    4 +-
 arch/arm64/Kconfig                                 |   10 +
 arch/arm64/kvm/hyp/tlb.c                           |   64 +-
 arch/powerpc/crypto/crc32c-vpmsum_glue.c           |    2 +-
 arch/powerpc/include/asm/mmu_context.h             |   20 +-
 arch/powerpc/kernel/setup-common.c                 |    2 +-
 arch/powerpc/mm/mmu_context_book3s64.c             |    6 +-
 arch/powerpc/mm/mmu_context_iommu.c                |   62 +-
 arch/x86/events/core.c                             |    4 +-
 arch/x86/kernel/cpu/mshyperv.c                     |   24 +
 arch/x86/kernel/head64.c                           |    1 +
 arch/x86/kernel/tsc.c                              |    2 +
 arch/x86/mm/kasan_init_64.c                        |    1 +
 arch/x86/pci/xen.c                                 |   23 +-
 block/scsi_ioctl.c                                 |    3 +
 drivers/acpi/blacklist.c                           |   28 +
 drivers/clk/bcm/clk-bcm2835.c                      |    2 +-
 drivers/dma/ioat/init.c                            |    4 +-
 drivers/gpu/drm/nouveau/nvkm/engine/disp/Kbuild    |    2 +
 .../gpu/drm/nouveau/nvkm/engine/disp/channv50.c    |   30 +-
 .../gpu/drm/nouveau/nvkm/engine/disp/channv50.h    |   23 +-
 drivers/gpu/drm/nouveau/nvkm/engine/disp/cursg84.c |    2 +-
 .../gpu/drm/nouveau/nvkm/engine/disp/cursgf119.c   |    2 +-
 .../gpu/drm/nouveau/nvkm/engine/disp/cursgk104.c   |    2 +-
 .../gpu/drm/nouveau/nvkm/engine/disp/cursgp102.c   |   37 +
 .../gpu/drm/nouveau/nvkm/engine/disp/cursgt215.c   |    2 +-
 .../gpu/drm/nouveau/nvkm/engine/disp/cursnv50.c    |    6 +-
 .../gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c   |   44 +-
 .../gpu/drm/nouveau/nvkm/engine/disp/dmacgp104.c   |   23 +-
 .../gpu/drm/nouveau/nvkm/engine/disp/dmacnv50.c    |   46 +-
 drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmg84.c |    2 +-
 .../gpu/drm/nouveau/nvkm/engine/disp/oimmgf119.c   |    2 +-
 .../gpu/drm/nouveau/nvkm/engine/disp/oimmgk104.c   |    2 +-
 .../gpu/drm/nouveau/nvkm/engine/disp/oimmgp102.c   |   37 +
 .../gpu/drm/nouveau/nvkm/engine/disp/oimmgt215.c   |    2 +-
 .../gpu/drm/nouveau/nvkm/engine/disp/oimmnv50.c    |    6 +-
 .../gpu/drm/nouveau/nvkm/engine/disp/piocgf119.c   |   28 +-
 .../gpu/drm/nouveau/nvkm/engine/disp/piocnv50.c    |   30 +-
 .../gpu/drm/nouveau/nvkm/engine/disp/rootgp104.c   |    4 +-
 .../gpu/drm/nouveau/nvkm/engine/disp/rootnv50.c    |    4 +-
 drivers/gpu/drm/vc4/vc4_crtc.c                     |   46 +-
 drivers/gpu/drm/vc4/vc4_drv.h                      |    2 +
 drivers/gpu/drm/vc4/vc4_kms.c                      |   33 +-
 drivers/gpu/drm/vc4/vc4_regs.h                     |    3 +-
 drivers/irqchip/irq-gic-v3-its.c                   |   16 +
 drivers/media/usb/uvc/uvc_driver.c                 |  118 ++-
 drivers/net/ethernet/ibm/ibmveth.c                 |   12 +-
 drivers/net/ethernet/intel/igb/e1000_phy.c         |    4 +
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |   34 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |   13 +-
 .../net/ethernet/mellanox/mlxsw/spectrum_router.c  |   30 +-
 drivers/net/geneve.c                               |   10 +-
 drivers/net/phy/phy.c                              |   14 +-
 drivers/net/tun.c                                  |   18 +-
 drivers/net/vrf.c                                  |    3 +-
 drivers/net/vxlan.c                                |   27 +-
 drivers/pci/iov.c                                  |   70 +-
 drivers/pci/pci.c                                  |   34 -
 drivers/pci/pci.h                                  |    7 +-
 drivers/pci/probe.c                                |    3 +-
 drivers/pci/rom.c                                  |    5 +
 drivers/pci/setup-res.c                            |   48 +-
 drivers/s390/crypto/ap_bus.c                       |    3 +
 drivers/s390/crypto/ap_bus.h                       |    1 +
 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c           | 1096 +++++++++-----------
 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h           |    5 +-
 drivers/tty/serial/8250/8250_pci.c                 |   23 +-
 drivers/usb/gadget/udc/atmel_usba_udc.c            |    3 +-
 drivers/usb/gadget/udc/atmel_usba_udc.h            |    1 +
 drivers/vfio/vfio_iommu_spapr_tce.c                |  328 ++++--
 include/linux/bpf_verifier.h                       |   14 +-
 include/linux/dccp.h                               |    1 +
 include/linux/hyperv.h                             |   32 +-
 include/uapi/linux/packet_diag.h                   |    2 +-
 kernel/bpf/verifier.c                              |   77 +-
 kernel/futex.c                                     |   22 +-
 kernel/locking/rwsem-spinlock.c                    |   15 +-
 mm/slab.c                                          |    4 +-
 mm/slab.h                                          |    2 +-
 mm/slab_common.c                                   |   27 +-
 mm/slob.c                                          |    2 +-
 mm/slub.c                                          |   19 +-
 net/bridge/br_forward.c                            |    3 +-
 net/bridge/br_input.c                              |    1 +
 net/bridge/br_netfilter_hooks.c                    |   21 -
 net/core/dev.c                                     |   35 +-
 net/core/skbuff.c                                  |   30 +-
 net/dccp/ccids/ccid2.c                             |    1 +
 net/dccp/input.c                                   |   10 +-
 net/dccp/ipv4.c                                    |    3 +-
 net/dccp/ipv6.c                                    |    8 +-
 net/dccp/minisocks.c                               |   25 +-
 net/ipv4/af_inet.c                                 |    4 +-
 net/ipv4/route.c                                   |    1 +
 net/ipv4/tcp_input.c                               |   10 +-
 net/ipv4/tcp_ipv4.c                                |   10 +-
 net/ipv4/tcp_timer.c                               |    6 +-
 net/ipv6/ip6_fib.c                                 |    2 +
 net/ipv6/ip6_offload.c                             |    4 +-
 net/ipv6/ip6_output.c                              |    7 +-
 net/ipv6/ip6_vti.c                                 |    4 +
 net/ipv6/netfilter/nf_conntrack_reasm.c            |    1 +
 net/ipv6/tcp_ipv6.c                                |    8 +-
 net/l2tp/l2tp_ip.c                                 |    2 +-
 net/mpls/af_mpls.c                                 |    4 +-
 net/openvswitch/conntrack.c                        |    1 -
 net/packet/af_packet.c                             |    8 +-
 net/sched/act_api.c                                |    5 +-
 net/sched/act_connmark.c                           |    3 +
 net/sched/act_skbmod.c                             |    1 -
 net/strparser/strparser.c                          |    1 +
 112 files changed, 1816 insertions(+), 1272 deletions(-)

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 01/93] net/mlx5e: Register/unregister vport representors on interface attach/detach
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 02/93] net/mlx5e: Do not reduce LRO WQE size when not using build_skb Greg Kroah-Hartman
                   ` (88 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Saeed Mahameed, Hadar Hen Zion,
	Roi Dayan, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Saeed Mahameed <saeedm@mellanox.com>


[ Upstream commit 6f08a22c5fb2b9aefb8ecd8496758e7a677c1fde ]

Currently vport representors are added only on driver load and removed on
driver unload.  Apparently we forgot to handle them when we added the
seamless reset flow feature.  This caused to leave the representors
netdevs alive and active with open HW resources on pci shutdown and on
error reset flows.

To overcome this we move their handling to interface attach/detach, so
they would be cleaned up on shutdown and recreated on reset flows.

Fixes: 26e59d8077a3 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   23 ++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3936,6 +3936,19 @@ static void mlx5e_register_vport_rep(str
 	}
 }
 
+static void mlx5e_unregister_vport_rep(struct mlx5_core_dev *mdev)
+{
+	struct mlx5_eswitch *esw = mdev->priv.eswitch;
+	int total_vfs = MLX5_TOTAL_VPORTS(mdev);
+	int vport;
+
+	if (!MLX5_CAP_GEN(mdev, vport_group_manager))
+		return;
+
+	for (vport = 1; vport < total_vfs; vport++)
+		mlx5_eswitch_unregister_vport_rep(esw, vport);
+}
+
 void mlx5e_detach_netdev(struct mlx5_core_dev *mdev, struct net_device *netdev)
 {
 	struct mlx5e_priv *priv = netdev_priv(netdev);
@@ -3983,6 +3996,7 @@ static int mlx5e_attach(struct mlx5_core
 		return err;
 	}
 
+	mlx5e_register_vport_rep(mdev);
 	return 0;
 }
 
@@ -3994,6 +4008,7 @@ static void mlx5e_detach(struct mlx5_cor
 	if (!netif_device_present(netdev))
 		return;
 
+	mlx5e_unregister_vport_rep(mdev);
 	mlx5e_detach_netdev(mdev, netdev);
 	mlx5e_destroy_mdev_resources(mdev);
 }
@@ -4012,8 +4027,6 @@ static void *mlx5e_add(struct mlx5_core_
 	if (err)
 		return NULL;
 
-	mlx5e_register_vport_rep(mdev);
-
 	if (MLX5_CAP_GEN(mdev, vport_group_manager))
 		ppriv = &esw->offloads.vport_reps[0];
 
@@ -4065,13 +4078,7 @@ void mlx5e_destroy_netdev(struct mlx5_co
 
 static void mlx5e_remove(struct mlx5_core_dev *mdev, void *vpriv)
 {
-	struct mlx5_eswitch *esw = mdev->priv.eswitch;
-	int total_vfs = MLX5_TOTAL_VPORTS(mdev);
 	struct mlx5e_priv *priv = vpriv;
-	int vport;
-
-	for (vport = 1; vport < total_vfs; vport++)
-		mlx5_eswitch_unregister_vport_rep(esw, vport);
 
 	unregister_netdev(priv->netdev);
 	mlx5e_detach(mdev, vpriv);

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 02/93] net/mlx5e: Do not reduce LRO WQE size when not using build_skb
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 01/93] net/mlx5e: Register/unregister vport representors on interface attach/detach Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 03/93] net/mlx5e: Fix wrong CQE decompression Greg Kroah-Hartman
                   ` (87 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tariq Toukan, Saeed Mahameed,
	David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tariq Toukan <tariqt@mellanox.com>


[ Upstream commit 4078e637c12f1e0a74293f1ec9563f42bff14a03 ]

When rq_type is Striding RQ, no room of SKB_RESERVE is needed
as SKB allocation is not done via build_skb.

Fixes: e4b85508072b ("net/mlx5e: Slightly reduce hardware LRO size")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -81,6 +81,7 @@ static bool mlx5e_check_fragmented_strid
 static void mlx5e_set_rq_type_params(struct mlx5e_priv *priv, u8 rq_type)
 {
 	priv->params.rq_wq_type = rq_type;
+	priv->params.lro_wqe_sz = MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ;
 	switch (priv->params.rq_wq_type) {
 	case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
 		priv->params.log_rq_size = MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE_MPW;
@@ -92,6 +93,10 @@ static void mlx5e_set_rq_type_params(str
 		break;
 	default: /* MLX5_WQ_TYPE_LINKED_LIST */
 		priv->params.log_rq_size = MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE;
+
+		/* Extra room needed for build_skb */
+		priv->params.lro_wqe_sz -= MLX5_RX_HEADROOM +
+			SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
 	}
 	priv->params.min_rx_wqes = mlx5_min_rx_wqes(priv->params.rq_wq_type,
 					       BIT(priv->params.log_rq_size));
@@ -3473,12 +3478,6 @@ static void mlx5e_build_nic_netdev_priv(
 	mlx5e_build_default_indir_rqt(mdev, priv->params.indirection_rqt,
 				      MLX5E_INDIR_RQT_SIZE, profile->max_nch(mdev));
 
-	priv->params.lro_wqe_sz =
-		MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ -
-		/* Extra room needed for build_skb */
-		MLX5_RX_HEADROOM -
-		SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
-
 	/* Initialize pflags */
 	MLX5E_SET_PRIV_FLAG(priv, MLX5E_PFLAG_RX_CQE_BASED_MODER,
 			    priv->params.rx_cq_period_mode == MLX5_CQ_PERIOD_MODE_START_FROM_CQE);

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 03/93] net/mlx5e: Fix wrong CQE decompression
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 01/93] net/mlx5e: Register/unregister vport representors on interface attach/detach Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 02/93] net/mlx5e: Do not reduce LRO WQE size when not using build_skb Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 04/93] vxlan: correctly validate VXLAN ID against VXLAN_N_VID Greg Kroah-Hartman
                   ` (86 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tariq Toukan, Tom Herbert,
	kernel-team, Saeed Mahameed, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tariq Toukan <tariqt@mellanox.com>


[ Upstream commit 36154be40a28e4afaa0416da2681d80b7e2ca319 ]

In cqe compression with striding RQ, the decompression of the CQE field
wqe_counter was done with a wrong wraparound value.
This caused handling cqes with a wrong pointer to wqe (rx descriptor)
and creating SKBs with wrong data, pointing to wrong (and already consumed)
strides/pages.

The meaning of the CQE field wqe_counter in striding RQ holds the
stride index instead of the WQE index. Hence, when decompressing
a CQE, wqe_counter should have wrapped-around the number of strides
in a single multi-packet WQE.

We dropped this wrap-around mask at all in CQE decompression of striding
RQ. It is not needed as in such cases the CQE compression session would
break because of different value of wqe_id field, starting a new
compression session.

Tested:
 ethtool -K ethxx lro off/on
 ethtool --set-priv-flags ethxx rx_cqe_compress on
 super_netperf 16 {ipv4,ipv6} -t TCP_STREAM -m 50 -D
 verified no csum errors and no page refcount issues.

Fixes: 7219ab34f184 ("net/mlx5e: CQE compression")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reported-by: Tom Herbert <tom@herbertland.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c |   13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -92,19 +92,18 @@ static inline void mlx5e_cqes_update_own
 static inline void mlx5e_decompress_cqe(struct mlx5e_rq *rq,
 					struct mlx5e_cq *cq, u32 cqcc)
 {
-	u16 wqe_cnt_step;
-
 	cq->title.byte_cnt     = cq->mini_arr[cq->mini_arr_idx].byte_cnt;
 	cq->title.check_sum    = cq->mini_arr[cq->mini_arr_idx].checksum;
 	cq->title.op_own      &= 0xf0;
 	cq->title.op_own      |= 0x01 & (cqcc >> cq->wq.log_sz);
 	cq->title.wqe_counter  = cpu_to_be16(cq->decmprs_wqe_counter);
 
-	wqe_cnt_step =
-		rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ ?
-		mpwrq_get_cqe_consumed_strides(&cq->title) : 1;
-	cq->decmprs_wqe_counter =
-		(cq->decmprs_wqe_counter + wqe_cnt_step) & rq->wq.sz_m1;
+	if (rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ)
+		cq->decmprs_wqe_counter +=
+			mpwrq_get_cqe_consumed_strides(&cq->title);
+	else
+		cq->decmprs_wqe_counter =
+			(cq->decmprs_wqe_counter + 1) & rq->wq.sz_m1;
 }
 
 static inline void mlx5e_decompress_cqe_no_hash(struct mlx5e_rq *rq,

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 04/93] vxlan: correctly validate VXLAN ID against VXLAN_N_VID
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 03/93] net/mlx5e: Fix wrong CQE decompression Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 05/93] vti6: return GRE_KEY for vti6 Greg Kroah-Hartman
                   ` (85 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Matthias Schiffer, Jiri Benc,
	David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Matthias Schiffer <mschiffer@universe-factory.net>


[ Upstream commit 4e37d6911f36545b286d15073f6f2222f840e81c ]

The incorrect check caused an off-by-one error: the maximum VID 0xffffff
was unusable.

Fixes: d342894c5d2f ("vxlan: virtual extensible lan")
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/vxlan.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2637,7 +2637,7 @@ static int vxlan_validate(struct nlattr
 
 	if (data[IFLA_VXLAN_ID]) {
 		__u32 id = nla_get_u32(data[IFLA_VXLAN_ID]);
-		if (id >= VXLAN_VID_MASK)
+		if (id >= VXLAN_N_VID)
 			return -ERANGE;
 	}
 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 05/93] vti6: return GRE_KEY for vti6
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (3 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 04/93] vxlan: correctly validate VXLAN ID against VXLAN_N_VID Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 06/93] vxlan: dont allow overwrite of config src addr Greg Kroah-Hartman
                   ` (84 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, David Forster, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: David Forster <dforster@brocade.com>


[ Upstream commit 7dcdf941cdc96692ab99fd790c8cc68945514851 ]

Align vti6 with vti by returning GRE_KEY flag. This enables iproute2
to display tunnel keys on "ip -6 tunnel show"

Signed-off-by: David Forster <dforster@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/ip6_vti.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -691,6 +691,10 @@ vti6_parm_to_user(struct ip6_tnl_parm2 *
 	u->link = p->link;
 	u->i_key = p->i_key;
 	u->o_key = p->o_key;
+	if (u->i_key)
+		u->i_flags |= GRE_KEY;
+	if (u->o_key)
+		u->o_flags |= GRE_KEY;
 	u->proto = p->proto;
 
 	memcpy(u->name, p->name, sizeof(u->name));

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 06/93] vxlan: dont allow overwrite of config src addr
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 05/93] vti6: return GRE_KEY for vti6 Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 07/93] ipv4: mask tos for input route Greg Kroah-Hartman
                   ` (83 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Brian Russell, Jiri Benc, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Brian Russell <brussell@brocade.com>


[ Upstream commit 1158632b5a2dcce0786c1b1b99654e81cc867981 ]

When using IPv6 transport and a default dst, a pointer to the configured
source address is passed into the route lookup. If no source address is
configured, then the value is overwritten.

IPv6 route lookup ignores egress ifindex match if the source address is set,
so if egress ifindex match is desired, the source address must be passed
as any. The overwrite breaks this for subsequent lookups.

Avoid this by copying the configured address to an existing stack variable
and pass a pointer to that instead.

Fixes: 272d96a5ab10 ("net: vxlan: lwt: Use source ip address during route lookup.")

Signed-off-by: Brian Russell <brussell@brocade.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/vxlan.c |   12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1942,7 +1942,6 @@ static void vxlan_xmit_one(struct sk_buf
 	const struct iphdr *old_iph;
 	union vxlan_addr *dst;
 	union vxlan_addr remote_ip, local_ip;
-	union vxlan_addr *src;
 	struct vxlan_metadata _md;
 	struct vxlan_metadata *md = &_md;
 	__be16 src_port = 0, dst_port;
@@ -1960,7 +1959,7 @@ static void vxlan_xmit_one(struct sk_buf
 		dst_port = rdst->remote_port ? rdst->remote_port : vxlan->cfg.dst_port;
 		vni = rdst->remote_vni;
 		dst = &rdst->remote_ip;
-		src = &vxlan->cfg.saddr;
+		local_ip = vxlan->cfg.saddr;
 		dst_cache = &rdst->dst_cache;
 	} else {
 		if (!info) {
@@ -1979,7 +1978,6 @@ static void vxlan_xmit_one(struct sk_buf
 			local_ip.sin6.sin6_addr = info->key.u.ipv6.src;
 		}
 		dst = &remote_ip;
-		src = &local_ip;
 		dst_cache = &info->dst_cache;
 	}
 
@@ -2028,7 +2026,7 @@ static void vxlan_xmit_one(struct sk_buf
 		rt = vxlan_get_route(vxlan, skb,
 				     rdst ? rdst->remote_ifindex : 0, tos,
 				     dst->sin.sin_addr.s_addr,
-				     &src->sin.sin_addr.s_addr,
+				     &local_ip.sin.sin_addr.s_addr,
 				     dst_cache, info);
 		if (IS_ERR(rt)) {
 			netdev_dbg(dev, "no route to %pI4\n",
@@ -2071,7 +2069,7 @@ static void vxlan_xmit_one(struct sk_buf
 		if (err < 0)
 			goto xmit_tx_error;
 
-		udp_tunnel_xmit_skb(rt, sk, skb, src->sin.sin_addr.s_addr,
+		udp_tunnel_xmit_skb(rt, sk, skb, local_ip.sin.sin_addr.s_addr,
 				    dst->sin.sin_addr.s_addr, tos, ttl, df,
 				    src_port, dst_port, xnet, !udp_sum);
 #if IS_ENABLED(CONFIG_IPV6)
@@ -2087,7 +2085,7 @@ static void vxlan_xmit_one(struct sk_buf
 		ndst = vxlan6_get_route(vxlan, skb,
 					rdst ? rdst->remote_ifindex : 0, tos,
 					label, &dst->sin6.sin6_addr,
-					&src->sin6.sin6_addr,
+					&local_ip.sin6.sin6_addr,
 					dst_cache, info);
 		if (IS_ERR(ndst)) {
 			netdev_dbg(dev, "no route to %pI6\n",
@@ -2134,7 +2132,7 @@ static void vxlan_xmit_one(struct sk_buf
 			return;
 		}
 		udp_tunnel6_xmit_skb(ndst, sk, skb, dev,
-				     &src->sin6.sin6_addr,
+				     &local_ip.sin6.sin6_addr,
 				     &dst->sin6.sin6_addr, tos, ttl,
 				     label, src_port, dst_port, !udp_sum);
 #endif

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 07/93] ipv4: mask tos for input route
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (5 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 06/93] vxlan: dont allow overwrite of config src addr Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 08/93] net sched actions: decrement module reference count after table flush Greg Kroah-Hartman
                   ` (82 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Julian Anastasov, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Julian Anastasov <ja@ssi.bg>


[ Upstream commit 6e28099d38c0e50d62c1afc054e37e573adf3d21 ]

Restore the lost masking of TOS in input route code to
allow ip rules to match it properly.

Problem [1] noticed by Shmulik Ladkani <shmulik.ladkani@gmail.com>

[1] http://marc.info/?t=137331755300040&r=1&w=2

Fixes: 89aef8921bfb ("ipv4: Delete routing cache.")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/route.c |    1 +
 1 file changed, 1 insertion(+)

--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1968,6 +1968,7 @@ int ip_route_input_noref(struct sk_buff
 {
 	int res;
 
+	tos &= IPTOS_RT_MASK;
 	rcu_read_lock();
 
 	/* Multicast recognition logic is moved from route cache to here.

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 08/93] net sched actions: decrement module reference count after table flush.
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (6 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 07/93] ipv4: mask tos for input route Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 10/93] net: phy: Avoid deadlock during phy_error() Greg Kroah-Hartman
                   ` (81 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Roman Mashak, Jamal Hadi Salim,
	Cong Wang, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Roman Mashak <mrv@mojatatu.com>


[ Upstream commit edb9d1bff4bbe19b8ae0e71b1f38732591a9eeb2 ]

When tc actions are loaded as a module and no actions have been installed,
flushing them would result in actions removed from the memory, but modules
reference count not being decremented, so that the modules would not be
unloaded.

Following is example with GACT action:

% sudo modprobe act_gact
% lsmod
Module                  Size  Used by
act_gact               16384  0
%
% sudo tc actions ls action gact
%
% sudo tc actions flush action gact
% lsmod
Module                  Size  Used by
act_gact               16384  1
% sudo tc actions flush action gact
% lsmod
Module                  Size  Used by
act_gact               16384  2
% sudo rmmod act_gact
rmmod: ERROR: Module act_gact is in use
....

After the fix:
% lsmod
Module                  Size  Used by
act_gact               16384  0
%
% sudo tc actions add action pass index 1
% sudo tc actions add action pass index 2
% sudo tc actions add action pass index 3
% lsmod
Module                  Size  Used by
act_gact               16384  3
%
% sudo tc actions flush action gact
% lsmod
Module                  Size  Used by
act_gact               16384  0
%
% sudo tc actions flush action gact
% lsmod
Module                  Size  Used by
act_gact               16384  0
% sudo rmmod act_gact
% lsmod
Module                  Size  Used by
%

Fixes: f97017cdefef ("net-sched: Fix actions flushing")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sched/act_api.c |    5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -820,10 +820,8 @@ static int tca_action_flush(struct net *
 		goto out_module_put;
 
 	err = ops->walk(net, skb, &dcb, RTM_DELACTION, ops);
-	if (err < 0)
+	if (err <= 0)
 		goto out_module_put;
-	if (err == 0)
-		goto noflush_out;
 
 	nla_nest_end(skb, nest);
 
@@ -840,7 +838,6 @@ static int tca_action_flush(struct net *
 out_module_put:
 	module_put(ops->owner);
 err_out:
-noflush_out:
 	kfree_skb(skb);
 	return err;
 }

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 10/93] net: phy: Avoid deadlock during phy_error()
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (7 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 08/93] net sched actions: decrement module reference count after table flush Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 11/93] vxlan: lock RCU on TX path Greg Kroah-Hartman
                   ` (80 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Russell King, Florian Fainelli,
	David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Florian Fainelli <f.fainelli@gmail.com>


[ Upstream commit eab127717a6af54401ba534790c793ec143cd1fc ]

phy_error() is called in the PHY state machine workqueue context, and
calls phy_trigger_machine() which does a cancel_delayed_work_sync() of
the workqueue we execute from, causing a deadlock situation.

Augment phy_trigger_machine() machine with a sync boolean indicating
whether we should use cancel_*_sync() or just cancel_*_work().

Fixes: 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.")
Reported-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/phy/phy.c |   14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -611,14 +611,18 @@ void phy_start_machine(struct phy_device
  * phy_trigger_machine - trigger the state machine to run
  *
  * @phydev: the phy_device struct
+ * @sync: indicate whether we should wait for the workqueue cancelation
  *
  * Description: There has been a change in state which requires that the
  *   state machine runs.
  */
 
-static void phy_trigger_machine(struct phy_device *phydev)
+static void phy_trigger_machine(struct phy_device *phydev, bool sync)
 {
-	cancel_delayed_work_sync(&phydev->state_queue);
+	if (sync)
+		cancel_delayed_work_sync(&phydev->state_queue);
+	else
+		cancel_delayed_work(&phydev->state_queue);
 	queue_delayed_work(system_power_efficient_wq, &phydev->state_queue, 0);
 }
 
@@ -655,7 +659,7 @@ static void phy_error(struct phy_device
 	phydev->state = PHY_HALTED;
 	mutex_unlock(&phydev->lock);
 
-	phy_trigger_machine(phydev);
+	phy_trigger_machine(phydev, false);
 }
 
 /**
@@ -817,7 +821,7 @@ void phy_change(struct work_struct *work
 	}
 
 	/* reschedule state queue work to run as soon as possible */
-	phy_trigger_machine(phydev);
+	phy_trigger_machine(phydev, true);
 	return;
 
 ignore:
@@ -907,7 +911,7 @@ void phy_start(struct phy_device *phydev
 	if (do_resume)
 		phy_resume(phydev);
 
-	phy_trigger_machine(phydev);
+	phy_trigger_machine(phydev, true);
 }
 EXPORT_SYMBOL(phy_start);
 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 11/93] vxlan: lock RCU on TX path
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (8 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 10/93] net: phy: Avoid deadlock during phy_error() Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 12/93] geneve: " Greg Kroah-Hartman
                   ` (79 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Jakub Kicinski, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jakub Kicinski <jakub.kicinski@netronome.com>


[ Upstream commit 56de859e9967c070464a9a9f4f18d73f9447298e ]

There is no guarantees that callers of the TX path will hold
the RCU lock.  Grab it explicitly.

Fixes: c6fcc4fc5f8b ("vxlan: avoid using stale vxlan socket.")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/vxlan.c |   13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1955,6 +1955,7 @@ static void vxlan_xmit_one(struct sk_buf
 
 	info = skb_tunnel_info(skb);
 
+	rcu_read_lock();
 	if (rdst) {
 		dst_port = rdst->remote_port ? rdst->remote_port : vxlan->cfg.dst_port;
 		vni = rdst->remote_vni;
@@ -1985,7 +1986,7 @@ static void vxlan_xmit_one(struct sk_buf
 		if (did_rsc) {
 			/* short-circuited back to local bridge */
 			vxlan_encap_bypass(skb, vxlan, vxlan);
-			return;
+			goto out_unlock;
 		}
 		goto drop;
 	}
@@ -2054,7 +2055,7 @@ static void vxlan_xmit_one(struct sk_buf
 			if (!dst_vxlan)
 				goto tx_error;
 			vxlan_encap_bypass(skb, vxlan, dst_vxlan);
-			return;
+			goto out_unlock;
 		}
 
 		if (!info)
@@ -2115,7 +2116,7 @@ static void vxlan_xmit_one(struct sk_buf
 			if (!dst_vxlan)
 				goto tx_error;
 			vxlan_encap_bypass(skb, vxlan, dst_vxlan);
-			return;
+			goto out_unlock;
 		}
 
 		if (!info)
@@ -2129,7 +2130,7 @@ static void vxlan_xmit_one(struct sk_buf
 		if (err < 0) {
 			dst_release(ndst);
 			dev->stats.tx_errors++;
-			return;
+			goto out_unlock;
 		}
 		udp_tunnel6_xmit_skb(ndst, sk, skb, dev,
 				     &local_ip.sin6.sin6_addr,
@@ -2137,7 +2138,8 @@ static void vxlan_xmit_one(struct sk_buf
 				     label, src_port, dst_port, !udp_sum);
 #endif
 	}
-
+out_unlock:
+	rcu_read_unlock();
 	return;
 
 drop:
@@ -2153,6 +2155,7 @@ tx_error:
 	dev->stats.tx_errors++;
 tx_free:
 	dev_kfree_skb(skb);
+	rcu_read_unlock();
 }
 
 /* Transmit local packets over Vxlan

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 12/93] geneve: lock RCU on TX path
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (9 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 11/93] vxlan: lock RCU on TX path Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 13/93] mlxsw: spectrum_router: Avoid potential packets loss Greg Kroah-Hartman
                   ` (78 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Jakub Kicinski, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jakub Kicinski <jakub.kicinski@netronome.com>


[ Upstream commit a717e3f740803cc88bd5c9a70c93504f6a368663 ]

There is no guarantees that callers of the TX path will hold
the RCU lock.  Grab it explicitly.

Fixes: fceb9c3e3825 ("geneve: avoid using stale geneve socket.")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/geneve.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -1039,16 +1039,22 @@ static netdev_tx_t geneve_xmit(struct sk
 {
 	struct geneve_dev *geneve = netdev_priv(dev);
 	struct ip_tunnel_info *info = NULL;
+	int err;
 
 	if (geneve->collect_md)
 		info = skb_tunnel_info(skb);
 
+	rcu_read_lock();
 #if IS_ENABLED(CONFIG_IPV6)
 	if ((info && ip_tunnel_info_af(info) == AF_INET6) ||
 	    (!info && geneve->remote.sa.sa_family == AF_INET6))
-		return geneve6_xmit_skb(skb, dev, info);
+		err = geneve6_xmit_skb(skb, dev, info);
+	else
 #endif
-	return geneve_xmit_skb(skb, dev, info);
+		err = geneve_xmit_skb(skb, dev, info);
+	rcu_read_unlock();
+
+	return err;
 }
 
 static int __geneve_change_mtu(struct net_device *dev, int new_mtu, bool strict)

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 13/93] mlxsw: spectrum_router: Avoid potential packets loss
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (10 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 12/93] geneve: " Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 14/93] tcp/dccp: block BH for SYN processing Greg Kroah-Hartman
                   ` (77 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ido Schimmel, Jiri Pirko, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ido Schimmel <idosch@mellanox.com>


[ Upstream commit f7df4923fa986247e93ec2cdff5ca168fff14dcf ]

When the structure of the LPM tree changes (f.e., due to the addition of
a new prefix), we unbind the old tree and then bind the new one. This
may result in temporary packet loss.

Instead, overwrite the old binding with the new one.

Fixes: 6b75c4807db3 ("mlxsw: spectrum_router: Add virtual router management")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c |   30 ++++++++++++------
 1 file changed, 20 insertions(+), 10 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -500,30 +500,40 @@ static int
 mlxsw_sp_vr_lpm_tree_check(struct mlxsw_sp *mlxsw_sp, struct mlxsw_sp_vr *vr,
 			   struct mlxsw_sp_prefix_usage *req_prefix_usage)
 {
-	struct mlxsw_sp_lpm_tree *lpm_tree;
+	struct mlxsw_sp_lpm_tree *lpm_tree = vr->lpm_tree;
+	struct mlxsw_sp_lpm_tree *new_tree;
+	int err;
 
-	if (mlxsw_sp_prefix_usage_eq(req_prefix_usage,
-				     &vr->lpm_tree->prefix_usage))
+	if (mlxsw_sp_prefix_usage_eq(req_prefix_usage, &lpm_tree->prefix_usage))
 		return 0;
 
-	lpm_tree = mlxsw_sp_lpm_tree_get(mlxsw_sp, req_prefix_usage,
+	new_tree = mlxsw_sp_lpm_tree_get(mlxsw_sp, req_prefix_usage,
 					 vr->proto, false);
-	if (IS_ERR(lpm_tree)) {
+	if (IS_ERR(new_tree)) {
 		/* We failed to get a tree according to the required
 		 * prefix usage. However, the current tree might be still good
 		 * for us if our requirement is subset of the prefixes used
 		 * in the tree.
 		 */
 		if (mlxsw_sp_prefix_usage_subset(req_prefix_usage,
-						 &vr->lpm_tree->prefix_usage))
+						 &lpm_tree->prefix_usage))
 			return 0;
-		return PTR_ERR(lpm_tree);
+		return PTR_ERR(new_tree);
 	}
 
-	mlxsw_sp_vr_lpm_tree_unbind(mlxsw_sp, vr);
-	mlxsw_sp_lpm_tree_put(mlxsw_sp, vr->lpm_tree);
+	/* Prevent packet loss by overwriting existing binding */
+	vr->lpm_tree = new_tree;
+	err = mlxsw_sp_vr_lpm_tree_bind(mlxsw_sp, vr);
+	if (err)
+		goto err_tree_bind;
+	mlxsw_sp_lpm_tree_put(mlxsw_sp, lpm_tree);
+
+	return 0;
+
+err_tree_bind:
 	vr->lpm_tree = lpm_tree;
-	return mlxsw_sp_vr_lpm_tree_bind(mlxsw_sp, vr);
+	mlxsw_sp_lpm_tree_put(mlxsw_sp, new_tree);
+	return err;
 }
 
 static struct mlxsw_sp_vr *mlxsw_sp_vr_get(struct mlxsw_sp *mlxsw_sp,

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 14/93] tcp/dccp: block BH for SYN processing
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (11 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 13/93] mlxsw: spectrum_router: Avoid potential packets loss Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 15/93] net: bridge: allow IPv6 when multicast flood is disabled Greg Kroah-Hartman
                   ` (76 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Andrey Konovalov,
	Soheil Hassas Yeganeh, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>


[ Upstream commit 449809a66c1d0b1563dee84493e14bf3104d2d7e ]

SYN processing really was meant to be handled from BH.

When I got rid of BH blocking while processing socket backlog
in commit 5413d1babe8f ("net: do not block BH while processing socket
backlog"), I forgot that a malicious user could transition to TCP_LISTEN
from a state that allowed (SYN) packets to be parked in the socket
backlog while socket is owned by the thread doing the listen() call.

Sure enough syzkaller found this and reported the bug ;)

=================================
[ INFO: inconsistent lock state ]
4.10.0+ #60 Not tainted
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
syz-executor0/5090 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&(&hashinfo->ehash_locks[i])->rlock){+.?...}, at:
[<ffffffff83a6a370>] spin_lock include/linux/spinlock.h:299 [inline]
 (&(&hashinfo->ehash_locks[i])->rlock){+.?...}, at:
[<ffffffff83a6a370>] inet_ehash_insert+0x240/0xad0
net/ipv4/inet_hashtables.c:407
{IN-SOFTIRQ-W} state was registered at:
  mark_irqflags kernel/locking/lockdep.c:2923 [inline]
  __lock_acquire+0xbcf/0x3270 kernel/locking/lockdep.c:3295
  lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
  _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
  spin_lock include/linux/spinlock.h:299 [inline]
  inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
  reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
  inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
  tcp_conn_request+0x25cc/0x3310 net/ipv4/tcp_input.c:6399
  tcp_v4_conn_request+0x157/0x220 net/ipv4/tcp_ipv4.c:1262
  tcp_rcv_state_process+0x802/0x4130 net/ipv4/tcp_input.c:5889
  tcp_v4_do_rcv+0x56b/0x940 net/ipv4/tcp_ipv4.c:1433
  tcp_v4_rcv+0x2e12/0x3210 net/ipv4/tcp_ipv4.c:1711
  ip_local_deliver_finish+0x4ce/0xc40 net/ipv4/ip_input.c:216
  NF_HOOK include/linux/netfilter.h:257 [inline]
  ip_local_deliver+0x1ce/0x710 net/ipv4/ip_input.c:257
  dst_input include/net/dst.h:492 [inline]
  ip_rcv_finish+0xb1d/0x2110 net/ipv4/ip_input.c:396
  NF_HOOK include/linux/netfilter.h:257 [inline]
  ip_rcv+0xd90/0x19c0 net/ipv4/ip_input.c:487
  __netif_receive_skb_core+0x1ad1/0x3400 net/core/dev.c:4179
  __netif_receive_skb+0x2a/0x170 net/core/dev.c:4217
  netif_receive_skb_internal+0x1d6/0x430 net/core/dev.c:4245
  napi_skb_finish net/core/dev.c:4602 [inline]
  napi_gro_receive+0x4e6/0x680 net/core/dev.c:4636
  e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4033 [inline]
  e1000_clean_rx_irq+0x5e0/0x1490
drivers/net/ethernet/intel/e1000/e1000_main.c:4489
  e1000_clean+0xb9a/0x2910 drivers/net/ethernet/intel/e1000/e1000_main.c:3834
  napi_poll net/core/dev.c:5171 [inline]
  net_rx_action+0xe70/0x1900 net/core/dev.c:5236
  __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
  invoke_softirq kernel/softirq.c:364 [inline]
  irq_exit+0x19e/0x1d0 kernel/softirq.c:405
  exiting_irq arch/x86/include/asm/apic.h:658 [inline]
  do_IRQ+0x81/0x1a0 arch/x86/kernel/irq.c:250
  ret_from_intr+0x0/0x20
  native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
  arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
  default_idle+0x8f/0x410 arch/x86/kernel/process.c:271
  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:262
  default_idle_call+0x36/0x60 kernel/sched/idle.c:96
  cpuidle_idle_call kernel/sched/idle.c:154 [inline]
  do_idle+0x348/0x440 kernel/sched/idle.c:243
  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:345
  start_secondary+0x344/0x440 arch/x86/kernel/smpboot.c:272
  verify_cpu+0x0/0xfc
irq event stamp: 1741
hardirqs last  enabled at (1741): [<ffffffff84d49d77>]
__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160
[inline]
hardirqs last  enabled at (1741): [<ffffffff84d49d77>]
_raw_spin_unlock_irqrestore+0xf7/0x1a0 kernel/locking/spinlock.c:191
hardirqs last disabled at (1740): [<ffffffff84d4a732>]
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
hardirqs last disabled at (1740): [<ffffffff84d4a732>]
_raw_spin_lock_irqsave+0xa2/0x110 kernel/locking/spinlock.c:159
softirqs last  enabled at (1738): [<ffffffff84d4deff>]
__do_softirq+0x7cf/0xb7d kernel/softirq.c:310
softirqs last disabled at (1571): [<ffffffff84d4b92c>]
do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&hashinfo->ehash_locks[i])->rlock);
  <Interrupt>
    lock(&(&hashinfo->ehash_locks[i])->rlock);

 *** DEADLOCK ***

1 lock held by syz-executor0/5090:
 #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83406b43>] lock_sock
include/net/sock.h:1460 [inline]
 #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83406b43>]
sock_setsockopt+0x233/0x1e40 net/core/sock.c:683

stack backtrace:
CPU: 1 PID: 5090 Comm: syz-executor0 Not tainted 4.10.0+ #60
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x292/0x398 lib/dump_stack.c:51
 print_usage_bug+0x3ef/0x450 kernel/locking/lockdep.c:2387
 valid_state kernel/locking/lockdep.c:2400 [inline]
 mark_lock_irq kernel/locking/lockdep.c:2602 [inline]
 mark_lock+0xf30/0x1410 kernel/locking/lockdep.c:3065
 mark_irqflags kernel/locking/lockdep.c:2941 [inline]
 __lock_acquire+0x6dc/0x3270 kernel/locking/lockdep.c:3295
 lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:299 [inline]
 inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
 reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
 inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
 dccp_v6_conn_request+0xada/0x11b0 net/dccp/ipv6.c:380
 dccp_rcv_state_process+0x51e/0x1660 net/dccp/input.c:606
 dccp_v6_do_rcv+0x213/0x350 net/dccp/ipv6.c:632
 sk_backlog_rcv include/net/sock.h:896 [inline]
 __release_sock+0x127/0x3a0 net/core/sock.c:2052
 release_sock+0xa5/0x2b0 net/core/sock.c:2539
 sock_setsockopt+0x60f/0x1e40 net/core/sock.c:1016
 SYSC_setsockopt net/socket.c:1782 [inline]
 SyS_setsockopt+0x2fb/0x3a0 net/socket.c:1765
 entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x4458b9
RSP: 002b:00007fe8b26c2b58 EFLAGS: 00000292 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00000000004458b9
RDX: 000000000000001a RSI: 0000000000000001 RDI: 0000000000000006
RBP: 00000000006e2110 R08: 0000000000000010 R09: 0000000000000000
R10: 00000000208c3000 R11: 0000000000000292 R12: 0000000000708000
R13: 0000000020000000 R14: 0000000000001000 R15: 0000000000000000

Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/dccp/input.c     |   10 ++++++++--
 net/ipv4/tcp_input.c |   10 ++++++++--
 2 files changed, 16 insertions(+), 4 deletions(-)

--- a/net/dccp/input.c
+++ b/net/dccp/input.c
@@ -577,6 +577,7 @@ int dccp_rcv_state_process(struct sock *
 	struct dccp_sock *dp = dccp_sk(sk);
 	struct dccp_skb_cb *dcb = DCCP_SKB_CB(skb);
 	const int old_state = sk->sk_state;
+	bool acceptable;
 	int queued = 0;
 
 	/*
@@ -603,8 +604,13 @@ int dccp_rcv_state_process(struct sock *
 	 */
 	if (sk->sk_state == DCCP_LISTEN) {
 		if (dh->dccph_type == DCCP_PKT_REQUEST) {
-			if (inet_csk(sk)->icsk_af_ops->conn_request(sk,
-								    skb) < 0)
+			/* It is possible that we process SYN packets from backlog,
+			 * so we need to make sure to disable BH right there.
+			 */
+			local_bh_disable();
+			acceptable = inet_csk(sk)->icsk_af_ops->conn_request(sk, skb) >= 0;
+			local_bh_enable();
+			if (!acceptable)
 				return 1;
 			consume_skb(skb);
 			return 0;
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5916,9 +5916,15 @@ int tcp_rcv_state_process(struct sock *s
 		if (th->syn) {
 			if (th->fin)
 				goto discard;
-			if (icsk->icsk_af_ops->conn_request(sk, skb) < 0)
-				return 1;
+			/* It is possible that we process SYN packets from backlog,
+			 * so we need to make sure to disable BH right there.
+			 */
+			local_bh_disable();
+			acceptable = icsk->icsk_af_ops->conn_request(sk, skb) >= 0;
+			local_bh_enable();
 
+			if (!acceptable)
+				return 1;
 			consume_skb(skb);
 			return 0;
 		}

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 15/93] net: bridge: allow IPv6 when multicast flood is disabled
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (12 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 14/93] tcp/dccp: block BH for SYN processing Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 16/93] net: dont call strlen() on the user buffer in packet_bind_spkt() Greg Kroah-Hartman
                   ` (75 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Nikolay Aleksandrov, Mike Manning,
	David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mike Manning <mmanning@brocade.com>


[ Upstream commit 8953de2f02ad7b15e4964c82f9afd60f128e4e98 ]

Even with multicast flooding turned off, IPv6 ND should still work so
that IPv6 connectivity is provided. Allow this by continuing to flood
multicast traffic originated by us.

Fixes: b6cb5ac8331b ("net: bridge: add per-port multicast flood flag")
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Mike Manning <mmanning@brocade.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/bridge/br_forward.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -186,8 +186,9 @@ void br_flood(struct net_bridge *br, str
 		/* Do not flood unicast traffic to ports that turn it off */
 		if (pkt_type == BR_PKT_UNICAST && !(p->flags & BR_FLOOD))
 			continue;
+		/* Do not flood if mc off, except for traffic we originate */
 		if (pkt_type == BR_PKT_MULTICAST &&
-		    !(p->flags & BR_MCAST_FLOOD))
+		    !(p->flags & BR_MCAST_FLOOD) && skb->dev != br->dev)
 			continue;
 
 		/* Do not flood to ports that enable proxy ARP */

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 16/93] net: dont call strlen() on the user buffer in packet_bind_spkt()
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (13 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 15/93] net: bridge: allow IPv6 when multicast flood is disabled Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 17/93] net: net_enable_timestamp() can be called from irq contexts Greg Kroah-Hartman
                   ` (74 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Alexander Potapenko,
	David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexander Potapenko <glider@google.com>


[ Upstream commit 540e2894f7905538740aaf122bd8e0548e1c34a4 ]

KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
uninitialized memory in packet_bind_spkt():
Acked-by: Eric Dumazet <edumazet@google.com>

==================================================================
BUG: KMSAN: use of unitialized memory
CPU: 0 PID: 1074 Comm: packet Not tainted 4.8.0-rc6+ #1891
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
01/01/2011
 0000000000000000 ffff88006b6dfc08 ffffffff82559ae8 ffff88006b6dfb48
 ffffffff818a7c91 ffffffff85b9c870 0000000000000092 ffffffff85b9c550
 0000000000000000 0000000000000092 00000000ec400911 0000000000000002
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff82559ae8>] dump_stack+0x238/0x290 lib/dump_stack.c:51
 [<ffffffff818a6626>] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1003
 [<ffffffff818a783b>] __msan_warning+0x5b/0xb0
mm/kmsan/kmsan_instr.c:424
 [<     inline     >] strlen lib/string.c:484
 [<ffffffff8259b58d>] strlcpy+0x9d/0x200 lib/string.c:144
 [<ffffffff84b2eca4>] packet_bind_spkt+0x144/0x230
net/packet/af_packet.c:3132
 [<ffffffff84242e4d>] SYSC_bind+0x40d/0x5f0 net/socket.c:1370
 [<ffffffff84242a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
 [<ffffffff8515991b>] entry_SYSCALL_64_fastpath+0x13/0x8f
arch/x86/entry/entry_64.o:?
chained origin: 00000000eba00911
 [<ffffffff810bb787>] save_stack_trace+0x27/0x50
arch/x86/kernel/stacktrace.c:67
 [<     inline     >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322
 [<     inline     >] kmsan_save_stack mm/kmsan/kmsan.c:334
 [<ffffffff818a59f8>] kmsan_internal_chain_origin+0x118/0x1e0
mm/kmsan/kmsan.c:527
 [<ffffffff818a7773>] __msan_set_alloca_origin4+0xc3/0x130
mm/kmsan/kmsan_instr.c:380
 [<ffffffff84242b69>] SYSC_bind+0x129/0x5f0 net/socket.c:1356
 [<ffffffff84242a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
 [<ffffffff8515991b>] entry_SYSCALL_64_fastpath+0x13/0x8f
arch/x86/entry/entry_64.o:?
origin description: ----address@SYSC_bind (origin=00000000eb400911)
==================================================================
(the line numbers are relative to 4.8-rc6, but the bug persists
upstream)

, when I run the following program as root:

=====================================
 #include <string.h>
 #include <sys/socket.h>
 #include <netpacket/packet.h>
 #include <net/ethernet.h>

 int main() {
   struct sockaddr addr;
   memset(&addr, 0xff, sizeof(addr));
   addr.sa_family = AF_PACKET;
   int fd = socket(PF_PACKET, SOCK_PACKET, htons(ETH_P_ALL));
   bind(fd, &addr, sizeof(addr));
   return 0;
 }
=====================================

This happens because addr.sa_data copied from the userspace is not
zero-terminated, and copying it with strlcpy() in packet_bind_spkt()
results in calling strlen() on the kernel copy of that non-terminated
buffer.

Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/packet/af_packet.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -3140,7 +3140,7 @@ static int packet_bind_spkt(struct socke
 			    int addr_len)
 {
 	struct sock *sk = sock->sk;
-	char name[15];
+	char name[sizeof(uaddr->sa_data) + 1];
 
 	/*
 	 *	Check legality
@@ -3148,7 +3148,11 @@ static int packet_bind_spkt(struct socke
 
 	if (addr_len != sizeof(struct sockaddr))
 		return -EINVAL;
-	strlcpy(name, uaddr->sa_data, sizeof(name));
+	/* uaddr->sa_data comes from the userspace, it's not guaranteed to be
+	 * zero-terminated.
+	 */
+	memcpy(name, uaddr->sa_data, sizeof(uaddr->sa_data));
+	name[sizeof(uaddr->sa_data)] = 0;
 
 	return packet_do_bind(sk, name, 0, pkt_sk(sk)->num);
 }

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 17/93] net: net_enable_timestamp() can be called from irq contexts
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (14 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 16/93] net: dont call strlen() on the user buffer in packet_bind_spkt() Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 18/93] ipv6: orphan skbs in reassembly unit Greg Kroah-Hartman
                   ` (73 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Dmitry Vyukov, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>


[ Upstream commit 13baa00ad01bb3a9f893e3a08cbc2d072fc0c15d ]

It is now very clear that silly TCP listeners might play with
enabling/disabling timestamping while new children are added
to their accept queue.

Meaning net_enable_timestamp() can be called from BH context
while current state of the static key is not enabled.

Lets play safe and allow all contexts.

The work queue is scheduled only under the problematic cases,
which are the static key enable/disable transition, to not slow down
critical paths.

This extends and improves what we did in commit 5fa8bbda38c6 ("net: use
a work queue to defer net_disable_timestamp() work")

Fixes: b90e5794c5bd ("net: dont call jump_label_dec from irq context")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/core/dev.c |   35 +++++++++++++++++++++++++++++++----
 1 file changed, 31 insertions(+), 4 deletions(-)

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1697,27 +1697,54 @@ EXPORT_SYMBOL_GPL(net_dec_egress_queue);
 static struct static_key netstamp_needed __read_mostly;
 #ifdef HAVE_JUMP_LABEL
 static atomic_t netstamp_needed_deferred;
+static atomic_t netstamp_wanted;
 static void netstamp_clear(struct work_struct *work)
 {
 	int deferred = atomic_xchg(&netstamp_needed_deferred, 0);
+	int wanted;
 
-	while (deferred--)
-		static_key_slow_dec(&netstamp_needed);
+	wanted = atomic_add_return(deferred, &netstamp_wanted);
+	if (wanted > 0)
+		static_key_enable(&netstamp_needed);
+	else
+		static_key_disable(&netstamp_needed);
 }
 static DECLARE_WORK(netstamp_work, netstamp_clear);
 #endif
 
 void net_enable_timestamp(void)
 {
+#ifdef HAVE_JUMP_LABEL
+	int wanted;
+
+	while (1) {
+		wanted = atomic_read(&netstamp_wanted);
+		if (wanted <= 0)
+			break;
+		if (atomic_cmpxchg(&netstamp_wanted, wanted, wanted + 1) == wanted)
+			return;
+	}
+	atomic_inc(&netstamp_needed_deferred);
+	schedule_work(&netstamp_work);
+#else
 	static_key_slow_inc(&netstamp_needed);
+#endif
 }
 EXPORT_SYMBOL(net_enable_timestamp);
 
 void net_disable_timestamp(void)
 {
 #ifdef HAVE_JUMP_LABEL
-	/* net_disable_timestamp() can be called from non process context */
-	atomic_inc(&netstamp_needed_deferred);
+	int wanted;
+
+	while (1) {
+		wanted = atomic_read(&netstamp_wanted);
+		if (wanted <= 1)
+			break;
+		if (atomic_cmpxchg(&netstamp_wanted, wanted, wanted - 1) == wanted)
+			return;
+	}
+	atomic_dec(&netstamp_needed_deferred);
 	schedule_work(&netstamp_work);
 #else
 	static_key_slow_dec(&netstamp_needed);

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 18/93] ipv6: orphan skbs in reassembly unit
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (15 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 17/93] net: net_enable_timestamp() can be called from irq contexts Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 19/93] dccp: Unlock sock before calling sk_free() Greg Kroah-Hartman
                   ` (72 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Joe Stringer, Andrey Konovalov,
	Eric Dumazet, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>


[ Upstream commit 48cac18ecf1de82f76259a54402c3adb7839ad01 ]

Andrey reported a use-after-free in IPv6 stack.

Issue here is that we free the socket while it still has skb
in TX path and in some queues.

It happens here because IPv6 reassembly unit messes skb->truesize,
breaking skb_set_owner_w() badly.

We fixed a similar issue for IPV4 in commit 8282f27449bf ("inet: frag:
Always orphan skbs inside ip_defrag()")
Acked-by: Joe Stringer <joe@ovn.org>

==================================================================
BUG: KASAN: use-after-free in sock_wfree+0x118/0x120
Read of size 8 at addr ffff880062da0060 by task a.out/4140

page:ffffea00018b6800 count:1 mapcount:0 mapping:          (null)
index:0x0 compound_mapcount: 0
flags: 0x100000000008100(slab|head)
raw: 0100000000008100 0000000000000000 0000000000000000 0000000180130013
raw: dead000000000100 dead000000000200 ffff88006741f140 0000000000000000
page dumped because: kasan: bad access detected

CPU: 0 PID: 4140 Comm: a.out Not tainted 4.10.0-rc3+ #59
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15
 dump_stack+0x292/0x398 lib/dump_stack.c:51
 describe_address mm/kasan/report.c:262
 kasan_report_error+0x121/0x560 mm/kasan/report.c:370
 kasan_report mm/kasan/report.c:392
 __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:413
 sock_flag ./arch/x86/include/asm/bitops.h:324
 sock_wfree+0x118/0x120 net/core/sock.c:1631
 skb_release_head_state+0xfc/0x250 net/core/skbuff.c:655
 skb_release_all+0x15/0x60 net/core/skbuff.c:668
 __kfree_skb+0x15/0x20 net/core/skbuff.c:684
 kfree_skb+0x16e/0x4e0 net/core/skbuff.c:705
 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
 inet_frag_put ./include/net/inet_frag.h:133
 nf_ct_frag6_gather+0x1125/0x38b0 net/ipv6/netfilter/nf_conntrack_reasm.c:617
 ipv6_defrag+0x21b/0x350 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
 nf_hook_entry_hookfn ./include/linux/netfilter.h:102
 nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
 nf_hook ./include/linux/netfilter.h:212
 __ip6_local_out+0x52c/0xaf0 net/ipv6/output_core.c:160
 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
 ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
 ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
 rawv6_push_pending_frames net/ipv6/raw.c:613
 rawv6_sendmsg+0x2cff/0x4130 net/ipv6/raw.c:927
 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
 sock_sendmsg_nosec net/socket.c:635
 sock_sendmsg+0xca/0x110 net/socket.c:645
 sock_write_iter+0x326/0x620 net/socket.c:848
 new_sync_write fs/read_write.c:499
 __vfs_write+0x483/0x760 fs/read_write.c:512
 vfs_write+0x187/0x530 fs/read_write.c:560
 SYSC_write fs/read_write.c:607
 SyS_write+0xfb/0x230 fs/read_write.c:599
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
RIP: 0033:0x7ff26e6f5b79
RSP: 002b:00007ff268e0ed98 EFLAGS: 00000206 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007ff268e0f9c0 RCX: 00007ff26e6f5b79
RDX: 0000000000000010 RSI: 0000000020f50fe1 RDI: 0000000000000003
RBP: 00007ff26ebc1220 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
R13: 00007ff268e0f9c0 R14: 00007ff26efec040 R15: 0000000000000003

The buggy address belongs to the object at ffff880062da0000
 which belongs to the cache RAWv6 of size 1504
The buggy address ffff880062da0060 is located 96 bytes inside
 of 1504-byte region [ffff880062da0000, ffff880062da05e0)

Freed by task 4113:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514
 kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
 slab_free_hook mm/slub.c:1352
 slab_free_freelist_hook mm/slub.c:1374
 slab_free mm/slub.c:2951
 kmem_cache_free+0xb2/0x2c0 mm/slub.c:2973
 sk_prot_free net/core/sock.c:1377
 __sk_destruct+0x49c/0x6e0 net/core/sock.c:1452
 sk_destruct+0x47/0x80 net/core/sock.c:1460
 __sk_free+0x57/0x230 net/core/sock.c:1468
 sk_free+0x23/0x30 net/core/sock.c:1479
 sock_put ./include/net/sock.h:1638
 sk_common_release+0x31e/0x4e0 net/core/sock.c:2782
 rawv6_close+0x54/0x80 net/ipv6/raw.c:1214
 inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:431
 sock_release+0x8d/0x1e0 net/socket.c:599
 sock_close+0x16/0x20 net/socket.c:1063
 __fput+0x332/0x7f0 fs/file_table.c:208
 ____fput+0x15/0x20 fs/file_table.c:244
 task_work_run+0x19b/0x270 kernel/task_work.c:116
 exit_task_work ./include/linux/task_work.h:21
 do_exit+0x186b/0x2800 kernel/exit.c:839
 do_group_exit+0x149/0x420 kernel/exit.c:943
 SYSC_exit_group kernel/exit.c:954
 SyS_exit_group+0x1d/0x20 kernel/exit.c:952
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203

Allocated by task 4115:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514
 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:544
 slab_post_alloc_hook mm/slab.h:432
 slab_alloc_node mm/slub.c:2708
 slab_alloc mm/slub.c:2716
 kmem_cache_alloc+0x1af/0x250 mm/slub.c:2721
 sk_prot_alloc+0x65/0x2a0 net/core/sock.c:1334
 sk_alloc+0x105/0x1010 net/core/sock.c:1396
 inet6_create+0x44d/0x1150 net/ipv6/af_inet6.c:183
 __sock_create+0x4f6/0x880 net/socket.c:1199
 sock_create net/socket.c:1239
 SYSC_socket net/socket.c:1269
 SyS_socket+0xf9/0x230 net/socket.c:1249
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203

Memory state around the buggy address:
 ffff880062d9ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff880062d9ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff880062da0000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                       ^
 ffff880062da0080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff880062da0100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/netfilter/nf_conntrack_reasm.c |    1 +
 net/openvswitch/conntrack.c             |    1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -589,6 +589,7 @@ int nf_ct_frag6_gather(struct net *net,
 	hdr = ipv6_hdr(skb);
 	fhdr = (struct frag_hdr *)skb_transport_header(skb);
 
+	skb_orphan(skb);
 	fq = fq_find(net, fhdr->identification, user, &hdr->saddr, &hdr->daddr,
 		     skb->dev ? skb->dev->ifindex : 0, ip6_frag_ecn(hdr));
 	if (fq == NULL) {
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -367,7 +367,6 @@ static int handle_fragments(struct net *
 	} else if (key->eth.type == htons(ETH_P_IPV6)) {
 		enum ip6_defrag_users user = IP6_DEFRAG_CONNTRACK_IN + zone;
 
-		skb_orphan(skb);
 		memset(IP6CB(skb), 0, sizeof(struct inet6_skb_parm));
 		err = nf_ct_frag6_gather(net, skb, user);
 		if (err) {

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 19/93] dccp: Unlock sock before calling sk_free()
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (16 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 18/93] ipv6: orphan skbs in reassembly unit Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 20/93] strparser: destroy workqueue on module exit Greg Kroah-Hartman
                   ` (71 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dmitry Vyukov, Cong Wang,
	Eric Dumazet, Gerrit Renker, Thomas Gleixner,
	Arnaldo Carvalho de Melo, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Arnaldo Carvalho de Melo <acme@redhat.com>


[ Upstream commit d5afb6f9b6bb2c57bd0c05e76e12489dc0d037d9 ]

The code where sk_clone() came from created a new socket and locked it,
but then, on the error path didn't unlock it.

This problem stayed there for a long while, till b0691c8ee7c2 ("net:
Unlock sock before calling sk_free()") fixed it, but unfortunately the
callers of sk_clone() (now sk_clone_locked()) were not audited and the
one in dccp_create_openreq_child() remained.

Now in the age of the syskaller fuzzer, this was finally uncovered, as
reported by Dmitry:

 ---- 8< ----

I've got the following report while running syzkaller fuzzer on
86292b33d4b7 ("Merge branch 'akpm' (patches from Andrew)")

  [ BUG: held lock freed! ]
  4.10.0+ #234 Not tainted
  -------------------------
  syz-executor6/6898 is freeing memory
  ffff88006286cac0-ffff88006286d3b7, with a lock still held there!
   (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>] spin_lock
  include/linux/spinlock.h:299 [inline]
   (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>]
  sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504
  5 locks held by syz-executor6/6898:
   #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff839a34b4>] lock_sock
  include/net/sock.h:1460 [inline]
   #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff839a34b4>]
  inet_stream_connect+0x44/0xa0 net/ipv4/af_inet.c:681
   #1:  (rcu_read_lock){......}, at: [<ffffffff83bc1c2a>]
  inet6_csk_xmit+0x12a/0x5d0 net/ipv6/inet6_connection_sock.c:126
   #2:  (rcu_read_lock){......}, at: [<ffffffff8369b424>] __skb_unlink
  include/linux/skbuff.h:1767 [inline]
   #2:  (rcu_read_lock){......}, at: [<ffffffff8369b424>] __skb_dequeue
  include/linux/skbuff.h:1783 [inline]
   #2:  (rcu_read_lock){......}, at: [<ffffffff8369b424>]
  process_backlog+0x264/0x730 net/core/dev.c:4835
   #3:  (rcu_read_lock){......}, at: [<ffffffff83aeb5c0>]
  ip6_input_finish+0x0/0x1700 net/ipv6/ip6_input.c:59
   #4:  (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>] spin_lock
  include/linux/spinlock.h:299 [inline]
   #4:  (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>]
  sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504

Fix it just like was done by b0691c8ee7c2 ("net: Unlock sock before calling
sk_free()").

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170301153510.GE15145@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/dccp/minisocks.c |    1 +
 1 file changed, 1 insertion(+)

--- a/net/dccp/minisocks.c
+++ b/net/dccp/minisocks.c
@@ -122,6 +122,7 @@ struct sock *dccp_create_openreq_child(c
 			/* It is still raw copy of parent, so invalidate
 			 * destructor and make plain sk_free() */
 			newsk->sk_destruct = NULL;
+			bh_unlock_sock(newsk);
 			sk_free(newsk);
 			return NULL;
 		}

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 20/93] strparser: destroy workqueue on module exit
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (17 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 19/93] dccp: Unlock sock before calling sk_free() Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 21/93] tcp: fix various issues for sockets morphing to listen state Greg Kroah-Hartman
                   ` (70 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tom Herbert, Cong Wang, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: WANG Cong <xiyou.wangcong@gmail.com>


[ Upstream commit f78ef7cd9a0686b979679d0de061c6dbfd8d649e ]

Fixes: 43a0c6751a32 ("strparser: Stream parser for messages")
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/strparser/strparser.c |    1 +
 1 file changed, 1 insertion(+)

--- a/net/strparser/strparser.c
+++ b/net/strparser/strparser.c
@@ -504,6 +504,7 @@ static int __init strp_mod_init(void)
 
 static void __exit strp_mod_exit(void)
 {
+	destroy_workqueue(strp_wq);
 }
 module_init(strp_mod_init);
 module_exit(strp_mod_exit);

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 21/93] tcp: fix various issues for sockets morphing to listen state
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (18 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 20/93] strparser: destroy workqueue on module exit Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 22/93] net: fix socket refcounting in skb_complete_wifi_ack() Greg Kroah-Hartman
                   ` (69 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Dmitry Vyukov, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>


[ Upstream commit 02b2faaf0af1d85585f6d6980e286d53612acfc2 ]

Dmitry Vyukov reported a divide by 0 triggered by syzkaller, exploiting
tcp_disconnect() path that was never really considered and/or used
before syzkaller ;)

I was not able to reproduce the bug, but it seems issues here are the
three possible actions that assumed they would never trigger on a
listener.

1) tcp_write_timer_handler
2) tcp_delack_timer_handler
3) MTU reduction

Only IPv6 MTU reduction was properly testing TCP_CLOSE and TCP_LISTEN
 states from tcp_v6_mtu_reduced()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/tcp_ipv4.c  |    7 +++++--
 net/ipv4/tcp_timer.c |    6 ++++--
 2 files changed, 9 insertions(+), 4 deletions(-)

--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -269,10 +269,13 @@ EXPORT_SYMBOL(tcp_v4_connect);
  */
 void tcp_v4_mtu_reduced(struct sock *sk)
 {
-	struct dst_entry *dst;
 	struct inet_sock *inet = inet_sk(sk);
-	u32 mtu = tcp_sk(sk)->mtu_info;
+	struct dst_entry *dst;
+	u32 mtu;
 
+	if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE))
+		return;
+	mtu = tcp_sk(sk)->mtu_info;
 	dst = inet_csk_update_pmtu(sk, mtu);
 	if (!dst)
 		return;
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -249,7 +249,8 @@ void tcp_delack_timer_handler(struct soc
 
 	sk_mem_reclaim_partial(sk);
 
-	if (sk->sk_state == TCP_CLOSE || !(icsk->icsk_ack.pending & ICSK_ACK_TIMER))
+	if (((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) ||
+	    !(icsk->icsk_ack.pending & ICSK_ACK_TIMER))
 		goto out;
 
 	if (time_after(icsk->icsk_ack.timeout, jiffies)) {
@@ -552,7 +553,8 @@ void tcp_write_timer_handler(struct sock
 	struct inet_connection_sock *icsk = inet_csk(sk);
 	int event;
 
-	if (sk->sk_state == TCP_CLOSE || !icsk->icsk_pending)
+	if (((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) ||
+	    !icsk->icsk_pending)
 		goto out;
 
 	if (time_after(icsk->icsk_timeout, jiffies)) {

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 22/93] net: fix socket refcounting in skb_complete_wifi_ack()
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (19 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 21/93] tcp: fix various issues for sockets morphing to listen state Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 23/93] net: fix socket refcounting in skb_complete_tx_timestamp() Greg Kroah-Hartman
                   ` (68 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Alexander Duyck,
	Johannes Berg, Soheil Hassas Yeganeh, Willem de Bruijn,
	David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>


[ Upstream commit dd4f10722aeb10f4f582948839f066bebe44e5fb ]

TX skbs do not necessarily hold a reference on skb->sk->sk_refcnt
By the time TX completion happens, sk_refcnt might be already 0.

sock_hold()/sock_put() would then corrupt critical state, like
sk_wmem_alloc.

Fixes: bf7fa551e0ce ("mac80211: Resolve sk_refcnt/sk_wmem_alloc issue in wifi ack path")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/core/skbuff.c |   15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3871,7 +3871,7 @@ void skb_complete_wifi_ack(struct sk_buf
 {
 	struct sock *sk = skb->sk;
 	struct sock_exterr_skb *serr;
-	int err;
+	int err = 1;
 
 	skb->wifi_acked_valid = 1;
 	skb->wifi_acked = acked;
@@ -3881,14 +3881,15 @@ void skb_complete_wifi_ack(struct sk_buf
 	serr->ee.ee_errno = ENOMSG;
 	serr->ee.ee_origin = SO_EE_ORIGIN_TXSTATUS;
 
-	/* take a reference to prevent skb_orphan() from freeing the socket */
-	sock_hold(sk);
-
-	err = sock_queue_err_skb(sk, skb);
+	/* Take a reference to prevent skb_orphan() from freeing the socket,
+	 * but only if the socket refcount is not zero.
+	 */
+	if (likely(atomic_inc_not_zero(&sk->sk_refcnt))) {
+		err = sock_queue_err_skb(sk, skb);
+		sock_put(sk);
+	}
 	if (err)
 		kfree_skb(skb);
-
-	sock_put(sk);
 }
 EXPORT_SYMBOL_GPL(skb_complete_wifi_ack);
 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 23/93] net: fix socket refcounting in skb_complete_tx_timestamp()
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (20 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 22/93] net: fix socket refcounting in skb_complete_wifi_ack() Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 24/93] net/sched: act_skbmod: remove unneeded rcu_read_unlock in tcf_skbmod_dump Greg Kroah-Hartman
                   ` (67 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Alexander Duyck,
	Johannes Berg, Soheil Hassas Yeganeh, Willem de Bruijn,
	David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>


[ Upstream commit 9ac25fc063751379cb77434fef9f3b088cd3e2f7 ]

TX skbs do not necessarily hold a reference on skb->sk->sk_refcnt
By the time TX completion happens, sk_refcnt might be already 0.

sock_hold()/sock_put() would then corrupt critical state, like
sk_wmem_alloc and lead to leaks or use after free.

Fixes: 62bccb8cdb69 ("net-timestamp: Make the clone operation stand-alone from phy timestamping")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/core/skbuff.c |   15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3814,13 +3814,14 @@ void skb_complete_tx_timestamp(struct sk
 	if (!skb_may_tx_timestamp(sk, false))
 		return;
 
-	/* take a reference to prevent skb_orphan() from freeing the socket */
-	sock_hold(sk);
-
-	*skb_hwtstamps(skb) = *hwtstamps;
-	__skb_complete_tx_timestamp(skb, sk, SCM_TSTAMP_SND);
-
-	sock_put(sk);
+	/* Take a reference to prevent skb_orphan() from freeing the socket,
+	 * but only if the socket refcount is not zero.
+	 */
+	if (likely(atomic_inc_not_zero(&sk->sk_refcnt))) {
+		*skb_hwtstamps(skb) = *hwtstamps;
+		__skb_complete_tx_timestamp(skb, sk, SCM_TSTAMP_SND);
+		sock_put(sk);
+	}
 }
 EXPORT_SYMBOL_GPL(skb_complete_tx_timestamp);
 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 24/93] net/sched: act_skbmod: remove unneeded rcu_read_unlock in tcf_skbmod_dump
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (21 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 23/93] net: fix socket refcounting in skb_complete_tx_timestamp() Greg Kroah-Hartman
@ 2017-03-20 17:50 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 25/93] dccp: fix use-after-free in dccp_feat_activate_values Greg Kroah-Hartman
                   ` (66 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexey Khoroshilov, Jamal Hadi Salim,
	David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexey Khoroshilov <khoroshilov@ispras.ru>


[ Upstream commit 6c4dc75c251721f517e9daeb5370ea606b5b35ce ]

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sched/act_skbmod.c |    1 -
 1 file changed, 1 deletion(-)

--- a/net/sched/act_skbmod.c
+++ b/net/sched/act_skbmod.c
@@ -228,7 +228,6 @@ static int tcf_skbmod_dump(struct sk_buf
 
 	return skb->len;
 nla_put_failure:
-	rcu_read_unlock();
 	nlmsg_trim(skb, b);
 	return -1;
 }

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 25/93] dccp: fix use-after-free in dccp_feat_activate_values
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (22 preceding siblings ...)
  2017-03-20 17:50 ` [PATCH 4.9 24/93] net/sched: act_skbmod: remove unneeded rcu_read_unlock in tcf_skbmod_dump Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 26/93] vrf: Fix use-after-free in vrf_xmit Greg Kroah-Hartman
                   ` (65 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Dmitry Vyukov, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>


[ Upstream commit 62f8f4d9066c1c6f2474845d1ca7e2891f2ae3fd ]

Dmitry reported crashes in DCCP stack [1]

Problem here is that when I got rid of listener spinlock, I missed the
fact that DCCP stores a complex state in struct dccp_request_sock,
while TCP does not.

Since multiple cpus could access it at the same time, we need to add
protection.

[1]
BUG: KASAN: use-after-free in dccp_feat_activate_values+0x967/0xab0
net/dccp/feat.c:1541 at addr ffff88003713be68
Read of size 8 by task syz-executor2/8457
CPU: 2 PID: 8457 Comm: syz-executor2 Not tainted 4.10.0-rc7+ #127
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x292/0x398 lib/dump_stack.c:51
 kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
 print_address_description mm/kasan/report.c:200 [inline]
 kasan_report_error mm/kasan/report.c:289 [inline]
 kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
 kasan_report mm/kasan/report.c:332 [inline]
 __asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:332
 dccp_feat_activate_values+0x967/0xab0 net/dccp/feat.c:1541
 dccp_create_openreq_child+0x464/0x610 net/dccp/minisocks.c:121
 dccp_v6_request_recv_sock+0x1f6/0x1960 net/dccp/ipv6.c:457
 dccp_check_req+0x335/0x5a0 net/dccp/minisocks.c:186
 dccp_v6_rcv+0x69e/0x1d00 net/dccp/ipv6.c:711
 ip6_input_finish+0x46d/0x17a0 net/ipv6/ip6_input.c:279
 NF_HOOK include/linux/netfilter.h:257 [inline]
 ip6_input+0xdb/0x590 net/ipv6/ip6_input.c:322
 dst_input include/net/dst.h:507 [inline]
 ip6_rcv_finish+0x289/0x890 net/ipv6/ip6_input.c:69
 NF_HOOK include/linux/netfilter.h:257 [inline]
 ipv6_rcv+0x12ec/0x23d0 net/ipv6/ip6_input.c:203
 __netif_receive_skb_core+0x1ae5/0x3400 net/core/dev.c:4190
 __netif_receive_skb+0x2a/0x170 net/core/dev.c:4228
 process_backlog+0xe5/0x6c0 net/core/dev.c:4839
 napi_poll net/core/dev.c:5202 [inline]
 net_rx_action+0xe70/0x1900 net/core/dev.c:5267
 __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
 do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902
 </IRQ>
 do_softirq.part.17+0x1e8/0x230 kernel/softirq.c:328
 do_softirq kernel/softirq.c:176 [inline]
 __local_bh_enable_ip+0x1f2/0x200 kernel/softirq.c:181
 local_bh_enable include/linux/bottom_half.h:31 [inline]
 rcu_read_unlock_bh include/linux/rcupdate.h:971 [inline]
 ip6_finish_output2+0xbb0/0x23d0 net/ipv6/ip6_output.c:123
 ip6_finish_output+0x302/0x960 net/ipv6/ip6_output.c:148
 NF_HOOK_COND include/linux/netfilter.h:246 [inline]
 ip6_output+0x1cb/0x8d0 net/ipv6/ip6_output.c:162
 ip6_xmit+0xcdf/0x20d0 include/net/dst.h:501
 inet6_csk_xmit+0x320/0x5f0 net/ipv6/inet6_connection_sock.c:179
 dccp_transmit_skb+0xb09/0x1120 net/dccp/output.c:141
 dccp_xmit_packet+0x215/0x760 net/dccp/output.c:280
 dccp_write_xmit+0x168/0x1d0 net/dccp/output.c:362
 dccp_sendmsg+0x79c/0xb10 net/dccp/proto.c:796
 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
 sock_sendmsg_nosec net/socket.c:635 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:645
 SYSC_sendto+0x660/0x810 net/socket.c:1687
 SyS_sendto+0x40/0x50 net/socket.c:1655
 entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x4458b9
RSP: 002b:00007f8ceb77bb58 EFLAGS: 00000282 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000000017 RCX: 00000000004458b9
RDX: 0000000000000023 RSI: 0000000020e60000 RDI: 0000000000000017
RBP: 00000000006e1b90 R08: 00000000200f9fe1 R09: 0000000000000020
R10: 0000000000008010 R11: 0000000000000282 R12: 00000000007080a8
R13: 0000000000000000 R14: 00007f8ceb77c9c0 R15: 00007f8ceb77c700
Object at ffff88003713be50, in cache kmalloc-64 size: 64
Allocated:
PID = 8446
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514 [inline]
 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
 kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2738
 kmalloc include/linux/slab.h:490 [inline]
 dccp_feat_entry_new+0x214/0x410 net/dccp/feat.c:467
 dccp_feat_push_change+0x38/0x220 net/dccp/feat.c:487
 __feat_register_sp+0x223/0x2f0 net/dccp/feat.c:741
 dccp_feat_propagate_ccid+0x22b/0x2b0 net/dccp/feat.c:949
 dccp_feat_server_ccid_dependencies+0x1b3/0x250 net/dccp/feat.c:1012
 dccp_make_response+0x1f1/0xc90 net/dccp/output.c:423
 dccp_v6_send_response+0x4ec/0xc20 net/dccp/ipv6.c:217
 dccp_v6_conn_request+0xaba/0x11b0 net/dccp/ipv6.c:377
 dccp_rcv_state_process+0x51e/0x1650 net/dccp/input.c:606
 dccp_v6_do_rcv+0x213/0x350 net/dccp/ipv6.c:632
 sk_backlog_rcv include/net/sock.h:893 [inline]
 __sk_receive_skb+0x36f/0xcc0 net/core/sock.c:479
 dccp_v6_rcv+0xba5/0x1d00 net/dccp/ipv6.c:742
 ip6_input_finish+0x46d/0x17a0 net/ipv6/ip6_input.c:279
 NF_HOOK include/linux/netfilter.h:257 [inline]
 ip6_input+0xdb/0x590 net/ipv6/ip6_input.c:322
 dst_input include/net/dst.h:507 [inline]
 ip6_rcv_finish+0x289/0x890 net/ipv6/ip6_input.c:69
 NF_HOOK include/linux/netfilter.h:257 [inline]
 ipv6_rcv+0x12ec/0x23d0 net/ipv6/ip6_input.c:203
 __netif_receive_skb_core+0x1ae5/0x3400 net/core/dev.c:4190
 __netif_receive_skb+0x2a/0x170 net/core/dev.c:4228
 process_backlog+0xe5/0x6c0 net/core/dev.c:4839
 napi_poll net/core/dev.c:5202 [inline]
 net_rx_action+0xe70/0x1900 net/core/dev.c:5267
 __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
Freed:
PID = 15
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514 [inline]
 kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
 slab_free_hook mm/slub.c:1355 [inline]
 slab_free_freelist_hook mm/slub.c:1377 [inline]
 slab_free mm/slub.c:2954 [inline]
 kfree+0xe8/0x2b0 mm/slub.c:3874
 dccp_feat_entry_destructor.part.4+0x48/0x60 net/dccp/feat.c:418
 dccp_feat_entry_destructor net/dccp/feat.c:416 [inline]
 dccp_feat_list_pop net/dccp/feat.c:541 [inline]
 dccp_feat_activate_values+0x57f/0xab0 net/dccp/feat.c:1543
 dccp_create_openreq_child+0x464/0x610 net/dccp/minisocks.c:121
 dccp_v6_request_recv_sock+0x1f6/0x1960 net/dccp/ipv6.c:457
 dccp_check_req+0x335/0x5a0 net/dccp/minisocks.c:186
 dccp_v6_rcv+0x69e/0x1d00 net/dccp/ipv6.c:711
 ip6_input_finish+0x46d/0x17a0 net/ipv6/ip6_input.c:279
 NF_HOOK include/linux/netfilter.h:257 [inline]
 ip6_input+0xdb/0x590 net/ipv6/ip6_input.c:322
 dst_input include/net/dst.h:507 [inline]
 ip6_rcv_finish+0x289/0x890 net/ipv6/ip6_input.c:69
 NF_HOOK include/linux/netfilter.h:257 [inline]
 ipv6_rcv+0x12ec/0x23d0 net/ipv6/ip6_input.c:203
 __netif_receive_skb_core+0x1ae5/0x3400 net/core/dev.c:4190
 __netif_receive_skb+0x2a/0x170 net/core/dev.c:4228
 process_backlog+0xe5/0x6c0 net/core/dev.c:4839
 napi_poll net/core/dev.c:5202 [inline]
 net_rx_action+0xe70/0x1900 net/core/dev.c:5267
 __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
Memory state around the buggy address:
 ffff88003713bd00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88003713bd80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88003713be00: fc fc fc fc fc fc fc fc fc fc fb fb fb fb fb fb
                                                          ^

Fixes: 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/dccp.h |    1 +
 net/dccp/minisocks.c |   24 ++++++++++++++++--------
 2 files changed, 17 insertions(+), 8 deletions(-)

--- a/include/linux/dccp.h
+++ b/include/linux/dccp.h
@@ -163,6 +163,7 @@ struct dccp_request_sock {
 	__u64			 dreq_isr;
 	__u64			 dreq_gsr;
 	__be32			 dreq_service;
+	spinlock_t		 dreq_lock;
 	struct list_head	 dreq_featneg;
 	__u32			 dreq_timestamp_echo;
 	__u32			 dreq_timestamp_time;
--- a/net/dccp/minisocks.c
+++ b/net/dccp/minisocks.c
@@ -146,6 +146,13 @@ struct sock *dccp_check_req(struct sock
 	struct dccp_request_sock *dreq = dccp_rsk(req);
 	bool own_req;
 
+	/* TCP/DCCP listeners became lockless.
+	 * DCCP stores complex state in its request_sock, so we need
+	 * a protection for them, now this code runs without being protected
+	 * by the parent (listener) lock.
+	 */
+	spin_lock_bh(&dreq->dreq_lock);
+
 	/* Check for retransmitted REQUEST */
 	if (dccp_hdr(skb)->dccph_type == DCCP_PKT_REQUEST) {
 
@@ -160,7 +167,7 @@ struct sock *dccp_check_req(struct sock
 			inet_rtx_syn_ack(sk, req);
 		}
 		/* Network Duplicate, discard packet */
-		return NULL;
+		goto out;
 	}
 
 	DCCP_SKB_CB(skb)->dccpd_reset_code = DCCP_RESET_CODE_PACKET_ERROR;
@@ -186,20 +193,20 @@ struct sock *dccp_check_req(struct sock
 
 	child = inet_csk(sk)->icsk_af_ops->syn_recv_sock(sk, skb, req, NULL,
 							 req, &own_req);
-	if (!child)
-		goto listen_overflow;
-
-	return inet_csk_complete_hashdance(sk, child, req, own_req);
+	if (child) {
+		child = inet_csk_complete_hashdance(sk, child, req, own_req);
+		goto out;
+	}
 
-listen_overflow:
-	dccp_pr_debug("listen_overflow!\n");
 	DCCP_SKB_CB(skb)->dccpd_reset_code = DCCP_RESET_CODE_TOO_BUSY;
 drop:
 	if (dccp_hdr(skb)->dccph_type != DCCP_PKT_RESET)
 		req->rsk_ops->send_reset(sk, skb);
 
 	inet_csk_reqsk_queue_drop(sk, req);
-	return NULL;
+out:
+	spin_unlock_bh(&dreq->dreq_lock);
+	return child;
 }
 
 EXPORT_SYMBOL_GPL(dccp_check_req);
@@ -250,6 +257,7 @@ int dccp_reqsk_init(struct request_sock
 {
 	struct dccp_request_sock *dreq = dccp_rsk(req);
 
+	spin_lock_init(&dreq->dreq_lock);
 	inet_rsk(req)->ir_rmt_port = dccp_hdr(skb)->dccph_sport;
 	inet_rsk(req)->ir_num	   = ntohs(dccp_hdr(skb)->dccph_dport);
 	inet_rsk(req)->acked	   = 0;

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 26/93] vrf: Fix use-after-free in vrf_xmit
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (23 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 25/93] dccp: fix use-after-free in dccp_feat_activate_values Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 27/93] net/tunnel: set inner protocol in network gro hooks Greg Kroah-Hartman
                   ` (64 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, David Ahern, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: David Ahern <dsa@cumulusnetworks.com>


[ Upstream commit f7887d40e541f74402df0684a1463c0a0bb68c68 ]

KASAN detected a use-after-free:

[  269.467067] BUG: KASAN: use-after-free in vrf_xmit+0x7f1/0x827 [vrf] at addr ffff8800350a21c0
[  269.467067] Read of size 4 by task ssh/1879
[  269.467067] CPU: 1 PID: 1879 Comm: ssh Not tainted 4.10.0+ #249
[  269.467067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[  269.467067] Call Trace:
[  269.467067]  dump_stack+0x81/0xb6
[  269.467067]  kasan_object_err+0x21/0x78
[  269.467067]  kasan_report+0x2f7/0x450
[  269.467067]  ? vrf_xmit+0x7f1/0x827 [vrf]
[  269.467067]  ? ip_output+0xa4/0xdb
[  269.467067]  __asan_load4+0x6b/0x6d
[  269.467067]  vrf_xmit+0x7f1/0x827 [vrf]
...

Which corresponds to the skb access after xmit handling. Fix by saving
skb->len and using the saved value to update stats.

Fixes: 193125dbd8eb2 ("net: Introduce VRF device driver")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/vrf.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -346,6 +346,7 @@ static netdev_tx_t is_ip_tx_frame(struct
 
 static netdev_tx_t vrf_xmit(struct sk_buff *skb, struct net_device *dev)
 {
+	int len = skb->len;
 	netdev_tx_t ret = is_ip_tx_frame(skb, dev);
 
 	if (likely(ret == NET_XMIT_SUCCESS || ret == NET_XMIT_CN)) {
@@ -353,7 +354,7 @@ static netdev_tx_t vrf_xmit(struct sk_bu
 
 		u64_stats_update_begin(&dstats->syncp);
 		dstats->tx_pkts++;
-		dstats->tx_bytes += skb->len;
+		dstats->tx_bytes += len;
 		u64_stats_update_end(&dstats->syncp);
 	} else {
 		this_cpu_inc(dev->dstats->tx_drps);

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 27/93] net/tunnel: set inner protocol in network gro hooks
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (24 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 26/93] vrf: Fix use-after-free in vrf_xmit Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 28/93] uapi: fix linux/packet_diag.h userspace compilation error Greg Kroah-Hartman
                   ` (63 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Paolo Abeni, Alexander Duyck,
	David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Paolo Abeni <pabeni@redhat.com>


[ Upstream commit 294acf1c01bace5cea5d30b510504238bf5f7c25 ]

The gso code of several tunnels type (gre and udp tunnels)
takes for granted that the skb->inner_protocol is properly
initialized and drops the packet elsewhere.

On the forwarding path no one is initializing such field,
so gro encapsulated packets are dropped on forward.

Since commit 38720352412a ("gre: Use inner_proto to obtain
inner header protocol"), this can be reproduced when the
encapsulated packets use gre as the tunneling protocol.

The issue happens also with vxlan and geneve tunnels since
commit 8bce6d7d0d1e ("udp: Generalize skb_udp_segment"), if the
forwarding host's ingress nic has h/w offload for such tunnel
and a vxlan/geneve device is configured on top of it, regardless
of the configured peer address and vni.

To address the issue, this change initialize the inner_protocol
field for encapsulated packets in both ipv4 and ipv6 gro complete
callbacks.

Fixes: 38720352412a ("gre: Use inner_proto to obtain inner header protocol")
Fixes: 8bce6d7d0d1e ("udp: Generalize skb_udp_segment")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/af_inet.c     |    4 +++-
 net/ipv6/ip6_offload.c |    4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1460,8 +1460,10 @@ int inet_gro_complete(struct sk_buff *sk
 	int proto = iph->protocol;
 	int err = -ENOSYS;
 
-	if (skb->encapsulation)
+	if (skb->encapsulation) {
+		skb_set_inner_protocol(skb, cpu_to_be16(ETH_P_IP));
 		skb_set_inner_network_header(skb, nhoff);
+	}
 
 	csum_replace2(&iph->check, iph->tot_len, newlen);
 	iph->tot_len = newlen;
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -294,8 +294,10 @@ static int ipv6_gro_complete(struct sk_b
 	struct ipv6hdr *iph = (struct ipv6hdr *)(skb->data + nhoff);
 	int err = -ENOSYS;
 
-	if (skb->encapsulation)
+	if (skb->encapsulation) {
+		skb_set_inner_protocol(skb, cpu_to_be16(ETH_P_IPV6));
 		skb_set_inner_network_header(skb, nhoff);
+	}
 
 	iph->payload_len = htons(skb->len - nhoff - sizeof(*iph));
 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 28/93] uapi: fix linux/packet_diag.h userspace compilation error
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (25 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 27/93] net/tunnel: set inner protocol in network gro hooks Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 30/93] mpls: Send route delete notifications when router module is unloaded Greg Kroah-Hartman
                   ` (62 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dmitry V. Levin, Pavel Emelyanov,
	David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: "Dmitry V. Levin" <ldv@altlinux.org>


[ Upstream commit 745cb7f8a5de0805cade3de3991b7a95317c7c73 ]

Replace MAX_ADDR_LEN with its numeric value to fix the following
linux/packet_diag.h userspace compilation error:

/usr/include/linux/packet_diag.h:67:17: error: 'MAX_ADDR_LEN' undeclared here (not in a function)
  __u8 pdmc_addr[MAX_ADDR_LEN];

This is not the first case in the UAPI where the numeric value
of MAX_ADDR_LEN is used instead of symbolic one, uapi/linux/if_link.h
already does the same:

$ grep MAX_ADDR_LEN include/uapi/linux/if_link.h
	__u8 mac[32]; /* MAX_ADDR_LEN */

There are no UAPI headers besides these two that use MAX_ADDR_LEN.

Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Acked-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/uapi/linux/packet_diag.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/uapi/linux/packet_diag.h
+++ b/include/uapi/linux/packet_diag.h
@@ -64,7 +64,7 @@ struct packet_diag_mclist {
 	__u32	pdmc_count;
 	__u16	pdmc_type;
 	__u16	pdmc_alen;
-	__u8	pdmc_addr[MAX_ADDR_LEN];
+	__u8	pdmc_addr[32]; /* MAX_ADDR_LEN */
 };
 
 struct packet_diag_ring {

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 30/93] mpls: Send route delete notifications when router module is unloaded
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (26 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 28/93] uapi: fix linux/packet_diag.h userspace compilation error Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 31/93] mpls: Do not decrement alive counter for unregister events Greg Kroah-Hartman
                   ` (61 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, David Ahern, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: David Ahern <dsa@cumulusnetworks.com>


[ Upstream commit e37791ec1ad785b59022ae211f63a16189bacebf ]

When the mpls_router module is unloaded, mpls routes are deleted but
notifications are not sent to userspace leaving userspace caches
out of sync. Add the call to mpls_notify_route in mpls_net_exit as
routes are freed.

Fixes: 0189197f44160 ("mpls: Basic routing support")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/mpls/af_mpls.c |    1 +
 1 file changed, 1 insertion(+)

--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -1696,6 +1696,7 @@ static void mpls_net_exit(struct net *ne
 	for (index = 0; index < platform_labels; index++) {
 		struct mpls_route *rt = rtnl_dereference(platform_label[index]);
 		RCU_INIT_POINTER(platform_label[index], NULL);
+		mpls_notify_route(net, index, rt, NULL, NULL);
 		mpls_rt_free(rt);
 	}
 	rtnl_unlock();

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 31/93] mpls: Do not decrement alive counter for unregister events
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (27 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 30/93] mpls: Send route delete notifications when router module is unloaded Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 32/93] ipv6: make ECMP route replacement less greedy Greg Kroah-Hartman
                   ` (60 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, David Ahern, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: David Ahern <dsa@cumulusnetworks.com>


[ Upstream commit 79099aab38c8f5c746748b066ae74ba984fe2cc8 ]

Multipath routes can be rendered usesless when a device in one of the
paths is deleted. For example:

$ ip -f mpls ro ls
100
	nexthop as to 200 via inet 172.16.2.2  dev virt12
	nexthop as to 300 via inet 172.16.3.2  dev br0
101
	nexthop as to 201 via inet6 2000:2::2  dev virt12
	nexthop as to 301 via inet6 2000:3::2  dev br0

$ ip li del br0

When br0 is deleted the other hop is not considered in
mpls_select_multipath because of the alive check -- rt_nhn_alive
is 0.

rt_nhn_alive is decremented once in mpls_ifdown when the device is taken
down (NETDEV_DOWN) and again when it is deleted (NETDEV_UNREGISTER). For
a 2 hop route, deleting one device drops the alive count to 0. Since
devices are taken down before unregistering, the decrement on
NETDEV_UNREGISTER is redundant.

Fixes: c89359a42e2a4 ("mpls: support for dead routes")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/mpls/af_mpls.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -956,7 +956,8 @@ static void mpls_ifdown(struct net_devic
 				/* fall through */
 			case NETDEV_CHANGE:
 				nh->nh_flags |= RTNH_F_LINKDOWN;
-				ACCESS_ONCE(rt->rt_nhn_alive) = rt->rt_nhn_alive - 1;
+				if (event != NETDEV_UNREGISTER)
+					ACCESS_ONCE(rt->rt_nhn_alive) = rt->rt_nhn_alive - 1;
 				break;
 			}
 			if (event == NETDEV_UNREGISTER)

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 32/93] ipv6: make ECMP route replacement less greedy
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (28 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 31/93] mpls: Do not decrement alive counter for unregister events Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 33/93] ipv6: avoid write to a possibly cloned skb Greg Kroah-Hartman
                   ` (59 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Sabrina Dubroca, Nicolas Dichtel,
	Xin Long, Michal Kubecek, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sabrina Dubroca <sd@queasysnail.net>


[ Upstream commit 67e194007be08d071294456274dd53e0a04fdf90 ]

Commit 27596472473a ("ipv6: fix ECMP route replacement") introduced a
loop that removes all siblings of an ECMP route that is being
replaced. However, this loop doesn't stop when it has replaced
siblings, and keeps removing other routes with a higher metric.
We also end up triggering the WARN_ON after the loop, because after
this nsiblings < 0.

Instead, stop the loop when we have taken care of all routes with the
same metric as the route being replaced.

  Reproducer:
  ===========
    #!/bin/sh

    ip netns add ns1
    ip netns add ns2
    ip -net ns1 link set lo up

    for x in 0 1 2 ; do
        ip link add veth$x netns ns2 type veth peer name eth$x netns ns1
        ip -net ns1 link set eth$x up
        ip -net ns2 link set veth$x up
    done

    ip -net ns1 -6 r a 2000::/64 nexthop via fe80::0 dev eth0 \
            nexthop via fe80::1 dev eth1 nexthop via fe80::2 dev eth2
    ip -net ns1 -6 r a 2000::/64 via fe80::42 dev eth0 metric 256
    ip -net ns1 -6 r a 2000::/64 via fe80::43 dev eth0 metric 2048

    echo "before replace, 3 routes"
    ip -net ns1 -6 r | grep -v '^fe80\|^ff00'
    echo

    ip -net ns1 -6 r c 2000::/64 nexthop via fe80::4 dev eth0 \
            nexthop via fe80::5 dev eth1 nexthop via fe80::6 dev eth2

    echo "after replace, only 2 routes, metric 2048 is gone"
    ip -net ns1 -6 r | grep -v '^fe80\|^ff00'

Fixes: 27596472473a ("ipv6: fix ECMP route replacement")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/ip6_fib.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -908,6 +908,8 @@ add:
 			ins = &rt->dst.rt6_next;
 			iter = *ins;
 			while (iter) {
+				if (iter->rt6i_metric > rt->rt6i_metric)
+					break;
 				if (rt6_qualify_for_ecmp(iter)) {
 					*ins = iter->dst.rt6_next;
 					fib6_purge_rt(iter, fn, info->nl_net);

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 33/93] ipv6: avoid write to a possibly cloned skb
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (29 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 32/93] ipv6: make ECMP route replacement less greedy Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 34/93] bridge: drop netfilter fake rtable unconditionally Greg Kroah-Hartman
                   ` (58 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hannes Frederic Sowa, Andreas Karis,
	Florian Westphal, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Florian Westphal <fw@strlen.de>


[ Upstream commit 79e49503efe53a8c51d8b695bedc8a346c5e4a87 ]

ip6_fragment, in case skb has a fraglist, checks if the
skb is cloned.  If it is, it will move to the 'slow path' and allocates
new skbs for each fragment.

However, right before entering the slowpath loop, it updates the
nexthdr value of the last ipv6 extension header to NEXTHDR_FRAGMENT,
to account for the fragment header that will be inserted in the new
ipv6-fragment skbs.

In case original skb is cloned this munges nexthdr value of another
skb.  Avoid this by doing the nexthdr update for each of the new fragment
skbs separately.

This was observed with tcpdump on a bridge device where netfilter ipv6
reassembly is active:  tcpdump shows malformed fragment headers as
the l4 header (icmpv6, tcp, etc). is decoded as a fragment header.

Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Reported-by: Andreas Karis <akaris@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/ip6_output.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -757,13 +757,14 @@ slow_path:
 	 *	Fragment the datagram.
 	 */
 
-	*prevhdr = NEXTHDR_FRAGMENT;
 	troom = rt->dst.dev->needed_tailroom;
 
 	/*
 	 *	Keep copying data until we run out.
 	 */
 	while (left > 0)	{
+		u8 *fragnexthdr_offset;
+
 		len = left;
 		/* IF: it doesn't fit, use 'mtu' - the data space left */
 		if (len > mtu)
@@ -808,6 +809,10 @@ slow_path:
 		 */
 		skb_copy_from_linear_data(skb, skb_network_header(frag), hlen);
 
+		fragnexthdr_offset = skb_network_header(frag);
+		fragnexthdr_offset += prevhdr - skb_network_header(skb);
+		*fragnexthdr_offset = NEXTHDR_FRAGMENT;
+
 		/*
 		 *	Build fragment header.
 		 */

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 34/93] bridge: drop netfilter fake rtable unconditionally
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (30 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 33/93] ipv6: avoid write to a possibly cloned skb Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 37/93] dccp: fix memory leak during tear-down of unsuccessful connection request Greg Kroah-Hartman
                   ` (57 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andreas Karis, Florian Westphal,
	Pablo Neira Ayuso, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Florian Westphal <fw@strlen.de>


[ Upstream commit a13b2082ece95247779b9995c4e91b4246bed023 ]

Andreas reports kernel oops during rmmod of the br_netfilter module.
Hannes debugged the oops down to a NULL rt6info->rt6i_indev.

Problem is that br_netfilter has the nasty concept of adding a fake
rtable to skb->dst; this happens in a br_netfilter prerouting hook.

A second hook (in bridge LOCAL_IN) is supposed to remove these again
before the skb is handed up the stack.

However, on module unload hooks get unregistered which means an
skb could traverse the prerouting hook that attaches the fake_rtable,
while the 'fake rtable remove' hook gets removed from the hooklist
immediately after.

Fixes: 34666d467cbf1e2e3c7 ("netfilter: bridge: move br_netfilter out of the core")
Reported-by: Andreas Karis <akaris@redhat.com>
Debugged-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/bridge/br_input.c           |    1 +
 net/bridge/br_netfilter_hooks.c |   21 ---------------------
 2 files changed, 1 insertion(+), 21 deletions(-)

--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -29,6 +29,7 @@ EXPORT_SYMBOL(br_should_route_hook);
 static int
 br_netif_receive_skb(struct net *net, struct sock *sk, struct sk_buff *skb)
 {
+	br_drop_fake_rtable(skb);
 	return netif_receive_skb(skb);
 }
 
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -521,21 +521,6 @@ static unsigned int br_nf_pre_routing(vo
 }
 
 
-/* PF_BRIDGE/LOCAL_IN ************************************************/
-/* The packet is locally destined, which requires a real
- * dst_entry, so detach the fake one.  On the way up, the
- * packet would pass through PRE_ROUTING again (which already
- * took place when the packet entered the bridge), but we
- * register an IPv4 PRE_ROUTING 'sabotage' hook that will
- * prevent this from happening. */
-static unsigned int br_nf_local_in(void *priv,
-				   struct sk_buff *skb,
-				   const struct nf_hook_state *state)
-{
-	br_drop_fake_rtable(skb);
-	return NF_ACCEPT;
-}
-
 /* PF_BRIDGE/FORWARD *************************************************/
 static int br_nf_forward_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
 {
@@ -906,12 +891,6 @@ static struct nf_hook_ops br_nf_ops[] __
 		.priority = NF_BR_PRI_BRNF,
 	},
 	{
-		.hook = br_nf_local_in,
-		.pf = NFPROTO_BRIDGE,
-		.hooknum = NF_BR_LOCAL_IN,
-		.priority = NF_BR_PRI_BRNF,
-	},
-	{
 		.hook = br_nf_forward_ip,
 		.pf = NFPROTO_BRIDGE,
 		.hooknum = NF_BR_FORWARD,

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 37/93] dccp: fix memory leak during tear-down of unsuccessful connection request
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (31 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 34/93] bridge: drop netfilter fake rtable unconditionally Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 38/93] bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers Greg Kroah-Hartman
                   ` (56 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jianwen Ji, Hannes Frederic Sowa,
	David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hannes Frederic Sowa <hannes@stressinduktion.org>


[ Upstream commit 72ef9c4125c7b257e3a714d62d778ab46583d6a3 ]

This patch fixes a memory leak, which happens if the connection request
is not fulfilled between parsing the DCCP options and handling the SYN
(because e.g. the backlog is full), because we forgot to free the
list of ack vectors.

Reported-by: Jianwen Ji <jiji@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/dccp/ccids/ccid2.c |    1 +
 1 file changed, 1 insertion(+)

--- a/net/dccp/ccids/ccid2.c
+++ b/net/dccp/ccids/ccid2.c
@@ -749,6 +749,7 @@ static void ccid2_hc_tx_exit(struct sock
 	for (i = 0; i < hc->tx_seqbufc; i++)
 		kfree(hc->tx_seqbuf[i]);
 	hc->tx_seqbufc = 0;
+	dccp_ackvec_parsed_cleanup(&hc->tx_av_chunks);
 }
 
 static void ccid2_hc_rx_packet_recv(struct sock *sk, struct sk_buff *skb)

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 38/93] bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (32 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 37/93] dccp: fix memory leak during tear-down of unsuccessful connection request Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 39/93] bpf: fix state equivalence Greg Kroah-Hartman
                   ` (55 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Graf, Josef Bacik,
	Daniel Borkmann, Alexei Starovoitov, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Graf <tgraf@suug.ch>


[ Upstream commit 57a09bf0a416700676e77102c28f9cfcb48267e0 ]

A BPF program is required to check the return register of a
map_elem_lookup() call before accessing memory. The verifier keeps
track of this by converting the type of the result register from
PTR_TO_MAP_VALUE_OR_NULL to PTR_TO_MAP_VALUE after a conditional
jump ensures safety. This check is currently exclusively performed
for the result register 0.

In the event the compiler reorders instructions, BPF_MOV64_REG
instructions may be moved before the conditional jump which causes
them to keep their type PTR_TO_MAP_VALUE_OR_NULL to which the
verifier objects when the register is accessed:

0: (b7) r1 = 10
1: (7b) *(u64 *)(r10 -8) = r1
2: (bf) r2 = r10
3: (07) r2 += -8
4: (18) r1 = 0x59c00000
6: (85) call 1
7: (bf) r4 = r0
8: (15) if r0 == 0x0 goto pc+1
 R0=map_value(ks=8,vs=8) R4=map_value_or_null(ks=8,vs=8) R10=fp
9: (7a) *(u64 *)(r4 +0) = 0
R4 invalid mem access 'map_value_or_null'

This commit extends the verifier to keep track of all identical
PTR_TO_MAP_VALUE_OR_NULL registers after a map_elem_lookup() by
assigning them an ID and then marking them all when the conditional
jump is observed.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/bpf_verifier.h |    2 -
 kernel/bpf/verifier.c        |   61 +++++++++++++++++++++++++++++++------------
 2 files changed, 46 insertions(+), 17 deletions(-)

--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -24,13 +24,13 @@ struct bpf_reg_state {
 	 */
 	s64 min_value;
 	u64 max_value;
+	u32 id;
 	union {
 		/* valid when type == CONST_IMM | PTR_TO_STACK | UNKNOWN_VALUE */
 		s64 imm;
 
 		/* valid when type == PTR_TO_PACKET* */
 		struct {
-			u32 id;
 			u16 off;
 			u16 range;
 		};
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -212,9 +212,10 @@ static void print_verifier_state(struct
 		else if (t == CONST_PTR_TO_MAP || t == PTR_TO_MAP_VALUE ||
 			 t == PTR_TO_MAP_VALUE_OR_NULL ||
 			 t == PTR_TO_MAP_VALUE_ADJ)
-			verbose("(ks=%d,vs=%d)",
+			verbose("(ks=%d,vs=%d,id=%u)",
 				reg->map_ptr->key_size,
-				reg->map_ptr->value_size);
+				reg->map_ptr->value_size,
+				reg->id);
 		if (reg->min_value != BPF_REGISTER_MIN_RANGE)
 			verbose(",min_value=%lld",
 				(long long)reg->min_value);
@@ -447,6 +448,7 @@ static void mark_reg_unknown_value(struc
 {
 	BUG_ON(regno >= MAX_BPF_REG);
 	regs[regno].type = UNKNOWN_VALUE;
+	regs[regno].id = 0;
 	regs[regno].imm = 0;
 }
 
@@ -1252,6 +1254,7 @@ static int check_call(struct bpf_verifie
 			return -EINVAL;
 		}
 		regs[BPF_REG_0].map_ptr = meta.map_ptr;
+		regs[BPF_REG_0].id = ++env->id_gen;
 	} else {
 		verbose("unknown return type %d of func %d\n",
 			fn->ret_type, func_id);
@@ -1668,8 +1671,7 @@ static int check_alu_op(struct bpf_verif
 						insn->src_reg);
 					return -EACCES;
 				}
-				regs[insn->dst_reg].type = UNKNOWN_VALUE;
-				regs[insn->dst_reg].map_ptr = NULL;
+				mark_reg_unknown_value(regs, insn->dst_reg);
 			}
 		} else {
 			/* case: R = imm
@@ -1931,6 +1933,38 @@ static void reg_set_min_max_inv(struct b
 	check_reg_overflow(true_reg);
 }
 
+static void mark_map_reg(struct bpf_reg_state *regs, u32 regno, u32 id,
+			 enum bpf_reg_type type)
+{
+	struct bpf_reg_state *reg = &regs[regno];
+
+	if (reg->type == PTR_TO_MAP_VALUE_OR_NULL && reg->id == id) {
+		reg->type = type;
+		if (type == UNKNOWN_VALUE)
+			mark_reg_unknown_value(regs, regno);
+	}
+}
+
+/* The logic is similar to find_good_pkt_pointers(), both could eventually
+ * be folded together at some point.
+ */
+static void mark_map_regs(struct bpf_verifier_state *state, u32 regno,
+			  enum bpf_reg_type type)
+{
+	struct bpf_reg_state *regs = state->regs;
+	int i;
+
+	for (i = 0; i < MAX_BPF_REG; i++)
+		mark_map_reg(regs, i, regs[regno].id, type);
+
+	for (i = 0; i < MAX_BPF_STACK; i += BPF_REG_SIZE) {
+		if (state->stack_slot_type[i] != STACK_SPILL)
+			continue;
+		mark_map_reg(state->spilled_regs, i / BPF_REG_SIZE,
+			     regs[regno].id, type);
+	}
+}
+
 static int check_cond_jmp_op(struct bpf_verifier_env *env,
 			     struct bpf_insn *insn, int *insn_idx)
 {
@@ -2018,18 +2052,13 @@ static int check_cond_jmp_op(struct bpf_
 	if (BPF_SRC(insn->code) == BPF_K &&
 	    insn->imm == 0 && (opcode == BPF_JEQ || opcode == BPF_JNE) &&
 	    dst_reg->type == PTR_TO_MAP_VALUE_OR_NULL) {
-		if (opcode == BPF_JEQ) {
-			/* next fallthrough insn can access memory via
-			 * this register
-			 */
-			regs[insn->dst_reg].type = PTR_TO_MAP_VALUE;
-			/* branch targer cannot access it, since reg == 0 */
-			mark_reg_unknown_value(other_branch->regs,
-					       insn->dst_reg);
-		} else {
-			other_branch->regs[insn->dst_reg].type = PTR_TO_MAP_VALUE;
-			mark_reg_unknown_value(regs, insn->dst_reg);
-		}
+		/* Mark all identical map registers in each branch as either
+		 * safe or unknown depending R == 0 or R != 0 conditional.
+		 */
+		mark_map_regs(this_branch, insn->dst_reg,
+			      opcode == BPF_JEQ ? PTR_TO_MAP_VALUE : UNKNOWN_VALUE);
+		mark_map_regs(other_branch, insn->dst_reg,
+			      opcode == BPF_JEQ ? UNKNOWN_VALUE : PTR_TO_MAP_VALUE);
 	} else if (BPF_SRC(insn->code) == BPF_X && opcode == BPF_JGT &&
 		   dst_reg->type == PTR_TO_PACKET &&
 		   regs[insn->src_reg].type == PTR_TO_PACKET_END) {

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 39/93] bpf: fix state equivalence
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (33 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 38/93] bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 40/93] bpf: fix regression on verifier pruning wrt map lookups Greg Kroah-Hartman
                   ` (54 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexei Starovoitov, Daniel Borkmann,
	Thomas Graf, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexei Starovoitov <ast@fb.com>


[ Upstream commit d2a4dd37f6b41fbcad76efbf63124eb3126c66fe ]

Commmits 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
and 484611357c19 ("bpf: allow access into map value arrays") by themselves
are correct, but in combination they make state equivalence ignore 'id' field
of the register state which can lead to accepting invalid program.

Fixes: 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
Fixes: 484611357c19 ("bpf: allow access into map value arrays")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/bpf_verifier.h |   14 +++++++-------
 kernel/bpf/verifier.c        |    2 +-
 2 files changed, 8 insertions(+), 8 deletions(-)

--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -18,13 +18,6 @@
 
 struct bpf_reg_state {
 	enum bpf_reg_type type;
-	/*
-	 * Used to determine if any memory access using this register will
-	 * result in a bad access.
-	 */
-	s64 min_value;
-	u64 max_value;
-	u32 id;
 	union {
 		/* valid when type == CONST_IMM | PTR_TO_STACK | UNKNOWN_VALUE */
 		s64 imm;
@@ -40,6 +33,13 @@ struct bpf_reg_state {
 		 */
 		struct bpf_map *map_ptr;
 	};
+	u32 id;
+	/* Used to determine if any memory access using this register will
+	 * result in a bad access. These two fields must be last.
+	 * See states_equal()
+	 */
+	s64 min_value;
+	u64 max_value;
 };
 
 enum bpf_stack_slot_type {
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2498,7 +2498,7 @@ static bool states_equal(struct bpf_veri
 		 * we didn't do a variable access into a map then we are a-ok.
 		 */
 		if (!varlen_map_access &&
-		    rold->type == rcur->type && rold->imm == rcur->imm)
+		    memcmp(rold, rcur, offsetofend(struct bpf_reg_state, id)) == 0)
 			continue;
 
 		/* If we didn't map access then again we don't care about the

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 40/93] bpf: fix regression on verifier pruning wrt map lookups
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (34 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 39/93] bpf: fix state equivalence Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 41/93] bpf: fix mark_reg_unknown_value for spilled regs on map value marking Greg Kroah-Hartman
                   ` (53 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Daniel Borkmann, Thomas Graf,
	Alexei Starovoitov, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <daniel@iogearbox.net>


[ Upstream commit a08dd0da5307ba01295c8383923e51e7997c3576 ]

Commit 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL
registers") introduced a regression where existing programs stopped
loading due to reaching the verifier's maximum complexity limit,
whereas prior to this commit they were loading just fine; the affected
program has roughly 2k instructions.

What was found is that state pruning couldn't be performed effectively
anymore due to mismatches of the verifier's register state, in particular
in the id tracking. It doesn't mean that 57a09bf0a416 is incorrect per
se, but rather that verifier needs to perform a lot more work for the
same program with regards to involved map lookups.

Since commit 57a09bf0a416 is only about tracking registers with type
PTR_TO_MAP_VALUE_OR_NULL, the id is only needed to follow registers
until they are promoted through pattern matching with a NULL check to
either PTR_TO_MAP_VALUE or UNKNOWN_VALUE type. After that point, the
id becomes irrelevant for the transitioned types.

For UNKNOWN_VALUE, id is already reset to 0 via mark_reg_unknown_value(),
but not so for PTR_TO_MAP_VALUE where id is becoming stale. It's even
transferred further into other types that don't make use of it. Among
others, one example is where UNKNOWN_VALUE is set on function call
return with RET_INTEGER return type.

states_equal() will then fall through the memcmp() on register state;
note that the second memcmp() uses offsetofend(), so the id is part of
that since d2a4dd37f6b4 ("bpf: fix state equivalence"). But the bisect
pointed already to 57a09bf0a416, where we really reach beyond complexity
limit. What I found was that states_equal() often failed in this
case due to id mismatches in spilled regs with registers in type
PTR_TO_MAP_VALUE. Unlike non-spilled regs, spilled regs just perform
a memcmp() on their reg state and don't have any other optimizations
in place, therefore also id was relevant in this case for making a
pruning decision.

We can safely reset id to 0 as well when converting to PTR_TO_MAP_VALUE.
For the affected program, it resulted in a ~17 fold reduction of
complexity and let the program load fine again. Selftest suite also
runs fine. The only other place where env->id_gen is used currently is
through direct packet access, but for these cases id is long living, thus
a different scenario.

Also, the current logic in mark_map_regs() is not fully correct when
marking NULL branch with UNKNOWN_VALUE. We need to cache the destination
reg's id in any case. Otherwise, once we marked that reg as UNKNOWN_VALUE,
it's id is reset and any subsequent registers that hold the original id
and are of type PTR_TO_MAP_VALUE_OR_NULL won't be marked UNKNOWN_VALUE
anymore, since mark_map_reg() reuses the uncached regs[regno].id that
was just overridden. Note, we don't need to cache it outside of
mark_map_regs(), since it's called once on this_branch and the other
time on other_branch, which are both two independent verifier states.
A test case for this is added here, too.

Fixes: 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/bpf/verifier.c |   11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1940,6 +1940,11 @@ static void mark_map_reg(struct bpf_reg_
 
 	if (reg->type == PTR_TO_MAP_VALUE_OR_NULL && reg->id == id) {
 		reg->type = type;
+		/* We don't need id from this point onwards anymore, thus we
+		 * should better reset it, so that state pruning has chances
+		 * to take effect.
+		 */
+		reg->id = 0;
 		if (type == UNKNOWN_VALUE)
 			mark_reg_unknown_value(regs, regno);
 	}
@@ -1952,16 +1957,16 @@ static void mark_map_regs(struct bpf_ver
 			  enum bpf_reg_type type)
 {
 	struct bpf_reg_state *regs = state->regs;
+	u32 id = regs[regno].id;
 	int i;
 
 	for (i = 0; i < MAX_BPF_REG; i++)
-		mark_map_reg(regs, i, regs[regno].id, type);
+		mark_map_reg(regs, i, id, type);
 
 	for (i = 0; i < MAX_BPF_STACK; i += BPF_REG_SIZE) {
 		if (state->stack_slot_type[i] != STACK_SPILL)
 			continue;
-		mark_map_reg(state->spilled_regs, i / BPF_REG_SIZE,
-			     regs[regno].id, type);
+		mark_map_reg(state->spilled_regs, i / BPF_REG_SIZE, id, type);
 	}
 }
 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 41/93] bpf: fix mark_reg_unknown_value for spilled regs on map value marking
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (35 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 40/93] bpf: fix regression on verifier pruning wrt map lookups Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 42/93] dmaengine: iota: ioat_alloc_chan_resources should not perform sleeping allocations Greg Kroah-Hartman
                   ` (52 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Martin KaFai Lau, Daniel Borkmann,
	Alexei Starovoitov, David S. Miller

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <daniel@iogearbox.net>


[ Upstream commit 6760bf2ddde8ad64f8205a651223a93de3a35494 ]

Martin reported a verifier issue that hit the BUG_ON() for his
test case in the mark_reg_unknown_value() function:

  [  202.861380] kernel BUG at kernel/bpf/verifier.c:467!
  [...]
  [  203.291109] Call Trace:
  [  203.296501]  [<ffffffff811364d5>] mark_map_reg+0x45/0x50
  [  203.308225]  [<ffffffff81136558>] mark_map_regs+0x78/0x90
  [  203.320140]  [<ffffffff8113938d>] do_check+0x226d/0x2c90
  [  203.331865]  [<ffffffff8113a6ab>] bpf_check+0x48b/0x780
  [  203.343403]  [<ffffffff81134c8e>] bpf_prog_load+0x27e/0x440
  [  203.355705]  [<ffffffff8118a38f>] ? handle_mm_fault+0x11af/0x1230
  [  203.369158]  [<ffffffff812d8188>] ? security_capable+0x48/0x60
  [  203.382035]  [<ffffffff811351a4>] SyS_bpf+0x124/0x960
  [  203.393185]  [<ffffffff810515f6>] ? __do_page_fault+0x276/0x490
  [  203.406258]  [<ffffffff816db320>] entry_SYSCALL_64_fastpath+0x13/0x94

This issue got uncovered after the fix in a08dd0da5307 ("bpf: fix
regression on verifier pruning wrt map lookups"). The reason why it
wasn't noticed before was, because as mentioned in a08dd0da5307,
mark_map_regs() was doing the id matching incorrectly based on the
uncached regs[regno].id. So, in the first loop, we walked all regs
and as soon as we found regno == i, then this reg's id was cleared
when calling mark_reg_unknown_value() thus that every subsequent
register was probed against id of 0 (which, in combination with the
PTR_TO_MAP_VALUE_OR_NULL type is an invalid condition that no other
register state can hold), and therefore wasn't type transitioned such
as in the spilled register case for the second loop.

Now since that got fixed, it turned out that 57a09bf0a416 ("bpf:
Detect identical PTR_TO_MAP_VALUE_OR_NULL registers") used
mark_reg_unknown_value() incorrectly for the spilled regs, and thus
hitting the BUG_ON() in some cases due to regno >= MAX_BPF_REG.

Although spilled regs have the same type as the non-spilled regs
for the verifier state, that is, struct bpf_reg_state, they are
semantically different from the non-spilled regs. In other words,
there can be up to 64 (MAX_BPF_STACK / BPF_REG_SIZE) spilled regs
in the stack, for example, register R<x> could have been spilled by
the program to stack location X, Y, Z, and in mark_map_regs() we
need to scan these stack slots of type STACK_SPILL for potential
registers that we have to transition from PTR_TO_MAP_VALUE_OR_NULL.
Therefore, depending on the location, the spilled_regs regno can
be a lot higher than just MAX_BPF_REG's value since we operate on
stack instead. The reset in mark_reg_unknown_value() itself is
just fine, only that the BUG_ON() was inappropriate for this. Fix
it by making a __mark_reg_unknown_value() version that can be
called from mark_map_reg() generically; we know for the non-spilled
case that the regno is always < MAX_BPF_REG anyway.

Fixes: 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
Reported-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/bpf/verifier.c |   11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -444,14 +444,19 @@ static void init_reg_state(struct bpf_re
 	regs[BPF_REG_1].type = PTR_TO_CTX;
 }
 
-static void mark_reg_unknown_value(struct bpf_reg_state *regs, u32 regno)
+static void __mark_reg_unknown_value(struct bpf_reg_state *regs, u32 regno)
 {
-	BUG_ON(regno >= MAX_BPF_REG);
 	regs[regno].type = UNKNOWN_VALUE;
 	regs[regno].id = 0;
 	regs[regno].imm = 0;
 }
 
+static void mark_reg_unknown_value(struct bpf_reg_state *regs, u32 regno)
+{
+	BUG_ON(regno >= MAX_BPF_REG);
+	__mark_reg_unknown_value(regs, regno);
+}
+
 static void reset_reg_range_values(struct bpf_reg_state *regs, u32 regno)
 {
 	regs[regno].min_value = BPF_REGISTER_MIN_RANGE;
@@ -1946,7 +1951,7 @@ static void mark_map_reg(struct bpf_reg_
 		 */
 		reg->id = 0;
 		if (type == UNKNOWN_VALUE)
-			mark_reg_unknown_value(regs, regno);
+			__mark_reg_unknown_value(regs, regno);
 	}
 }
 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 42/93] dmaengine: iota: ioat_alloc_chan_resources should not perform sleeping allocations.
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (36 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 41/93] bpf: fix mark_reg_unknown_value for spilled regs on map value marking Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 43/93] xen: do not re-use pirq number cached in pci device msi msg data Greg Kroah-Hartman
                   ` (51 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Krister Johansen, Dave Jiang, Vinod Koul

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Krister Johansen <kjlx@templeofstupid.com>

commit 21d25f6a4217e755906cb548b55ddab39d0e88b9 upstream.

On a kernel with DEBUG_LOCKS, ioat_free_chan_resources triggers an
in_interrupt() warning.  With PROVE_LOCKING, it reports detecting a
SOFTIRQ-safe to SOFTIRQ-unsafe lock ordering in the same code path.

This is because dma_generic_alloc_coherent() checks if the GFP flags
permit blocking.  It allocates from different subsystems if blocking is
permitted.  The free path knows how to return the memory to the correct
allocator.  If GFP_KERNEL is specified then the alloc and free end up
going through cma_alloc(), which uses mutexes.

Given that ioat_free_chan_resources() can be called in interrupt
context, ioat_alloc_chan_resources() must specify GFP_NOWAIT so that the
allocations do not block and instead use an allocator that uses
spinlocks.

Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/dma/ioat/init.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/dma/ioat/init.c
+++ b/drivers/dma/ioat/init.c
@@ -691,7 +691,7 @@ static int ioat_alloc_chan_resources(str
 	/* doing 2 32bit writes to mmio since 1 64b write doesn't work */
 	ioat_chan->completion =
 		dma_pool_zalloc(ioat_chan->ioat_dma->completion_pool,
-				GFP_KERNEL, &ioat_chan->completion_dma);
+				GFP_NOWAIT, &ioat_chan->completion_dma);
 	if (!ioat_chan->completion)
 		return -ENOMEM;
 
@@ -701,7 +701,7 @@ static int ioat_alloc_chan_resources(str
 	       ioat_chan->reg_base + IOAT_CHANCMP_OFFSET_HIGH);
 
 	order = IOAT_MAX_ORDER;
-	ring = ioat_alloc_ring(c, order, GFP_KERNEL);
+	ring = ioat_alloc_ring(c, order, GFP_NOWAIT);
 	if (!ring)
 		return -ENOMEM;
 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 43/93] xen: do not re-use pirq number cached in pci device msi msg data
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (37 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 42/93] dmaengine: iota: ioat_alloc_chan_resources should not perform sleeping allocations Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 44/93] igb: Workaround for igb i210 firmware issue Greg Kroah-Hartman
                   ` (50 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dan Streetman, Stefano Stabellini,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Streetman <ddstreet@ieee.org>

[ Upstream commit c74fd80f2f41d05f350bb478151021f88551afe8 ]

Revert the main part of commit:
af42b8d12f8a ("xen: fix MSI setup and teardown for PV on HVM guests")

That commit introduced reading the pci device's msi message data to see
if a pirq was previously configured for the device's msi/msix, and re-use
that pirq.  At the time, that was the correct behavior.  However, a
later change to Qemu caused it to call into the Xen hypervisor to unmap
all pirqs for a pci device, when the pci device disables its MSI/MSIX
vectors; specifically the Qemu commit:
c976437c7dba9c7444fb41df45468968aaa326ad
("qemu-xen: free all the pirqs for msi/msix when driver unload")

Once Qemu added this pirq unmapping, it was no longer correct for the
kernel to re-use the pirq number cached in the pci device msi message
data.  All Qemu releases since 2.1.0 contain the patch that unmaps the
pirqs when the pci device disables its MSI/MSIX vectors.

This bug is causing failures to initialize multiple NVMe controllers
under Xen, because the NVMe driver sets up a single MSIX vector for
each controller (concurrently), and then after using that to talk to
the controller for some configuration data, it disables the single MSIX
vector and re-configures all the MSIX vectors it needs.  So the MSIX
setup code tries to re-use the cached pirq from the first vector
for each controller, but the hypervisor has already given away that
pirq to another controller, and its initialization fails.

This is discussed in more detail at:
https://lists.xen.org/archives/html/xen-devel/2017-01/msg00447.html

Fixes: af42b8d12f8a ("xen: fix MSI setup and teardown for PV on HVM guests")
Signed-off-by: Dan Streetman <dan.streetman@canonical.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/pci/xen.c |   23 +++++++----------------
 1 file changed, 7 insertions(+), 16 deletions(-)

--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -234,23 +234,14 @@ static int xen_hvm_setup_msi_irqs(struct
 		return 1;
 
 	for_each_pci_msi_entry(msidesc, dev) {
-		__pci_read_msi_msg(msidesc, &msg);
-		pirq = MSI_ADDR_EXT_DEST_ID(msg.address_hi) |
-			((msg.address_lo >> MSI_ADDR_DEST_ID_SHIFT) & 0xff);
-		if (msg.data != XEN_PIRQ_MSI_DATA ||
-		    xen_irq_from_pirq(pirq) < 0) {
-			pirq = xen_allocate_pirq_msi(dev, msidesc);
-			if (pirq < 0) {
-				irq = -ENODEV;
-				goto error;
-			}
-			xen_msi_compose_msg(dev, pirq, &msg);
-			__pci_write_msi_msg(msidesc, &msg);
-			dev_dbg(&dev->dev, "xen: msi bound to pirq=%d\n", pirq);
-		} else {
-			dev_dbg(&dev->dev,
-				"xen: msi already bound to pirq=%d\n", pirq);
+		pirq = xen_allocate_pirq_msi(dev, msidesc);
+		if (pirq < 0) {
+			irq = -ENODEV;
+			goto error;
 		}
+		xen_msi_compose_msg(dev, pirq, &msg);
+		__pci_write_msi_msg(msidesc, &msg);
+		dev_dbg(&dev->dev, "xen: msi bound to pirq=%d\n", pirq);
 		irq = xen_bind_pirq_msi_to_irq(dev, msidesc, pirq,
 					       (type == PCI_CAP_ID_MSI) ? nvec : 1,
 					       (type == PCI_CAP_ID_MSIX) ?

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 44/93] igb: Workaround for igb i210 firmware issue
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (38 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 43/93] xen: do not re-use pirq number cached in pci device msi msg data Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 45/93] igb: add i211 to i210 PHY workaround Greg Kroah-Hartman
                   ` (49 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Chris J Arges, Aaron Brown,
	Jeff Kirsher, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Chris J Arges <christopherarges@gmail.com>

[ Upstream commit 4e684f59d760a2c7c716bb60190783546e2d08a1 ]

Sometimes firmware may not properly initialize I347AT4_PAGE_SELECT causing
the probe of an igb i210 NIC to fail. This patch adds an addition zeroing
of this register during igb_get_phy_id to workaround this issue.

Thanks for Jochen Henneberg for the idea and original patch.

Signed-off-by: Chris J Arges <christopherarges@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/intel/igb/e1000_phy.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/drivers/net/ethernet/intel/igb/e1000_phy.c
+++ b/drivers/net/ethernet/intel/igb/e1000_phy.c
@@ -77,6 +77,10 @@ s32 igb_get_phy_id(struct e1000_hw *hw)
 	s32 ret_val = 0;
 	u16 phy_id;
 
+	/* ensure PHY page selection to fix misconfigured i210 */
+	if (hw->mac.type == e1000_i210)
+		phy->ops.write_reg(hw, I347AT4_PAGE_SELECT, 0);
+
 	ret_val = phy->ops.read_reg(hw, PHY_ID1, &phy_id);
 	if (ret_val)
 		goto out;

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 45/93] igb: add i211 to i210 PHY workaround
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (39 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 44/93] igb: Workaround for igb i210 firmware issue Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 46/93] scsi: ibmvscsis: Issues from Dan Carpenter/Smatch Greg Kroah-Hartman
                   ` (48 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Todd Fujinaka, Aaron Brown,
	Jeff Kirsher, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Todd Fujinaka <todd.fujinaka@intel.com>

[ Upstream commit 5bc8c230e2a993b49244f9457499f17283da9ec7 ]

i210 and i211 share the same PHY but have different PCI IDs. Don't
forget i211 for any i210 workarounds.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/intel/igb/e1000_phy.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/ethernet/intel/igb/e1000_phy.c
+++ b/drivers/net/ethernet/intel/igb/e1000_phy.c
@@ -78,7 +78,7 @@ s32 igb_get_phy_id(struct e1000_hw *hw)
 	u16 phy_id;
 
 	/* ensure PHY page selection to fix misconfigured i210 */
-	if (hw->mac.type == e1000_i210)
+	if ((hw->mac.type == e1000_i210) || (hw->mac.type == e1000_i211))
 		phy->ops.write_reg(hw, I347AT4_PAGE_SELECT, 0);
 
 	ret_val = phy->ops.read_reg(hw, PHY_ID1, &phy_id);

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 46/93] scsi: ibmvscsis: Issues from Dan Carpenter/Smatch
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (40 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 45/93] igb: add i211 to i210 PHY workaround Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 47/93] scsi: ibmvscsis: Return correct partition name/# to client Greg Kroah-Hartman
                   ` (47 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Michael Cyr, Bryant G. Ly,
	Steven Royer, Martin K. Petersen, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michael Cyr <mikecyr@us.ibm.com>

[ Upstream commit 11950d70b52d2bc5e3580da8cd63909ef38d67db ]

Signed-off-by: Michael Cyr <mikecyr@us.ibm.com>
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c |   13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
@@ -1746,14 +1746,7 @@ static long ibmvscsis_mad(struct scsi_in
 
 		pr_debug("mad: type %d\n", be32_to_cpu(mad->type));
 
-		if (be16_to_cpu(mad->length) < 0) {
-			dev_err(&vscsi->dev, "mad: length is < 0\n");
-			ibmvscsis_post_disconnect(vscsi,
-						  ERR_DISCONNECT_RECONNECT, 0);
-			rc = SRP_VIOLATION;
-		} else {
-			rc = ibmvscsis_process_mad(vscsi, iue);
-		}
+		rc = ibmvscsis_process_mad(vscsi, iue);
 
 		pr_debug("mad: status %hd, rc %ld\n", be16_to_cpu(mad->status),
 			 rc);
@@ -2523,7 +2516,6 @@ static void ibmvscsis_parse_cmd(struct s
 		dev_err(&vscsi->dev, "0x%llx: parsing SRP descriptor table failed.\n",
 			srp->tag);
 		goto fail;
-		return;
 	}
 
 	cmd->rsp.sol_not = srp->sol_not;
@@ -3379,7 +3371,8 @@ static int ibmvscsis_probe(struct vio_de
 	INIT_LIST_HEAD(&vscsi->waiting_rsp);
 	INIT_LIST_HEAD(&vscsi->active_q);
 
-	snprintf(vscsi->tport.tport_name, 256, "%s", dev_name(&vdev->dev));
+	snprintf(vscsi->tport.tport_name, IBMVSCSIS_NAMELEN, "%s",
+		 dev_name(&vdev->dev));
 
 	pr_debug("probe tport_name: %s\n", vscsi->tport.tport_name);
 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 47/93] scsi: ibmvscsis: Return correct partition name/# to client
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (41 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 46/93] scsi: ibmvscsis: Issues from Dan Carpenter/Smatch Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 48/93] scsi: ibmvscsis: Clean up properly if target_submit_cmd/tmr fails Greg Kroah-Hartman
                   ` (46 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Michael Cyr, Bryant G. Ly,
	Steven Royer, Martin K. Petersen, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michael Cyr <mikecyr@us.ibm.com>

[ Upstream commit 9c93cf03d4eb3dc58931ff7cac0af9c344fe5e0b ]

Signed-off-by: Michael Cyr <mikecyr@us.ibm.com>
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
@@ -3387,6 +3387,9 @@ static int ibmvscsis_probe(struct vio_de
 	strncat(vscsi->eye, vdev->name, MAX_EYE);
 
 	vscsi->dds.unit_id = vdev->unit_address;
+	strncpy(vscsi->dds.partition_name, partition_name,
+		sizeof(vscsi->dds.partition_name));
+	vscsi->dds.partition_num = partition_number;
 
 	spin_lock_bh(&ibmvscsis_dev_lock);
 	list_add_tail(&vscsi->list, &ibmvscsis_dev_list);
@@ -3603,7 +3606,7 @@ static int ibmvscsis_get_system_info(voi
 
 	num = of_get_property(rootdn, "ibm,partition-no", NULL);
 	if (num)
-		partition_number = *num;
+		partition_number = of_read_number(num, 1);
 
 	of_node_put(rootdn);
 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 48/93] scsi: ibmvscsis: Clean up properly if target_submit_cmd/tmr fails
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (42 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 47/93] scsi: ibmvscsis: Return correct partition name/# to client Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 49/93] scsi: ibmvscsis: Rearrange functions for future patches Greg Kroah-Hartman
                   ` (45 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Michael Cyr, Bryant G. Ly,
	Steven Royer, Martin K. Petersen, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michael Cyr <mikecyr@us.ibm.com>

[ Upstream commit 7435b32e2d2fb5da6c2ae9b9c8ce56d8a3cb3bc3 ]

Signed-off-by: Michael Cyr <mikecyr@us.ibm.com>
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
@@ -2552,6 +2552,10 @@ static void ibmvscsis_parse_cmd(struct s
 			       data_len, attr, dir, 0);
 	if (rc) {
 		dev_err(&vscsi->dev, "target_submit_cmd failed, rc %d\n", rc);
+		spin_lock_bh(&vscsi->intr_lock);
+		list_del(&cmd->list);
+		ibmvscsis_free_cmd_resources(vscsi, cmd);
+		spin_unlock_bh(&vscsi->intr_lock);
 		goto fail;
 	}
 	return;
@@ -2631,6 +2635,9 @@ static void ibmvscsis_parse_task(struct
 		if (rc) {
 			dev_err(&vscsi->dev, "target_submit_tmr failed, rc %d\n",
 				rc);
+			spin_lock_bh(&vscsi->intr_lock);
+			list_del(&cmd->list);
+			spin_unlock_bh(&vscsi->intr_lock);
 			cmd->se_cmd.se_tmr_req->response =
 				TMR_FUNCTION_REJECTED;
 		}

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 49/93] scsi: ibmvscsis: Rearrange functions for future patches
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (43 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 48/93] scsi: ibmvscsis: Clean up properly if target_submit_cmd/tmr fails Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 50/93] scsi: ibmvscsis: Synchronize cmds at tpg_enable_store time Greg Kroah-Hartman
                   ` (44 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Michael Cyr, Bryant G. Ly,
	Steven Royer, Martin K. Petersen, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michael Cyr <mikecyr@us.ibm.com>

[ Upstream commit 79fac9c9b74f4951c9ce82b22e714bcc34ae4a56 ]

This patch reorders functions in a manner necessary for a follow-on
patch.  It also makes some minor styling changes (mostly removing extra
spaces) and fixes some typos.

There are no code changes in this patch, with one exception: due to the
reordering of the functions, I needed to explicitly declare a function
at the top of the file.  However, this will be removed in the next patch,
since the code requiring the predeclaration will be removed.

Signed-off-by: Michael Cyr <mikecyr@us.ibm.com>
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c |  658 +++++++++++++++----------------
 1 file changed, 330 insertions(+), 328 deletions(-)

--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
@@ -22,7 +22,7 @@
  *
  ****************************************************************************/
 
-#define pr_fmt(fmt)     KBUILD_MODNAME ": " fmt
+#define pr_fmt(fmt)	KBUILD_MODNAME ": " fmt
 
 #include <linux/module.h>
 #include <linux/kernel.h>
@@ -62,6 +62,8 @@ static long ibmvscsis_parse_command(stru
 
 static void ibmvscsis_adapter_idle(struct scsi_info *vscsi);
 
+static void ibmvscsis_reset_queue(struct scsi_info *vscsi, uint new_state);
+
 static void ibmvscsis_determine_resid(struct se_cmd *se_cmd,
 				      struct srp_rsp *rsp)
 {
@@ -82,7 +84,7 @@ static void ibmvscsis_determine_resid(st
 		}
 	} else if (se_cmd->se_cmd_flags & SCF_OVERFLOW_BIT) {
 		if (se_cmd->data_direction == DMA_TO_DEVICE) {
-			/*  residual data from an overflow write */
+			/* residual data from an overflow write */
 			rsp->flags = SRP_RSP_FLAG_DOOVER;
 			rsp->data_out_res_cnt = cpu_to_be32(residual_count);
 		} else if (se_cmd->data_direction == DMA_FROM_DEVICE) {
@@ -102,7 +104,7 @@ static void ibmvscsis_determine_resid(st
  * and the function returns TRUE.
  *
  * EXECUTION ENVIRONMENT:
- *      Interrupt or Process environment
+ *	Interrupt or Process environment
  */
 static bool connection_broken(struct scsi_info *vscsi)
 {
@@ -325,7 +327,7 @@ static struct viosrp_crq *ibmvscsis_cmd_
 }
 
 /**
- * ibmvscsis_send_init_message() -  send initialize message to the client
+ * ibmvscsis_send_init_message() - send initialize message to the client
  * @vscsi:	Pointer to our adapter structure
  * @format:	Which Init Message format to send
  *
@@ -383,13 +385,13 @@ static long ibmvscsis_check_init_msg(str
 					      vscsi->cmd_q.base_addr);
 		if (crq) {
 			*format = (uint)(crq->format);
-			rc =  ERROR;
+			rc = ERROR;
 			crq->valid = INVALIDATE_CMD_RESP_EL;
 			dma_rmb();
 		}
 	} else {
 		*format = (uint)(crq->format);
-		rc =  ERROR;
+		rc = ERROR;
 		crq->valid = INVALIDATE_CMD_RESP_EL;
 		dma_rmb();
 	}
@@ -398,166 +400,6 @@ static long ibmvscsis_check_init_msg(str
 }
 
 /**
- * ibmvscsis_establish_new_q() - Establish new CRQ queue
- * @vscsi:	Pointer to our adapter structure
- * @new_state:	New state being established after resetting the queue
- *
- * Must be called with interrupt lock held.
- */
-static long ibmvscsis_establish_new_q(struct scsi_info *vscsi,  uint new_state)
-{
-	long rc = ADAPT_SUCCESS;
-	uint format;
-
-	vscsi->flags &= PRESERVE_FLAG_FIELDS;
-	vscsi->rsp_q_timer.timer_pops = 0;
-	vscsi->debit = 0;
-	vscsi->credit = 0;
-
-	rc = vio_enable_interrupts(vscsi->dma_dev);
-	if (rc) {
-		pr_warn("reset_queue: failed to enable interrupts, rc %ld\n",
-			rc);
-		return rc;
-	}
-
-	rc = ibmvscsis_check_init_msg(vscsi, &format);
-	if (rc) {
-		dev_err(&vscsi->dev, "reset_queue: check_init_msg failed, rc %ld\n",
-			rc);
-		return rc;
-	}
-
-	if (format == UNUSED_FORMAT && new_state == WAIT_CONNECTION) {
-		rc = ibmvscsis_send_init_message(vscsi, INIT_MSG);
-		switch (rc) {
-		case H_SUCCESS:
-		case H_DROPPED:
-		case H_CLOSED:
-			rc = ADAPT_SUCCESS;
-			break;
-
-		case H_PARAMETER:
-		case H_HARDWARE:
-			break;
-
-		default:
-			vscsi->state = UNDEFINED;
-			rc = H_HARDWARE;
-			break;
-		}
-	}
-
-	return rc;
-}
-
-/**
- * ibmvscsis_reset_queue() - Reset CRQ Queue
- * @vscsi:	Pointer to our adapter structure
- * @new_state:	New state to establish after resetting the queue
- *
- * This function calls h_free_q and then calls h_reg_q and does all
- * of the bookkeeping to get us back to where we can communicate.
- *
- * Actually, we don't always call h_free_crq.  A problem was discovered
- * where one partition would close and reopen his queue, which would
- * cause his partner to get a transport event, which would cause him to
- * close and reopen his queue, which would cause the original partition
- * to get a transport event, etc., etc.  To prevent this, we don't
- * actually close our queue if the client initiated the reset, (i.e.
- * either we got a transport event or we have detected that the client's
- * queue is gone)
- *
- * EXECUTION ENVIRONMENT:
- *	Process environment, called with interrupt lock held
- */
-static void ibmvscsis_reset_queue(struct scsi_info *vscsi, uint new_state)
-{
-	int bytes;
-	long rc = ADAPT_SUCCESS;
-
-	pr_debug("reset_queue: flags 0x%x\n", vscsi->flags);
-
-	/* don't reset, the client did it for us */
-	if (vscsi->flags & (CLIENT_FAILED | TRANS_EVENT)) {
-		vscsi->flags &=  PRESERVE_FLAG_FIELDS;
-		vscsi->rsp_q_timer.timer_pops = 0;
-		vscsi->debit = 0;
-		vscsi->credit = 0;
-		vscsi->state = new_state;
-		vio_enable_interrupts(vscsi->dma_dev);
-	} else {
-		rc = ibmvscsis_free_command_q(vscsi);
-		if (rc == ADAPT_SUCCESS) {
-			vscsi->state = new_state;
-
-			bytes = vscsi->cmd_q.size * PAGE_SIZE;
-			rc = h_reg_crq(vscsi->dds.unit_id,
-				       vscsi->cmd_q.crq_token, bytes);
-			if (rc == H_CLOSED || rc == H_SUCCESS) {
-				rc = ibmvscsis_establish_new_q(vscsi,
-							       new_state);
-			}
-
-			if (rc != ADAPT_SUCCESS) {
-				pr_debug("reset_queue: reg_crq rc %ld\n", rc);
-
-				vscsi->state = ERR_DISCONNECTED;
-				vscsi->flags |=  RESPONSE_Q_DOWN;
-				ibmvscsis_free_command_q(vscsi);
-			}
-		} else {
-			vscsi->state = ERR_DISCONNECTED;
-			vscsi->flags |= RESPONSE_Q_DOWN;
-		}
-	}
-}
-
-/**
- * ibmvscsis_free_cmd_resources() - Free command resources
- * @vscsi:	Pointer to our adapter structure
- * @cmd:	Command which is not longer in use
- *
- * Must be called with interrupt lock held.
- */
-static void ibmvscsis_free_cmd_resources(struct scsi_info *vscsi,
-					 struct ibmvscsis_cmd *cmd)
-{
-	struct iu_entry *iue = cmd->iue;
-
-	switch (cmd->type) {
-	case TASK_MANAGEMENT:
-	case SCSI_CDB:
-		/*
-		 * When the queue goes down this value is cleared, so it
-		 * cannot be cleared in this general purpose function.
-		 */
-		if (vscsi->debit)
-			vscsi->debit -= 1;
-		break;
-	case ADAPTER_MAD:
-		vscsi->flags &= ~PROCESSING_MAD;
-		break;
-	case UNSET_TYPE:
-		break;
-	default:
-		dev_err(&vscsi->dev, "free_cmd_resources unknown type %d\n",
-			cmd->type);
-		break;
-	}
-
-	cmd->iue = NULL;
-	list_add_tail(&cmd->list, &vscsi->free_cmd);
-	srp_iu_put(iue);
-
-	if (list_empty(&vscsi->active_q) && list_empty(&vscsi->schedule_q) &&
-	    list_empty(&vscsi->waiting_rsp) && (vscsi->flags & WAIT_FOR_IDLE)) {
-		vscsi->flags &= ~WAIT_FOR_IDLE;
-		complete(&vscsi->wait_idle);
-	}
-}
-
-/**
  * ibmvscsis_disconnect() - Helper function to disconnect
  * @work:	Pointer to work_struct, gives access to our adapter structure
  *
@@ -590,7 +432,7 @@ static void ibmvscsis_disconnect(struct
 	 * should transitition to the new state
 	 */
 	switch (vscsi->state) {
-	/*  Should never be called while in this state. */
+	/* Should never be called while in this state. */
 	case NO_QUEUE:
 	/*
 	 * Can never transition from this state;
@@ -807,6 +649,316 @@ static void ibmvscsis_post_disconnect(st
 }
 
 /**
+ * ibmvscsis_handle_init_compl_msg() - Respond to an Init Complete Message
+ * @vscsi:	Pointer to our adapter structure
+ *
+ * Must be called with interrupt lock held.
+ */
+static long ibmvscsis_handle_init_compl_msg(struct scsi_info *vscsi)
+{
+	long rc = ADAPT_SUCCESS;
+
+	switch (vscsi->state) {
+	case NO_QUEUE:
+	case ERR_DISCONNECT:
+	case ERR_DISCONNECT_RECONNECT:
+	case ERR_DISCONNECTED:
+	case UNCONFIGURING:
+	case UNDEFINED:
+		rc = ERROR;
+		break;
+
+	case WAIT_CONNECTION:
+		vscsi->state = CONNECTED;
+		break;
+
+	case WAIT_IDLE:
+	case SRP_PROCESSING:
+	case CONNECTED:
+	case WAIT_ENABLED:
+	case PART_UP_WAIT_ENAB:
+	default:
+		rc = ERROR;
+		dev_err(&vscsi->dev, "init_msg: invalid state %d to get init compl msg\n",
+			vscsi->state);
+		ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT_RECONNECT, 0);
+		break;
+	}
+
+	return rc;
+}
+
+/**
+ * ibmvscsis_handle_init_msg() - Respond to an Init Message
+ * @vscsi:	Pointer to our adapter structure
+ *
+ * Must be called with interrupt lock held.
+ */
+static long ibmvscsis_handle_init_msg(struct scsi_info *vscsi)
+{
+	long rc = ADAPT_SUCCESS;
+
+	switch (vscsi->state) {
+	case WAIT_ENABLED:
+		vscsi->state = PART_UP_WAIT_ENAB;
+		break;
+
+	case WAIT_CONNECTION:
+		rc = ibmvscsis_send_init_message(vscsi, INIT_COMPLETE_MSG);
+		switch (rc) {
+		case H_SUCCESS:
+			vscsi->state = CONNECTED;
+			break;
+
+		case H_PARAMETER:
+			dev_err(&vscsi->dev, "init_msg: failed to send, rc %ld\n",
+				rc);
+			ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT, 0);
+			break;
+
+		case H_DROPPED:
+			dev_err(&vscsi->dev, "init_msg: failed to send, rc %ld\n",
+				rc);
+			rc = ERROR;
+			ibmvscsis_post_disconnect(vscsi,
+						  ERR_DISCONNECT_RECONNECT, 0);
+			break;
+
+		case H_CLOSED:
+			pr_warn("init_msg: failed to send, rc %ld\n", rc);
+			rc = 0;
+			break;
+		}
+		break;
+
+	case UNDEFINED:
+		rc = ERROR;
+		break;
+
+	case UNCONFIGURING:
+		break;
+
+	case PART_UP_WAIT_ENAB:
+	case CONNECTED:
+	case SRP_PROCESSING:
+	case WAIT_IDLE:
+	case NO_QUEUE:
+	case ERR_DISCONNECT:
+	case ERR_DISCONNECT_RECONNECT:
+	case ERR_DISCONNECTED:
+	default:
+		rc = ERROR;
+		dev_err(&vscsi->dev, "init_msg: invalid state %d to get init msg\n",
+			vscsi->state);
+		ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT_RECONNECT, 0);
+		break;
+	}
+
+	return rc;
+}
+
+/**
+ * ibmvscsis_init_msg() - Respond to an init message
+ * @vscsi:	Pointer to our adapter structure
+ * @crq:	Pointer to CRQ element containing the Init Message
+ *
+ * EXECUTION ENVIRONMENT:
+ *	Interrupt, interrupt lock held
+ */
+static long ibmvscsis_init_msg(struct scsi_info *vscsi, struct viosrp_crq *crq)
+{
+	long rc = ADAPT_SUCCESS;
+
+	pr_debug("init_msg: state 0x%hx\n", vscsi->state);
+
+	rc = h_vioctl(vscsi->dds.unit_id, H_GET_PARTNER_INFO,
+		      (u64)vscsi->map_ioba | ((u64)PAGE_SIZE << 32), 0, 0, 0,
+		      0);
+	if (rc == H_SUCCESS) {
+		vscsi->client_data.partition_number =
+			be64_to_cpu(*(u64 *)vscsi->map_buf);
+		pr_debug("init_msg, part num %d\n",
+			 vscsi->client_data.partition_number);
+	} else {
+		pr_debug("init_msg h_vioctl rc %ld\n", rc);
+		rc = ADAPT_SUCCESS;
+	}
+
+	if (crq->format == INIT_MSG) {
+		rc = ibmvscsis_handle_init_msg(vscsi);
+	} else if (crq->format == INIT_COMPLETE_MSG) {
+		rc = ibmvscsis_handle_init_compl_msg(vscsi);
+	} else {
+		rc = ERROR;
+		dev_err(&vscsi->dev, "init_msg: invalid format %d\n",
+			(uint)crq->format);
+		ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT_RECONNECT, 0);
+	}
+
+	return rc;
+}
+
+/**
+ * ibmvscsis_establish_new_q() - Establish new CRQ queue
+ * @vscsi:	Pointer to our adapter structure
+ * @new_state:	New state being established after resetting the queue
+ *
+ * Must be called with interrupt lock held.
+ */
+static long ibmvscsis_establish_new_q(struct scsi_info *vscsi, uint new_state)
+{
+	long rc = ADAPT_SUCCESS;
+	uint format;
+
+	vscsi->flags &= PRESERVE_FLAG_FIELDS;
+	vscsi->rsp_q_timer.timer_pops = 0;
+	vscsi->debit = 0;
+	vscsi->credit = 0;
+
+	rc = vio_enable_interrupts(vscsi->dma_dev);
+	if (rc) {
+		pr_warn("reset_queue: failed to enable interrupts, rc %ld\n",
+			rc);
+		return rc;
+	}
+
+	rc = ibmvscsis_check_init_msg(vscsi, &format);
+	if (rc) {
+		dev_err(&vscsi->dev, "reset_queue: check_init_msg failed, rc %ld\n",
+			rc);
+		return rc;
+	}
+
+	if (format == UNUSED_FORMAT && new_state == WAIT_CONNECTION) {
+		rc = ibmvscsis_send_init_message(vscsi, INIT_MSG);
+		switch (rc) {
+		case H_SUCCESS:
+		case H_DROPPED:
+		case H_CLOSED:
+			rc = ADAPT_SUCCESS;
+			break;
+
+		case H_PARAMETER:
+		case H_HARDWARE:
+			break;
+
+		default:
+			vscsi->state = UNDEFINED;
+			rc = H_HARDWARE;
+			break;
+		}
+	}
+
+	return rc;
+}
+
+/**
+ * ibmvscsis_reset_queue() - Reset CRQ Queue
+ * @vscsi:	Pointer to our adapter structure
+ * @new_state:	New state to establish after resetting the queue
+ *
+ * This function calls h_free_q and then calls h_reg_q and does all
+ * of the bookkeeping to get us back to where we can communicate.
+ *
+ * Actually, we don't always call h_free_crq.  A problem was discovered
+ * where one partition would close and reopen his queue, which would
+ * cause his partner to get a transport event, which would cause him to
+ * close and reopen his queue, which would cause the original partition
+ * to get a transport event, etc., etc.  To prevent this, we don't
+ * actually close our queue if the client initiated the reset, (i.e.
+ * either we got a transport event or we have detected that the client's
+ * queue is gone)
+ *
+ * EXECUTION ENVIRONMENT:
+ *	Process environment, called with interrupt lock held
+ */
+static void ibmvscsis_reset_queue(struct scsi_info *vscsi, uint new_state)
+{
+	int bytes;
+	long rc = ADAPT_SUCCESS;
+
+	pr_debug("reset_queue: flags 0x%x\n", vscsi->flags);
+
+	/* don't reset, the client did it for us */
+	if (vscsi->flags & (CLIENT_FAILED | TRANS_EVENT)) {
+		vscsi->flags &= PRESERVE_FLAG_FIELDS;
+		vscsi->rsp_q_timer.timer_pops = 0;
+		vscsi->debit = 0;
+		vscsi->credit = 0;
+		vscsi->state = new_state;
+		vio_enable_interrupts(vscsi->dma_dev);
+	} else {
+		rc = ibmvscsis_free_command_q(vscsi);
+		if (rc == ADAPT_SUCCESS) {
+			vscsi->state = new_state;
+
+			bytes = vscsi->cmd_q.size * PAGE_SIZE;
+			rc = h_reg_crq(vscsi->dds.unit_id,
+				       vscsi->cmd_q.crq_token, bytes);
+			if (rc == H_CLOSED || rc == H_SUCCESS) {
+				rc = ibmvscsis_establish_new_q(vscsi,
+							       new_state);
+			}
+
+			if (rc != ADAPT_SUCCESS) {
+				pr_debug("reset_queue: reg_crq rc %ld\n", rc);
+
+				vscsi->state = ERR_DISCONNECTED;
+				vscsi->flags |= RESPONSE_Q_DOWN;
+				ibmvscsis_free_command_q(vscsi);
+			}
+		} else {
+			vscsi->state = ERR_DISCONNECTED;
+			vscsi->flags |= RESPONSE_Q_DOWN;
+		}
+	}
+}
+
+/**
+ * ibmvscsis_free_cmd_resources() - Free command resources
+ * @vscsi:	Pointer to our adapter structure
+ * @cmd:	Command which is not longer in use
+ *
+ * Must be called with interrupt lock held.
+ */
+static void ibmvscsis_free_cmd_resources(struct scsi_info *vscsi,
+					 struct ibmvscsis_cmd *cmd)
+{
+	struct iu_entry *iue = cmd->iue;
+
+	switch (cmd->type) {
+	case TASK_MANAGEMENT:
+	case SCSI_CDB:
+		/*
+		 * When the queue goes down this value is cleared, so it
+		 * cannot be cleared in this general purpose function.
+		 */
+		if (vscsi->debit)
+			vscsi->debit -= 1;
+		break;
+	case ADAPTER_MAD:
+		vscsi->flags &= ~PROCESSING_MAD;
+		break;
+	case UNSET_TYPE:
+		break;
+	default:
+		dev_err(&vscsi->dev, "free_cmd_resources unknown type %d\n",
+			cmd->type);
+		break;
+	}
+
+	cmd->iue = NULL;
+	list_add_tail(&cmd->list, &vscsi->free_cmd);
+	srp_iu_put(iue);
+
+	if (list_empty(&vscsi->active_q) && list_empty(&vscsi->schedule_q) &&
+	    list_empty(&vscsi->waiting_rsp) && (vscsi->flags & WAIT_FOR_IDLE)) {
+		vscsi->flags &= ~WAIT_FOR_IDLE;
+		complete(&vscsi->wait_idle);
+	}
+}
+
+/**
  * ibmvscsis_trans_event() - Handle a Transport Event
  * @vscsi:	Pointer to our adapter structure
  * @crq:	Pointer to CRQ entry containing the Transport Event
@@ -896,7 +1048,7 @@ static long ibmvscsis_trans_event(struct
 		}
 	}
 
-	rc =  vscsi->flags & SCHEDULE_DISCONNECT;
+	rc = vscsi->flags & SCHEDULE_DISCONNECT;
 
 	pr_debug("Leaving trans_event: flags 0x%x, state 0x%hx, rc %ld\n",
 		 vscsi->flags, vscsi->state, rc);
@@ -1221,7 +1373,7 @@ static long ibmvscsis_copy_crq_packet(st
  * @iue:	Information Unit containing the Adapter Info MAD request
  *
  * EXECUTION ENVIRONMENT:
- *	Interrupt adpater lock is held
+ *	Interrupt adapter lock is held
  */
 static long ibmvscsis_adapter_info(struct scsi_info *vscsi,
 				   struct iu_entry *iue)
@@ -1692,7 +1844,7 @@ static void ibmvscsis_send_mad_resp(stru
  * @crq:	Pointer to the CRQ entry containing the MAD request
  *
  * EXECUTION ENVIRONMENT:
- *	Interrupt  called with adapter lock held
+ *	Interrupt, called with adapter lock held
  */
 static long ibmvscsis_mad(struct scsi_info *vscsi, struct viosrp_crq *crq)
 {
@@ -1858,7 +2010,7 @@ static long ibmvscsis_srp_login_rej(stru
 		break;
 	case H_PERMISSION:
 		if (connection_broken(vscsi))
-			flag_bits =  RESPONSE_Q_DOWN | CLIENT_FAILED;
+			flag_bits = RESPONSE_Q_DOWN | CLIENT_FAILED;
 		dev_err(&vscsi->dev, "login_rej: error copying to client, rc %ld\n",
 			rc);
 		ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT_RECONNECT,
@@ -2181,156 +2333,6 @@ static long ibmvscsis_ping_response(stru
 }
 
 /**
- * ibmvscsis_handle_init_compl_msg() - Respond to an Init Complete Message
- * @vscsi:	Pointer to our adapter structure
- *
- * Must be called with interrupt lock held.
- */
-static long ibmvscsis_handle_init_compl_msg(struct scsi_info *vscsi)
-{
-	long rc = ADAPT_SUCCESS;
-
-	switch (vscsi->state) {
-	case NO_QUEUE:
-	case ERR_DISCONNECT:
-	case ERR_DISCONNECT_RECONNECT:
-	case ERR_DISCONNECTED:
-	case UNCONFIGURING:
-	case UNDEFINED:
-		rc = ERROR;
-		break;
-
-	case WAIT_CONNECTION:
-		vscsi->state = CONNECTED;
-		break;
-
-	case WAIT_IDLE:
-	case SRP_PROCESSING:
-	case CONNECTED:
-	case WAIT_ENABLED:
-	case PART_UP_WAIT_ENAB:
-	default:
-		rc = ERROR;
-		dev_err(&vscsi->dev, "init_msg: invalid state %d to get init compl msg\n",
-			vscsi->state);
-		ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT_RECONNECT, 0);
-		break;
-	}
-
-	return rc;
-}
-
-/**
- * ibmvscsis_handle_init_msg() - Respond to an Init Message
- * @vscsi:	Pointer to our adapter structure
- *
- * Must be called with interrupt lock held.
- */
-static long ibmvscsis_handle_init_msg(struct scsi_info *vscsi)
-{
-	long rc = ADAPT_SUCCESS;
-
-	switch (vscsi->state) {
-	case WAIT_ENABLED:
-		vscsi->state = PART_UP_WAIT_ENAB;
-		break;
-
-	case WAIT_CONNECTION:
-		rc = ibmvscsis_send_init_message(vscsi, INIT_COMPLETE_MSG);
-		switch (rc) {
-		case H_SUCCESS:
-			vscsi->state = CONNECTED;
-			break;
-
-		case H_PARAMETER:
-			dev_err(&vscsi->dev, "init_msg: failed to send, rc %ld\n",
-				rc);
-			ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT, 0);
-			break;
-
-		case H_DROPPED:
-			dev_err(&vscsi->dev, "init_msg: failed to send, rc %ld\n",
-				rc);
-			rc = ERROR;
-			ibmvscsis_post_disconnect(vscsi,
-						  ERR_DISCONNECT_RECONNECT, 0);
-			break;
-
-		case H_CLOSED:
-			pr_warn("init_msg: failed to send, rc %ld\n", rc);
-			rc = 0;
-			break;
-		}
-		break;
-
-	case UNDEFINED:
-		rc = ERROR;
-		break;
-
-	case UNCONFIGURING:
-		break;
-
-	case PART_UP_WAIT_ENAB:
-	case CONNECTED:
-	case SRP_PROCESSING:
-	case WAIT_IDLE:
-	case NO_QUEUE:
-	case ERR_DISCONNECT:
-	case ERR_DISCONNECT_RECONNECT:
-	case ERR_DISCONNECTED:
-	default:
-		rc = ERROR;
-		dev_err(&vscsi->dev, "init_msg: invalid state %d to get init msg\n",
-			vscsi->state);
-		ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT_RECONNECT, 0);
-		break;
-	}
-
-	return rc;
-}
-
-/**
- * ibmvscsis_init_msg() - Respond to an init message
- * @vscsi:	Pointer to our adapter structure
- * @crq:	Pointer to CRQ element containing the Init Message
- *
- * EXECUTION ENVIRONMENT:
- *	Interrupt, interrupt lock held
- */
-static long ibmvscsis_init_msg(struct scsi_info *vscsi, struct viosrp_crq *crq)
-{
-	long rc = ADAPT_SUCCESS;
-
-	pr_debug("init_msg: state 0x%hx\n", vscsi->state);
-
-	rc = h_vioctl(vscsi->dds.unit_id, H_GET_PARTNER_INFO,
-		      (u64)vscsi->map_ioba | ((u64)PAGE_SIZE << 32), 0, 0, 0,
-		      0);
-	if (rc == H_SUCCESS) {
-		vscsi->client_data.partition_number =
-			be64_to_cpu(*(u64 *)vscsi->map_buf);
-		pr_debug("init_msg, part num %d\n",
-			 vscsi->client_data.partition_number);
-	} else {
-		pr_debug("init_msg h_vioctl rc %ld\n", rc);
-		rc = ADAPT_SUCCESS;
-	}
-
-	if (crq->format == INIT_MSG) {
-		rc = ibmvscsis_handle_init_msg(vscsi);
-	} else if (crq->format == INIT_COMPLETE_MSG) {
-		rc = ibmvscsis_handle_init_compl_msg(vscsi);
-	} else {
-		rc = ERROR;
-		dev_err(&vscsi->dev, "init_msg: invalid format %d\n",
-			(uint)crq->format);
-		ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT_RECONNECT, 0);
-	}
-
-	return rc;
-}
-
-/**
  * ibmvscsis_parse_command() - Parse an element taken from the cmd rsp queue.
  * @vscsi:	Pointer to our adapter structure
  * @crq:	Pointer to CRQ element containing the SRP request
@@ -2385,7 +2387,7 @@ static long ibmvscsis_parse_command(stru
 		break;
 
 	case VALID_TRANS_EVENT:
-		rc =  ibmvscsis_trans_event(vscsi, crq);
+		rc = ibmvscsis_trans_event(vscsi, crq);
 		break;
 
 	case VALID_INIT_MSG:
@@ -3270,7 +3272,7 @@ static void ibmvscsis_handle_crq(unsigne
 	/*
 	 * if we are in a path where we are waiting for all pending commands
 	 * to complete because we received a transport event and anything in
-	 * the command queue is for a new connection,  do nothing
+	 * the command queue is for a new connection, do nothing
 	 */
 	if (TARGET_STOP(vscsi)) {
 		vio_enable_interrupts(vscsi->dma_dev);
@@ -3314,7 +3316,7 @@ cmd_work:
 				 * everything but transport events on the queue
 				 *
 				 * need to decrement the queue index so we can
-				 * look at the elment again
+				 * look at the element again
 				 */
 				if (vscsi->cmd_q.index)
 					vscsi->cmd_q.index -= 1;
@@ -3988,10 +3990,10 @@ static struct attribute *ibmvscsis_dev_a
 ATTRIBUTE_GROUPS(ibmvscsis_dev);
 
 static struct class ibmvscsis_class = {
-	.name           = "ibmvscsis",
-	.dev_release    = ibmvscsis_dev_release,
-	.class_attrs    = ibmvscsis_class_attrs,
-	.dev_groups     = ibmvscsis_dev_groups,
+	.name		= "ibmvscsis",
+	.dev_release	= ibmvscsis_dev_release,
+	.class_attrs	= ibmvscsis_class_attrs,
+	.dev_groups	= ibmvscsis_dev_groups,
 };
 
 static struct vio_device_id ibmvscsis_device_table[] = {

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 50/93] scsi: ibmvscsis: Synchronize cmds at tpg_enable_store time
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (44 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 49/93] scsi: ibmvscsis: Rearrange functions for future patches Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 51/93] scsi: ibmvscsis: Synchronize cmds at remove time Greg Kroah-Hartman
                   ` (43 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Michael Cyr, Bryant G. Ly,
	Steven Royer, Martin K. Petersen, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michael Cyr <mikecyr@us.ibm.com>

[ Upstream commit c9b3379f60a83288a5e2f8ea75476460978689b0 ]

This patch changes the way the IBM vSCSI server driver manages its
Command/Response Queue (CRQ).  We used to register the CRQ with phyp at
probe time.  Now we wait until tpg_enable_store.  Similarly, when
tpg_enable_store is called to "disable" (i.e. the stored value is 0),
we unregister the queue with phyp.

One consquence to this is that we have no need for the PART_UP_WAIT_ENAB
state, since we can't get an Init Message from the client in our CRQ if
we're waiting to be enabled, since we haven't registered the queue yet.

Signed-off-by: Michael Cyr <mikecyr@us.ibm.com>
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c |  224 +++++--------------------------
 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h |    2 
 2 files changed, 38 insertions(+), 188 deletions(-)

--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
@@ -62,8 +62,6 @@ static long ibmvscsis_parse_command(stru
 
 static void ibmvscsis_adapter_idle(struct scsi_info *vscsi);
 
-static void ibmvscsis_reset_queue(struct scsi_info *vscsi, uint new_state);
-
 static void ibmvscsis_determine_resid(struct se_cmd *se_cmd,
 				      struct srp_rsp *rsp)
 {
@@ -418,7 +416,6 @@ static void ibmvscsis_disconnect(struct
 					       proc_work);
 	u16 new_state;
 	bool wait_idle = false;
-	long rc = ADAPT_SUCCESS;
 
 	spin_lock_bh(&vscsi->intr_lock);
 	new_state = vscsi->new_state;
@@ -471,30 +468,12 @@ static void ibmvscsis_disconnect(struct
 			vscsi->state = new_state;
 		break;
 
-	/*
-	 * If this is a transition into an error state.
-	 * a client is attempting to establish a connection
-	 * and has violated the RPA protocol.
-	 * There can be nothing pending on the adapter although
-	 * there can be requests in the command queue.
-	 */
 	case WAIT_ENABLED:
-	case PART_UP_WAIT_ENAB:
 		switch (new_state) {
+		/* should never happen */
 		case ERR_DISCONNECT:
-			vscsi->flags |= RESPONSE_Q_DOWN;
-			vscsi->state = new_state;
-			vscsi->flags &= ~(SCHEDULE_DISCONNECT |
-					  DISCONNECT_SCHEDULED);
-			ibmvscsis_free_command_q(vscsi);
-			break;
 		case ERR_DISCONNECT_RECONNECT:
-			ibmvscsis_reset_queue(vscsi, WAIT_ENABLED);
-			break;
-
-		/* should never happen */
 		case WAIT_IDLE:
-			rc = ERROR;
 			dev_err(&vscsi->dev, "disconnect: invalid state %d for WAIT_IDLE\n",
 				vscsi->state);
 			break;
@@ -631,7 +610,6 @@ static void ibmvscsis_post_disconnect(st
 			break;
 
 		case WAIT_ENABLED:
-		case PART_UP_WAIT_ENAB:
 		case WAIT_IDLE:
 		case WAIT_CONNECTION:
 		case CONNECTED:
@@ -676,7 +654,6 @@ static long ibmvscsis_handle_init_compl_
 	case SRP_PROCESSING:
 	case CONNECTED:
 	case WAIT_ENABLED:
-	case PART_UP_WAIT_ENAB:
 	default:
 		rc = ERROR;
 		dev_err(&vscsi->dev, "init_msg: invalid state %d to get init compl msg\n",
@@ -699,10 +676,6 @@ static long ibmvscsis_handle_init_msg(st
 	long rc = ADAPT_SUCCESS;
 
 	switch (vscsi->state) {
-	case WAIT_ENABLED:
-		vscsi->state = PART_UP_WAIT_ENAB;
-		break;
-
 	case WAIT_CONNECTION:
 		rc = ibmvscsis_send_init_message(vscsi, INIT_COMPLETE_MSG);
 		switch (rc) {
@@ -738,7 +711,7 @@ static long ibmvscsis_handle_init_msg(st
 	case UNCONFIGURING:
 		break;
 
-	case PART_UP_WAIT_ENAB:
+	case WAIT_ENABLED:
 	case CONNECTED:
 	case SRP_PROCESSING:
 	case WAIT_IDLE:
@@ -801,11 +774,10 @@ static long ibmvscsis_init_msg(struct sc
 /**
  * ibmvscsis_establish_new_q() - Establish new CRQ queue
  * @vscsi:	Pointer to our adapter structure
- * @new_state:	New state being established after resetting the queue
  *
  * Must be called with interrupt lock held.
  */
-static long ibmvscsis_establish_new_q(struct scsi_info *vscsi, uint new_state)
+static long ibmvscsis_establish_new_q(struct scsi_info *vscsi)
 {
 	long rc = ADAPT_SUCCESS;
 	uint format;
@@ -817,19 +789,19 @@ static long ibmvscsis_establish_new_q(st
 
 	rc = vio_enable_interrupts(vscsi->dma_dev);
 	if (rc) {
-		pr_warn("reset_queue: failed to enable interrupts, rc %ld\n",
+		pr_warn("establish_new_q: failed to enable interrupts, rc %ld\n",
 			rc);
 		return rc;
 	}
 
 	rc = ibmvscsis_check_init_msg(vscsi, &format);
 	if (rc) {
-		dev_err(&vscsi->dev, "reset_queue: check_init_msg failed, rc %ld\n",
+		dev_err(&vscsi->dev, "establish_new_q: check_init_msg failed, rc %ld\n",
 			rc);
 		return rc;
 	}
 
-	if (format == UNUSED_FORMAT && new_state == WAIT_CONNECTION) {
+	if (format == UNUSED_FORMAT) {
 		rc = ibmvscsis_send_init_message(vscsi, INIT_MSG);
 		switch (rc) {
 		case H_SUCCESS:
@@ -847,6 +819,8 @@ static long ibmvscsis_establish_new_q(st
 			rc = H_HARDWARE;
 			break;
 		}
+	} else if (format == INIT_MSG) {
+		rc = ibmvscsis_handle_init_msg(vscsi);
 	}
 
 	return rc;
@@ -855,7 +829,6 @@ static long ibmvscsis_establish_new_q(st
 /**
  * ibmvscsis_reset_queue() - Reset CRQ Queue
  * @vscsi:	Pointer to our adapter structure
- * @new_state:	New state to establish after resetting the queue
  *
  * This function calls h_free_q and then calls h_reg_q and does all
  * of the bookkeeping to get us back to where we can communicate.
@@ -872,7 +845,7 @@ static long ibmvscsis_establish_new_q(st
  * EXECUTION ENVIRONMENT:
  *	Process environment, called with interrupt lock held
  */
-static void ibmvscsis_reset_queue(struct scsi_info *vscsi, uint new_state)
+static void ibmvscsis_reset_queue(struct scsi_info *vscsi)
 {
 	int bytes;
 	long rc = ADAPT_SUCCESS;
@@ -885,19 +858,18 @@ static void ibmvscsis_reset_queue(struct
 		vscsi->rsp_q_timer.timer_pops = 0;
 		vscsi->debit = 0;
 		vscsi->credit = 0;
-		vscsi->state = new_state;
+		vscsi->state = WAIT_CONNECTION;
 		vio_enable_interrupts(vscsi->dma_dev);
 	} else {
 		rc = ibmvscsis_free_command_q(vscsi);
 		if (rc == ADAPT_SUCCESS) {
-			vscsi->state = new_state;
+			vscsi->state = WAIT_CONNECTION;
 
 			bytes = vscsi->cmd_q.size * PAGE_SIZE;
 			rc = h_reg_crq(vscsi->dds.unit_id,
 				       vscsi->cmd_q.crq_token, bytes);
 			if (rc == H_CLOSED || rc == H_SUCCESS) {
-				rc = ibmvscsis_establish_new_q(vscsi,
-							       new_state);
+				rc = ibmvscsis_establish_new_q(vscsi);
 			}
 
 			if (rc != ADAPT_SUCCESS) {
@@ -1016,10 +988,6 @@ static long ibmvscsis_trans_event(struct
 						   TRANS_EVENT));
 			break;
 
-		case PART_UP_WAIT_ENAB:
-			vscsi->state = WAIT_ENABLED;
-			break;
-
 		case SRP_PROCESSING:
 			if ((vscsi->debit > 0) ||
 			    !list_empty(&vscsi->schedule_q) ||
@@ -1220,15 +1188,18 @@ static void ibmvscsis_adapter_idle(struc
 
 	switch (vscsi->state) {
 	case ERR_DISCONNECT_RECONNECT:
-		ibmvscsis_reset_queue(vscsi, WAIT_CONNECTION);
+		ibmvscsis_reset_queue(vscsi);
 		pr_debug("adapter_idle, disc_rec: flags 0x%x\n", vscsi->flags);
 		break;
 
 	case ERR_DISCONNECT:
 		ibmvscsis_free_command_q(vscsi);
-		vscsi->flags &= ~DISCONNECT_SCHEDULED;
+		vscsi->flags &= ~(SCHEDULE_DISCONNECT | DISCONNECT_SCHEDULED);
 		vscsi->flags |= RESPONSE_Q_DOWN;
-		vscsi->state = ERR_DISCONNECTED;
+		if (vscsi->tport.enabled)
+			vscsi->state = ERR_DISCONNECTED;
+		else
+			vscsi->state = WAIT_ENABLED;
 		pr_debug("adapter_idle, disc: flags 0x%x, state 0x%hx\n",
 			 vscsi->flags, vscsi->state);
 		break;
@@ -1773,8 +1744,8 @@ static void ibmvscsis_send_messages(stru
 					be64_to_cpu(msg_hi),
 					be64_to_cpu(cmd->rsp.tag));
 
-			pr_debug("send_messages: tag 0x%llx, rc %ld\n",
-				 be64_to_cpu(cmd->rsp.tag), rc);
+			pr_debug("send_messages: cmd %p, tag 0x%llx, rc %ld\n",
+				 cmd, be64_to_cpu(cmd->rsp.tag), rc);
 
 			/* if all ok free up the command element resources */
 			if (rc == H_SUCCESS) {
@@ -2788,36 +2759,6 @@ static irqreturn_t ibmvscsis_interrupt(i
 }
 
 /**
- * ibmvscsis_check_q() - Helper function to Check Init Message Valid
- * @vscsi:	Pointer to our adapter structure
- *
- * Checks if a initialize message was queued by the initiatior
- * while the timing window was open.  This function is called from
- * probe after the CRQ is created and interrupts are enabled.
- * It would only be used by adapters who wait for some event before
- * completing the init handshake with the client.  For ibmvscsi, this
- * event is waiting for the port to be enabled.
- *
- * EXECUTION ENVIRONMENT:
- *	Process level only, interrupt lock held
- */
-static long ibmvscsis_check_q(struct scsi_info *vscsi)
-{
-	uint format;
-	long rc;
-
-	rc = ibmvscsis_check_init_msg(vscsi, &format);
-	if (rc)
-		ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT_RECONNECT, 0);
-	else if (format == UNUSED_FORMAT)
-		vscsi->state = WAIT_ENABLED;
-	else
-		vscsi->state = PART_UP_WAIT_ENAB;
-
-	return rc;
-}
-
-/**
  * ibmvscsis_enable_change_state() - Set new state based on enabled status
  * @vscsi:	Pointer to our adapter structure
  *
@@ -2828,77 +2769,19 @@ static long ibmvscsis_check_q(struct scs
  */
 static long ibmvscsis_enable_change_state(struct scsi_info *vscsi)
 {
+	int bytes;
 	long rc = ADAPT_SUCCESS;
 
-handle_state_change:
-	switch (vscsi->state) {
-	case WAIT_ENABLED:
-		rc = ibmvscsis_send_init_message(vscsi, INIT_MSG);
-		switch (rc) {
-		case H_SUCCESS:
-		case H_DROPPED:
-		case H_CLOSED:
-			vscsi->state =  WAIT_CONNECTION;
-			rc = ADAPT_SUCCESS;
-			break;
-
-		case H_PARAMETER:
-			break;
-
-		case H_HARDWARE:
-			break;
-
-		default:
-			vscsi->state = UNDEFINED;
-			rc = H_HARDWARE;
-			break;
-		}
-		break;
-	case PART_UP_WAIT_ENAB:
-		rc = ibmvscsis_send_init_message(vscsi, INIT_COMPLETE_MSG);
-		switch (rc) {
-		case H_SUCCESS:
-			vscsi->state = CONNECTED;
-			rc = ADAPT_SUCCESS;
-			break;
-
-		case H_DROPPED:
-		case H_CLOSED:
-			vscsi->state = WAIT_ENABLED;
-			goto handle_state_change;
-
-		case H_PARAMETER:
-			break;
-
-		case H_HARDWARE:
-			break;
-
-		default:
-			rc = H_HARDWARE;
-			break;
-		}
-		break;
-
-	case WAIT_CONNECTION:
-	case WAIT_IDLE:
-	case SRP_PROCESSING:
-	case CONNECTED:
-		rc = ADAPT_SUCCESS;
-		break;
-		/* should not be able to get here */
-	case UNCONFIGURING:
-		rc = ERROR;
-		vscsi->state = UNDEFINED;
-		break;
+	bytes = vscsi->cmd_q.size * PAGE_SIZE;
+	rc = h_reg_crq(vscsi->dds.unit_id, vscsi->cmd_q.crq_token, bytes);
+	if (rc == H_CLOSED || rc == H_SUCCESS) {
+		vscsi->state = WAIT_CONNECTION;
+		rc = ibmvscsis_establish_new_q(vscsi);
+	}
 
-		/* driver should never allow this to happen */
-	case ERR_DISCONNECT:
-	case ERR_DISCONNECT_RECONNECT:
-	default:
-		dev_err(&vscsi->dev, "in invalid state %d during enable_change_state\n",
-			vscsi->state);
-		rc = ADAPT_SUCCESS;
-		break;
+	if (rc != ADAPT_SUCCESS) {
+		vscsi->state = ERR_DISCONNECTED;
+		vscsi->flags |= RESPONSE_Q_DOWN;
 	}
 
 	return rc;
@@ -2918,7 +2801,6 @@ handle_state_change:
  */
 static long ibmvscsis_create_command_q(struct scsi_info *vscsi, int num_cmds)
 {
-	long rc = 0;
 	int pages;
 	struct vio_dev *vdev = vscsi->dma_dev;
 
@@ -2942,22 +2824,7 @@ static long ibmvscsis_create_command_q(s
 		return -ENOMEM;
 	}
 
-	rc =  h_reg_crq(vscsi->dds.unit_id, vscsi->cmd_q.crq_token, PAGE_SIZE);
-	if (rc) {
-		if (rc == H_CLOSED) {
-			vscsi->state = WAIT_ENABLED;
-			rc = 0;
-		} else {
-			dma_unmap_single(&vdev->dev, vscsi->cmd_q.crq_token,
-					 PAGE_SIZE, DMA_BIDIRECTIONAL);
-			free_page((unsigned long)vscsi->cmd_q.base_addr);
-			rc = -ENODEV;
-		}
-	} else {
-		vscsi->state = WAIT_ENABLED;
-	}
-
-	return rc;
+	return 0;
 }
 
 /**
@@ -3491,31 +3358,12 @@ static int ibmvscsis_probe(struct vio_de
 		goto destroy_WQ;
 	}
 
-	spin_lock_bh(&vscsi->intr_lock);
-	vio_enable_interrupts(vdev);
-	if (rc) {
-		dev_err(&vscsi->dev, "enabling interrupts failed, rc %d\n", rc);
-		rc = -ENODEV;
-		spin_unlock_bh(&vscsi->intr_lock);
-		goto free_irq;
-	}
-
-	if (ibmvscsis_check_q(vscsi)) {
-		rc = ERROR;
-		dev_err(&vscsi->dev, "probe: check_q failed, rc %d\n", rc);
-		spin_unlock_bh(&vscsi->intr_lock);
-		goto disable_interrupt;
-	}
-	spin_unlock_bh(&vscsi->intr_lock);
+	vscsi->state = WAIT_ENABLED;
 
 	dev_set_drvdata(&vdev->dev, vscsi);
 
 	return 0;
 
-disable_interrupt:
-	vio_disable_interrupts(vdev);
-free_irq:
-	free_irq(vdev->irq, vscsi);
 destroy_WQ:
 	destroy_workqueue(vscsi->work_q);
 unmap_buf:
@@ -3909,18 +3757,22 @@ static ssize_t ibmvscsis_tpg_enable_stor
 	}
 
 	if (tmp) {
-		tport->enabled = true;
 		spin_lock_bh(&vscsi->intr_lock);
+		tport->enabled = true;
 		lrc = ibmvscsis_enable_change_state(vscsi);
 		if (lrc)
 			pr_err("enable_change_state failed, rc %ld state %d\n",
 			       lrc, vscsi->state);
 		spin_unlock_bh(&vscsi->intr_lock);
 	} else {
+		spin_lock_bh(&vscsi->intr_lock);
 		tport->enabled = false;
+		/* This simulates the server going down */
+		ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT, 0);
+		spin_unlock_bh(&vscsi->intr_lock);
 	}
 
-	pr_debug("tpg_enable_store, state %d\n", vscsi->state);
+	pr_debug("tpg_enable_store, tmp %ld, state %d\n", tmp, vscsi->state);
 
 	return count;
 }
--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h
@@ -204,8 +204,6 @@ struct scsi_info {
 	struct list_head waiting_rsp;
 #define NO_QUEUE                    0x00
 #define WAIT_ENABLED                0X01
-	/* driver has received an initialize command */
-#define PART_UP_WAIT_ENAB           0x02
 #define WAIT_CONNECTION             0x04
 	/* have established a connection */
 #define CONNECTED                   0x08

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 51/93] scsi: ibmvscsis: Synchronize cmds at remove time
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (45 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 50/93] scsi: ibmvscsis: Synchronize cmds at tpg_enable_store time Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 52/93] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic Greg Kroah-Hartman
                   ` (42 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Michael Cyr, Bryant G. Ly,
	Steven Royer, Martin K. Petersen, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michael Cyr <mikecyr@us.ibm.com>

[ Upstream commit 8bf11557d44d00562360d370de8aa70ba89aa0d5 ]

This patch adds code to disconnect from the client, which will make sure
any outstanding commands have been completed, before continuing on with
the remove operation.

Signed-off-by: Michael Cyr <mikecyr@us.ibm.com>
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c |   39 +++++++++++++++++++++++++++----
 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h |    3 ++
 2 files changed, 37 insertions(+), 5 deletions(-)

--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
@@ -470,6 +470,18 @@ static void ibmvscsis_disconnect(struct
 
 	case WAIT_ENABLED:
 		switch (new_state) {
+		case UNCONFIGURING:
+			vscsi->state = new_state;
+			vscsi->flags |= RESPONSE_Q_DOWN;
+			vscsi->flags &= ~(SCHEDULE_DISCONNECT |
+					  DISCONNECT_SCHEDULED);
+			dma_rmb();
+			if (vscsi->flags & CFG_SLEEPING) {
+				vscsi->flags &= ~CFG_SLEEPING;
+				complete(&vscsi->unconfig);
+			}
+			break;
+
 		/* should never happen */
 		case ERR_DISCONNECT:
 		case ERR_DISCONNECT_RECONNECT:
@@ -482,6 +494,13 @@ static void ibmvscsis_disconnect(struct
 
 	case WAIT_IDLE:
 		switch (new_state) {
+		case UNCONFIGURING:
+			vscsi->flags |= RESPONSE_Q_DOWN;
+			vscsi->state = new_state;
+			vscsi->flags &= ~(SCHEDULE_DISCONNECT |
+					  DISCONNECT_SCHEDULED);
+			ibmvscsis_free_command_q(vscsi);
+			break;
 		case ERR_DISCONNECT:
 		case ERR_DISCONNECT_RECONNECT:
 			vscsi->state = new_state;
@@ -1187,6 +1206,15 @@ static void ibmvscsis_adapter_idle(struc
 		free_qs = true;
 
 	switch (vscsi->state) {
+	case UNCONFIGURING:
+		ibmvscsis_free_command_q(vscsi);
+		dma_rmb();
+		isync();
+		if (vscsi->flags & CFG_SLEEPING) {
+			vscsi->flags &= ~CFG_SLEEPING;
+			complete(&vscsi->unconfig);
+		}
+		break;
 	case ERR_DISCONNECT_RECONNECT:
 		ibmvscsis_reset_queue(vscsi);
 		pr_debug("adapter_idle, disc_rec: flags 0x%x\n", vscsi->flags);
@@ -3342,6 +3370,7 @@ static int ibmvscsis_probe(struct vio_de
 		     (unsigned long)vscsi);
 
 	init_completion(&vscsi->wait_idle);
+	init_completion(&vscsi->unconfig);
 
 	snprintf(wq_name, 24, "ibmvscsis%s", dev_name(&vdev->dev));
 	vscsi->work_q = create_workqueue(wq_name);
@@ -3397,10 +3426,11 @@ static int ibmvscsis_remove(struct vio_d
 
 	pr_debug("remove (%s)\n", dev_name(&vscsi->dma_dev->dev));
 
-	/*
-	 * TBD: Need to handle if there are commands on the waiting_rsp q
-	 *      Actually, can there still be cmds outstanding to tcm?
-	 */
+	spin_lock_bh(&vscsi->intr_lock);
+	ibmvscsis_post_disconnect(vscsi, UNCONFIGURING, 0);
+	vscsi->flags |= CFG_SLEEPING;
+	spin_unlock_bh(&vscsi->intr_lock);
+	wait_for_completion(&vscsi->unconfig);
 
 	vio_disable_interrupts(vdev);
 	free_irq(vdev->irq, vscsi);
@@ -3409,7 +3439,6 @@ static int ibmvscsis_remove(struct vio_d
 			 DMA_BIDIRECTIONAL);
 	kfree(vscsi->map_buf);
 	tasklet_kill(&vscsi->work_task);
-	ibmvscsis_unregister_command_q(vscsi);
 	ibmvscsis_destroy_command_q(vscsi);
 	ibmvscsis_freetimer(vscsi);
 	ibmvscsis_free_cmds(vscsi);
--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h
@@ -257,6 +257,8 @@ struct scsi_info {
 #define SCHEDULE_DISCONNECT           0x00400
 	/* disconnect handler is scheduled */
 #define DISCONNECT_SCHEDULED          0x00800
+	/* remove function is sleeping */
+#define CFG_SLEEPING                  0x01000
 	u32 flags;
 	/* adapter lock */
 	spinlock_t intr_lock;
@@ -285,6 +287,7 @@ struct scsi_info {
 
 	struct workqueue_struct *work_q;
 	struct completion wait_idle;
+	struct completion unconfig;
 	struct device dev;
 	struct vio_dev *dma_dev;
 	struct srp_target target;

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 52/93] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (46 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 51/93] scsi: ibmvscsis: Synchronize cmds at remove time Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 53/93] PCI: Separate VF BAR updates from standard BAR updates Greg Kroah-Hartman
                   ` (41 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Vitaly Kuznetsov, K. Y. Srinivasan,
	devel, Haiyang Zhang, Thomas Gleixner, Ingo Molnar, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vitaly Kuznetsov <vkuznets@redhat.com>

[ Upstream commit 59107e2f48831daedc46973ce4988605ab066de3 ]

There is a feature in Hyper-V ('Debug-VM --InjectNonMaskableInterrupt')
which injects NMI to the guest. We may want to crash the guest and do kdump
on this NMI by enabling unknown_nmi_panic. To make kdump succeed we need to
allow the kdump kernel to re-establish VMBus connection so it will see
VMBus devices (storage, network,..).

To properly unload VMBus making it possible to start over during kdump we
need to do the following:

 - Send an 'unload' message to the hypervisor. This can be done on any CPU
   so we do this the crashing CPU.

 - Receive the 'unload finished' reply message. WS2012R2 delivers this
   message to the CPU which was used to establish VMBus connection during
   module load and this CPU may differ from the CPU sending 'unload'.

Receiving a VMBus message means the following:

 - There is a per-CPU slot in memory for one message. This slot can in
   theory be accessed by any CPU.

 - We get an interrupt on the CPU when a message was placed into the slot.

 - When we read the message we need to clear the slot and signal the fact
   to the hypervisor. In case there are more messages to this CPU pending
   the hypervisor will deliver the next message. The signaling is done by
   writing to an MSR so this can only be done on the appropriate CPU.

To avoid doing cross-CPU work on crash we have vmbus_wait_for_unload()
function which checks message slots for all CPUs in a loop waiting for the
'unload finished' messages. However, there is an issue which arises when
these conditions are met:

 - We're crashing on a CPU which is different from the one which was used
   to initially contact the hypervisor.

 - The CPU which was used for the initial contact is blocked with interrupts
   disabled and there is a message pending in the message slot.

In this case we won't be able to read the 'unload finished' message on the
crashing CPU. This is reproducible when we receive unknown NMIs on all CPUs
simultaneously: the first CPU entering panic() will proceed to crash and
all other CPUs will stop themselves with interrupts disabled.

The suggested solution is to handle unknown NMIs for Hyper-V guests on the
first CPU which gets them only. This will allow us to rely on VMBus
interrupt handler being able to receive the 'unload finish' message in
case it is delivered to a different CPU.

The issue is not reproducible on WS2016 as Debug-VM delivers NMI to the
boot CPU only, WS2012R2 and earlier Hyper-V versions are affected.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: devel@linuxdriverproject.org
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Link: http://lkml.kernel.org/r/20161202100720.28121-1-vkuznets@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/mshyperv.c |   24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -31,6 +31,7 @@
 #include <asm/apic.h>
 #include <asm/timer.h>
 #include <asm/reboot.h>
+#include <asm/nmi.h>
 
 struct ms_hyperv_info ms_hyperv;
 EXPORT_SYMBOL_GPL(ms_hyperv);
@@ -158,6 +159,26 @@ static unsigned char hv_get_nmi_reason(v
 	return 0;
 }
 
+#ifdef CONFIG_X86_LOCAL_APIC
+/*
+ * Prior to WS2016 Debug-VM sends NMIs to all CPUs which makes
+ * it dificult to process CHANNELMSG_UNLOAD in case of crash. Handle
+ * unknown NMI on the first CPU which gets it.
+ */
+static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs)
+{
+	static atomic_t nmi_cpu = ATOMIC_INIT(-1);
+
+	if (!unknown_nmi_panic)
+		return NMI_DONE;
+
+	if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)
+		return NMI_HANDLED;
+
+	return NMI_DONE;
+}
+#endif
+
 static void __init ms_hyperv_init_platform(void)
 {
 	/*
@@ -183,6 +204,9 @@ static void __init ms_hyperv_init_platfo
 		pr_info("HyperV: LAPIC Timer Frequency: %#x\n",
 			lapic_timer_frequency);
 	}
+
+	register_nmi_handler(NMI_UNKNOWN, hv_nmi_unknown, NMI_FLAG_FIRST,
+			     "hv_nmi_unknown");
 #endif
 
 	if (ms_hyperv.features & HV_X64_MSR_TIME_REF_COUNT_AVAILABLE)

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 53/93] PCI: Separate VF BAR updates from standard BAR updates
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (47 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 52/93] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 54/93] PCI: Remove pci_resource_bar() and pci_iov_resource_bar() Greg Kroah-Hartman
                   ` (40 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Bjorn Helgaas, Gavin Shan, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Bjorn Helgaas <bhelgaas@google.com>

[ Upstream commit 6ffa2489c51da77564a0881a73765ea2169f955d ]

Previously pci_update_resource() used the same code path for updating
standard BARs and VF BARs in SR-IOV capabilities.

Split the VF BAR update into a new pci_iov_update_resource() internal
interface, which makes it simpler to compute the BAR address (we can get
rid of pci_resource_bar() and pci_iov_resource_bar()).

This patch:

  - Renames pci_update_resource() to pci_std_update_resource(),
  - Adds pci_iov_update_resource(),
  - Makes pci_update_resource() a wrapper that calls the appropriate one,

No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/iov.c       |   50 ++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/pci/pci.h       |    1 
 drivers/pci/setup-res.c |   13 ++++++++++--
 3 files changed, 62 insertions(+), 2 deletions(-)

--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -571,6 +571,56 @@ int pci_iov_resource_bar(struct pci_dev
 		4 * (resno - PCI_IOV_RESOURCES);
 }
 
+/**
+ * pci_iov_update_resource - update a VF BAR
+ * @dev: the PCI device
+ * @resno: the resource number
+ *
+ * Update a VF BAR in the SR-IOV capability of a PF.
+ */
+void pci_iov_update_resource(struct pci_dev *dev, int resno)
+{
+	struct pci_sriov *iov = dev->is_physfn ? dev->sriov : NULL;
+	struct resource *res = dev->resource + resno;
+	int vf_bar = resno - PCI_IOV_RESOURCES;
+	struct pci_bus_region region;
+	u32 new;
+	int reg;
+
+	/*
+	 * The generic pci_restore_bars() path calls this for all devices,
+	 * including VFs and non-SR-IOV devices.  If this is not a PF, we
+	 * have nothing to do.
+	 */
+	if (!iov)
+		return;
+
+	/*
+	 * Ignore unimplemented BARs, unused resource slots for 64-bit
+	 * BARs, and non-movable resources, e.g., those described via
+	 * Enhanced Allocation.
+	 */
+	if (!res->flags)
+		return;
+
+	if (res->flags & IORESOURCE_UNSET)
+		return;
+
+	if (res->flags & IORESOURCE_PCI_FIXED)
+		return;
+
+	pcibios_resource_to_bus(dev->bus, &region, res);
+	new = region.start;
+	new |= res->flags & ~PCI_BASE_ADDRESS_MEM_MASK;
+
+	reg = iov->pos + PCI_SRIOV_BAR + 4 * vf_bar;
+	pci_write_config_dword(dev, reg, new);
+	if (res->flags & IORESOURCE_MEM_64) {
+		new = region.start >> 16 >> 16;
+		pci_write_config_dword(dev, reg + 4, new);
+	}
+}
+
 resource_size_t __weak pcibios_iov_resource_alignment(struct pci_dev *dev,
 						      int resno)
 {
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -290,6 +290,7 @@ static inline void pci_restore_ats_state
 int pci_iov_init(struct pci_dev *dev);
 void pci_iov_release(struct pci_dev *dev);
 int pci_iov_resource_bar(struct pci_dev *dev, int resno);
+void pci_iov_update_resource(struct pci_dev *dev, int resno);
 resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno);
 void pci_restore_iov_state(struct pci_dev *dev);
 int pci_iov_bus_range(struct pci_bus *bus);
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -25,8 +25,7 @@
 #include <linux/slab.h>
 #include "pci.h"
 
-
-void pci_update_resource(struct pci_dev *dev, int resno)
+static void pci_std_update_resource(struct pci_dev *dev, int resno)
 {
 	struct pci_bus_region region;
 	bool disable;
@@ -110,6 +109,16 @@ void pci_update_resource(struct pci_dev
 		pci_write_config_word(dev, PCI_COMMAND, cmd);
 }
 
+void pci_update_resource(struct pci_dev *dev, int resno)
+{
+	if (resno <= PCI_ROM_RESOURCE)
+		pci_std_update_resource(dev, resno);
+#ifdef CONFIG_PCI_IOV
+	else if (resno >= PCI_IOV_RESOURCES && resno <= PCI_IOV_RESOURCE_END)
+		pci_iov_update_resource(dev, resno);
+#endif
+}
+
 int pci_claim_resource(struct pci_dev *dev, int resource)
 {
 	struct resource *res = &dev->resource[resource];

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 54/93] PCI: Remove pci_resource_bar() and pci_iov_resource_bar()
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (48 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 53/93] PCI: Separate VF BAR updates from standard BAR updates Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 55/93] PCI: Add comments about ROM BAR updating Greg Kroah-Hartman
                   ` (39 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Bjorn Helgaas, Gavin Shan, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Bjorn Helgaas <bhelgaas@google.com>

[ Upstream commit 286c2378aaccc7343ebf17ec6cd86567659caf70 ]

pci_std_update_resource() only deals with standard BARs, so we don't have
to worry about the complications of VF BARs in an SR-IOV capability.

Compute the BAR address inline and remove pci_resource_bar().  That makes
pci_iov_resource_bar() unused, so remove that as well.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/iov.c       |   18 ------------------
 drivers/pci/pci.c       |   30 ------------------------------
 drivers/pci/pci.h       |    6 ------
 drivers/pci/setup-res.c |   13 +++++++------
 4 files changed, 7 insertions(+), 60 deletions(-)

--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -554,24 +554,6 @@ void pci_iov_release(struct pci_dev *dev
 }
 
 /**
- * pci_iov_resource_bar - get position of the SR-IOV BAR
- * @dev: the PCI device
- * @resno: the resource number
- *
- * Returns position of the BAR encapsulated in the SR-IOV capability.
- */
-int pci_iov_resource_bar(struct pci_dev *dev, int resno)
-{
-	if (resno < PCI_IOV_RESOURCES || resno > PCI_IOV_RESOURCE_END)
-		return 0;
-
-	BUG_ON(!dev->is_physfn);
-
-	return dev->sriov->pos + PCI_SRIOV_BAR +
-		4 * (resno - PCI_IOV_RESOURCES);
-}
-
-/**
  * pci_iov_update_resource - update a VF BAR
  * @dev: the PCI device
  * @resno: the resource number
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4835,36 +4835,6 @@ int pci_select_bars(struct pci_dev *dev,
 }
 EXPORT_SYMBOL(pci_select_bars);
 
-/**
- * pci_resource_bar - get position of the BAR associated with a resource
- * @dev: the PCI device
- * @resno: the resource number
- * @type: the BAR type to be filled in
- *
- * Returns BAR position in config space, or 0 if the BAR is invalid.
- */
-int pci_resource_bar(struct pci_dev *dev, int resno, enum pci_bar_type *type)
-{
-	int reg;
-
-	if (resno < PCI_ROM_RESOURCE) {
-		*type = pci_bar_unknown;
-		return PCI_BASE_ADDRESS_0 + 4 * resno;
-	} else if (resno == PCI_ROM_RESOURCE) {
-		*type = pci_bar_mem32;
-		return dev->rom_base_reg;
-	} else if (resno < PCI_BRIDGE_RESOURCES) {
-		/* device specific resource */
-		*type = pci_bar_unknown;
-		reg = pci_iov_resource_bar(dev, resno);
-		if (reg)
-			return reg;
-	}
-
-	dev_err(&dev->dev, "BAR %d: invalid resource\n", resno);
-	return 0;
-}
-
 /* Some architectures require additional programming to enable VGA */
 static arch_set_vga_state_t arch_set_vga_state;
 
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -245,7 +245,6 @@ bool pci_bus_read_dev_vendor_id(struct p
 int pci_setup_device(struct pci_dev *dev);
 int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
 		    struct resource *res, unsigned int reg);
-int pci_resource_bar(struct pci_dev *dev, int resno, enum pci_bar_type *type);
 void pci_configure_ari(struct pci_dev *dev);
 void __pci_bus_size_bridges(struct pci_bus *bus,
 			struct list_head *realloc_head);
@@ -289,7 +288,6 @@ static inline void pci_restore_ats_state
 #ifdef CONFIG_PCI_IOV
 int pci_iov_init(struct pci_dev *dev);
 void pci_iov_release(struct pci_dev *dev);
-int pci_iov_resource_bar(struct pci_dev *dev, int resno);
 void pci_iov_update_resource(struct pci_dev *dev, int resno);
 resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno);
 void pci_restore_iov_state(struct pci_dev *dev);
@@ -304,10 +302,6 @@ static inline void pci_iov_release(struc
 
 {
 }
-static inline int pci_iov_resource_bar(struct pci_dev *dev, int resno)
-{
-	return 0;
-}
 static inline void pci_restore_iov_state(struct pci_dev *dev)
 {
 }
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -32,7 +32,6 @@ static void pci_std_update_resource(stru
 	u16 cmd;
 	u32 new, check, mask;
 	int reg;
-	enum pci_bar_type type;
 	struct resource *res = dev->resource + resno;
 
 	if (dev->is_virtfn) {
@@ -66,14 +65,16 @@ static void pci_std_update_resource(stru
 	else
 		mask = (u32)PCI_BASE_ADDRESS_MEM_MASK;
 
-	reg = pci_resource_bar(dev, resno, &type);
-	if (!reg)
-		return;
-	if (type != pci_bar_unknown) {
+	if (resno < PCI_ROM_RESOURCE) {
+		reg = PCI_BASE_ADDRESS_0 + 4 * resno;
+	} else if (resno == PCI_ROM_RESOURCE) {
 		if (!(res->flags & IORESOURCE_ROM_ENABLE))
 			return;
+
+		reg = dev->rom_base_reg;
 		new |= PCI_ROM_ADDRESS_ENABLE;
-	}
+	} else
+		return;
 
 	/*
 	 * We can't update a 64-bit BAR atomically, so when possible,

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 55/93] PCI: Add comments about ROM BAR updating
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (49 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 54/93] PCI: Remove pci_resource_bar() and pci_iov_resource_bar() Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 56/93] PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE Greg Kroah-Hartman
                   ` (38 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Bjorn Helgaas, Gavin Shan, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Bjorn Helgaas <bhelgaas@google.com>

[ Upstream commit 0b457dde3cf8b7c76a60f8e960f21bbd4abdc416 ]

pci_update_resource() updates a hardware BAR so its address matches the
kernel's struct resource UNLESS it's a disabled ROM BAR.  We only update
those when we enable the ROM.

It's not obvious from the code why ROM BARs should be handled specially.
Apparently there are Matrox devices with defective ROM BARs that read as
zero when disabled.  That means that if pci_enable_rom() reads the disabled
BAR, sets PCI_ROM_ADDRESS_ENABLE (without re-inserting the address), and
writes it back, it would enable the ROM at address zero.

Add comments and references to explain why we can't make the code look more
rational.

The code changes are from 755528c860b0 ("Ignore disabled ROM resources at
setup") and 8085ce084c0f ("[PATCH] Fix PCI ROM mapping").

Link: https://lkml.org/lkml/2005/8/30/138
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/rom.c       |    5 +++++
 drivers/pci/setup-res.c |    6 ++++++
 2 files changed, 11 insertions(+)

--- a/drivers/pci/rom.c
+++ b/drivers/pci/rom.c
@@ -35,6 +35,11 @@ int pci_enable_rom(struct pci_dev *pdev)
 	if (res->flags & IORESOURCE_ROM_SHADOW)
 		return 0;
 
+	/*
+	 * Ideally pci_update_resource() would update the ROM BAR address,
+	 * and we would only set the enable bit here.  But apparently some
+	 * devices have buggy ROM BARs that read as zero when disabled.
+	 */
 	pcibios_resource_to_bus(pdev->bus, &region, res);
 	pci_read_config_dword(pdev, pdev->rom_base_reg, &rom_addr);
 	rom_addr &= ~PCI_ROM_ADDRESS_MASK;
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -68,6 +68,12 @@ static void pci_std_update_resource(stru
 	if (resno < PCI_ROM_RESOURCE) {
 		reg = PCI_BASE_ADDRESS_0 + 4 * resno;
 	} else if (resno == PCI_ROM_RESOURCE) {
+
+		/*
+		 * Apparently some Matrox devices have ROM BARs that read
+		 * as zero when disabled, so don't update ROM BARs unless
+		 * they're enabled.  See https://lkml.org/lkml/2005/8/30/138.
+		 */
 		if (!(res->flags & IORESOURCE_ROM_ENABLE))
 			return;
 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 56/93] PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (50 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 55/93] PCI: Add comments about ROM BAR updating Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 57/93] PCI: Dont update VF BARs while VF memory space is enabled Greg Kroah-Hartman
                   ` (37 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Bjorn Helgaas, Gavin Shan, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Bjorn Helgaas <bhelgaas@google.com>

[ Upstream commit 7a6d312b50e63f598f5b5914c4fd21878ac2b595 ]

Remove the assumption that IORESOURCE_ROM_ENABLE == PCI_ROM_ADDRESS_ENABLE.
PCI_ROM_ADDRESS_ENABLE is the ROM enable bit defined by the PCI spec, so if
we're reading or writing a BAR register value, that's what we should use.
IORESOURCE_ROM_ENABLE is a corresponding bit in struct resource flags.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/probe.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -227,7 +227,8 @@ int __pci_read_base(struct pci_dev *dev,
 			mask64 = (u32)PCI_BASE_ADDRESS_MEM_MASK;
 		}
 	} else {
-		res->flags |= (l & IORESOURCE_ROM_ENABLE);
+		if (l & PCI_ROM_ADDRESS_ENABLE)
+			res->flags |= IORESOURCE_ROM_ENABLE;
 		l64 = l & PCI_ROM_ADDRESS_MASK;
 		sz64 = sz & PCI_ROM_ADDRESS_MASK;
 		mask64 = (u32)PCI_ROM_ADDRESS_MASK;

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 57/93] PCI: Dont update VF BARs while VF memory space is enabled
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (51 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 56/93] PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 58/93] PCI: Update BARs using property bits appropriate for type Greg Kroah-Hartman
                   ` (36 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Bjorn Helgaas, Gavin Shan, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Bjorn Helgaas <bhelgaas@google.com>

[ Upstream commit 546ba9f8f22f71b0202b6ba8967be5cc6dae4e21 ]

If we update a VF BAR while it's enabled, there are two potential problems:

  1) Any driver that's using the VF has a cached BAR value that is stale
     after the update, and

  2) We can't update 64-bit BARs atomically, so the intermediate state
     (new lower dword with old upper dword) may conflict with another
     device, and an access by a driver unrelated to the VF may cause a bus
     error.

Warn about attempts to update VF BARs while they are enabled.  This is a
programming error, so use dev_WARN() to get a backtrace.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/iov.c |    8 ++++++++
 1 file changed, 8 insertions(+)

--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -566,6 +566,7 @@ void pci_iov_update_resource(struct pci_
 	struct resource *res = dev->resource + resno;
 	int vf_bar = resno - PCI_IOV_RESOURCES;
 	struct pci_bus_region region;
+	u16 cmd;
 	u32 new;
 	int reg;
 
@@ -577,6 +578,13 @@ void pci_iov_update_resource(struct pci_
 	if (!iov)
 		return;
 
+	pci_read_config_word(dev, iov->pos + PCI_SRIOV_CTRL, &cmd);
+	if ((cmd & PCI_SRIOV_CTRL_VFE) && (cmd & PCI_SRIOV_CTRL_MSE)) {
+		dev_WARN(&dev->dev, "can't update enabled VF BAR%d %pR\n",
+			 vf_bar, res);
+		return;
+	}
+
 	/*
 	 * Ignore unimplemented BARs, unused resource slots for 64-bit
 	 * BARs, and non-movable resources, e.g., those described via

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 58/93] PCI: Update BARs using property bits appropriate for type
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (52 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 57/93] PCI: Dont update VF BARs while VF memory space is enabled Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 59/93] PCI: Ignore BAR updates on virtual functions Greg Kroah-Hartman
                   ` (35 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Bjorn Helgaas, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Bjorn Helgaas <bhelgaas@google.com>

[ Upstream commit 45d004f4afefdd8d79916ee6d97a9ecd94bb1ffe ]

The BAR property bits (0-3 for memory BARs, 0-1 for I/O BARs) are supposed
to be read-only, but we do save them in res->flags and include them when
updating the BAR.

Mask the I/O property bits with ~PCI_BASE_ADDRESS_IO_MASK (0x3) instead of
PCI_REGION_FLAG_MASK (0xf) to make it obvious that we can't corrupt bits
2-3 of I/O addresses.

Use PCI_ROM_ADDRESS_MASK for ROM BARs.  This means we'll only check the top
21 bits (instead of the 28 bits we used to check) of a ROM BAR to see if
the update was successful.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/setup-res.c |   11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -58,12 +58,17 @@ static void pci_std_update_resource(stru
 		return;
 
 	pcibios_resource_to_bus(dev->bus, &region, res);
+	new = region.start;
 
-	new = region.start | (res->flags & PCI_REGION_FLAG_MASK);
-	if (res->flags & IORESOURCE_IO)
+	if (res->flags & IORESOURCE_IO) {
 		mask = (u32)PCI_BASE_ADDRESS_IO_MASK;
-	else
+		new |= res->flags & ~PCI_BASE_ADDRESS_IO_MASK;
+	} else if (resno == PCI_ROM_RESOURCE) {
+		mask = (u32)PCI_ROM_ADDRESS_MASK;
+	} else {
 		mask = (u32)PCI_BASE_ADDRESS_MEM_MASK;
+		new |= res->flags & ~PCI_BASE_ADDRESS_MEM_MASK;
+	}
 
 	if (resno < PCI_ROM_RESOURCE) {
 		reg = PCI_BASE_ADDRESS_0 + 4 * resno;

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 59/93] PCI: Ignore BAR updates on virtual functions
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (53 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 58/93] PCI: Update BARs using property bits appropriate for type Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 60/93] PCI: Do any VF BAR updates before enabling the BARs Greg Kroah-Hartman
                   ` (34 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Bjorn Helgaas, Gavin Shan, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Bjorn Helgaas <bhelgaas@google.com>

[ Upstream commit 63880b230a4af502c56dde3d4588634c70c66006 ]

VF BARs are read-only zero, so updating VF BARs will not have any effect.
See the SR-IOV spec r1.1, sec 3.4.1.11.

We already ignore these updates because of 70675e0b6a1a ("PCI: Don't try to
restore VF BARs"); this merely restructures it slightly to make it easier
to split updates for standard and SR-IOV BARs.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/pci.c       |    4 ----
 drivers/pci/setup-res.c |    5 ++---
 2 files changed, 2 insertions(+), 7 deletions(-)

--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -564,10 +564,6 @@ static void pci_restore_bars(struct pci_
 {
 	int i;
 
-	/* Per SR-IOV spec 3.4.1.11, VF BARs are RO zero */
-	if (dev->is_virtfn)
-		return;
-
 	for (i = 0; i < PCI_BRIDGE_RESOURCES; i++)
 		pci_update_resource(dev, i);
 }
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -34,10 +34,9 @@ static void pci_std_update_resource(stru
 	int reg;
 	struct resource *res = dev->resource + resno;
 
-	if (dev->is_virtfn) {
-		dev_warn(&dev->dev, "can't update VF BAR%d\n", resno);
+	/* Per SR-IOV spec 3.4.1.11, VF BARs are RO zero */
+	if (dev->is_virtfn)
 		return;
-	}
 
 	/*
 	 * Ignore resources for unimplemented BARs and unused resource slots

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 60/93] PCI: Do any VF BAR updates before enabling the BARs
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (54 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 59/93] PCI: Ignore BAR updates on virtual functions Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 61/93] ibmveth: calculate gso_segs for large packets Greg Kroah-Hartman
                   ` (33 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Carol Soto, Gavin Shan,
	Bjorn Helgaas, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Gavin Shan <gwshan@linux.vnet.ibm.com>

[ Upstream commit f40ec3c748c6912f6266c56a7f7992de61b255ed ]

Previously we enabled VFs and enable their memory space before calling
pcibios_sriov_enable().  But pcibios_sriov_enable() may update the VF BARs:
for example, on PPC PowerNV we may change them to manage the association of
VFs to PEs.

Because 64-bit BARs cannot be updated atomically, it's unsafe to update
them while they're enabled.  The half-updated state may conflict with other
devices in the system.

Call pcibios_sriov_enable() before enabling the VFs so any BAR updates
happen while the VF BARs are disabled.

[bhelgaas: changelog]
Tested-by: Carol Soto <clsoto@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/iov.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -306,13 +306,6 @@ static int sriov_enable(struct pci_dev *
 			return rc;
 	}
 
-	pci_iov_set_numvfs(dev, nr_virtfn);
-	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
-	pci_cfg_access_lock(dev);
-	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
-	msleep(100);
-	pci_cfg_access_unlock(dev);
-
 	iov->initial_VFs = initial;
 	if (nr_virtfn < initial)
 		initial = nr_virtfn;
@@ -323,6 +316,13 @@ static int sriov_enable(struct pci_dev *
 		goto err_pcibios;
 	}
 
+	pci_iov_set_numvfs(dev, nr_virtfn);
+	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
+	pci_cfg_access_lock(dev);
+	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+	msleep(100);
+	pci_cfg_access_unlock(dev);
+
 	for (i = 0; i < initial; i++) {
 		rc = pci_iov_add_virtfn(dev, i, 0);
 		if (rc)

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 61/93] ibmveth: calculate gso_segs for large packets
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (55 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 60/93] PCI: Do any VF BAR updates before enabling the BARs Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 62/93] Drivers: hv: ring_buffer: count on wrap around mappings in get_next_pkt_raw() (v2) Greg Kroah-Hartman
                   ` (32 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Falcon,
	Marcelo Ricardo Leitner, Jonathan Maxwell, David S. Miller,
	Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>

[ Upstream commit 94acf164dc8f1184e8d0737be7125134c2701dbe ]

Include calculations to compute the number of segments
that comprise an aggregated large packet.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Jonathan Maxwell <jmaxwell37@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/ibm/ibmveth.c |   12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -1181,7 +1181,9 @@ map_failed:
 
 static void ibmveth_rx_mss_helper(struct sk_buff *skb, u16 mss, int lrg_pkt)
 {
+	struct tcphdr *tcph;
 	int offset = 0;
+	int hdr_len;
 
 	/* only TCP packets will be aggregated */
 	if (skb->protocol == htons(ETH_P_IP)) {
@@ -1208,14 +1210,20 @@ static void ibmveth_rx_mss_helper(struct
 	/* if mss is not set through Large Packet bit/mss in rx buffer,
 	 * expect that the mss will be written to the tcp header checksum.
 	 */
+	tcph = (struct tcphdr *)(skb->data + offset);
 	if (lrg_pkt) {
 		skb_shinfo(skb)->gso_size = mss;
 	} else if (offset) {
-		struct tcphdr *tcph = (struct tcphdr *)(skb->data + offset);
-
 		skb_shinfo(skb)->gso_size = ntohs(tcph->check);
 		tcph->check = 0;
 	}
+
+	if (skb_shinfo(skb)->gso_size) {
+		hdr_len = offset + tcph->doff * 4;
+		skb_shinfo(skb)->gso_segs =
+				DIV_ROUND_UP(skb->len - hdr_len,
+					     skb_shinfo(skb)->gso_size);
+	}
 }
 
 static int ibmveth_poll(struct napi_struct *napi, int budget)

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 62/93] Drivers: hv: ring_buffer: count on wrap around mappings in get_next_pkt_raw() (v2)
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (56 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 61/93] ibmveth: calculate gso_segs for large packets Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 63/93] vfio/spapr: Postpone allocation of userspace version of TCE table Greg Kroah-Hartman
                   ` (31 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Vitaly Kuznetsov, K. Y. Srinivasan,
	Dexuan Cui, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vitaly Kuznetsov <vkuznets@redhat.com>

[ Upstream commit fa32ff6576623616c1751562edaed8c164ca5199 ]

With wrap around mappings in place we can always provide drivers with
direct links to packets on the ring buffer, even when they wrap around.
Do the required updates to get_next_pkt_raw()/put_pkt_raw()

The first version of this commit was reverted (65a532f3d50a) to deal with
cross-tree merge issues which are (hopefully) resolved now.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Tested-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/hyperv.h |   32 +++++++++++---------------------
 1 file changed, 11 insertions(+), 21 deletions(-)

--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1548,31 +1548,23 @@ static inline struct vmpacket_descriptor
 get_next_pkt_raw(struct vmbus_channel *channel)
 {
 	struct hv_ring_buffer_info *ring_info = &channel->inbound;
-	u32 read_loc = ring_info->priv_read_index;
+	u32 priv_read_loc = ring_info->priv_read_index;
 	void *ring_buffer = hv_get_ring_buffer(ring_info);
-	struct vmpacket_descriptor *cur_desc;
-	u32 packetlen;
 	u32 dsize = ring_info->ring_datasize;
-	u32 delta = read_loc - ring_info->ring_buffer->read_index;
+	/*
+	 * delta is the difference between what is available to read and
+	 * what was already consumed in place. We commit read index after
+	 * the whole batch is processed.
+	 */
+	u32 delta = priv_read_loc >= ring_info->ring_buffer->read_index ?
+		priv_read_loc - ring_info->ring_buffer->read_index :
+		(dsize - ring_info->ring_buffer->read_index) + priv_read_loc;
 	u32 bytes_avail_toread = (hv_get_bytes_to_read(ring_info) - delta);
 
 	if (bytes_avail_toread < sizeof(struct vmpacket_descriptor))
 		return NULL;
 
-	if ((read_loc + sizeof(*cur_desc)) > dsize)
-		return NULL;
-
-	cur_desc = ring_buffer + read_loc;
-	packetlen = cur_desc->len8 << 3;
-
-	/*
-	 * If the packet under consideration is wrapping around,
-	 * return failure.
-	 */
-	if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > (dsize - 1))
-		return NULL;
-
-	return cur_desc;
+	return ring_buffer + priv_read_loc;
 }
 
 /*
@@ -1584,16 +1576,14 @@ static inline void put_pkt_raw(struct vm
 				struct vmpacket_descriptor *desc)
 {
 	struct hv_ring_buffer_info *ring_info = &channel->inbound;
-	u32 read_loc = ring_info->priv_read_index;
 	u32 packetlen = desc->len8 << 3;
 	u32 dsize = ring_info->ring_datasize;
 
-	if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > dsize)
-		BUG();
 	/*
 	 * Include the packet trailer.
 	 */
 	ring_info->priv_read_index += packetlen + VMBUS_PKT_TRAILER;
+	ring_info->priv_read_index %= dsize;
 }
 
 /*

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 63/93] vfio/spapr: Postpone allocation of userspace version of TCE table
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (57 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 62/93] Drivers: hv: ring_buffer: count on wrap around mappings in get_next_pkt_raw() (v2) Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 64/93] powerpc/iommu: Pass mm_struct to init/cleanup helpers Greg Kroah-Hartman
                   ` (30 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexey Kardashevskiy, David Gibson,
	Alex Williamson, Michael Ellerman, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexey Kardashevskiy <aik@ozlabs.ru>

[ Upstream commit 39701e56f5f16ea0cf8fc9e8472e645f8de91d23 ]

The iommu_table struct manages a hardware TCE table and a vmalloc'd
table with corresponding userspace addresses. Both are allocated when
the default DMA window is created and this happens when the very first
group is attached to a container.

As we are going to allow the userspace to configure container in one
memory context and pas container fd to another, we have to postpones
such allocations till a container fd is passed to the destination
user process so we would account locked memory limit against the actual
container user constrainsts.

This postpones the it_userspace array allocation till it is used first
time for mapping. The unmapping patch already checks if the array is
allocated.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/vfio/vfio_iommu_spapr_tce.c |   20 +++++++-------------
 1 file changed, 7 insertions(+), 13 deletions(-)

--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -509,6 +509,12 @@ static long tce_iommu_build_v2(struct tc
 	unsigned long hpa;
 	enum dma_data_direction dirtmp;
 
+	if (!tbl->it_userspace) {
+		ret = tce_iommu_userspace_view_alloc(tbl);
+		if (ret)
+			return ret;
+	}
+
 	for (i = 0; i < pages; ++i) {
 		struct mm_iommu_table_group_mem_t *mem = NULL;
 		unsigned long *pua = IOMMU_TABLE_USERSPACE_ENTRY(tbl,
@@ -582,15 +588,6 @@ static long tce_iommu_create_table(struc
 	WARN_ON(!ret && !(*ptbl)->it_ops->free);
 	WARN_ON(!ret && ((*ptbl)->it_allocated_size != table_size));
 
-	if (!ret && container->v2) {
-		ret = tce_iommu_userspace_view_alloc(*ptbl);
-		if (ret)
-			(*ptbl)->it_ops->free(*ptbl);
-	}
-
-	if (ret)
-		decrement_locked_vm(table_size >> PAGE_SHIFT);
-
 	return ret;
 }
 
@@ -1062,10 +1059,7 @@ static int tce_iommu_take_ownership(stru
 		if (!tbl || !tbl->it_map)
 			continue;
 
-		rc = tce_iommu_userspace_view_alloc(tbl);
-		if (!rc)
-			rc = iommu_take_ownership(tbl);
-
+		rc = iommu_take_ownership(tbl);
 		if (rc) {
 			for (j = 0; j < i; ++j)
 				iommu_release_ownership(

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 64/93] powerpc/iommu: Pass mm_struct to init/cleanup helpers
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (58 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 63/93] vfio/spapr: Postpone allocation of userspace version of TCE table Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 65/93] powerpc/iommu: Stop using @current in mm_iommu_xxx Greg Kroah-Hartman
                   ` (29 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexey Kardashevskiy, David Gibson,
	Michael Ellerman, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexey Kardashevskiy <aik@ozlabs.ru>

[ Upstream commit 88f54a3581eb9deaa3bd1aade40aef266d782385 ]

We are going to get rid of @current references in mmu_context_boos3s64.c
and cache mm_struct in the VFIO container. Since mm_context_t does not
have reference counting, we will be using mm_struct which does have
the reference counter.

This changes mm_iommu_init/mm_iommu_cleanup to receive mm_struct rather
than mm_context_t (which is embedded into mm).

This should not cause any behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/powerpc/include/asm/mmu_context.h |    4 ++--
 arch/powerpc/kernel/setup-common.c     |    2 +-
 arch/powerpc/mm/mmu_context_book3s64.c |    4 ++--
 arch/powerpc/mm/mmu_context_iommu.c    |    9 +++++----
 4 files changed, 10 insertions(+), 9 deletions(-)

--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -23,8 +23,8 @@ extern bool mm_iommu_preregistered(void)
 extern long mm_iommu_get(unsigned long ua, unsigned long entries,
 		struct mm_iommu_table_group_mem_t **pmem);
 extern long mm_iommu_put(struct mm_iommu_table_group_mem_t *mem);
-extern void mm_iommu_init(mm_context_t *ctx);
-extern void mm_iommu_cleanup(mm_context_t *ctx);
+extern void mm_iommu_init(struct mm_struct *mm);
+extern void mm_iommu_cleanup(struct mm_struct *mm);
 extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(unsigned long ua,
 		unsigned long size);
 extern struct mm_iommu_table_group_mem_t *mm_iommu_find(unsigned long ua,
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -915,7 +915,7 @@ void __init setup_arch(char **cmdline_p)
 	init_mm.context.pte_frag = NULL;
 #endif
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-	mm_iommu_init(&init_mm.context);
+	mm_iommu_init(&init_mm);
 #endif
 	irqstack_early_init();
 	exc_lvl_early_init();
--- a/arch/powerpc/mm/mmu_context_book3s64.c
+++ b/arch/powerpc/mm/mmu_context_book3s64.c
@@ -115,7 +115,7 @@ int init_new_context(struct task_struct
 	mm->context.pte_frag = NULL;
 #endif
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-	mm_iommu_init(&mm->context);
+	mm_iommu_init(mm);
 #endif
 	return 0;
 }
@@ -160,7 +160,7 @@ static inline void destroy_pagetable_pag
 void destroy_context(struct mm_struct *mm)
 {
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-	mm_iommu_cleanup(&mm->context);
+	mm_iommu_cleanup(mm);
 #endif
 
 #ifdef CONFIG_PPC_ICSWX
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -373,16 +373,17 @@ void mm_iommu_mapped_dec(struct mm_iommu
 }
 EXPORT_SYMBOL_GPL(mm_iommu_mapped_dec);
 
-void mm_iommu_init(mm_context_t *ctx)
+void mm_iommu_init(struct mm_struct *mm)
 {
-	INIT_LIST_HEAD_RCU(&ctx->iommu_group_mem_list);
+	INIT_LIST_HEAD_RCU(&mm->context.iommu_group_mem_list);
 }
 
-void mm_iommu_cleanup(mm_context_t *ctx)
+void mm_iommu_cleanup(struct mm_struct *mm)
 {
 	struct mm_iommu_table_group_mem_t *mem, *tmp;
 
-	list_for_each_entry_safe(mem, tmp, &ctx->iommu_group_mem_list, next) {
+	list_for_each_entry_safe(mem, tmp, &mm->context.iommu_group_mem_list,
+			next) {
 		list_del_rcu(&mem->next);
 		mm_iommu_do_free(mem);
 	}

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 65/93] powerpc/iommu: Stop using @current in mm_iommu_xxx
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (59 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 64/93] powerpc/iommu: Pass mm_struct to init/cleanup helpers Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 66/93] vfio/spapr: Reference mm in tce_container Greg Kroah-Hartman
                   ` (28 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexey Kardashevskiy, David Gibson,
	Alex Williamson, Michael Ellerman, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexey Kardashevskiy <aik@ozlabs.ru>

[ Upstream commit d7baee6901b34c4895eb78efdbf13a49079d7404 ]

This changes mm_iommu_xxx helpers to take mm_struct as a parameter
instead of getting it from @current which in some situations may
not have a valid reference to mm.

This changes helpers to receive @mm and moves all references to @current
to the caller, including checks for !current and !current->mm;
checks in mm_iommu_preregistered() are removed as there is no caller
yet.

This moves the mm_iommu_adjust_locked_vm() call to the caller as
it receives mm_iommu_table_group_mem_t but it needs mm.

This should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/powerpc/include/asm/mmu_context.h |   16 ++++++-----
 arch/powerpc/mm/mmu_context_iommu.c    |   46 ++++++++++++---------------------
 drivers/vfio/vfio_iommu_spapr_tce.c    |   14 +++++++---
 3 files changed, 36 insertions(+), 40 deletions(-)

--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -19,16 +19,18 @@ extern void destroy_context(struct mm_st
 struct mm_iommu_table_group_mem_t;
 
 extern int isolate_lru_page(struct page *page);	/* from internal.h */
-extern bool mm_iommu_preregistered(void);
-extern long mm_iommu_get(unsigned long ua, unsigned long entries,
+extern bool mm_iommu_preregistered(struct mm_struct *mm);
+extern long mm_iommu_get(struct mm_struct *mm,
+		unsigned long ua, unsigned long entries,
 		struct mm_iommu_table_group_mem_t **pmem);
-extern long mm_iommu_put(struct mm_iommu_table_group_mem_t *mem);
+extern long mm_iommu_put(struct mm_struct *mm,
+		struct mm_iommu_table_group_mem_t *mem);
 extern void mm_iommu_init(struct mm_struct *mm);
 extern void mm_iommu_cleanup(struct mm_struct *mm);
-extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(unsigned long ua,
-		unsigned long size);
-extern struct mm_iommu_table_group_mem_t *mm_iommu_find(unsigned long ua,
-		unsigned long entries);
+extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(struct mm_struct *mm,
+		unsigned long ua, unsigned long size);
+extern struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
+		unsigned long ua, unsigned long entries);
 extern long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
 		unsigned long ua, unsigned long *hpa);
 extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem);
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -56,7 +56,7 @@ static long mm_iommu_adjust_locked_vm(st
 	}
 
 	pr_debug("[%d] RLIMIT_MEMLOCK HASH64 %c%ld %ld/%ld\n",
-			current->pid,
+			current ? current->pid : 0,
 			incr ? '+' : '-',
 			npages << PAGE_SHIFT,
 			mm->locked_vm << PAGE_SHIFT,
@@ -66,12 +66,9 @@ static long mm_iommu_adjust_locked_vm(st
 	return ret;
 }
 
-bool mm_iommu_preregistered(void)
+bool mm_iommu_preregistered(struct mm_struct *mm)
 {
-	if (!current || !current->mm)
-		return false;
-
-	return !list_empty(&current->mm->context.iommu_group_mem_list);
+	return !list_empty(&mm->context.iommu_group_mem_list);
 }
 EXPORT_SYMBOL_GPL(mm_iommu_preregistered);
 
@@ -124,19 +121,16 @@ static int mm_iommu_move_page_from_cma(s
 	return 0;
 }
 
-long mm_iommu_get(unsigned long ua, unsigned long entries,
+long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
 		struct mm_iommu_table_group_mem_t **pmem)
 {
 	struct mm_iommu_table_group_mem_t *mem;
 	long i, j, ret = 0, locked_entries = 0;
 	struct page *page = NULL;
 
-	if (!current || !current->mm)
-		return -ESRCH; /* process exited */
-
 	mutex_lock(&mem_list_mutex);
 
-	list_for_each_entry_rcu(mem, &current->mm->context.iommu_group_mem_list,
+	list_for_each_entry_rcu(mem, &mm->context.iommu_group_mem_list,
 			next) {
 		if ((mem->ua == ua) && (mem->entries == entries)) {
 			++mem->used;
@@ -154,7 +148,7 @@ long mm_iommu_get(unsigned long ua, unsi
 
 	}
 
-	ret = mm_iommu_adjust_locked_vm(current->mm, entries, true);
+	ret = mm_iommu_adjust_locked_vm(mm, entries, true);
 	if (ret)
 		goto unlock_exit;
 
@@ -215,11 +209,11 @@ populate:
 	mem->entries = entries;
 	*pmem = mem;
 
-	list_add_rcu(&mem->next, &current->mm->context.iommu_group_mem_list);
+	list_add_rcu(&mem->next, &mm->context.iommu_group_mem_list);
 
 unlock_exit:
 	if (locked_entries && ret)
-		mm_iommu_adjust_locked_vm(current->mm, locked_entries, false);
+		mm_iommu_adjust_locked_vm(mm, locked_entries, false);
 
 	mutex_unlock(&mem_list_mutex);
 
@@ -264,17 +258,13 @@ static void mm_iommu_free(struct rcu_hea
 static void mm_iommu_release(struct mm_iommu_table_group_mem_t *mem)
 {
 	list_del_rcu(&mem->next);
-	mm_iommu_adjust_locked_vm(current->mm, mem->entries, false);
 	call_rcu(&mem->rcu, mm_iommu_free);
 }
 
-long mm_iommu_put(struct mm_iommu_table_group_mem_t *mem)
+long mm_iommu_put(struct mm_struct *mm, struct mm_iommu_table_group_mem_t *mem)
 {
 	long ret = 0;
 
-	if (!current || !current->mm)
-		return -ESRCH; /* process exited */
-
 	mutex_lock(&mem_list_mutex);
 
 	if (mem->used == 0) {
@@ -297,6 +287,8 @@ long mm_iommu_put(struct mm_iommu_table_
 	/* @mapped became 0 so now mappings are disabled, release the region */
 	mm_iommu_release(mem);
 
+	mm_iommu_adjust_locked_vm(mm, mem->entries, false);
+
 unlock_exit:
 	mutex_unlock(&mem_list_mutex);
 
@@ -304,14 +296,12 @@ unlock_exit:
 }
 EXPORT_SYMBOL_GPL(mm_iommu_put);
 
-struct mm_iommu_table_group_mem_t *mm_iommu_lookup(unsigned long ua,
-		unsigned long size)
+struct mm_iommu_table_group_mem_t *mm_iommu_lookup(struct mm_struct *mm,
+		unsigned long ua, unsigned long size)
 {
 	struct mm_iommu_table_group_mem_t *mem, *ret = NULL;
 
-	list_for_each_entry_rcu(mem,
-			&current->mm->context.iommu_group_mem_list,
-			next) {
+	list_for_each_entry_rcu(mem, &mm->context.iommu_group_mem_list, next) {
 		if ((mem->ua <= ua) &&
 				(ua + size <= mem->ua +
 				 (mem->entries << PAGE_SHIFT))) {
@@ -324,14 +314,12 @@ struct mm_iommu_table_group_mem_t *mm_io
 }
 EXPORT_SYMBOL_GPL(mm_iommu_lookup);
 
-struct mm_iommu_table_group_mem_t *mm_iommu_find(unsigned long ua,
-		unsigned long entries)
+struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
+		unsigned long ua, unsigned long entries)
 {
 	struct mm_iommu_table_group_mem_t *mem, *ret = NULL;
 
-	list_for_each_entry_rcu(mem,
-			&current->mm->context.iommu_group_mem_list,
-			next) {
+	list_for_each_entry_rcu(mem, &mm->context.iommu_group_mem_list, next) {
 		if ((mem->ua == ua) && (mem->entries == entries)) {
 			ret = mem;
 			break;
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -107,14 +107,17 @@ static long tce_iommu_unregister_pages(s
 {
 	struct mm_iommu_table_group_mem_t *mem;
 
+	if (!current || !current->mm)
+		return -ESRCH; /* process exited */
+
 	if ((vaddr & ~PAGE_MASK) || (size & ~PAGE_MASK))
 		return -EINVAL;
 
-	mem = mm_iommu_find(vaddr, size >> PAGE_SHIFT);
+	mem = mm_iommu_find(current->mm, vaddr, size >> PAGE_SHIFT);
 	if (!mem)
 		return -ENOENT;
 
-	return mm_iommu_put(mem);
+	return mm_iommu_put(current->mm, mem);
 }
 
 static long tce_iommu_register_pages(struct tce_container *container,
@@ -124,11 +127,14 @@ static long tce_iommu_register_pages(str
 	struct mm_iommu_table_group_mem_t *mem = NULL;
 	unsigned long entries = size >> PAGE_SHIFT;
 
+	if (!current || !current->mm)
+		return -ESRCH; /* process exited */
+
 	if ((vaddr & ~PAGE_MASK) || (size & ~PAGE_MASK) ||
 			((vaddr + size) < vaddr))
 		return -EINVAL;
 
-	ret = mm_iommu_get(vaddr, entries, &mem);
+	ret = mm_iommu_get(current->mm, vaddr, entries, &mem);
 	if (ret)
 		return ret;
 
@@ -375,7 +381,7 @@ static int tce_iommu_prereg_ua_to_hpa(un
 	long ret = 0;
 	struct mm_iommu_table_group_mem_t *mem;
 
-	mem = mm_iommu_lookup(tce, size);
+	mem = mm_iommu_lookup(current->mm, tce, size);
 	if (!mem)
 		return -EINVAL;
 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 66/93] vfio/spapr: Reference mm in tce_container
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (60 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 65/93] powerpc/iommu: Stop using @current in mm_iommu_xxx Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 67/93] powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown Greg Kroah-Hartman
                   ` (27 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexey Kardashevskiy,
	Alex Williamson, David Gibson, Michael Ellerman, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexey Kardashevskiy <aik@ozlabs.ru>

[ Upstream commit bc82d122ae4a0e9f971f13403995898fcfa0c09e ]

In some situations the userspace memory context may live longer than
the userspace process itself so if we need to do proper memory context
cleanup, we better have tce_container take a reference to mm_struct and
use it later when the process is gone (@current or @current->mm is NULL).

This references mm and stores the pointer in the container; this is done
in a new helper - tce_iommu_mm_set() - when one of the following happens:
- a container is enabled (IOMMU v1);
- a first attempt to pre-register memory is made (IOMMU v2);
- a DMA window is created (IOMMU v2).
The @mm stays referenced till the container is destroyed.

This replaces current->mm with container->mm everywhere except debug
prints.

This adds a check that current->mm is the same as the one stored in
the container to prevent userspace from making changes to a memory
context of other processes.

DMA map/unmap ioctls() do not check for @mm as they already check
for @enabled which is set after tce_iommu_mm_set() is called.

This does not reference a task as multiple threads within the same mm
are allowed to ioctl() to vfio and supposedly they will have same limits
and capabilities and if they do not, we'll just fail with no harm made.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/vfio/vfio_iommu_spapr_tce.c |  160 ++++++++++++++++++++++--------------
 1 file changed, 100 insertions(+), 60 deletions(-)

--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -31,49 +31,49 @@
 static void tce_iommu_detach_group(void *iommu_data,
 		struct iommu_group *iommu_group);
 
-static long try_increment_locked_vm(long npages)
+static long try_increment_locked_vm(struct mm_struct *mm, long npages)
 {
 	long ret = 0, locked, lock_limit;
 
-	if (!current || !current->mm)
-		return -ESRCH; /* process exited */
+	if (WARN_ON_ONCE(!mm))
+		return -EPERM;
 
 	if (!npages)
 		return 0;
 
-	down_write(&current->mm->mmap_sem);
-	locked = current->mm->locked_vm + npages;
+	down_write(&mm->mmap_sem);
+	locked = mm->locked_vm + npages;
 	lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
 	if (locked > lock_limit && !capable(CAP_IPC_LOCK))
 		ret = -ENOMEM;
 	else
-		current->mm->locked_vm += npages;
+		mm->locked_vm += npages;
 
 	pr_debug("[%d] RLIMIT_MEMLOCK +%ld %ld/%ld%s\n", current->pid,
 			npages << PAGE_SHIFT,
-			current->mm->locked_vm << PAGE_SHIFT,
+			mm->locked_vm << PAGE_SHIFT,
 			rlimit(RLIMIT_MEMLOCK),
 			ret ? " - exceeded" : "");
 
-	up_write(&current->mm->mmap_sem);
+	up_write(&mm->mmap_sem);
 
 	return ret;
 }
 
-static void decrement_locked_vm(long npages)
+static void decrement_locked_vm(struct mm_struct *mm, long npages)
 {
-	if (!current || !current->mm || !npages)
-		return; /* process exited */
+	if (!mm || !npages)
+		return;
 
-	down_write(&current->mm->mmap_sem);
-	if (WARN_ON_ONCE(npages > current->mm->locked_vm))
-		npages = current->mm->locked_vm;
-	current->mm->locked_vm -= npages;
+	down_write(&mm->mmap_sem);
+	if (WARN_ON_ONCE(npages > mm->locked_vm))
+		npages = mm->locked_vm;
+	mm->locked_vm -= npages;
 	pr_debug("[%d] RLIMIT_MEMLOCK -%ld %ld/%ld\n", current->pid,
 			npages << PAGE_SHIFT,
-			current->mm->locked_vm << PAGE_SHIFT,
+			mm->locked_vm << PAGE_SHIFT,
 			rlimit(RLIMIT_MEMLOCK));
-	up_write(&current->mm->mmap_sem);
+	up_write(&mm->mmap_sem);
 }
 
 /*
@@ -98,26 +98,38 @@ struct tce_container {
 	bool enabled;
 	bool v2;
 	unsigned long locked_pages;
+	struct mm_struct *mm;
 	struct iommu_table *tables[IOMMU_TABLE_GROUP_MAX_TABLES];
 	struct list_head group_list;
 };
 
+static long tce_iommu_mm_set(struct tce_container *container)
+{
+	if (container->mm) {
+		if (container->mm == current->mm)
+			return 0;
+		return -EPERM;
+	}
+	BUG_ON(!current->mm);
+	container->mm = current->mm;
+	atomic_inc(&container->mm->mm_count);
+
+	return 0;
+}
+
 static long tce_iommu_unregister_pages(struct tce_container *container,
 		__u64 vaddr, __u64 size)
 {
 	struct mm_iommu_table_group_mem_t *mem;
 
-	if (!current || !current->mm)
-		return -ESRCH; /* process exited */
-
 	if ((vaddr & ~PAGE_MASK) || (size & ~PAGE_MASK))
 		return -EINVAL;
 
-	mem = mm_iommu_find(current->mm, vaddr, size >> PAGE_SHIFT);
+	mem = mm_iommu_find(container->mm, vaddr, size >> PAGE_SHIFT);
 	if (!mem)
 		return -ENOENT;
 
-	return mm_iommu_put(current->mm, mem);
+	return mm_iommu_put(container->mm, mem);
 }
 
 static long tce_iommu_register_pages(struct tce_container *container,
@@ -127,14 +139,11 @@ static long tce_iommu_register_pages(str
 	struct mm_iommu_table_group_mem_t *mem = NULL;
 	unsigned long entries = size >> PAGE_SHIFT;
 
-	if (!current || !current->mm)
-		return -ESRCH; /* process exited */
-
 	if ((vaddr & ~PAGE_MASK) || (size & ~PAGE_MASK) ||
 			((vaddr + size) < vaddr))
 		return -EINVAL;
 
-	ret = mm_iommu_get(current->mm, vaddr, entries, &mem);
+	ret = mm_iommu_get(container->mm, vaddr, entries, &mem);
 	if (ret)
 		return ret;
 
@@ -143,7 +152,8 @@ static long tce_iommu_register_pages(str
 	return 0;
 }
 
-static long tce_iommu_userspace_view_alloc(struct iommu_table *tbl)
+static long tce_iommu_userspace_view_alloc(struct iommu_table *tbl,
+		struct mm_struct *mm)
 {
 	unsigned long cb = _ALIGN_UP(sizeof(tbl->it_userspace[0]) *
 			tbl->it_size, PAGE_SIZE);
@@ -152,13 +162,13 @@ static long tce_iommu_userspace_view_all
 
 	BUG_ON(tbl->it_userspace);
 
-	ret = try_increment_locked_vm(cb >> PAGE_SHIFT);
+	ret = try_increment_locked_vm(mm, cb >> PAGE_SHIFT);
 	if (ret)
 		return ret;
 
 	uas = vzalloc(cb);
 	if (!uas) {
-		decrement_locked_vm(cb >> PAGE_SHIFT);
+		decrement_locked_vm(mm, cb >> PAGE_SHIFT);
 		return -ENOMEM;
 	}
 	tbl->it_userspace = uas;
@@ -166,7 +176,8 @@ static long tce_iommu_userspace_view_all
 	return 0;
 }
 
-static void tce_iommu_userspace_view_free(struct iommu_table *tbl)
+static void tce_iommu_userspace_view_free(struct iommu_table *tbl,
+		struct mm_struct *mm)
 {
 	unsigned long cb = _ALIGN_UP(sizeof(tbl->it_userspace[0]) *
 			tbl->it_size, PAGE_SIZE);
@@ -176,7 +187,7 @@ static void tce_iommu_userspace_view_fre
 
 	vfree(tbl->it_userspace);
 	tbl->it_userspace = NULL;
-	decrement_locked_vm(cb >> PAGE_SHIFT);
+	decrement_locked_vm(mm, cb >> PAGE_SHIFT);
 }
 
 static bool tce_page_is_contained(struct page *page, unsigned page_shift)
@@ -236,9 +247,6 @@ static int tce_iommu_enable(struct tce_c
 	struct iommu_table_group *table_group;
 	struct tce_iommu_group *tcegrp;
 
-	if (!current->mm)
-		return -ESRCH; /* process exited */
-
 	if (container->enabled)
 		return -EBUSY;
 
@@ -283,8 +291,12 @@ static int tce_iommu_enable(struct tce_c
 	if (!table_group->tce32_size)
 		return -EPERM;
 
+	ret = tce_iommu_mm_set(container);
+	if (ret)
+		return ret;
+
 	locked = table_group->tce32_size >> PAGE_SHIFT;
-	ret = try_increment_locked_vm(locked);
+	ret = try_increment_locked_vm(container->mm, locked);
 	if (ret)
 		return ret;
 
@@ -302,10 +314,8 @@ static void tce_iommu_disable(struct tce
 
 	container->enabled = false;
 
-	if (!current->mm)
-		return;
-
-	decrement_locked_vm(container->locked_pages);
+	BUG_ON(!container->mm);
+	decrement_locked_vm(container->mm, container->locked_pages);
 }
 
 static void *tce_iommu_open(unsigned long arg)
@@ -332,7 +342,8 @@ static void *tce_iommu_open(unsigned lon
 static int tce_iommu_clear(struct tce_container *container,
 		struct iommu_table *tbl,
 		unsigned long entry, unsigned long pages);
-static void tce_iommu_free_table(struct iommu_table *tbl);
+static void tce_iommu_free_table(struct tce_container *container,
+		struct iommu_table *tbl);
 
 static void tce_iommu_release(void *iommu_data)
 {
@@ -357,10 +368,12 @@ static void tce_iommu_release(void *iomm
 			continue;
 
 		tce_iommu_clear(container, tbl, tbl->it_offset, tbl->it_size);
-		tce_iommu_free_table(tbl);
+		tce_iommu_free_table(container, tbl);
 	}
 
 	tce_iommu_disable(container);
+	if (container->mm)
+		mmdrop(container->mm);
 	mutex_destroy(&container->lock);
 
 	kfree(container);
@@ -375,13 +388,14 @@ static void tce_iommu_unuse_page(struct
 	put_page(page);
 }
 
-static int tce_iommu_prereg_ua_to_hpa(unsigned long tce, unsigned long size,
+static int tce_iommu_prereg_ua_to_hpa(struct tce_container *container,
+		unsigned long tce, unsigned long size,
 		unsigned long *phpa, struct mm_iommu_table_group_mem_t **pmem)
 {
 	long ret = 0;
 	struct mm_iommu_table_group_mem_t *mem;
 
-	mem = mm_iommu_lookup(current->mm, tce, size);
+	mem = mm_iommu_lookup(container->mm, tce, size);
 	if (!mem)
 		return -EINVAL;
 
@@ -394,18 +408,18 @@ static int tce_iommu_prereg_ua_to_hpa(un
 	return 0;
 }
 
-static void tce_iommu_unuse_page_v2(struct iommu_table *tbl,
-		unsigned long entry)
+static void tce_iommu_unuse_page_v2(struct tce_container *container,
+		struct iommu_table *tbl, unsigned long entry)
 {
 	struct mm_iommu_table_group_mem_t *mem = NULL;
 	int ret;
 	unsigned long hpa = 0;
 	unsigned long *pua = IOMMU_TABLE_USERSPACE_ENTRY(tbl, entry);
 
-	if (!pua || !current || !current->mm)
+	if (!pua)
 		return;
 
-	ret = tce_iommu_prereg_ua_to_hpa(*pua, IOMMU_PAGE_SIZE(tbl),
+	ret = tce_iommu_prereg_ua_to_hpa(container, *pua, IOMMU_PAGE_SIZE(tbl),
 			&hpa, &mem);
 	if (ret)
 		pr_debug("%s: tce %lx at #%lx was not cached, ret=%d\n",
@@ -435,7 +449,7 @@ static int tce_iommu_clear(struct tce_co
 			continue;
 
 		if (container->v2) {
-			tce_iommu_unuse_page_v2(tbl, entry);
+			tce_iommu_unuse_page_v2(container, tbl, entry);
 			continue;
 		}
 
@@ -516,7 +530,7 @@ static long tce_iommu_build_v2(struct tc
 	enum dma_data_direction dirtmp;
 
 	if (!tbl->it_userspace) {
-		ret = tce_iommu_userspace_view_alloc(tbl);
+		ret = tce_iommu_userspace_view_alloc(tbl, container->mm);
 		if (ret)
 			return ret;
 	}
@@ -526,8 +540,8 @@ static long tce_iommu_build_v2(struct tc
 		unsigned long *pua = IOMMU_TABLE_USERSPACE_ENTRY(tbl,
 				entry + i);
 
-		ret = tce_iommu_prereg_ua_to_hpa(tce, IOMMU_PAGE_SIZE(tbl),
-				&hpa, &mem);
+		ret = tce_iommu_prereg_ua_to_hpa(container,
+				tce, IOMMU_PAGE_SIZE(tbl), &hpa, &mem);
 		if (ret)
 			break;
 
@@ -548,7 +562,7 @@ static long tce_iommu_build_v2(struct tc
 		ret = iommu_tce_xchg(tbl, entry + i, &hpa, &dirtmp);
 		if (ret) {
 			/* dirtmp cannot be DMA_NONE here */
-			tce_iommu_unuse_page_v2(tbl, entry + i);
+			tce_iommu_unuse_page_v2(container, tbl, entry + i);
 			pr_err("iommu_tce: %s failed ioba=%lx, tce=%lx, ret=%ld\n",
 					__func__, entry << tbl->it_page_shift,
 					tce, ret);
@@ -556,7 +570,7 @@ static long tce_iommu_build_v2(struct tc
 		}
 
 		if (dirtmp != DMA_NONE)
-			tce_iommu_unuse_page_v2(tbl, entry + i);
+			tce_iommu_unuse_page_v2(container, tbl, entry + i);
 
 		*pua = tce;
 
@@ -584,7 +598,7 @@ static long tce_iommu_create_table(struc
 	if (!table_size)
 		return -EINVAL;
 
-	ret = try_increment_locked_vm(table_size >> PAGE_SHIFT);
+	ret = try_increment_locked_vm(container->mm, table_size >> PAGE_SHIFT);
 	if (ret)
 		return ret;
 
@@ -597,13 +611,14 @@ static long tce_iommu_create_table(struc
 	return ret;
 }
 
-static void tce_iommu_free_table(struct iommu_table *tbl)
+static void tce_iommu_free_table(struct tce_container *container,
+		struct iommu_table *tbl)
 {
 	unsigned long pages = tbl->it_allocated_size >> PAGE_SHIFT;
 
-	tce_iommu_userspace_view_free(tbl);
+	tce_iommu_userspace_view_free(tbl, container->mm);
 	tbl->it_ops->free(tbl);
-	decrement_locked_vm(pages);
+	decrement_locked_vm(container->mm, pages);
 }
 
 static long tce_iommu_create_window(struct tce_container *container,
@@ -666,7 +681,7 @@ unset_exit:
 		table_group = iommu_group_get_iommudata(tcegrp->grp);
 		table_group->ops->unset_window(table_group, num);
 	}
-	tce_iommu_free_table(tbl);
+	tce_iommu_free_table(container, tbl);
 
 	return ret;
 }
@@ -704,7 +719,7 @@ static long tce_iommu_remove_window(stru
 
 	/* Free table */
 	tce_iommu_clear(container, tbl, tbl->it_offset, tbl->it_size);
-	tce_iommu_free_table(tbl);
+	tce_iommu_free_table(container, tbl);
 	container->tables[num] = NULL;
 
 	return 0;
@@ -730,7 +745,17 @@ static long tce_iommu_ioctl(void *iommu_
 		}
 
 		return (ret < 0) ? 0 : ret;
+	}
+
+	/*
+	 * Sanity check to prevent one userspace from manipulating
+	 * another userspace mm.
+	 */
+	BUG_ON(!container);
+	if (container->mm && container->mm != current->mm)
+		return -EPERM;
 
+	switch (cmd) {
 	case VFIO_IOMMU_SPAPR_TCE_GET_INFO: {
 		struct vfio_iommu_spapr_tce_info info;
 		struct tce_iommu_group *tcegrp;
@@ -891,6 +916,10 @@ static long tce_iommu_ioctl(void *iommu_
 		minsz = offsetofend(struct vfio_iommu_spapr_register_memory,
 				size);
 
+		ret = tce_iommu_mm_set(container);
+		if (ret)
+			return ret;
+
 		if (copy_from_user(&param, (void __user *)arg, minsz))
 			return -EFAULT;
 
@@ -914,6 +943,9 @@ static long tce_iommu_ioctl(void *iommu_
 		if (!container->v2)
 			break;
 
+		if (!container->mm)
+			return -EPERM;
+
 		minsz = offsetofend(struct vfio_iommu_spapr_register_memory,
 				size);
 
@@ -972,6 +1004,10 @@ static long tce_iommu_ioctl(void *iommu_
 		if (!container->v2)
 			break;
 
+		ret = tce_iommu_mm_set(container);
+		if (ret)
+			return ret;
+
 		if (!tce_groups_attached(container))
 			return -ENXIO;
 
@@ -1006,6 +1042,10 @@ static long tce_iommu_ioctl(void *iommu_
 		if (!container->v2)
 			break;
 
+		ret = tce_iommu_mm_set(container);
+		if (ret)
+			return ret;
+
 		if (!tce_groups_attached(container))
 			return -ENXIO;
 
@@ -1046,7 +1086,7 @@ static void tce_iommu_release_ownership(
 			continue;
 
 		tce_iommu_clear(container, tbl, tbl->it_offset, tbl->it_size);
-		tce_iommu_userspace_view_free(tbl);
+		tce_iommu_userspace_view_free(tbl, container->mm);
 		if (tbl->it_map)
 			iommu_release_ownership(tbl);
 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 67/93] powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (61 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 66/93] vfio/spapr: Reference mm in tce_container Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 68/93] vfio/spapr: Add a helper to create default DMA window Greg Kroah-Hartman
                   ` (26 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexey Kardashevskiy,
	Nicholas Piggin, Alex Williamson, David Gibson, Michael Ellerman,
	Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexey Kardashevskiy <aik@ozlabs.ru>

[ Upstream commit 4b6fad7097f883335b6d9627c883cb7f276d94c9 ]

At the moment the userspace tool is expected to request pinning of
the entire guest RAM when VFIO IOMMU SPAPR v2 driver is present.
When the userspace process finishes, all the pinned pages need to
be put; this is done as a part of the userspace memory context (MM)
destruction which happens on the very last mmdrop().

This approach has a problem that a MM of the userspace process
may live longer than the userspace process itself as kernel threads
use userspace process MMs which was runnning on a CPU where
the kernel thread was scheduled to. If this happened, the MM remains
referenced until this exact kernel thread wakes up again
and releases the very last reference to the MM, on an idle system this
can take even hours.

This moves preregistered regions tracking from MM to VFIO; insteads of
using mm_iommu_table_group_mem_t::used, tce_container::prereg_list is
added so each container releases regions which it has pre-registered.

This changes the userspace interface to return EBUSY if a memory
region is already registered in a container. However it should not
have any practical effect as the only userspace tool available now
does register memory region once per container anyway.

As tce_iommu_register_pages/tce_iommu_unregister_pages are called
under container->lock, this does not need additional locking.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/powerpc/mm/mmu_context_book3s64.c |    4 --
 arch/powerpc/mm/mmu_context_iommu.c    |   11 -----
 drivers/vfio/vfio_iommu_spapr_tce.c    |   61 ++++++++++++++++++++++++++++++++-
 3 files changed, 61 insertions(+), 15 deletions(-)

--- a/arch/powerpc/mm/mmu_context_book3s64.c
+++ b/arch/powerpc/mm/mmu_context_book3s64.c
@@ -156,13 +156,11 @@ static inline void destroy_pagetable_pag
 }
 #endif
 
-
 void destroy_context(struct mm_struct *mm)
 {
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-	mm_iommu_cleanup(mm);
+	WARN_ON_ONCE(!list_empty(&mm->context.iommu_group_mem_list));
 #endif
-
 #ifdef CONFIG_PPC_ICSWX
 	drop_cop(mm->context.acop, mm);
 	kfree(mm->context.cop_lockp);
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -365,14 +365,3 @@ void mm_iommu_init(struct mm_struct *mm)
 {
 	INIT_LIST_HEAD_RCU(&mm->context.iommu_group_mem_list);
 }
-
-void mm_iommu_cleanup(struct mm_struct *mm)
-{
-	struct mm_iommu_table_group_mem_t *mem, *tmp;
-
-	list_for_each_entry_safe(mem, tmp, &mm->context.iommu_group_mem_list,
-			next) {
-		list_del_rcu(&mem->next);
-		mm_iommu_do_free(mem);
-	}
-}
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -89,6 +89,15 @@ struct tce_iommu_group {
 };
 
 /*
+ * A container needs to remember which preregistered region  it has
+ * referenced to do proper cleanup at the userspace process exit.
+ */
+struct tce_iommu_prereg {
+	struct list_head next;
+	struct mm_iommu_table_group_mem_t *mem;
+};
+
+/*
  * The container descriptor supports only a single group per container.
  * Required by the API as the container is not supplied with the IOMMU group
  * at the moment of initialization.
@@ -101,6 +110,7 @@ struct tce_container {
 	struct mm_struct *mm;
 	struct iommu_table *tables[IOMMU_TABLE_GROUP_MAX_TABLES];
 	struct list_head group_list;
+	struct list_head prereg_list;
 };
 
 static long tce_iommu_mm_set(struct tce_container *container)
@@ -117,10 +127,27 @@ static long tce_iommu_mm_set(struct tce_
 	return 0;
 }
 
+static long tce_iommu_prereg_free(struct tce_container *container,
+		struct tce_iommu_prereg *tcemem)
+{
+	long ret;
+
+	ret = mm_iommu_put(container->mm, tcemem->mem);
+	if (ret)
+		return ret;
+
+	list_del(&tcemem->next);
+	kfree(tcemem);
+
+	return 0;
+}
+
 static long tce_iommu_unregister_pages(struct tce_container *container,
 		__u64 vaddr, __u64 size)
 {
 	struct mm_iommu_table_group_mem_t *mem;
+	struct tce_iommu_prereg *tcemem;
+	bool found = false;
 
 	if ((vaddr & ~PAGE_MASK) || (size & ~PAGE_MASK))
 		return -EINVAL;
@@ -129,7 +156,17 @@ static long tce_iommu_unregister_pages(s
 	if (!mem)
 		return -ENOENT;
 
-	return mm_iommu_put(container->mm, mem);
+	list_for_each_entry(tcemem, &container->prereg_list, next) {
+		if (tcemem->mem == mem) {
+			found = true;
+			break;
+		}
+	}
+
+	if (!found)
+		return -ENOENT;
+
+	return tce_iommu_prereg_free(container, tcemem);
 }
 
 static long tce_iommu_register_pages(struct tce_container *container,
@@ -137,16 +174,29 @@ static long tce_iommu_register_pages(str
 {
 	long ret = 0;
 	struct mm_iommu_table_group_mem_t *mem = NULL;
+	struct tce_iommu_prereg *tcemem;
 	unsigned long entries = size >> PAGE_SHIFT;
 
 	if ((vaddr & ~PAGE_MASK) || (size & ~PAGE_MASK) ||
 			((vaddr + size) < vaddr))
 		return -EINVAL;
 
+	mem = mm_iommu_find(container->mm, vaddr, entries);
+	if (mem) {
+		list_for_each_entry(tcemem, &container->prereg_list, next) {
+			if (tcemem->mem == mem)
+				return -EBUSY;
+		}
+	}
+
 	ret = mm_iommu_get(container->mm, vaddr, entries, &mem);
 	if (ret)
 		return ret;
 
+	tcemem = kzalloc(sizeof(*tcemem), GFP_KERNEL);
+	tcemem->mem = mem;
+	list_add(&tcemem->next, &container->prereg_list);
+
 	container->enabled = true;
 
 	return 0;
@@ -333,6 +383,7 @@ static void *tce_iommu_open(unsigned lon
 
 	mutex_init(&container->lock);
 	INIT_LIST_HEAD_RCU(&container->group_list);
+	INIT_LIST_HEAD_RCU(&container->prereg_list);
 
 	container->v2 = arg == VFIO_SPAPR_TCE_v2_IOMMU;
 
@@ -371,6 +422,14 @@ static void tce_iommu_release(void *iomm
 		tce_iommu_free_table(container, tbl);
 	}
 
+	while (!list_empty(&container->prereg_list)) {
+		struct tce_iommu_prereg *tcemem;
+
+		tcemem = list_first_entry(&container->prereg_list,
+				struct tce_iommu_prereg, next);
+		WARN_ON_ONCE(tce_iommu_prereg_free(container, tcemem));
+	}
+
 	tce_iommu_disable(container);
 	if (container->mm)
 		mmdrop(container->mm);

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 68/93] vfio/spapr: Add a helper to create default DMA window
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (62 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 67/93] powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 69/93] vfio/spapr: Postpone default window creation Greg Kroah-Hartman
                   ` (25 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexey Kardashevskiy, David Gibson,
	Alex Williamson, Michael Ellerman, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexey Kardashevskiy <aik@ozlabs.ru>

[ Upstream commit 6f01cc692a16405235d5c34056455b182682123c ]

There is already a helper to create a DMA window which does allocate
a table and programs it to the IOMMU group. However
tce_iommu_take_ownership_ddw() did not use it and did these 2 calls
itself to simplify error path.

Since we are going to delay the default window creation till
the default window is accessed/removed or new window is added,
we need a helper to create a default window from all these cases.

This adds tce_iommu_create_default_window(). Since it relies on
a VFIO container to have at least one IOMMU group (for future use),
this changes tce_iommu_attach_group() to add a group to the container
first and then call the new helper.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/vfio/vfio_iommu_spapr_tce.c |   87 +++++++++++++++++-------------------
 1 file changed, 42 insertions(+), 45 deletions(-)

--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -784,6 +784,29 @@ static long tce_iommu_remove_window(stru
 	return 0;
 }
 
+static long tce_iommu_create_default_window(struct tce_container *container)
+{
+	long ret;
+	__u64 start_addr = 0;
+	struct tce_iommu_group *tcegrp;
+	struct iommu_table_group *table_group;
+
+	if (!tce_groups_attached(container))
+		return -ENODEV;
+
+	tcegrp = list_first_entry(&container->group_list,
+			struct tce_iommu_group, next);
+	table_group = iommu_group_get_iommudata(tcegrp->grp);
+	if (!table_group)
+		return -ENODEV;
+
+	ret = tce_iommu_create_window(container, IOMMU_PAGE_SHIFT_4K,
+			table_group->tce32_size, 1, &start_addr);
+	WARN_ON_ONCE(!ret && start_addr);
+
+	return ret;
+}
+
 static long tce_iommu_ioctl(void *iommu_data,
 				 unsigned int cmd, unsigned long arg)
 {
@@ -1199,9 +1222,6 @@ static void tce_iommu_release_ownership_
 static long tce_iommu_take_ownership_ddw(struct tce_container *container,
 		struct iommu_table_group *table_group)
 {
-	long i, ret = 0;
-	struct iommu_table *tbl = NULL;
-
 	if (!table_group->ops->create_table || !table_group->ops->set_window ||
 			!table_group->ops->release_ownership) {
 		WARN_ON_ONCE(1);
@@ -1210,47 +1230,7 @@ static long tce_iommu_take_ownership_ddw
 
 	table_group->ops->take_ownership(table_group);
 
-	/*
-	 * If it the first group attached, check if there is
-	 * a default DMA window and create one if none as
-	 * the userspace expects it to exist.
-	 */
-	if (!tce_groups_attached(container) && !container->tables[0]) {
-		ret = tce_iommu_create_table(container,
-				table_group,
-				0, /* window number */
-				IOMMU_PAGE_SHIFT_4K,
-				table_group->tce32_size,
-				1, /* default levels */
-				&tbl);
-		if (ret)
-			goto release_exit;
-		else
-			container->tables[0] = tbl;
-	}
-
-	/* Set all windows to the new group */
-	for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i) {
-		tbl = container->tables[i];
-
-		if (!tbl)
-			continue;
-
-		/* Set the default window to a new group */
-		ret = table_group->ops->set_window(table_group, i, tbl);
-		if (ret)
-			goto release_exit;
-	}
-
 	return 0;
-
-release_exit:
-	for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i)
-		table_group->ops->unset_window(table_group, i);
-
-	table_group->ops->release_ownership(table_group);
-
-	return ret;
 }
 
 static int tce_iommu_attach_group(void *iommu_data,
@@ -1260,6 +1240,7 @@ static int tce_iommu_attach_group(void *
 	struct tce_container *container = iommu_data;
 	struct iommu_table_group *table_group;
 	struct tce_iommu_group *tcegrp = NULL;
+	bool create_default_window = false;
 
 	mutex_lock(&container->lock);
 
@@ -1302,14 +1283,30 @@ static int tce_iommu_attach_group(void *
 	}
 
 	if (!table_group->ops || !table_group->ops->take_ownership ||
-			!table_group->ops->release_ownership)
+			!table_group->ops->release_ownership) {
 		ret = tce_iommu_take_ownership(container, table_group);
-	else
+	} else {
 		ret = tce_iommu_take_ownership_ddw(container, table_group);
+		if (!tce_groups_attached(container) && !container->tables[0])
+			create_default_window = true;
+	}
 
 	if (!ret) {
 		tcegrp->grp = iommu_group;
 		list_add(&tcegrp->next, &container->group_list);
+		/*
+		 * If it the first group attached, check if there is
+		 * a default DMA window and create one if none as
+		 * the userspace expects it to exist.
+		 */
+		if (create_default_window) {
+			ret = tce_iommu_create_default_window(container);
+			if (ret) {
+				list_del(&tcegrp->next);
+				tce_iommu_release_ownership_ddw(container,
+						table_group);
+			}
+		}
 	}
 
 unlock_exit:

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 69/93] vfio/spapr: Postpone default window creation
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (63 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 68/93] vfio/spapr: Add a helper to create default DMA window Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 70/93] drm/nouveau/disp/gp102: fix cursor/overlay immediate channel indices Greg Kroah-Hartman
                   ` (24 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexey Kardashevskiy, David Gibson,
	Alex Williamson, Michael Ellerman, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexey Kardashevskiy <aik@ozlabs.ru>

[ Upstream commit d9c728949ddc9de5734bf3b12ea906ca8a77f2a0 ]

We are going to allow the userspace to configure container in
one memory context and pass container fd to another so
we are postponing memory allocations accounted against
the locked memory limit. One of previous patches took care of
it_userspace.

At the moment we create the default DMA window when the first group is
attached to a container; this is done for the userspace which is not
DDW-aware but familiar with the SPAPR TCE IOMMU v2 in the part of memory
pre-registration - such client expects the default DMA window to exist.

This postpones the default DMA window allocation till one of
the folliwing happens:
1. first map/unmap request arrives;
2. new window is requested;
This adds noop for the case when the userspace requested removal
of the default window which has not been created yet.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/vfio/vfio_iommu_spapr_tce.c |   40 ++++++++++++++++++++++--------------
 1 file changed, 25 insertions(+), 15 deletions(-)

--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -106,6 +106,7 @@ struct tce_container {
 	struct mutex lock;
 	bool enabled;
 	bool v2;
+	bool def_window_pending;
 	unsigned long locked_pages;
 	struct mm_struct *mm;
 	struct iommu_table *tables[IOMMU_TABLE_GROUP_MAX_TABLES];
@@ -791,6 +792,9 @@ static long tce_iommu_create_default_win
 	struct tce_iommu_group *tcegrp;
 	struct iommu_table_group *table_group;
 
+	if (!container->def_window_pending)
+		return 0;
+
 	if (!tce_groups_attached(container))
 		return -ENODEV;
 
@@ -804,6 +808,9 @@ static long tce_iommu_create_default_win
 			table_group->tce32_size, 1, &start_addr);
 	WARN_ON_ONCE(!ret && start_addr);
 
+	if (!ret)
+		container->def_window_pending = false;
+
 	return ret;
 }
 
@@ -907,6 +914,10 @@ static long tce_iommu_ioctl(void *iommu_
 				VFIO_DMA_MAP_FLAG_WRITE))
 			return -EINVAL;
 
+		ret = tce_iommu_create_default_window(container);
+		if (ret)
+			return ret;
+
 		num = tce_iommu_find_table(container, param.iova, &tbl);
 		if (num < 0)
 			return -ENXIO;
@@ -970,6 +981,10 @@ static long tce_iommu_ioctl(void *iommu_
 		if (param.flags)
 			return -EINVAL;
 
+		ret = tce_iommu_create_default_window(container);
+		if (ret)
+			return ret;
+
 		num = tce_iommu_find_table(container, param.iova, &tbl);
 		if (num < 0)
 			return -ENXIO;
@@ -1107,6 +1122,10 @@ static long tce_iommu_ioctl(void *iommu_
 
 		mutex_lock(&container->lock);
 
+		ret = tce_iommu_create_default_window(container);
+		if (ret)
+			return ret;
+
 		ret = tce_iommu_create_window(container, create.page_shift,
 				create.window_size, create.levels,
 				&create.start_addr);
@@ -1143,6 +1162,11 @@ static long tce_iommu_ioctl(void *iommu_
 		if (remove.flags)
 			return -EINVAL;
 
+		if (container->def_window_pending && !remove.start_addr) {
+			container->def_window_pending = false;
+			return 0;
+		}
+
 		mutex_lock(&container->lock);
 
 		ret = tce_iommu_remove_window(container, remove.start_addr);
@@ -1240,7 +1264,6 @@ static int tce_iommu_attach_group(void *
 	struct tce_container *container = iommu_data;
 	struct iommu_table_group *table_group;
 	struct tce_iommu_group *tcegrp = NULL;
-	bool create_default_window = false;
 
 	mutex_lock(&container->lock);
 
@@ -1288,25 +1311,12 @@ static int tce_iommu_attach_group(void *
 	} else {
 		ret = tce_iommu_take_ownership_ddw(container, table_group);
 		if (!tce_groups_attached(container) && !container->tables[0])
-			create_default_window = true;
+			container->def_window_pending = true;
 	}
 
 	if (!ret) {
 		tcegrp->grp = iommu_group;
 		list_add(&tcegrp->next, &container->group_list);
-		/*
-		 * If it the first group attached, check if there is
-		 * a default DMA window and create one if none as
-		 * the userspace expects it to exist.
-		 */
-		if (create_default_window) {
-			ret = tce_iommu_create_default_window(container);
-			if (ret) {
-				list_del(&tcegrp->next);
-				tce_iommu_release_ownership_ddw(container,
-						table_group);
-			}
-		}
 	}
 
 unlock_exit:

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 70/93] drm/nouveau/disp/gp102: fix cursor/overlay immediate channel indices
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (64 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 69/93] vfio/spapr: Postpone default window creation Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 71/93] drm/nouveau/disp/nv50-: split chid into chid.ctrl and chid.user Greg Kroah-Hartman
                   ` (23 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Ben Skeggs, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ben Skeggs <bskeggs@redhat.com>

[ Upstream commit e50fcff15fe120ef2103a9e18af6644235c2b14d ]

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/nouveau/nvkm/engine/disp/Kbuild      |    2 +
 drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h  |    2 +
 drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgp102.c |   37 +++++++++++++++++++
 drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgp102.c |   37 +++++++++++++++++++
 drivers/gpu/drm/nouveau/nvkm/engine/disp/rootgp104.c |    4 +-
 5 files changed, 80 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgp102.c
 create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgp102.c

--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/Kbuild
@@ -95,9 +95,11 @@ nvkm-y += nvkm/engine/disp/cursg84.o
 nvkm-y += nvkm/engine/disp/cursgt215.o
 nvkm-y += nvkm/engine/disp/cursgf119.o
 nvkm-y += nvkm/engine/disp/cursgk104.o
+nvkm-y += nvkm/engine/disp/cursgp102.o
 
 nvkm-y += nvkm/engine/disp/oimmnv50.o
 nvkm-y += nvkm/engine/disp/oimmg84.o
 nvkm-y += nvkm/engine/disp/oimmgt215.o
 nvkm-y += nvkm/engine/disp/oimmgf119.o
 nvkm-y += nvkm/engine/disp/oimmgk104.o
+nvkm-y += nvkm/engine/disp/oimmgp102.o
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h
@@ -114,6 +114,8 @@ extern const struct nv50_disp_pioc_oclas
 extern const struct nv50_disp_pioc_oclass gk104_disp_oimm_oclass;
 extern const struct nv50_disp_pioc_oclass gk104_disp_curs_oclass;
 
+extern const struct nv50_disp_pioc_oclass gp102_disp_oimm_oclass;
+extern const struct nv50_disp_pioc_oclass gp102_disp_curs_oclass;
 
 int nv50_disp_curs_new(const struct nv50_disp_chan_func *,
 		       const struct nv50_disp_chan_mthd *,
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgp102.c
@@ -0,0 +1,37 @@
+/*
+ * Copyright 2016 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Ben Skeggs <bskeggs@redhat.com>
+ */
+#include "channv50.h"
+#include "rootnv50.h"
+
+#include <nvif/class.h>
+
+const struct nv50_disp_pioc_oclass
+gp102_disp_curs_oclass = {
+	.base.oclass = GK104_DISP_CURSOR,
+	.base.minver = 0,
+	.base.maxver = 0,
+	.ctor = nv50_disp_curs_new,
+	.func = &gf119_disp_pioc_func,
+	.chid = { 13, 17 },
+};
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgp102.c
@@ -0,0 +1,37 @@
+/*
+ * Copyright 2016 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Ben Skeggs <bskeggs@redhat.com>
+ */
+#include "channv50.h"
+#include "rootnv50.h"
+
+#include <nvif/class.h>
+
+const struct nv50_disp_pioc_oclass
+gp102_disp_oimm_oclass = {
+	.base.oclass = GK104_DISP_OVERLAY,
+	.base.minver = 0,
+	.base.maxver = 0,
+	.ctor = nv50_disp_oimm_new,
+	.func = &gf119_disp_pioc_func,
+	.chid = { 9, 13 },
+};
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/rootgp104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/rootgp104.c
@@ -36,8 +36,8 @@ gp104_disp_root = {
 		&gp104_disp_ovly_oclass,
 	},
 	.pioc = {
-		&gk104_disp_oimm_oclass,
-		&gk104_disp_curs_oclass,
+		&gp102_disp_oimm_oclass,
+		&gp102_disp_curs_oclass,
 	},
 };
 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 71/93] drm/nouveau/disp/nv50-: split chid into chid.ctrl and chid.user
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (65 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 70/93] drm/nouveau/disp/gp102: fix cursor/overlay immediate channel indices Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 72/93] drm/nouveau/disp/nv50-: specify ctrl/user separately when constructing classes Greg Kroah-Hartman
                   ` (22 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Ben Skeggs, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ben Skeggs <bskeggs@redhat.com>

[ Upstream commit 4391d7f5c79a9fe6fa11cf6c160ca7f7bdb49d2a ]

GP102/GP104 make life difficult by redefining the channel indices for
some registers, but not others.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c  |   23 +++++----
 drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h  |    6 ++
 drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c |   44 +++++++++----------
 drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgp104.c |   23 +++++----
 drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacnv50.c  |   44 +++++++++----------
 drivers/gpu/drm/nouveau/nvkm/engine/disp/piocgf119.c |   28 ++++++------
 drivers/gpu/drm/nouveau/nvkm/engine/disp/piocnv50.c  |   30 ++++++------
 7 files changed, 106 insertions(+), 92 deletions(-)

--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c
@@ -82,7 +82,7 @@ nv50_disp_chan_mthd(struct nv50_disp_cha
 
 			if (mthd->addr) {
 				snprintf(cname_, sizeof(cname_), "%s %d",
-					 mthd->name, chan->chid);
+					 mthd->name, chan->chid.user);
 				cname = cname_;
 			}
 
@@ -139,7 +139,7 @@ nv50_disp_chan_uevent_ctor(struct nvkm_o
 	if (!(ret = nvif_unvers(ret, &data, &size, args->none))) {
 		notify->size  = sizeof(struct nvif_notify_uevent_rep);
 		notify->types = 1;
-		notify->index = chan->chid;
+		notify->index = chan->chid.user;
 		return 0;
 	}
 
@@ -159,7 +159,7 @@ nv50_disp_chan_rd32(struct nvkm_object *
 	struct nv50_disp_chan *chan = nv50_disp_chan(object);
 	struct nv50_disp *disp = chan->root->disp;
 	struct nvkm_device *device = disp->base.engine.subdev.device;
-	*data = nvkm_rd32(device, 0x640000 + (chan->chid * 0x1000) + addr);
+	*data = nvkm_rd32(device, 0x640000 + (chan->chid.user * 0x1000) + addr);
 	return 0;
 }
 
@@ -169,7 +169,7 @@ nv50_disp_chan_wr32(struct nvkm_object *
 	struct nv50_disp_chan *chan = nv50_disp_chan(object);
 	struct nv50_disp *disp = chan->root->disp;
 	struct nvkm_device *device = disp->base.engine.subdev.device;
-	nvkm_wr32(device, 0x640000 + (chan->chid * 0x1000) + addr, data);
+	nvkm_wr32(device, 0x640000 + (chan->chid.user * 0x1000) + addr, data);
 	return 0;
 }
 
@@ -196,7 +196,7 @@ nv50_disp_chan_map(struct nvkm_object *o
 	struct nv50_disp *disp = chan->root->disp;
 	struct nvkm_device *device = disp->base.engine.subdev.device;
 	*addr = device->func->resource_addr(device, 0) +
-		0x640000 + (chan->chid * 0x1000);
+		0x640000 + (chan->chid.user * 0x1000);
 	*size = 0x001000;
 	return 0;
 }
@@ -243,8 +243,8 @@ nv50_disp_chan_dtor(struct nvkm_object *
 {
 	struct nv50_disp_chan *chan = nv50_disp_chan(object);
 	struct nv50_disp *disp = chan->root->disp;
-	if (chan->chid >= 0)
-		disp->chan[chan->chid] = NULL;
+	if (chan->chid.user >= 0)
+		disp->chan[chan->chid.user] = NULL;
 	return chan->func->dtor ? chan->func->dtor(chan) : chan;
 }
 
@@ -273,14 +273,15 @@ nv50_disp_chan_ctor(const struct nv50_di
 	chan->func = func;
 	chan->mthd = mthd;
 	chan->root = root;
-	chan->chid = chid;
+	chan->chid.ctrl = chid;
+	chan->chid.user = chid;
 	chan->head = head;
 
-	if (disp->chan[chan->chid]) {
-		chan->chid = -1;
+	if (disp->chan[chan->chid.user]) {
+		chan->chid.user = -1;
 		return -EBUSY;
 	}
-	disp->chan[chan->chid] = chan;
+	disp->chan[chan->chid.user] = chan;
 	return 0;
 }
 
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h
@@ -7,7 +7,11 @@ struct nv50_disp_chan {
 	const struct nv50_disp_chan_func *func;
 	const struct nv50_disp_chan_mthd *mthd;
 	struct nv50_disp_root *root;
-	int chid;
+
+	struct {
+		int ctrl;
+		int user;
+	} chid;
 	int head;
 
 	struct nvkm_object object;
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c
@@ -32,8 +32,8 @@ gf119_disp_dmac_bind(struct nv50_disp_dm
 		     struct nvkm_object *object, u32 handle)
 {
 	return nvkm_ramht_insert(chan->base.root->ramht, object,
-				 chan->base.chid, -9, handle,
-				 chan->base.chid << 27 | 0x00000001);
+				 chan->base.chid.user, -9, handle,
+				 chan->base.chid.user << 27 | 0x00000001);
 }
 
 void
@@ -42,22 +42,23 @@ gf119_disp_dmac_fini(struct nv50_disp_dm
 	struct nv50_disp *disp = chan->base.root->disp;
 	struct nvkm_subdev *subdev = &disp->base.engine.subdev;
 	struct nvkm_device *device = subdev->device;
-	int chid = chan->base.chid;
+	int ctrl = chan->base.chid.ctrl;
+	int user = chan->base.chid.user;
 
 	/* deactivate channel */
-	nvkm_mask(device, 0x610490 + (chid * 0x0010), 0x00001010, 0x00001000);
-	nvkm_mask(device, 0x610490 + (chid * 0x0010), 0x00000003, 0x00000000);
+	nvkm_mask(device, 0x610490 + (ctrl * 0x0010), 0x00001010, 0x00001000);
+	nvkm_mask(device, 0x610490 + (ctrl * 0x0010), 0x00000003, 0x00000000);
 	if (nvkm_msec(device, 2000,
-		if (!(nvkm_rd32(device, 0x610490 + (chid * 0x10)) & 0x001e0000))
+		if (!(nvkm_rd32(device, 0x610490 + (ctrl * 0x10)) & 0x001e0000))
 			break;
 	) < 0) {
-		nvkm_error(subdev, "ch %d fini: %08x\n", chid,
-			   nvkm_rd32(device, 0x610490 + (chid * 0x10)));
+		nvkm_error(subdev, "ch %d fini: %08x\n", user,
+			   nvkm_rd32(device, 0x610490 + (ctrl * 0x10)));
 	}
 
 	/* disable error reporting and completion notification */
-	nvkm_mask(device, 0x610090, 0x00000001 << chid, 0x00000000);
-	nvkm_mask(device, 0x6100a0, 0x00000001 << chid, 0x00000000);
+	nvkm_mask(device, 0x610090, 0x00000001 << user, 0x00000000);
+	nvkm_mask(device, 0x6100a0, 0x00000001 << user, 0x00000000);
 }
 
 static int
@@ -66,26 +67,27 @@ gf119_disp_dmac_init(struct nv50_disp_dm
 	struct nv50_disp *disp = chan->base.root->disp;
 	struct nvkm_subdev *subdev = &disp->base.engine.subdev;
 	struct nvkm_device *device = subdev->device;
-	int chid = chan->base.chid;
+	int ctrl = chan->base.chid.ctrl;
+	int user = chan->base.chid.user;
 
 	/* enable error reporting */
-	nvkm_mask(device, 0x6100a0, 0x00000001 << chid, 0x00000001 << chid);
+	nvkm_mask(device, 0x6100a0, 0x00000001 << user, 0x00000001 << user);
 
 	/* initialise channel for dma command submission */
-	nvkm_wr32(device, 0x610494 + (chid * 0x0010), chan->push);
-	nvkm_wr32(device, 0x610498 + (chid * 0x0010), 0x00010000);
-	nvkm_wr32(device, 0x61049c + (chid * 0x0010), 0x00000001);
-	nvkm_mask(device, 0x610490 + (chid * 0x0010), 0x00000010, 0x00000010);
-	nvkm_wr32(device, 0x640000 + (chid * 0x1000), 0x00000000);
-	nvkm_wr32(device, 0x610490 + (chid * 0x0010), 0x00000013);
+	nvkm_wr32(device, 0x610494 + (ctrl * 0x0010), chan->push);
+	nvkm_wr32(device, 0x610498 + (ctrl * 0x0010), 0x00010000);
+	nvkm_wr32(device, 0x61049c + (ctrl * 0x0010), 0x00000001);
+	nvkm_mask(device, 0x610490 + (ctrl * 0x0010), 0x00000010, 0x00000010);
+	nvkm_wr32(device, 0x640000 + (ctrl * 0x1000), 0x00000000);
+	nvkm_wr32(device, 0x610490 + (ctrl * 0x0010), 0x00000013);
 
 	/* wait for it to go inactive */
 	if (nvkm_msec(device, 2000,
-		if (!(nvkm_rd32(device, 0x610490 + (chid * 0x10)) & 0x80000000))
+		if (!(nvkm_rd32(device, 0x610490 + (ctrl * 0x10)) & 0x80000000))
 			break;
 	) < 0) {
-		nvkm_error(subdev, "ch %d init: %08x\n", chid,
-			   nvkm_rd32(device, 0x610490 + (chid * 0x10)));
+		nvkm_error(subdev, "ch %d init: %08x\n", user,
+			   nvkm_rd32(device, 0x610490 + (ctrl * 0x10)));
 		return -EBUSY;
 	}
 
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgp104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgp104.c
@@ -32,26 +32,27 @@ gp104_disp_dmac_init(struct nv50_disp_dm
 	struct nv50_disp *disp = chan->base.root->disp;
 	struct nvkm_subdev *subdev = &disp->base.engine.subdev;
 	struct nvkm_device *device = subdev->device;
-	int chid = chan->base.chid;
+	int ctrl = chan->base.chid.ctrl;
+	int user = chan->base.chid.user;
 
 	/* enable error reporting */
-	nvkm_mask(device, 0x6100a0, 0x00000001 << chid, 0x00000001 << chid);
+	nvkm_mask(device, 0x6100a0, 0x00000001 << user, 0x00000001 << user);
 
 	/* initialise channel for dma command submission */
-	nvkm_wr32(device, 0x611494 + (chid * 0x0010), chan->push);
-	nvkm_wr32(device, 0x611498 + (chid * 0x0010), 0x00010000);
-	nvkm_wr32(device, 0x61149c + (chid * 0x0010), 0x00000001);
-	nvkm_mask(device, 0x610490 + (chid * 0x0010), 0x00000010, 0x00000010);
-	nvkm_wr32(device, 0x640000 + (chid * 0x1000), 0x00000000);
-	nvkm_wr32(device, 0x610490 + (chid * 0x0010), 0x00000013);
+	nvkm_wr32(device, 0x611494 + (ctrl * 0x0010), chan->push);
+	nvkm_wr32(device, 0x611498 + (ctrl * 0x0010), 0x00010000);
+	nvkm_wr32(device, 0x61149c + (ctrl * 0x0010), 0x00000001);
+	nvkm_mask(device, 0x610490 + (ctrl * 0x0010), 0x00000010, 0x00000010);
+	nvkm_wr32(device, 0x640000 + (ctrl * 0x1000), 0x00000000);
+	nvkm_wr32(device, 0x610490 + (ctrl * 0x0010), 0x00000013);
 
 	/* wait for it to go inactive */
 	if (nvkm_msec(device, 2000,
-		if (!(nvkm_rd32(device, 0x610490 + (chid * 0x10)) & 0x80000000))
+		if (!(nvkm_rd32(device, 0x610490 + (ctrl * 0x10)) & 0x80000000))
 			break;
 	) < 0) {
-		nvkm_error(subdev, "ch %d init: %08x\n", chid,
-			   nvkm_rd32(device, 0x610490 + (chid * 0x10)));
+		nvkm_error(subdev, "ch %d init: %08x\n", user,
+			   nvkm_rd32(device, 0x610490 + (ctrl * 0x10)));
 		return -EBUSY;
 	}
 
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacnv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacnv50.c
@@ -179,9 +179,9 @@ nv50_disp_dmac_bind(struct nv50_disp_dma
 		    struct nvkm_object *object, u32 handle)
 {
 	return nvkm_ramht_insert(chan->base.root->ramht, object,
-				 chan->base.chid, -10, handle,
-				 chan->base.chid << 28 |
-				 chan->base.chid);
+				 chan->base.chid.user, -10, handle,
+				 chan->base.chid.user << 28 |
+				 chan->base.chid.user);
 }
 
 static void
@@ -190,21 +190,22 @@ nv50_disp_dmac_fini(struct nv50_disp_dma
 	struct nv50_disp *disp = chan->base.root->disp;
 	struct nvkm_subdev *subdev = &disp->base.engine.subdev;
 	struct nvkm_device *device = subdev->device;
-	int chid = chan->base.chid;
+	int ctrl = chan->base.chid.ctrl;
+	int user = chan->base.chid.user;
 
 	/* deactivate channel */
-	nvkm_mask(device, 0x610200 + (chid * 0x0010), 0x00001010, 0x00001000);
-	nvkm_mask(device, 0x610200 + (chid * 0x0010), 0x00000003, 0x00000000);
+	nvkm_mask(device, 0x610200 + (ctrl * 0x0010), 0x00001010, 0x00001000);
+	nvkm_mask(device, 0x610200 + (ctrl * 0x0010), 0x00000003, 0x00000000);
 	if (nvkm_msec(device, 2000,
-		if (!(nvkm_rd32(device, 0x610200 + (chid * 0x10)) & 0x001e0000))
+		if (!(nvkm_rd32(device, 0x610200 + (ctrl * 0x10)) & 0x001e0000))
 			break;
 	) < 0) {
-		nvkm_error(subdev, "ch %d fini timeout, %08x\n", chid,
-			   nvkm_rd32(device, 0x610200 + (chid * 0x10)));
+		nvkm_error(subdev, "ch %d fini timeout, %08x\n", user,
+			   nvkm_rd32(device, 0x610200 + (ctrl * 0x10)));
 	}
 
 	/* disable error reporting and completion notifications */
-	nvkm_mask(device, 0x610028, 0x00010001 << chid, 0x00000000 << chid);
+	nvkm_mask(device, 0x610028, 0x00010001 << user, 0x00000000 << user);
 }
 
 static int
@@ -213,26 +214,27 @@ nv50_disp_dmac_init(struct nv50_disp_dma
 	struct nv50_disp *disp = chan->base.root->disp;
 	struct nvkm_subdev *subdev = &disp->base.engine.subdev;
 	struct nvkm_device *device = subdev->device;
-	int chid = chan->base.chid;
+	int ctrl = chan->base.chid.ctrl;
+	int user = chan->base.chid.user;
 
 	/* enable error reporting */
-	nvkm_mask(device, 0x610028, 0x00010000 << chid, 0x00010000 << chid);
+	nvkm_mask(device, 0x610028, 0x00010000 << user, 0x00010000 << user);
 
 	/* initialise channel for dma command submission */
-	nvkm_wr32(device, 0x610204 + (chid * 0x0010), chan->push);
-	nvkm_wr32(device, 0x610208 + (chid * 0x0010), 0x00010000);
-	nvkm_wr32(device, 0x61020c + (chid * 0x0010), chid);
-	nvkm_mask(device, 0x610200 + (chid * 0x0010), 0x00000010, 0x00000010);
-	nvkm_wr32(device, 0x640000 + (chid * 0x1000), 0x00000000);
-	nvkm_wr32(device, 0x610200 + (chid * 0x0010), 0x00000013);
+	nvkm_wr32(device, 0x610204 + (ctrl * 0x0010), chan->push);
+	nvkm_wr32(device, 0x610208 + (ctrl * 0x0010), 0x00010000);
+	nvkm_wr32(device, 0x61020c + (ctrl * 0x0010), ctrl);
+	nvkm_mask(device, 0x610200 + (ctrl * 0x0010), 0x00000010, 0x00000010);
+	nvkm_wr32(device, 0x640000 + (ctrl * 0x1000), 0x00000000);
+	nvkm_wr32(device, 0x610200 + (ctrl * 0x0010), 0x00000013);
 
 	/* wait for it to go inactive */
 	if (nvkm_msec(device, 2000,
-		if (!(nvkm_rd32(device, 0x610200 + (chid * 0x10)) & 0x80000000))
+		if (!(nvkm_rd32(device, 0x610200 + (ctrl * 0x10)) & 0x80000000))
 			break;
 	) < 0) {
-		nvkm_error(subdev, "ch %d init timeout, %08x\n", chid,
-			   nvkm_rd32(device, 0x610200 + (chid * 0x10)));
+		nvkm_error(subdev, "ch %d init timeout, %08x\n", user,
+			   nvkm_rd32(device, 0x610200 + (ctrl * 0x10)));
 		return -EBUSY;
 	}
 
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/piocgf119.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/piocgf119.c
@@ -32,20 +32,21 @@ gf119_disp_pioc_fini(struct nv50_disp_ch
 	struct nv50_disp *disp = chan->root->disp;
 	struct nvkm_subdev *subdev = &disp->base.engine.subdev;
 	struct nvkm_device *device = subdev->device;
-	int chid = chan->chid;
+	int ctrl = chan->chid.ctrl;
+	int user = chan->chid.user;
 
-	nvkm_mask(device, 0x610490 + (chid * 0x10), 0x00000001, 0x00000000);
+	nvkm_mask(device, 0x610490 + (ctrl * 0x10), 0x00000001, 0x00000000);
 	if (nvkm_msec(device, 2000,
-		if (!(nvkm_rd32(device, 0x610490 + (chid * 0x10)) & 0x00030000))
+		if (!(nvkm_rd32(device, 0x610490 + (ctrl * 0x10)) & 0x00030000))
 			break;
 	) < 0) {
-		nvkm_error(subdev, "ch %d fini: %08x\n", chid,
-			   nvkm_rd32(device, 0x610490 + (chid * 0x10)));
+		nvkm_error(subdev, "ch %d fini: %08x\n", user,
+			   nvkm_rd32(device, 0x610490 + (ctrl * 0x10)));
 	}
 
 	/* disable error reporting and completion notification */
-	nvkm_mask(device, 0x610090, 0x00000001 << chid, 0x00000000);
-	nvkm_mask(device, 0x6100a0, 0x00000001 << chid, 0x00000000);
+	nvkm_mask(device, 0x610090, 0x00000001 << user, 0x00000000);
+	nvkm_mask(device, 0x6100a0, 0x00000001 << user, 0x00000000);
 }
 
 static int
@@ -54,20 +55,21 @@ gf119_disp_pioc_init(struct nv50_disp_ch
 	struct nv50_disp *disp = chan->root->disp;
 	struct nvkm_subdev *subdev = &disp->base.engine.subdev;
 	struct nvkm_device *device = subdev->device;
-	int chid = chan->chid;
+	int ctrl = chan->chid.ctrl;
+	int user = chan->chid.user;
 
 	/* enable error reporting */
-	nvkm_mask(device, 0x6100a0, 0x00000001 << chid, 0x00000001 << chid);
+	nvkm_mask(device, 0x6100a0, 0x00000001 << user, 0x00000001 << user);
 
 	/* activate channel */
-	nvkm_wr32(device, 0x610490 + (chid * 0x10), 0x00000001);
+	nvkm_wr32(device, 0x610490 + (ctrl * 0x10), 0x00000001);
 	if (nvkm_msec(device, 2000,
-		u32 tmp = nvkm_rd32(device, 0x610490 + (chid * 0x10));
+		u32 tmp = nvkm_rd32(device, 0x610490 + (ctrl * 0x10));
 		if ((tmp & 0x00030000) == 0x00010000)
 			break;
 	) < 0) {
-		nvkm_error(subdev, "ch %d init: %08x\n", chid,
-			   nvkm_rd32(device, 0x610490 + (chid * 0x10)));
+		nvkm_error(subdev, "ch %d init: %08x\n", user,
+			   nvkm_rd32(device, 0x610490 + (ctrl * 0x10)));
 		return -EBUSY;
 	}
 
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/piocnv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/piocnv50.c
@@ -32,15 +32,16 @@ nv50_disp_pioc_fini(struct nv50_disp_cha
 	struct nv50_disp *disp = chan->root->disp;
 	struct nvkm_subdev *subdev = &disp->base.engine.subdev;
 	struct nvkm_device *device = subdev->device;
-	int chid = chan->chid;
+	int ctrl = chan->chid.ctrl;
+	int user = chan->chid.user;
 
-	nvkm_mask(device, 0x610200 + (chid * 0x10), 0x00000001, 0x00000000);
+	nvkm_mask(device, 0x610200 + (ctrl * 0x10), 0x00000001, 0x00000000);
 	if (nvkm_msec(device, 2000,
-		if (!(nvkm_rd32(device, 0x610200 + (chid * 0x10)) & 0x00030000))
+		if (!(nvkm_rd32(device, 0x610200 + (ctrl * 0x10)) & 0x00030000))
 			break;
 	) < 0) {
-		nvkm_error(subdev, "ch %d timeout: %08x\n", chid,
-			   nvkm_rd32(device, 0x610200 + (chid * 0x10)));
+		nvkm_error(subdev, "ch %d timeout: %08x\n", user,
+			   nvkm_rd32(device, 0x610200 + (ctrl * 0x10)));
 	}
 }
 
@@ -50,26 +51,27 @@ nv50_disp_pioc_init(struct nv50_disp_cha
 	struct nv50_disp *disp = chan->root->disp;
 	struct nvkm_subdev *subdev = &disp->base.engine.subdev;
 	struct nvkm_device *device = subdev->device;
-	int chid = chan->chid;
+	int ctrl = chan->chid.ctrl;
+	int user = chan->chid.user;
 
-	nvkm_wr32(device, 0x610200 + (chid * 0x10), 0x00002000);
+	nvkm_wr32(device, 0x610200 + (ctrl * 0x10), 0x00002000);
 	if (nvkm_msec(device, 2000,
-		if (!(nvkm_rd32(device, 0x610200 + (chid * 0x10)) & 0x00030000))
+		if (!(nvkm_rd32(device, 0x610200 + (ctrl * 0x10)) & 0x00030000))
 			break;
 	) < 0) {
-		nvkm_error(subdev, "ch %d timeout0: %08x\n", chid,
-			   nvkm_rd32(device, 0x610200 + (chid * 0x10)));
+		nvkm_error(subdev, "ch %d timeout0: %08x\n", user,
+			   nvkm_rd32(device, 0x610200 + (ctrl * 0x10)));
 		return -EBUSY;
 	}
 
-	nvkm_wr32(device, 0x610200 + (chid * 0x10), 0x00000001);
+	nvkm_wr32(device, 0x610200 + (ctrl * 0x10), 0x00000001);
 	if (nvkm_msec(device, 2000,
-		u32 tmp = nvkm_rd32(device, 0x610200 + (chid * 0x10));
+		u32 tmp = nvkm_rd32(device, 0x610200 + (ctrl * 0x10));
 		if ((tmp & 0x00030000) == 0x00010000)
 			break;
 	) < 0) {
-		nvkm_error(subdev, "ch %d timeout1: %08x\n", chid,
-			   nvkm_rd32(device, 0x610200 + (chid * 0x10)));
+		nvkm_error(subdev, "ch %d timeout1: %08x\n", user,
+			   nvkm_rd32(device, 0x610200 + (ctrl * 0x10)));
 		return -EBUSY;
 	}
 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 72/93] drm/nouveau/disp/nv50-: specify ctrl/user separately when constructing classes
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (66 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 71/93] drm/nouveau/disp/nv50-: split chid into chid.ctrl and chid.user Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 73/93] block: allow WRITE_SAME commands with the SG_IO ioctl Greg Kroah-Hartman
                   ` (21 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Ben Skeggs, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ben Skeggs <bskeggs@redhat.com>

[ Upstream commit 2a32b9b1866a2ee9f01fbf2a48d99012f0120739 ]

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c  |   11 ++++++-----
 drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h  |   15 +++++++++------
 drivers/gpu/drm/nouveau/nvkm/engine/disp/cursg84.c   |    2 +-
 drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgf119.c |    2 +-
 drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgk104.c |    2 +-
 drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgt215.c |    2 +-
 drivers/gpu/drm/nouveau/nvkm/engine/disp/cursnv50.c  |    6 +++---
 drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacnv50.c  |    2 +-
 drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmg84.c   |    2 +-
 drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgf119.c |    2 +-
 drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgk104.c |    2 +-
 drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgt215.c |    2 +-
 drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmnv50.c  |    6 +++---
 drivers/gpu/drm/nouveau/nvkm/engine/disp/rootnv50.c  |    4 ++--
 14 files changed, 32 insertions(+), 28 deletions(-)

--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c
@@ -263,7 +263,7 @@ nv50_disp_chan = {
 int
 nv50_disp_chan_ctor(const struct nv50_disp_chan_func *func,
 		    const struct nv50_disp_chan_mthd *mthd,
-		    struct nv50_disp_root *root, int chid, int head,
+		    struct nv50_disp_root *root, int ctrl, int user, int head,
 		    const struct nvkm_oclass *oclass,
 		    struct nv50_disp_chan *chan)
 {
@@ -273,8 +273,8 @@ nv50_disp_chan_ctor(const struct nv50_di
 	chan->func = func;
 	chan->mthd = mthd;
 	chan->root = root;
-	chan->chid.ctrl = chid;
-	chan->chid.user = chid;
+	chan->chid.ctrl = ctrl;
+	chan->chid.user = user;
 	chan->head = head;
 
 	if (disp->chan[chan->chid.user]) {
@@ -288,7 +288,7 @@ nv50_disp_chan_ctor(const struct nv50_di
 int
 nv50_disp_chan_new_(const struct nv50_disp_chan_func *func,
 		    const struct nv50_disp_chan_mthd *mthd,
-		    struct nv50_disp_root *root, int chid, int head,
+		    struct nv50_disp_root *root, int ctrl, int user, int head,
 		    const struct nvkm_oclass *oclass,
 		    struct nvkm_object **pobject)
 {
@@ -298,5 +298,6 @@ nv50_disp_chan_new_(const struct nv50_di
 		return -ENOMEM;
 	*pobject = &chan->object;
 
-	return nv50_disp_chan_ctor(func, mthd, root, chid, head, oclass, chan);
+	return nv50_disp_chan_ctor(func, mthd, root, ctrl, user,
+				   head, oclass, chan);
 }
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h
@@ -29,11 +29,11 @@ struct nv50_disp_chan_func {
 
 int nv50_disp_chan_ctor(const struct nv50_disp_chan_func *,
 			const struct nv50_disp_chan_mthd *,
-			struct nv50_disp_root *, int chid, int head,
+			struct nv50_disp_root *, int ctrl, int user, int head,
 			const struct nvkm_oclass *, struct nv50_disp_chan *);
 int nv50_disp_chan_new_(const struct nv50_disp_chan_func *,
 			const struct nv50_disp_chan_mthd *,
-			struct nv50_disp_root *, int chid, int head,
+			struct nv50_disp_root *, int ctrl, int user, int head,
 			const struct nvkm_oclass *, struct nvkm_object **);
 
 extern const struct nv50_disp_chan_func nv50_disp_pioc_func;
@@ -94,13 +94,16 @@ extern const struct nv50_disp_chan_mthd
 struct nv50_disp_pioc_oclass {
 	int (*ctor)(const struct nv50_disp_chan_func *,
 		    const struct nv50_disp_chan_mthd *,
-		    struct nv50_disp_root *, int chid,
+		    struct nv50_disp_root *, int ctrl, int user,
 		    const struct nvkm_oclass *, void *data, u32 size,
 		    struct nvkm_object **);
 	struct nvkm_sclass base;
 	const struct nv50_disp_chan_func *func;
 	const struct nv50_disp_chan_mthd *mthd;
-	int chid;
+	struct {
+		int ctrl;
+		int user;
+	} chid;
 };
 
 extern const struct nv50_disp_pioc_oclass nv50_disp_oimm_oclass;
@@ -123,12 +126,12 @@ extern const struct nv50_disp_pioc_oclas
 
 int nv50_disp_curs_new(const struct nv50_disp_chan_func *,
 		       const struct nv50_disp_chan_mthd *,
-		       struct nv50_disp_root *, int chid,
+		       struct nv50_disp_root *, int ctrl, int user,
 		       const struct nvkm_oclass *, void *data, u32 size,
 		       struct nvkm_object **);
 int nv50_disp_oimm_new(const struct nv50_disp_chan_func *,
 		       const struct nv50_disp_chan_mthd *,
-		       struct nv50_disp_root *, int chid,
+		       struct nv50_disp_root *, int ctrl, int user,
 		       const struct nvkm_oclass *, void *data, u32 size,
 		       struct nvkm_object **);
 #endif
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursg84.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursg84.c
@@ -33,5 +33,5 @@ g84_disp_curs_oclass = {
 	.base.maxver = 0,
 	.ctor = nv50_disp_curs_new,
 	.func = &nv50_disp_pioc_func,
-	.chid = 7,
+	.chid = { 7, 7 },
 };
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgf119.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgf119.c
@@ -33,5 +33,5 @@ gf119_disp_curs_oclass = {
 	.base.maxver = 0,
 	.ctor = nv50_disp_curs_new,
 	.func = &gf119_disp_pioc_func,
-	.chid = 13,
+	.chid = { 13, 13 },
 };
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgk104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgk104.c
@@ -33,5 +33,5 @@ gk104_disp_curs_oclass = {
 	.base.maxver = 0,
 	.ctor = nv50_disp_curs_new,
 	.func = &gf119_disp_pioc_func,
-	.chid = 13,
+	.chid = { 13, 13 },
 };
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgt215.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgt215.c
@@ -33,5 +33,5 @@ gt215_disp_curs_oclass = {
 	.base.maxver = 0,
 	.ctor = nv50_disp_curs_new,
 	.func = &nv50_disp_pioc_func,
-	.chid = 7,
+	.chid = { 7, 7 },
 };
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursnv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursnv50.c
@@ -33,7 +33,7 @@
 int
 nv50_disp_curs_new(const struct nv50_disp_chan_func *func,
 		   const struct nv50_disp_chan_mthd *mthd,
-		   struct nv50_disp_root *root, int chid,
+		   struct nv50_disp_root *root, int ctrl, int user,
 		   const struct nvkm_oclass *oclass, void *data, u32 size,
 		   struct nvkm_object **pobject)
 {
@@ -54,7 +54,7 @@ nv50_disp_curs_new(const struct nv50_dis
 	} else
 		return ret;
 
-	return nv50_disp_chan_new_(func, mthd, root, chid + head,
+	return nv50_disp_chan_new_(func, mthd, root, ctrl + head, user + head,
 				   head, oclass, pobject);
 }
 
@@ -65,5 +65,5 @@ nv50_disp_curs_oclass = {
 	.base.maxver = 0,
 	.ctor = nv50_disp_curs_new,
 	.func = &nv50_disp_pioc_func,
-	.chid = 7,
+	.chid = { 7, 7 },
 };
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacnv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacnv50.c
@@ -149,7 +149,7 @@ nv50_disp_dmac_new_(const struct nv50_di
 	chan->func = func;
 
 	ret = nv50_disp_chan_ctor(&nv50_disp_dmac_func_, mthd, root,
-				  chid, head, oclass, &chan->base);
+				  chid, chid, head, oclass, &chan->base);
 	if (ret)
 		return ret;
 
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmg84.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmg84.c
@@ -33,5 +33,5 @@ g84_disp_oimm_oclass = {
 	.base.maxver = 0,
 	.ctor = nv50_disp_oimm_new,
 	.func = &nv50_disp_pioc_func,
-	.chid = 5,
+	.chid = { 5, 5 },
 };
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgf119.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgf119.c
@@ -33,5 +33,5 @@ gf119_disp_oimm_oclass = {
 	.base.maxver = 0,
 	.ctor = nv50_disp_oimm_new,
 	.func = &gf119_disp_pioc_func,
-	.chid = 9,
+	.chid = { 9, 9 },
 };
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgk104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgk104.c
@@ -33,5 +33,5 @@ gk104_disp_oimm_oclass = {
 	.base.maxver = 0,
 	.ctor = nv50_disp_oimm_new,
 	.func = &gf119_disp_pioc_func,
-	.chid = 9,
+	.chid = { 9, 9 },
 };
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgt215.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgt215.c
@@ -33,5 +33,5 @@ gt215_disp_oimm_oclass = {
 	.base.maxver = 0,
 	.ctor = nv50_disp_oimm_new,
 	.func = &nv50_disp_pioc_func,
-	.chid = 5,
+	.chid = { 5, 5 },
 };
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmnv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmnv50.c
@@ -33,7 +33,7 @@
 int
 nv50_disp_oimm_new(const struct nv50_disp_chan_func *func,
 		   const struct nv50_disp_chan_mthd *mthd,
-		   struct nv50_disp_root *root, int chid,
+		   struct nv50_disp_root *root, int ctrl, int user,
 		   const struct nvkm_oclass *oclass, void *data, u32 size,
 		   struct nvkm_object **pobject)
 {
@@ -54,7 +54,7 @@ nv50_disp_oimm_new(const struct nv50_dis
 	} else
 		return ret;
 
-	return nv50_disp_chan_new_(func, mthd, root, chid + head,
+	return nv50_disp_chan_new_(func, mthd, root, ctrl + head, user + head,
 				   head, oclass, pobject);
 }
 
@@ -65,5 +65,5 @@ nv50_disp_oimm_oclass = {
 	.base.maxver = 0,
 	.ctor = nv50_disp_oimm_new,
 	.func = &nv50_disp_pioc_func,
-	.chid = 5,
+	.chid = { 5, 5 },
 };
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/rootnv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/rootnv50.c
@@ -207,8 +207,8 @@ nv50_disp_root_pioc_new_(const struct nv
 {
 	const struct nv50_disp_pioc_oclass *sclass = oclass->priv;
 	struct nv50_disp_root *root = nv50_disp_root(oclass->parent);
-	return sclass->ctor(sclass->func, sclass->mthd, root, sclass->chid,
-			    oclass, data, size, pobject);
+	return sclass->ctor(sclass->func, sclass->mthd, root, sclass->chid.ctrl,
+			    sclass->chid.user, oclass, data, size, pobject);
 }
 
 static int

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 73/93] block: allow WRITE_SAME commands with the SG_IO ioctl
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (67 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 72/93] drm/nouveau/disp/nv50-: specify ctrl/user separately when constructing classes Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 74/93] s390/zcrypt: Introduce CEX6 toleration Greg Kroah-Hartman
                   ` (20 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mauricio Faria de Oliveira,
	Brahadambal Srinivasan, Manjunatha H R, Christoph Hellwig,
	Jens Axboe, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>

[ Upstream commit 25cdb64510644f3e854d502d69c73f21c6df88a9 ]

The WRITE_SAME commands are not present in the blk_default_cmd_filter
write_ok list, and thus are failed with -EPERM when the SG_IO ioctl()
is executed without CAP_SYS_RAWIO capability (e.g., unprivileged users).
[ sg_io() -> blk_fill_sghdr_rq() > blk_verify_command() -> -EPERM ]

The problem can be reproduced with the sg_write_same command

  # sg_write_same --num 1 --xferlen 512 /dev/sda
  #

  # capsh --drop=cap_sys_rawio -- -c \
    'sg_write_same --num 1 --xferlen 512 /dev/sda'
    Write same: pass through os error: Operation not permitted
  #

For comparison, the WRITE_VERIFY command does not observe this problem,
since it is in that list:

  # capsh --drop=cap_sys_rawio -- -c \
    'sg_write_verify --num 1 --ilen 512 --lba 0 /dev/sda'
  #

So, this patch adds the WRITE_SAME commands to the list, in order
for the SG_IO ioctl to finish successfully:

  # capsh --drop=cap_sys_rawio -- -c \
    'sg_write_same --num 1 --xferlen 512 /dev/sda'
  #

That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices
(qemu "-device scsi-block" [1], libvirt "<disk type='block' device='lun'>" [2]),
which employs the SG_IO ioctl() and runs as an unprivileged user (libvirt-qemu).

In that scenario, when a filesystem (e.g., ext4) performs its zero-out calls,
which are translated to write-same calls in the guest kernel, and then into
SG_IO ioctls to the host kernel, SCSI I/O errors may be observed in the guest:

  [...] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
  [...] sd 0:0:0:0: [sda] tag#0 Sense Key : Aborted Command [current]
  [...] sd 0:0:0:0: [sda] tag#0 Add. Sense: I/O process terminated
  [...] sd 0:0:0:0: [sda] tag#0 CDB: Write Same(10) 41 00 01 04 e0 78 00 00 08 00
  [...] blk_update_request: I/O error, dev sda, sector 17096824

Links:
[1] http://git.qemu.org/?p=qemu.git;a=commit;h=336a6915bc7089fb20fea4ba99972ad9a97c5f52
[2] https://libvirt.org/formatdomain.html#elementsDisks (see 'disk' -> 'device')

Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Brahadambal Srinivasan <latha@linux.vnet.ibm.com>
Reported-by: Manjunatha H R <manjuhr1@in.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 block/scsi_ioctl.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -182,6 +182,9 @@ static void blk_set_cmd_filter_defaults(
 	__set_bit(WRITE_16, filter->write_ok);
 	__set_bit(WRITE_LONG, filter->write_ok);
 	__set_bit(WRITE_LONG_2, filter->write_ok);
+	__set_bit(WRITE_SAME, filter->write_ok);
+	__set_bit(WRITE_SAME_16, filter->write_ok);
+	__set_bit(WRITE_SAME_32, filter->write_ok);
 	__set_bit(ERASE, filter->write_ok);
 	__set_bit(GPCMD_MODE_SELECT_10, filter->write_ok);
 	__set_bit(MODE_SELECT, filter->write_ok);

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 74/93] s390/zcrypt: Introduce CEX6 toleration
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (68 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 73/93] block: allow WRITE_SAME commands with the SG_IO ioctl Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 75/93] [media] uvcvideo: uvc_scan_fallback() for webcams with broken chain Greg Kroah-Hartman
                   ` (19 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Harald Freudenberger,
	Martin Schwidefsky, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Harald Freudenberger <freude@linux.vnet.ibm.com>

[ Upstream commit b3e8652bcbfa04807e44708d4d0c8cdad39c9215 ]

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/s390/crypto/ap_bus.c |    3 +++
 drivers/s390/crypto/ap_bus.h |    1 +
 2 files changed, 4 insertions(+)

--- a/drivers/s390/crypto/ap_bus.c
+++ b/drivers/s390/crypto/ap_bus.c
@@ -1712,6 +1712,9 @@ static void ap_scan_bus(struct work_stru
 		ap_dev->queue_depth = queue_depth;
 		ap_dev->raw_hwtype = device_type;
 		ap_dev->device_type = device_type;
+		/* CEX6 toleration: map to CEX5 */
+		if (device_type == AP_DEVICE_TYPE_CEX6)
+			ap_dev->device_type = AP_DEVICE_TYPE_CEX5;
 		ap_dev->functions = device_functions;
 		spin_lock_init(&ap_dev->lock);
 		INIT_LIST_HEAD(&ap_dev->pendingq);
--- a/drivers/s390/crypto/ap_bus.h
+++ b/drivers/s390/crypto/ap_bus.h
@@ -105,6 +105,7 @@ static inline int ap_test_bit(unsigned i
 #define AP_DEVICE_TYPE_CEX3C	9
 #define AP_DEVICE_TYPE_CEX4	10
 #define AP_DEVICE_TYPE_CEX5	11
+#define AP_DEVICE_TYPE_CEX6	12
 
 /*
  * Known function facilities

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 75/93] [media] uvcvideo: uvc_scan_fallback() for webcams with broken chain
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (69 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 74/93] s390/zcrypt: Introduce CEX6 toleration Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 76/93] slub: move synchronize_sched out of slab_mutex on shrink Greg Kroah-Hartman
                   ` (18 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Henrik Ingo, Laurent Pinchart,
	Mauro Carvalho Chehab, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Henrik Ingo <henrik.ingo@avoinelama.fi>

[ Upstream commit e950267ab802c8558f1100eafd4087fd039ad634 ]

Some devices have invalid baSourceID references, causing uvc_scan_chain()
to fail, but if we just take the entities we can find and put them
together in the most sensible chain we can think of, turns out they do
work anyway. Note: This heuristic assumes there is a single chain.

At the time of writing, devices known to have such a broken chain are
  - Acer Integrated Camera (5986:055a)
  - Realtek rtl157a7 (0bda:57a7)

Signed-off-by: Henrik Ingo <henrik.ingo@avoinelama.fi>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/media/usb/uvc/uvc_driver.c |  118 +++++++++++++++++++++++++++++++++++--
 1 file changed, 112 insertions(+), 6 deletions(-)

--- a/drivers/media/usb/uvc/uvc_driver.c
+++ b/drivers/media/usb/uvc/uvc_driver.c
@@ -1595,6 +1595,114 @@ static const char *uvc_print_chain(struc
 	return buffer;
 }
 
+static struct uvc_video_chain *uvc_alloc_chain(struct uvc_device *dev)
+{
+	struct uvc_video_chain *chain;
+
+	chain = kzalloc(sizeof(*chain), GFP_KERNEL);
+	if (chain == NULL)
+		return NULL;
+
+	INIT_LIST_HEAD(&chain->entities);
+	mutex_init(&chain->ctrl_mutex);
+	chain->dev = dev;
+	v4l2_prio_init(&chain->prio);
+
+	return chain;
+}
+
+/*
+ * Fallback heuristic for devices that don't connect units and terminals in a
+ * valid chain.
+ *
+ * Some devices have invalid baSourceID references, causing uvc_scan_chain()
+ * to fail, but if we just take the entities we can find and put them together
+ * in the most sensible chain we can think of, turns out they do work anyway.
+ * Note: This heuristic assumes there is a single chain.
+ *
+ * At the time of writing, devices known to have such a broken chain are
+ *  - Acer Integrated Camera (5986:055a)
+ *  - Realtek rtl157a7 (0bda:57a7)
+ */
+static int uvc_scan_fallback(struct uvc_device *dev)
+{
+	struct uvc_video_chain *chain;
+	struct uvc_entity *iterm = NULL;
+	struct uvc_entity *oterm = NULL;
+	struct uvc_entity *entity;
+	struct uvc_entity *prev;
+
+	/*
+	 * Start by locating the input and output terminals. We only support
+	 * devices with exactly one of each for now.
+	 */
+	list_for_each_entry(entity, &dev->entities, list) {
+		if (UVC_ENTITY_IS_ITERM(entity)) {
+			if (iterm)
+				return -EINVAL;
+			iterm = entity;
+		}
+
+		if (UVC_ENTITY_IS_OTERM(entity)) {
+			if (oterm)
+				return -EINVAL;
+			oterm = entity;
+		}
+	}
+
+	if (iterm == NULL || oterm == NULL)
+		return -EINVAL;
+
+	/* Allocate the chain and fill it. */
+	chain = uvc_alloc_chain(dev);
+	if (chain == NULL)
+		return -ENOMEM;
+
+	if (uvc_scan_chain_entity(chain, oterm) < 0)
+		goto error;
+
+	prev = oterm;
+
+	/*
+	 * Add all Processing and Extension Units with two pads. The order
+	 * doesn't matter much, use reverse list traversal to connect units in
+	 * UVC descriptor order as we build the chain from output to input. This
+	 * leads to units appearing in the order meant by the manufacturer for
+	 * the cameras known to require this heuristic.
+	 */
+	list_for_each_entry_reverse(entity, &dev->entities, list) {
+		if (entity->type != UVC_VC_PROCESSING_UNIT &&
+		    entity->type != UVC_VC_EXTENSION_UNIT)
+			continue;
+
+		if (entity->num_pads != 2)
+			continue;
+
+		if (uvc_scan_chain_entity(chain, entity) < 0)
+			goto error;
+
+		prev->baSourceID[0] = entity->id;
+		prev = entity;
+	}
+
+	if (uvc_scan_chain_entity(chain, iterm) < 0)
+		goto error;
+
+	prev->baSourceID[0] = iterm->id;
+
+	list_add_tail(&chain->list, &dev->chains);
+
+	uvc_trace(UVC_TRACE_PROBE,
+		  "Found a video chain by fallback heuristic (%s).\n",
+		  uvc_print_chain(chain));
+
+	return 0;
+
+error:
+	kfree(chain);
+	return -EINVAL;
+}
+
 /*
  * Scan the device for video chains and register video devices.
  *
@@ -1617,15 +1725,10 @@ static int uvc_scan_device(struct uvc_de
 		if (term->chain.next || term->chain.prev)
 			continue;
 
-		chain = kzalloc(sizeof(*chain), GFP_KERNEL);
+		chain = uvc_alloc_chain(dev);
 		if (chain == NULL)
 			return -ENOMEM;
 
-		INIT_LIST_HEAD(&chain->entities);
-		mutex_init(&chain->ctrl_mutex);
-		chain->dev = dev;
-		v4l2_prio_init(&chain->prio);
-
 		term->flags |= UVC_ENTITY_FLAG_DEFAULT;
 
 		if (uvc_scan_chain(chain, term) < 0) {
@@ -1639,6 +1742,9 @@ static int uvc_scan_device(struct uvc_de
 		list_add_tail(&chain->list, &dev->chains);
 	}
 
+	if (list_empty(&dev->chains))
+		uvc_scan_fallback(dev);
+
 	if (list_empty(&dev->chains)) {
 		uvc_printk(KERN_INFO, "No valid video chain found.\n");
 		return -1;

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 76/93] slub: move synchronize_sched out of slab_mutex on shrink
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (70 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 75/93] [media] uvcvideo: uvc_scan_fallback() for webcams with broken chain Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 77/93] ACPI / blacklist: add _REV quirks for Dell Precision 5520 and 3520 Greg Kroah-Hartman
                   ` (17 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Vladimir Davydov, Doug Smythies,
	Joonsoo Kim, Christoph Lameter, David Rientjes, Johannes Weiner,
	Michal Hocko, Pekka Enberg, Andrew Morton, Linus Torvalds,
	Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vladimir Davydov <vdavydov.dev@gmail.com>

[ Upstream commit 89e364db71fb5e7fc8d93228152abfa67daf35fa ]

synchronize_sched() is a heavy operation and calling it per each cache
owned by a memory cgroup being destroyed may take quite some time.  What
is worse, it's currently called under the slab_mutex, stalling all works
doing cache creation/destruction.

Actually, there isn't much point in calling synchronize_sched() for each
cache - it's enough to call it just once - after setting cpu_partial for
all caches and before shrinking them.  This way, we can also move it out
of the slab_mutex, which we have to hold for iterating over the slab
cache list.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=172991
Link: http://lkml.kernel.org/r/0a10d71ecae3db00fb4421bcd3f82bcc911f4be4.1475329751.git.vdavydov.dev@gmail.com
Signed-off-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Reported-by: Doug Smythies <dsmythies@telus.net>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/slab.c        |    4 ++--
 mm/slab.h        |    2 +-
 mm/slab_common.c |   27 +++++++++++++++++++++++++--
 mm/slob.c        |    2 +-
 mm/slub.c        |   19 ++-----------------
 5 files changed, 31 insertions(+), 23 deletions(-)

--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2332,7 +2332,7 @@ out:
 	return nr_freed;
 }
 
-int __kmem_cache_shrink(struct kmem_cache *cachep, bool deactivate)
+int __kmem_cache_shrink(struct kmem_cache *cachep)
 {
 	int ret = 0;
 	int node;
@@ -2352,7 +2352,7 @@ int __kmem_cache_shrink(struct kmem_cach
 
 int __kmem_cache_shutdown(struct kmem_cache *cachep)
 {
-	return __kmem_cache_shrink(cachep, false);
+	return __kmem_cache_shrink(cachep);
 }
 
 void __kmem_cache_release(struct kmem_cache *cachep)
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -146,7 +146,7 @@ static inline unsigned long kmem_cache_f
 
 int __kmem_cache_shutdown(struct kmem_cache *);
 void __kmem_cache_release(struct kmem_cache *);
-int __kmem_cache_shrink(struct kmem_cache *, bool);
+int __kmem_cache_shrink(struct kmem_cache *);
 void slab_kmem_cache_release(struct kmem_cache *);
 
 struct seq_file;
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -573,6 +573,29 @@ void memcg_deactivate_kmem_caches(struct
 	get_online_cpus();
 	get_online_mems();
 
+#ifdef CONFIG_SLUB
+	/*
+	 * In case of SLUB, we need to disable empty slab caching to
+	 * avoid pinning the offline memory cgroup by freeable kmem
+	 * pages charged to it. SLAB doesn't need this, as it
+	 * periodically purges unused slabs.
+	 */
+	mutex_lock(&slab_mutex);
+	list_for_each_entry(s, &slab_caches, list) {
+		c = is_root_cache(s) ? cache_from_memcg_idx(s, idx) : NULL;
+		if (c) {
+			c->cpu_partial = 0;
+			c->min_partial = 0;
+		}
+	}
+	mutex_unlock(&slab_mutex);
+	/*
+	 * kmem_cache->cpu_partial is checked locklessly (see
+	 * put_cpu_partial()). Make sure the change is visible.
+	 */
+	synchronize_sched();
+#endif
+
 	mutex_lock(&slab_mutex);
 	list_for_each_entry(s, &slab_caches, list) {
 		if (!is_root_cache(s))
@@ -584,7 +607,7 @@ void memcg_deactivate_kmem_caches(struct
 		if (!c)
 			continue;
 
-		__kmem_cache_shrink(c, true);
+		__kmem_cache_shrink(c);
 		arr->entries[idx] = NULL;
 	}
 	mutex_unlock(&slab_mutex);
@@ -755,7 +778,7 @@ int kmem_cache_shrink(struct kmem_cache
 	get_online_cpus();
 	get_online_mems();
 	kasan_cache_shrink(cachep);
-	ret = __kmem_cache_shrink(cachep, false);
+	ret = __kmem_cache_shrink(cachep);
 	put_online_mems();
 	put_online_cpus();
 	return ret;
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -634,7 +634,7 @@ void __kmem_cache_release(struct kmem_ca
 {
 }
 
-int __kmem_cache_shrink(struct kmem_cache *d, bool deactivate)
+int __kmem_cache_shrink(struct kmem_cache *d)
 {
 	return 0;
 }
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3887,7 +3887,7 @@ EXPORT_SYMBOL(kfree);
  * being allocated from last increasing the chance that the last objects
  * are freed in them.
  */
-int __kmem_cache_shrink(struct kmem_cache *s, bool deactivate)
+int __kmem_cache_shrink(struct kmem_cache *s)
 {
 	int node;
 	int i;
@@ -3899,21 +3899,6 @@ int __kmem_cache_shrink(struct kmem_cach
 	unsigned long flags;
 	int ret = 0;
 
-	if (deactivate) {
-		/*
-		 * Disable empty slabs caching. Used to avoid pinning offline
-		 * memory cgroups by kmem pages that can be freed.
-		 */
-		s->cpu_partial = 0;
-		s->min_partial = 0;
-
-		/*
-		 * s->cpu_partial is checked locklessly (see put_cpu_partial),
-		 * so we have to make sure the change is visible.
-		 */
-		synchronize_sched();
-	}
-
 	flush_all(s);
 	for_each_kmem_cache_node(s, node, n) {
 		INIT_LIST_HEAD(&discard);
@@ -3970,7 +3955,7 @@ static int slab_mem_going_offline_callba
 
 	mutex_lock(&slab_mutex);
 	list_for_each_entry(s, &slab_caches, list)
-		__kmem_cache_shrink(s, false);
+		__kmem_cache_shrink(s);
 	mutex_unlock(&slab_mutex);
 
 	return 0;

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 77/93] ACPI / blacklist: add _REV quirks for Dell Precision 5520 and 3520
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (71 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 76/93] slub: move synchronize_sched out of slab_mutex on shrink Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 78/93] ACPI / blacklist: Make Dell Latitude 3350 ethernet work Greg Kroah-Hartman
                   ` (16 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alex Hung, Rafael J. Wysocki, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alex Hung <alex.hung@canonical.com>

[ Upstream commit 9523b9bf6dceef6b0215e90b2348cd646597f796 ]

Precision 5520 and 3520 either hang at login and during suspend or reboot.

It turns out that that adding them to acpi_rev_dmi_table[] helps to work
around those issues.

Signed-off-by: Alex Hung <alex.hung@canonical.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/acpi/blacklist.c |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)

--- a/drivers/acpi/blacklist.c
+++ b/drivers/acpi/blacklist.c
@@ -160,6 +160,22 @@ static struct dmi_system_id acpi_rev_dmi
 		      DMI_MATCH(DMI_PRODUCT_NAME, "XPS 13 9343"),
 		},
 	},
+	{
+	 .callback = dmi_enable_rev_override,
+	 .ident = "DELL Precision 5520",
+	 .matches = {
+		      DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+		      DMI_MATCH(DMI_PRODUCT_NAME, "Precision 5520"),
+		},
+	},
+	{
+	 .callback = dmi_enable_rev_override,
+	 .ident = "DELL Precision 3520",
+	 .matches = {
+		      DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+		      DMI_MATCH(DMI_PRODUCT_NAME, "Precision 3520"),
+		},
+	},
 #endif
 	{}
 };

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 78/93] ACPI / blacklist: Make Dell Latitude 3350 ethernet work
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (72 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 77/93] ACPI / blacklist: add _REV quirks for Dell Precision 5520 and 3520 Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 79/93] serial: 8250_pci: Detach low-level driver during PCI error recovery Greg Kroah-Hartman
                   ` (15 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Michael Pobega, Rafael J. Wysocki,
	Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michael Pobega <mpobega@neverware.com>

[ Upstream commit 708f5dcc21ae9b35f395865fc154b0105baf4de4 ]

The Dell Latitude 3350's ethernet card attempts to use a reserved
IRQ (18), resulting in ACPI being unable to enable the ethernet.

Adding it to acpi_rev_dmi_table[] helps to work around this problem.

Signed-off-by: Michael Pobega <mpobega@neverware.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/acpi/blacklist.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

--- a/drivers/acpi/blacklist.c
+++ b/drivers/acpi/blacklist.c
@@ -176,6 +176,18 @@ static struct dmi_system_id acpi_rev_dmi
 		      DMI_MATCH(DMI_PRODUCT_NAME, "Precision 3520"),
 		},
 	},
+	/*
+	 * Resolves a quirk with the Dell Latitude 3350 that
+	 * causes the ethernet adapter to not function.
+	 */
+	{
+	 .callback = dmi_enable_rev_override,
+	 .ident = "DELL Latitude 3350",
+	 .matches = {
+		      DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+		      DMI_MATCH(DMI_PRODUCT_NAME, "Latitude 3350"),
+		},
+	},
 #endif
 	{}
 };

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 79/93] serial: 8250_pci: Detach low-level driver during PCI error recovery
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (73 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 78/93] ACPI / blacklist: Make Dell Latitude 3350 ethernet work Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 80/93] usb: gadget: udc: atmel: remove memory leak Greg Kroah-Hartman
                   ` (14 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Gabriel Krisman Bertazi, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>

[ Upstream commit f209fa03fc9d131b3108c2e4936181eabab87416 ]

During a PCI error recovery, like the ones provoked by EEH in the ppc64
platform, all IO to the device must be blocked while the recovery is
completed.  Current 8250_pci implementation only suspends the port
instead of detaching it, which doesn't prevent incoming accesses like
TIOCMGET and TIOCMSET calls from reaching the device.  Those end up
racing with the EEH recovery, crashing it.  Similar races were also
observed when opening the device and when shutting it down during
recovery.

This patch implements a more robust IO blockage for the 8250_pci
recovery by unregistering the port at the beginning of the procedure and
re-adding it afterwards.  Since the port is detached from the uart
layer, we can be sure that no request will make through to the device
during recovery.  This is similar to the solution used by the JSM serial
driver.

I thank Peter Hurley <peter@hurleysoftware.com> for valuable input on
this one over one year ago.

Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/tty/serial/8250/8250_pci.c |   23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

--- a/drivers/tty/serial/8250/8250_pci.c
+++ b/drivers/tty/serial/8250/8250_pci.c
@@ -52,6 +52,7 @@ struct serial_private {
 	struct pci_dev		*dev;
 	unsigned int		nr;
 	struct pci_serial_quirk	*quirk;
+	const struct pciserial_board *board;
 	int			line[0];
 };
 
@@ -3871,6 +3872,7 @@ pciserial_init_ports(struct pci_dev *dev
 		}
 	}
 	priv->nr = i;
+	priv->board = board;
 	return priv;
 
 err_deinit:
@@ -3881,7 +3883,7 @@ err_out:
 }
 EXPORT_SYMBOL_GPL(pciserial_init_ports);
 
-void pciserial_remove_ports(struct serial_private *priv)
+void pciserial_detach_ports(struct serial_private *priv)
 {
 	struct pci_serial_quirk *quirk;
 	int i;
@@ -3895,7 +3897,11 @@ void pciserial_remove_ports(struct seria
 	quirk = find_quirk(priv->dev);
 	if (quirk->exit)
 		quirk->exit(priv->dev);
+}
 
+void pciserial_remove_ports(struct serial_private *priv)
+{
+	pciserial_detach_ports(priv);
 	kfree(priv);
 }
 EXPORT_SYMBOL_GPL(pciserial_remove_ports);
@@ -5590,7 +5596,7 @@ static pci_ers_result_t serial8250_io_er
 		return PCI_ERS_RESULT_DISCONNECT;
 
 	if (priv)
-		pciserial_suspend_ports(priv);
+		pciserial_detach_ports(priv);
 
 	pci_disable_device(dev);
 
@@ -5615,9 +5621,18 @@ static pci_ers_result_t serial8250_io_sl
 static void serial8250_io_resume(struct pci_dev *dev)
 {
 	struct serial_private *priv = pci_get_drvdata(dev);
+	const struct pciserial_board *board;
 
-	if (priv)
-		pciserial_resume_ports(priv);
+	if (!priv)
+		return;
+
+	board = priv->board;
+	kfree(priv);
+	priv = pciserial_init_ports(dev, board);
+
+	if (!IS_ERR(priv)) {
+		pci_set_drvdata(dev, priv);
+	}
 }
 
 static const struct pci_error_handlers serial8250_err_handler = {

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 80/93] usb: gadget: udc: atmel: remove memory leak
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (74 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 79/93] serial: 8250_pci: Detach low-level driver during PCI error recovery Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 82/93] clk: bcm2835: Fix ->fixed_divider of pllh_aux Greg Kroah-Hartman
                   ` (13 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexandre Belloni, Felipe Balbi, Sasha Levin

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexandre Belloni <alexandre.belloni@free-electrons.com>

[ Upstream commit 32856eea7bf75dfb99b955ada6e147f553a11366 ]

Commit bbe097f092b0 ("usb: gadget: udc: atmel: fix endpoint name")
introduced a memory leak when unbinding the driver. The endpoint names
would not be freed. Solve that by including the name as a string in struct
usba_ep so it is freed when the endpoint is.

Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/usb/gadget/udc/atmel_usba_udc.c |    3 ++-
 drivers/usb/gadget/udc/atmel_usba_udc.h |    1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/usb/gadget/udc/atmel_usba_udc.c
+++ b/drivers/usb/gadget/udc/atmel_usba_udc.c
@@ -1978,7 +1978,8 @@ static struct usba_ep * atmel_udc_of_ini
 			dev_err(&pdev->dev, "of_probe: name error(%d)\n", ret);
 			goto err;
 		}
-		ep->ep.name = kasprintf(GFP_KERNEL, "ep%d", ep->index);
+		sprintf(ep->name, "ep%d", ep->index);
+		ep->ep.name = ep->name;
 
 		ep->ep_regs = udc->regs + USBA_EPT_BASE(i);
 		ep->dma_regs = udc->regs + USBA_DMA_BASE(i);
--- a/drivers/usb/gadget/udc/atmel_usba_udc.h
+++ b/drivers/usb/gadget/udc/atmel_usba_udc.h
@@ -280,6 +280,7 @@ struct usba_ep {
 	void __iomem				*ep_regs;
 	void __iomem				*dma_regs;
 	void __iomem				*fifo;
+	char					name[8];
 	struct usb_ep				ep;
 	struct usba_udc				*udc;
 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 82/93] clk: bcm2835: Fix ->fixed_divider of pllh_aux
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (75 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 80/93] usb: gadget: udc: atmel: remove memory leak Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 83/93] drm/vc4: Fix race between page flip completion event and clean-up Greg Kroah-Hartman
                   ` (12 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Boris Brezillon, Eric Anholt,
	Stephen Boyd, Amit Pundir

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Boris Brezillon <boris.brezillon@free-electrons.com>

commit f2a46926aba1f0c33944901d2420a6a887455ddc upstream.

There is no fixed divider on pllh_aux.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/clk/bcm/clk-bcm2835.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/clk/bcm/clk-bcm2835.c
+++ b/drivers/clk/bcm/clk-bcm2835.c
@@ -1598,7 +1598,7 @@ static const struct bcm2835_clk_desc clk
 		.a2w_reg = A2W_PLLH_AUX,
 		.load_mask = CM_PLLH_LOADAUX,
 		.hold_mask = 0,
-		.fixed_divider = 10),
+		.fixed_divider = 1),
 	[BCM2835_PLLH_PIX]	= REGISTER_PLL_DIV(
 		.name = "pllh_pix",
 		.source_pll = "pllh",

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 83/93] drm/vc4: Fix race between page flip completion event and clean-up
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (76 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 82/93] clk: bcm2835: Fix ->fixed_divider of pllh_aux Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:51 ` [PATCH 4.9 84/93] drm/vc4: Fix ->clock_select setting for the VEC encoder Greg Kroah-Hartman
                   ` (11 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Derek Foreman, Eric Anholt,
	Daniel Stone, Amit Pundir

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Derek Foreman <derekf@osg.samsung.com>

commit 26fc78f6fef39b9d7a15def5e7e9826ff68303f4 upstream.

There was a small window where a userspace program could submit
a pageflip after receiving a pageflip completion event yet still
receive EBUSY.

Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Cc: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/gpu/drm/vc4/vc4_crtc.c |    8 ++++++++
 drivers/gpu/drm/vc4/vc4_drv.h  |    1 +
 drivers/gpu/drm/vc4/vc4_kms.c  |   33 +++++++++++++++++++++++++--------
 3 files changed, 34 insertions(+), 8 deletions(-)

--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -669,6 +669,14 @@ void vc4_disable_vblank(struct drm_devic
 	CRTC_WRITE(PV_INTEN, 0);
 }
 
+/* Must be called with the event lock held */
+bool vc4_event_pending(struct drm_crtc *crtc)
+{
+	struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
+
+	return !!vc4_crtc->event;
+}
+
 static void vc4_crtc_handle_page_flip(struct vc4_crtc *vc4_crtc)
 {
 	struct drm_crtc *crtc = &vc4_crtc->base;
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -440,6 +440,7 @@ int vc4_bo_stats_debugfs(struct seq_file
 extern struct platform_driver vc4_crtc_driver;
 int vc4_enable_vblank(struct drm_device *dev, unsigned int crtc_id);
 void vc4_disable_vblank(struct drm_device *dev, unsigned int crtc_id);
+bool vc4_event_pending(struct drm_crtc *crtc);
 int vc4_crtc_debugfs_regs(struct seq_file *m, void *arg);
 int vc4_crtc_get_scanoutpos(struct drm_device *dev, unsigned int crtc_id,
 			    unsigned int flags, int *vpos, int *hpos,
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -119,17 +119,34 @@ static int vc4_atomic_commit(struct drm_
 
 	/* Make sure that any outstanding modesets have finished. */
 	if (nonblock) {
-		ret = down_trylock(&vc4->async_modeset);
-		if (ret) {
+		struct drm_crtc *crtc;
+		struct drm_crtc_state *crtc_state;
+		unsigned long flags;
+		bool busy = false;
+
+		/*
+		 * If there's an undispatched event to send then we're
+		 * obviously still busy.  If there isn't, then we can
+		 * unconditionally wait for the semaphore because it
+		 * shouldn't be contended (for long).
+		 *
+		 * This is to prevent a race where queuing a new flip
+		 * from userspace immediately on receipt of an event
+		 * beats our clean-up and returns EBUSY.
+		 */
+		spin_lock_irqsave(&dev->event_lock, flags);
+		for_each_crtc_in_state(state, crtc, crtc_state, i)
+			busy |= vc4_event_pending(crtc);
+		spin_unlock_irqrestore(&dev->event_lock, flags);
+		if (busy) {
 			kfree(c);
 			return -EBUSY;
 		}
-	} else {
-		ret = down_interruptible(&vc4->async_modeset);
-		if (ret) {
-			kfree(c);
-			return ret;
-		}
+	}
+	ret = down_interruptible(&vc4->async_modeset);
+	if (ret) {
+		kfree(c);
+		return ret;
 	}
 
 	ret = drm_atomic_helper_prepare_planes(dev, state);

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 84/93] drm/vc4: Fix ->clock_select setting for the VEC encoder
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (77 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 83/93] drm/vc4: Fix race between page flip completion event and clean-up Greg Kroah-Hartman
@ 2017-03-20 17:51 ` Greg Kroah-Hartman
  2017-03-20 17:52 ` [PATCH 4.9 85/93] arm64: KVM: VHE: Clear HCR_TGE when invalidating guest TLBs Greg Kroah-Hartman
                   ` (10 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Boris Brezillon, Eric Anholt, Amit Pundir

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Boris Brezillon <boris.brezillon@free-electrons.com>

commit ab8df60e3a3b68420d0d4477c5f07c00fbfb078b upstream.

PV_CONTROL_CLK_SELECT_VEC is actually 2 and not 0. Fix the definition and
rework the vc4_set_crtc_possible_masks() to cover the full range of the
PV_CONTROL_CLK_SELECT field.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Cc: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


---
 drivers/gpu/drm/vc4/vc4_crtc.c |   36 ++++++++++++++++++++++--------------
 drivers/gpu/drm/vc4/vc4_drv.h  |    1 +
 drivers/gpu/drm/vc4/vc4_regs.h |    3 ++-
 3 files changed, 25 insertions(+), 15 deletions(-)

--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -83,8 +83,7 @@ struct vc4_crtc_data {
 	/* Which channel of the HVS this pixelvalve sources from. */
 	int hvs_channel;
 
-	enum vc4_encoder_type encoder0_type;
-	enum vc4_encoder_type encoder1_type;
+	enum vc4_encoder_type encoder_types[4];
 };
 
 #define CRTC_WRITE(offset, val) writel(val, vc4_crtc->regs + (offset))
@@ -867,20 +866,26 @@ static const struct drm_crtc_helper_func
 
 static const struct vc4_crtc_data pv0_data = {
 	.hvs_channel = 0,
-	.encoder0_type = VC4_ENCODER_TYPE_DSI0,
-	.encoder1_type = VC4_ENCODER_TYPE_DPI,
+	.encoder_types = {
+		[PV_CONTROL_CLK_SELECT_DSI] = VC4_ENCODER_TYPE_DSI0,
+		[PV_CONTROL_CLK_SELECT_DPI_SMI_HDMI] = VC4_ENCODER_TYPE_DPI,
+	},
 };
 
 static const struct vc4_crtc_data pv1_data = {
 	.hvs_channel = 2,
-	.encoder0_type = VC4_ENCODER_TYPE_DSI1,
-	.encoder1_type = VC4_ENCODER_TYPE_SMI,
+	.encoder_types = {
+		[PV_CONTROL_CLK_SELECT_DSI] = VC4_ENCODER_TYPE_DSI1,
+		[PV_CONTROL_CLK_SELECT_DPI_SMI_HDMI] = VC4_ENCODER_TYPE_SMI,
+	},
 };
 
 static const struct vc4_crtc_data pv2_data = {
 	.hvs_channel = 1,
-	.encoder0_type = VC4_ENCODER_TYPE_VEC,
-	.encoder1_type = VC4_ENCODER_TYPE_HDMI,
+	.encoder_types = {
+		[PV_CONTROL_CLK_SELECT_DPI_SMI_HDMI] = VC4_ENCODER_TYPE_HDMI,
+		[PV_CONTROL_CLK_SELECT_VEC] = VC4_ENCODER_TYPE_VEC,
+	},
 };
 
 static const struct of_device_id vc4_crtc_dt_match[] = {
@@ -894,17 +899,20 @@ static void vc4_set_crtc_possible_masks(
 					struct drm_crtc *crtc)
 {
 	struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
+	const struct vc4_crtc_data *crtc_data = vc4_crtc->data;
+	const enum vc4_encoder_type *encoder_types = crtc_data->encoder_types;
 	struct drm_encoder *encoder;
 
 	drm_for_each_encoder(encoder, drm) {
 		struct vc4_encoder *vc4_encoder = to_vc4_encoder(encoder);
+		int i;
 
-		if (vc4_encoder->type == vc4_crtc->data->encoder0_type) {
-			vc4_encoder->clock_select = 0;
-			encoder->possible_crtcs |= drm_crtc_mask(crtc);
-		} else if (vc4_encoder->type == vc4_crtc->data->encoder1_type) {
-			vc4_encoder->clock_select = 1;
-			encoder->possible_crtcs |= drm_crtc_mask(crtc);
+		for (i = 0; i < ARRAY_SIZE(crtc_data->encoder_types); i++) {
+			if (vc4_encoder->type == encoder_types[i]) {
+				vc4_encoder->clock_select = i;
+				encoder->possible_crtcs |= drm_crtc_mask(crtc);
+				break;
+			}
 		}
 	}
 }
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -194,6 +194,7 @@ to_vc4_plane(struct drm_plane *plane)
 }
 
 enum vc4_encoder_type {
+	VC4_ENCODER_TYPE_NONE,
 	VC4_ENCODER_TYPE_HDMI,
 	VC4_ENCODER_TYPE_VEC,
 	VC4_ENCODER_TYPE_DSI0,
--- a/drivers/gpu/drm/vc4/vc4_regs.h
+++ b/drivers/gpu/drm/vc4/vc4_regs.h
@@ -177,8 +177,9 @@
 # define PV_CONTROL_WAIT_HSTART			BIT(12)
 # define PV_CONTROL_PIXEL_REP_MASK		VC4_MASK(5, 4)
 # define PV_CONTROL_PIXEL_REP_SHIFT		4
-# define PV_CONTROL_CLK_SELECT_DSI_VEC		0
+# define PV_CONTROL_CLK_SELECT_DSI		0
 # define PV_CONTROL_CLK_SELECT_DPI_SMI_HDMI	1
+# define PV_CONTROL_CLK_SELECT_VEC		2
 # define PV_CONTROL_CLK_SELECT_MASK		VC4_MASK(3, 2)
 # define PV_CONTROL_CLK_SELECT_SHIFT		2
 # define PV_CONTROL_FIFO_CLR			BIT(1)

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 85/93] arm64: KVM: VHE: Clear HCR_TGE when invalidating guest TLBs
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (78 preceding siblings ...)
  2017-03-20 17:51 ` [PATCH 4.9 84/93] drm/vc4: Fix ->clock_select setting for the VEC encoder Greg Kroah-Hartman
@ 2017-03-20 17:52 ` Greg Kroah-Hartman
  2017-03-20 17:52 ` [PATCH 4.9 86/93] irqchip/gicv3-its: Add workaround for QDF2400 ITS erratum 0065 Greg Kroah-Hartman
                   ` (9 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tomasz Nowicki, Christoffer Dall,
	Marc Zyngier

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Marc Zyngier <marc.zyngier@arm.com>

commit 68925176296a8b995e503349200e256674bfe5ac upstream.

When invalidating guest TLBs, special care must be taken to
actually shoot the guest TLBs and not the host ones if we're
running on a VHE system.  This is controlled by the HCR_EL2.TGE
bit, which we forget to clear before invalidating TLBs.

Address the issue by introducing two wrappers (__tlb_switch_to_guest
and __tlb_switch_to_host) that take care of both the VTTBR_EL2
and HCR_EL2.TGE switching.

Reported-by: Tomasz Nowicki <tnowicki@caviumnetworks.com>
Tested-by: Tomasz Nowicki <tnowicki@caviumnetworks.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arm64/kvm/hyp/tlb.c |   64 ++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 55 insertions(+), 9 deletions(-)

--- a/arch/arm64/kvm/hyp/tlb.c
+++ b/arch/arm64/kvm/hyp/tlb.c
@@ -17,14 +17,62 @@
 
 #include <asm/kvm_hyp.h>
 
+static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm)
+{
+	u64 val;
+
+	/*
+	 * With VHE enabled, we have HCR_EL2.{E2H,TGE} = {1,1}, and
+	 * most TLB operations target EL2/EL0. In order to affect the
+	 * guest TLBs (EL1/EL0), we need to change one of these two
+	 * bits. Changing E2H is impossible (goodbye TTBR1_EL2), so
+	 * let's flip TGE before executing the TLB operation.
+	 */
+	write_sysreg(kvm->arch.vttbr, vttbr_el2);
+	val = read_sysreg(hcr_el2);
+	val &= ~HCR_TGE;
+	write_sysreg(val, hcr_el2);
+	isb();
+}
+
+static void __hyp_text __tlb_switch_to_guest_nvhe(struct kvm *kvm)
+{
+	write_sysreg(kvm->arch.vttbr, vttbr_el2);
+	isb();
+}
+
+static hyp_alternate_select(__tlb_switch_to_guest,
+			    __tlb_switch_to_guest_nvhe,
+			    __tlb_switch_to_guest_vhe,
+			    ARM64_HAS_VIRT_HOST_EXTN);
+
+static void __hyp_text __tlb_switch_to_host_vhe(struct kvm *kvm)
+{
+	/*
+	 * We're done with the TLB operation, let's restore the host's
+	 * view of HCR_EL2.
+	 */
+	write_sysreg(0, vttbr_el2);
+	write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
+}
+
+static void __hyp_text __tlb_switch_to_host_nvhe(struct kvm *kvm)
+{
+	write_sysreg(0, vttbr_el2);
+}
+
+static hyp_alternate_select(__tlb_switch_to_host,
+			    __tlb_switch_to_host_nvhe,
+			    __tlb_switch_to_host_vhe,
+			    ARM64_HAS_VIRT_HOST_EXTN);
+
 void __hyp_text __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 {
 	dsb(ishst);
 
 	/* Switch to requested VMID */
 	kvm = kern_hyp_va(kvm);
-	write_sysreg(kvm->arch.vttbr, vttbr_el2);
-	isb();
+	__tlb_switch_to_guest()(kvm);
 
 	/*
 	 * We could do so much better if we had the VA as well.
@@ -45,7 +93,7 @@ void __hyp_text __kvm_tlb_flush_vmid_ipa
 	dsb(ish);
 	isb();
 
-	write_sysreg(0, vttbr_el2);
+	__tlb_switch_to_host()(kvm);
 }
 
 void __hyp_text __kvm_tlb_flush_vmid(struct kvm *kvm)
@@ -54,14 +102,13 @@ void __hyp_text __kvm_tlb_flush_vmid(str
 
 	/* Switch to requested VMID */
 	kvm = kern_hyp_va(kvm);
-	write_sysreg(kvm->arch.vttbr, vttbr_el2);
-	isb();
+	__tlb_switch_to_guest()(kvm);
 
 	asm volatile("tlbi vmalls12e1is" : : );
 	dsb(ish);
 	isb();
 
-	write_sysreg(0, vttbr_el2);
+	__tlb_switch_to_host()(kvm);
 }
 
 void __hyp_text __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu)
@@ -69,14 +116,13 @@ void __hyp_text __kvm_tlb_flush_local_vm
 	struct kvm *kvm = kern_hyp_va(kern_hyp_va(vcpu)->kvm);
 
 	/* Switch to requested VMID */
-	write_sysreg(kvm->arch.vttbr, vttbr_el2);
-	isb();
+	__tlb_switch_to_guest()(kvm);
 
 	asm volatile("tlbi vmalle1" : : );
 	dsb(nsh);
 	isb();
 
-	write_sysreg(0, vttbr_el2);
+	__tlb_switch_to_host()(kvm);
 }
 
 void __hyp_text __kvm_flush_vm_context(void)

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 86/93] irqchip/gicv3-its: Add workaround for QDF2400 ITS erratum 0065
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (79 preceding siblings ...)
  2017-03-20 17:52 ` [PATCH 4.9 85/93] arm64: KVM: VHE: Clear HCR_TGE when invalidating guest TLBs Greg Kroah-Hartman
@ 2017-03-20 17:52 ` Greg Kroah-Hartman
  2017-03-20 17:52 ` [PATCH 4.9 87/93] x86/tsc: Fix ART for TSC_KNOWN_FREQ Greg Kroah-Hartman
                   ` (8 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:52 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Shanker Donthineni, Marc Zyngier

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Shanker Donthineni <shankerd@codeaurora.org>

commit 90922a2d03d84de36bf8a9979d62580102f31a92 upstream.

On Qualcomm Datacenter Technologies QDF2400 SoCs, the ITS hardware
implementation uses 16Bytes for Interrupt Translation Entry (ITE),
but reports an incorrect value of 8Bytes in GITS_TYPER.ITTE_size.

It might cause kernel memory corruption depending on the number
of MSI(x) that are configured and the amount of memory that has
been allocated for ITEs in its_create_device().

This patch fixes the potential memory corruption by setting the
correct ITE size to 16Bytes.

Cc: stable@vger.kernel.org
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 Documentation/arm64/silicon-errata.txt |   44 +++++++++++++++++----------------
 arch/arm64/Kconfig                     |   10 +++++++
 drivers/irqchip/irq-gic-v3-its.c       |   16 ++++++++++++
 3 files changed, 49 insertions(+), 21 deletions(-)

--- a/Documentation/arm64/silicon-errata.txt
+++ b/Documentation/arm64/silicon-errata.txt
@@ -42,24 +42,26 @@ file acts as a registry of software work
 will be updated when new workarounds are committed and backported to
 stable kernels.
 
-| Implementor    | Component       | Erratum ID      | Kconfig                 |
-+----------------+-----------------+-----------------+-------------------------+
-| ARM            | Cortex-A53      | #826319         | ARM64_ERRATUM_826319    |
-| ARM            | Cortex-A53      | #827319         | ARM64_ERRATUM_827319    |
-| ARM            | Cortex-A53      | #824069         | ARM64_ERRATUM_824069    |
-| ARM            | Cortex-A53      | #819472         | ARM64_ERRATUM_819472    |
-| ARM            | Cortex-A53      | #845719         | ARM64_ERRATUM_845719    |
-| ARM            | Cortex-A53      | #843419         | ARM64_ERRATUM_843419    |
-| ARM            | Cortex-A57      | #832075         | ARM64_ERRATUM_832075    |
-| ARM            | Cortex-A57      | #852523         | N/A                     |
-| ARM            | Cortex-A57      | #834220         | ARM64_ERRATUM_834220    |
-| ARM            | Cortex-A72      | #853709         | N/A                     |
-| ARM            | MMU-500         | #841119,#826419 | N/A                     |
-|                |                 |                 |                         |
-| Cavium         | ThunderX ITS    | #22375, #24313  | CAVIUM_ERRATUM_22375    |
-| Cavium         | ThunderX ITS    | #23144          | CAVIUM_ERRATUM_23144    |
-| Cavium         | ThunderX GICv3  | #23154          | CAVIUM_ERRATUM_23154    |
-| Cavium         | ThunderX Core   | #27456          | CAVIUM_ERRATUM_27456    |
-| Cavium         | ThunderX SMMUv2 | #27704          | N/A		       |
-|                |                 |                 |                         |
-| Freescale/NXP  | LS2080A/LS1043A | A-008585        | FSL_ERRATUM_A008585     |
+| Implementor    | Component       | Erratum ID      | Kconfig                     |
++----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Cortex-A53      | #826319         | ARM64_ERRATUM_826319        |
+| ARM            | Cortex-A53      | #827319         | ARM64_ERRATUM_827319        |
+| ARM            | Cortex-A53      | #824069         | ARM64_ERRATUM_824069        |
+| ARM            | Cortex-A53      | #819472         | ARM64_ERRATUM_819472        |
+| ARM            | Cortex-A53      | #845719         | ARM64_ERRATUM_845719        |
+| ARM            | Cortex-A53      | #843419         | ARM64_ERRATUM_843419        |
+| ARM            | Cortex-A57      | #832075         | ARM64_ERRATUM_832075        |
+| ARM            | Cortex-A57      | #852523         | N/A                         |
+| ARM            | Cortex-A57      | #834220         | ARM64_ERRATUM_834220        |
+| ARM            | Cortex-A72      | #853709         | N/A                         |
+| ARM            | MMU-500         | #841119,#826419 | N/A                         |
+|                |                 |                 |                             |
+| Cavium         | ThunderX ITS    | #22375, #24313  | CAVIUM_ERRATUM_22375        |
+| Cavium         | ThunderX ITS    | #23144          | CAVIUM_ERRATUM_23144        |
+| Cavium         | ThunderX GICv3  | #23154          | CAVIUM_ERRATUM_23154        |
+| Cavium         | ThunderX Core   | #27456          | CAVIUM_ERRATUM_27456        |
+| Cavium         | ThunderX SMMUv2 | #27704          | N/A                         |
+|                |                 |                 |                             |
+| Freescale/NXP  | LS2080A/LS1043A | A-008585        | FSL_ERRATUM_A008585         |
+|                |                 |                 |                             |
+| Qualcomm Tech. | QDF2400 ITS     | E0065           | QCOM_QDF2400_ERRATUM_0065   |
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -474,6 +474,16 @@ config CAVIUM_ERRATUM_27456
 
 	  If unsure, say Y.
 
+config QCOM_QDF2400_ERRATUM_0065
+	bool "QDF2400 E0065: Incorrect GITS_TYPER.ITT_Entry_size"
+	default y
+	help
+	  On Qualcomm Datacenter Technologies QDF2400 SoC, ITS hardware reports
+	  ITE size incorrectly. The GITS_TYPER.ITT_Entry_size field should have
+	  been indicated as 16Bytes (0xf), not 8Bytes (0x7).
+
+	  If unsure, say Y.
+
 endmenu
 
 
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1598,6 +1598,14 @@ static void __maybe_unused its_enable_qu
 	its->flags |= ITS_FLAGS_WORKAROUND_CAVIUM_23144;
 }
 
+static void __maybe_unused its_enable_quirk_qdf2400_e0065(void *data)
+{
+	struct its_node *its = data;
+
+	/* On QDF2400, the size of the ITE is 16Bytes */
+	its->ite_size = 16;
+}
+
 static const struct gic_quirk its_quirks[] = {
 #ifdef CONFIG_CAVIUM_ERRATUM_22375
 	{
@@ -1615,6 +1623,14 @@ static const struct gic_quirk its_quirks
 		.init	= its_enable_quirk_cavium_23144,
 	},
 #endif
+#ifdef CONFIG_QCOM_QDF2400_ERRATUM_0065
+	{
+		.desc	= "ITS: QDF2400 erratum 0065",
+		.iidr	= 0x00001070, /* QDF2400 ITS rev 1.x */
+		.mask	= 0xffffffff,
+		.init	= its_enable_quirk_qdf2400_e0065,
+	},
+#endif
 	{
 	}
 };

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 87/93] x86/tsc: Fix ART for TSC_KNOWN_FREQ
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (80 preceding siblings ...)
  2017-03-20 17:52 ` [PATCH 4.9 86/93] irqchip/gicv3-its: Add workaround for QDF2400 ITS erratum 0065 Greg Kroah-Hartman
@ 2017-03-20 17:52 ` Greg Kroah-Hartman
  2017-03-20 17:52   ` Greg Kroah-Hartman
                   ` (7 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Prusty, Subhransu S,
	Peter Zijlstra (Intel),
	christopher.s.hall, kevin.b.stanton, john.stultz, akataria,
	Thomas Gleixner

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Zijlstra <peterz@infradead.org>

commit 44fee88cea43d3c2cac962e0439cb10a3cabff6d upstream.

Subhransu reported that convert_art_to_tsc() isn't working for him.

The ART to TSC relation is only set up for systems which use the refined
TSC calibration. Systems with known TSC frequency (available via CPUID 15)
are not using the refined calibration and therefor the ART to TSC relation
is never established.

Add the setup to the known frequency init path which skips ART
calibration. The init code needs to be duplicated as for systems which use
refined calibration the ART setup must be delayed until calibration has
been done.

The problem has been there since the ART support was introdduced, but only
detected now because Subhransu tested the first time on hardware which has
TSC frequency enumerated via CPUID 15.

Note for stable: The conditional has changed from TSC_RELIABLE to
     	 	 TSC_KNOWN_FREQUENCY.

[ tglx: Rewrote changelog and identified the proper 'Fixes' commit ]

Fixes: f9677e0f8308 ("x86/tsc: Always Running Timer (ART) correlated clocksource")
Reported-by: "Prusty, Subhransu S" <subhransu.s.prusty@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Cc: christopher.s.hall@intel.com
Cc: kevin.b.stanton@intel.com
Cc: john.stultz@linaro.org
Cc: akataria@vmware.com
Link: http://lkml.kernel.org/r/20170313145712.GI3312@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/tsc.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1287,6 +1287,8 @@ static int __init init_tsc_clocksource(v
 	 * exporting a reliable TSC.
 	 */
 	if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE)) {
+		if (boot_cpu_has(X86_FEATURE_ART))
+			art_related_clocksource = &clocksource_tsc;
 		clocksource_register_khz(&clocksource_tsc, tsc_khz);
 		return 0;
 	}

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 88/93] x86/kasan: Fix boot with KASAN=y and PROFILE_ANNOTATED_BRANCHES=y
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
@ 2017-03-20 17:52   ` Greg Kroah-Hartman
  2017-03-20 17:50 ` [PATCH 4.9 02/93] net/mlx5e: Do not reduce LRO WQE size when not using build_skb Greg Kroah-Hartman
                     ` (88 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Fengguang Wu, Andrey Ryabinin,
	kasan-dev, Alexander Potapenko, Andrew Morton, lkp,
	Dmitry Vyukov, Thomas Gleixner

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andrey Ryabinin <aryabinin@virtuozzo.com>

commit be3606ff739d1c1be36389f8737c577ad87e1f57 upstream.

The kernel doesn't boot with both PROFILE_ANNOTATED_BRANCHES=y and KASAN=y
options selected. With branch profiling enabled we end up calling
ftrace_likely_update() before kasan_early_init(). ftrace_likely_update() is
built with KASAN instrumentation, so calling it before kasan has been
initialized leads to crash.

Use DISABLE_BRANCH_PROFILING define to make sure that we don't call
ftrace_likely_update() from early code before kasan_early_init().

Fixes: ef7f0d6a6ca8 ("x86_64: add KASan support")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: kasan-dev@googlegroups.com
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: lkp@01.org
Cc: Dmitry Vyukov <dvyukov@google.com>
Link: http://lkml.kernel.org/r/20170313163337.1704-1-aryabinin@virtuozzo.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/head64.c    |    1 +
 arch/x86/mm/kasan_init_64.c |    1 +
 2 files changed, 2 insertions(+)

--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -4,6 +4,7 @@
  *  Copyright (C) 2000 Andrea Arcangeli <andrea@suse.de> SuSE
  */
 
+#define DISABLE_BRANCH_PROFILING
 #include <linux/init.h>
 #include <linux/linkage.h>
 #include <linux/types.h>
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -1,3 +1,4 @@
+#define DISABLE_BRANCH_PROFILING
 #define pr_fmt(fmt) "kasan: " fmt
 #include <linux/bootmem.h>
 #include <linux/kasan.h>

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 88/93] x86/kasan: Fix boot with KASAN=y and PROFILE_ANNOTATED_BRANCHES=y
@ 2017-03-20 17:52   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:52 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1826 bytes --]

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andrey Ryabinin <aryabinin@virtuozzo.com>

commit be3606ff739d1c1be36389f8737c577ad87e1f57 upstream.

The kernel doesn't boot with both PROFILE_ANNOTATED_BRANCHES=y and KASAN=y
options selected. With branch profiling enabled we end up calling
ftrace_likely_update() before kasan_early_init(). ftrace_likely_update() is
built with KASAN instrumentation, so calling it before kasan has been
initialized leads to crash.

Use DISABLE_BRANCH_PROFILING define to make sure that we don't call
ftrace_likely_update() from early code before kasan_early_init().

Fixes: ef7f0d6a6ca8 ("x86_64: add KASan support")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: kasan-dev(a)googlegroups.com
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: lkp(a)01.org
Cc: Dmitry Vyukov <dvyukov@google.com>
Link: http://lkml.kernel.org/r/20170313163337.1704-1-aryabinin(a)virtuozzo.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/head64.c    |    1 +
 arch/x86/mm/kasan_init_64.c |    1 +
 2 files changed, 2 insertions(+)

--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -4,6 +4,7 @@
  *  Copyright (C) 2000 Andrea Arcangeli <andrea@suse.de> SuSE
  */
 
+#define DISABLE_BRANCH_PROFILING
 #include <linux/init.h>
 #include <linux/linkage.h>
 #include <linux/types.h>
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -1,3 +1,4 @@
+#define DISABLE_BRANCH_PROFILING
 #define pr_fmt(fmt) "kasan: " fmt
 #include <linux/bootmem.h>
 #include <linux/kasan.h>



^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 89/93] x86/perf: Fix CR4.PCE propagation to use active_mm instead of mm
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (82 preceding siblings ...)
  2017-03-20 17:52   ` Greg Kroah-Hartman
@ 2017-03-20 17:52 ` Greg Kroah-Hartman
  2017-03-20 17:52 ` [PATCH 4.9 90/93] futex: Fix potential use-after-free in FUTEX_REQUEUE_PI Greg Kroah-Hartman
                   ` (5 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andy Lutomirski, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Borislav Petkov, H. Peter Anvin,
	Jiri Olsa, Linus Torvalds, Peter Zijlstra, Stephane Eranian,
	Thomas Gleixner, Ingo Molnar

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@kernel.org>

commit 5dc855d44c2ad960a86f593c60461f1ae1566b6d upstream.

If one thread mmaps a perf event while another thread in the same mm
is in some context where active_mm != mm (which can happen in the
scheduler, for example), refresh_pce() would write the wrong value
to CR4.PCE.  This broke some PAPI tests.

Reported-and-tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 7911d3f7af14 ("perf/x86: Only allow rdpmc if a perf_event is mapped")
Link: http://lkml.kernel.org/r/0c5b38a76ea50e405f9abe07a13dfaef87c173a1.1489694270.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/events/core.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2096,8 +2096,8 @@ static int x86_pmu_event_init(struct per
 
 static void refresh_pce(void *ignored)
 {
-	if (current->mm)
-		load_mm_cr4(current->mm);
+	if (current->active_mm)
+		load_mm_cr4(current->active_mm);
 }
 
 static void x86_pmu_event_mapped(struct perf_event *event)

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 90/93] futex: Fix potential use-after-free in FUTEX_REQUEUE_PI
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (83 preceding siblings ...)
  2017-03-20 17:52 ` [PATCH 4.9 89/93] x86/perf: Fix CR4.PCE propagation to use active_mm instead of mm Greg Kroah-Hartman
@ 2017-03-20 17:52 ` Greg Kroah-Hartman
  2017-03-20 17:52 ` [PATCH 4.9 91/93] futex: Add missing error handling to FUTEX_REQUEUE_PI Greg Kroah-Hartman
                   ` (4 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dmitry Vyukov, Peter Zijlstra (Intel),
	Darren Hart, juri.lelli, bigeasy, xlpang, rostedt,
	mathieu.desnoyers, jdesfossez, dvhart, bristot, Thomas Gleixner

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Zijlstra <peterz@infradead.org>

commit c236c8e95a3d395b0494e7108f0d41cf36ec107c upstream.

While working on the futex code, I stumbled over this potential
use-after-free scenario. Dmitry triggered it later with syzkaller.

pi_mutex is a pointer into pi_state, which we drop the reference on in
unqueue_me_pi(). So any access to that pointer after that is bad.

Since other sites already do rt_mutex_unlock() with hb->lock held, see
for example futex_lock_pi(), simply move the unlock before
unqueue_me_pi().

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Darren Hart <dvhart@linux.intel.com>
Cc: juri.lelli@arm.com
Cc: bigeasy@linutronix.de
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170304093558.801744246@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/futex.c |   20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2813,7 +2813,6 @@ static int futex_wait_requeue_pi(u32 __u
 {
 	struct hrtimer_sleeper timeout, *to = NULL;
 	struct rt_mutex_waiter rt_waiter;
-	struct rt_mutex *pi_mutex = NULL;
 	struct futex_hash_bucket *hb;
 	union futex_key key2 = FUTEX_KEY_INIT;
 	struct futex_q q = futex_q_init;
@@ -2905,6 +2904,8 @@ static int futex_wait_requeue_pi(u32 __u
 			spin_unlock(q.lock_ptr);
 		}
 	} else {
+		struct rt_mutex *pi_mutex;
+
 		/*
 		 * We have been woken up by futex_unlock_pi(), a timeout, or a
 		 * signal.  futex_unlock_pi() will not destroy the lock_ptr nor
@@ -2928,18 +2929,19 @@ static int futex_wait_requeue_pi(u32 __u
 		if (res)
 			ret = (res < 0) ? res : 0;
 
+		/*
+		 * If fixup_pi_state_owner() faulted and was unable to handle
+		 * the fault, unlock the rt_mutex and return the fault to
+		 * userspace.
+		 */
+		if (ret && rt_mutex_owner(pi_mutex) == current)
+			rt_mutex_unlock(pi_mutex);
+
 		/* Unqueue and drop the lock. */
 		unqueue_me_pi(&q);
 	}
 
-	/*
-	 * If fixup_pi_state_owner() faulted and was unable to handle the
-	 * fault, unlock the rt_mutex and return the fault to userspace.
-	 */
-	if (ret == -EFAULT) {
-		if (pi_mutex && rt_mutex_owner(pi_mutex) == current)
-			rt_mutex_unlock(pi_mutex);
-	} else if (ret == -EINTR) {
+	if (ret == -EINTR) {
 		/*
 		 * We've already been requeued, but cannot restart by calling
 		 * futex_lock_pi() directly. We could restart this syscall, but

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 91/93] futex: Add missing error handling to FUTEX_REQUEUE_PI
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (84 preceding siblings ...)
  2017-03-20 17:52 ` [PATCH 4.9 90/93] futex: Fix potential use-after-free in FUTEX_REQUEUE_PI Greg Kroah-Hartman
@ 2017-03-20 17:52 ` Greg Kroah-Hartman
  2017-03-20 17:52 ` [PATCH 4.9 92/93] locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y Greg Kroah-Hartman
                   ` (3 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner,
	Peter Zijlstra (Intel),
	Darren Hart, juri.lelli, bigeasy, xlpang, rostedt,
	mathieu.desnoyers, jdesfossez, dvhart, bristot

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Zijlstra <peterz@infradead.org>

commit 9bbb25afeb182502ca4f2c4f3f88af0681b34cae upstream.

Thomas spotted that fixup_pi_state_owner() can return errors and we
fail to unlock the rt_mutex in that case.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Darren Hart <dvhart@linux.intel.com>
Cc: juri.lelli@arm.com
Cc: bigeasy@linutronix.de
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170304093558.867401760@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/futex.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2896,6 +2896,8 @@ static int futex_wait_requeue_pi(u32 __u
 		if (q.pi_state && (q.pi_state->owner != current)) {
 			spin_lock(q.lock_ptr);
 			ret = fixup_pi_state_owner(uaddr2, &q, current);
+			if (ret && rt_mutex_owner(&q.pi_state->pi_mutex) == current)
+				rt_mutex_unlock(&q.pi_state->pi_mutex);
 			/*
 			 * Drop the reference to the pi state which
 			 * the requeue_pi() code acquired for us.

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 92/93] locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (85 preceding siblings ...)
  2017-03-20 17:52 ` [PATCH 4.9 91/93] futex: Add missing error handling to FUTEX_REQUEUE_PI Greg Kroah-Hartman
@ 2017-03-20 17:52 ` Greg Kroah-Hartman
  2017-03-20 17:52 ` [PATCH 4.9 93/93] crypto: powerpc - Fix initialisation of crc32c context Greg Kroah-Hartman
                   ` (2 subsequent siblings)
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Niklas Cassel, Peter Zijlstra (Intel),
	mhocko, Andrew Morton, Linus Torvalds, Niklas Cassel,
	Paul E. McKenney, Thomas Gleixner, Ingo Molnar

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Niklas Cassel <niklas.cassel@axis.com>

commit 17fcbd590d0c3e35bd9646e2215f86586378bc42 upstream.

We hang if SIGKILL has been sent, but the task is stuck in down_read()
(after do_exit()), even though no task is doing down_write() on the
rwsem in question:

  INFO: task libupnp:21868 blocked for more than 120 seconds.
  libupnp         D    0 21868      1 0x08100008
  ...
  Call Trace:
  __schedule()
  schedule()
  __down_read()
  do_exit()
  do_group_exit()
  __wake_up_parent()

This bug has already been fixed for CONFIG_RWSEM_XCHGADD_ALGORITHM=y in
the following commit:

 04cafed7fc19 ("locking/rwsem: Fix down_write_killable()")

... however, this bug also exists for CONFIG_RWSEM_GENERIC_SPINLOCK=y.

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Niklas Cassel <niklass@axis.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: d47996082f52 ("locking/rwsem: Introduce basis for down_write_killable()")
Link: http://lkml.kernel.org/r/1487981873-12649-1-git-send-email-niklass@axis.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/locking/rwsem-spinlock.c |   15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

--- a/kernel/locking/rwsem-spinlock.c
+++ b/kernel/locking/rwsem-spinlock.c
@@ -216,10 +216,8 @@ int __sched __down_write_common(struct r
 		 */
 		if (sem->count == 0)
 			break;
-		if (signal_pending_state(state, current)) {
-			ret = -EINTR;
-			goto out;
-		}
+		if (signal_pending_state(state, current))
+			goto out_nolock;
 		set_task_state(tsk, state);
 		raw_spin_unlock_irqrestore(&sem->wait_lock, flags);
 		schedule();
@@ -227,12 +225,19 @@ int __sched __down_write_common(struct r
 	}
 	/* got the lock */
 	sem->count = -1;
-out:
 	list_del(&waiter.list);
 
 	raw_spin_unlock_irqrestore(&sem->wait_lock, flags);
 
 	return ret;
+
+out_nolock:
+	list_del(&waiter.list);
+	if (!list_empty(&sem->wait_list))
+		__rwsem_do_wake(sem, 1);
+	raw_spin_unlock_irqrestore(&sem->wait_lock, flags);
+
+	return -EINTR;
 }
 
 void __sched __down_write(struct rw_semaphore *sem)

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 4.9 93/93] crypto: powerpc - Fix initialisation of crc32c context
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (86 preceding siblings ...)
  2017-03-20 17:52 ` [PATCH 4.9 92/93] locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y Greg Kroah-Hartman
@ 2017-03-20 17:52 ` Greg Kroah-Hartman
  2017-03-21  0:12 ` [PATCH 4.9 00/93] 4.9.17-stable review Shuah Khan
  2017-03-21  2:13 ` Guenter Roeck
  89 siblings, 0 replies; 92+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-20 17:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Anton Blanchard, Daniel Axtens, Herbert Xu

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Axtens <dja@axtens.net>

commit aa2be9b3d6d2d699e9ca7cbfc00867c80e5da213 upstream.

Turning on crypto self-tests on a POWER8 shows:

    alg: hash: Test 1 failed for crc32c-vpmsum
    00000000: ff ff ff ff

Comparing the code with the Intel CRC32c implementation on which
ours is based shows that we are doing an init with 0, not ~0
as CRC32c requires.

This probably wasn't caught because btrfs does its own weird
open-coded initialisation.

Initialise our internal context to ~0 on init.

This makes the self-tests pass, and btrfs continues to work.

Fixes: 6dd7a82cc54e ("crypto: powerpc - Add POWER8 optimised crc32c")
Cc: Anton Blanchard <anton@samba.org>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/powerpc/crypto/crc32c-vpmsum_glue.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/powerpc/crypto/crc32c-vpmsum_glue.c
+++ b/arch/powerpc/crypto/crc32c-vpmsum_glue.c
@@ -52,7 +52,7 @@ static int crc32c_vpmsum_cra_init(struct
 {
 	u32 *key = crypto_tfm_ctx(tfm);
 
-	*key = 0;
+	*key = ~0;
 
 	return 0;
 }

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 4.9 00/93] 4.9.17-stable review
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (87 preceding siblings ...)
  2017-03-20 17:52 ` [PATCH 4.9 93/93] crypto: powerpc - Fix initialisation of crc32c context Greg Kroah-Hartman
@ 2017-03-21  0:12 ` Shuah Khan
  2017-03-21  2:13 ` Guenter Roeck
  89 siblings, 0 replies; 92+ messages in thread
From: Shuah Khan @ 2017-03-21  0:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, linux, patches, ben.hutchings, stable, Shuah Khan

On 03/20/2017 11:50 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.17 release.
> There are 93 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed Mar 22 17:47:16 UTC 2017.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.17-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 4.9 00/93] 4.9.17-stable review
  2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
                   ` (88 preceding siblings ...)
  2017-03-21  0:12 ` [PATCH 4.9 00/93] 4.9.17-stable review Shuah Khan
@ 2017-03-21  2:13 ` Guenter Roeck
  89 siblings, 0 replies; 92+ messages in thread
From: Guenter Roeck @ 2017-03-21  2:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, shuahkh, patches, ben.hutchings, stable

On 03/20/2017 10:50 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.17 release.
> There are 93 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Mar 22 17:47:16 UTC 2017.
> Anything received after that time might be too late.
>

Build results:
	total: 149 pass: 149 fail: 0
Qemu test results:
	total: 122 pass: 122 fail: 0

Details are available at http://kerneltests.org/builders.

Guenter

^ permalink raw reply	[flat|nested] 92+ messages in thread

end of thread, other threads:[~2017-03-21  2:13 UTC | newest]

Thread overview: 92+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-20 17:50 [PATCH 4.9 00/93] 4.9.17-stable review Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 01/93] net/mlx5e: Register/unregister vport representors on interface attach/detach Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 02/93] net/mlx5e: Do not reduce LRO WQE size when not using build_skb Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 03/93] net/mlx5e: Fix wrong CQE decompression Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 04/93] vxlan: correctly validate VXLAN ID against VXLAN_N_VID Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 05/93] vti6: return GRE_KEY for vti6 Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 06/93] vxlan: dont allow overwrite of config src addr Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 07/93] ipv4: mask tos for input route Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 08/93] net sched actions: decrement module reference count after table flush Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 10/93] net: phy: Avoid deadlock during phy_error() Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 11/93] vxlan: lock RCU on TX path Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 12/93] geneve: " Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 13/93] mlxsw: spectrum_router: Avoid potential packets loss Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 14/93] tcp/dccp: block BH for SYN processing Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 15/93] net: bridge: allow IPv6 when multicast flood is disabled Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 16/93] net: dont call strlen() on the user buffer in packet_bind_spkt() Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 17/93] net: net_enable_timestamp() can be called from irq contexts Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 18/93] ipv6: orphan skbs in reassembly unit Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 19/93] dccp: Unlock sock before calling sk_free() Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 20/93] strparser: destroy workqueue on module exit Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 21/93] tcp: fix various issues for sockets morphing to listen state Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 22/93] net: fix socket refcounting in skb_complete_wifi_ack() Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 23/93] net: fix socket refcounting in skb_complete_tx_timestamp() Greg Kroah-Hartman
2017-03-20 17:50 ` [PATCH 4.9 24/93] net/sched: act_skbmod: remove unneeded rcu_read_unlock in tcf_skbmod_dump Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 25/93] dccp: fix use-after-free in dccp_feat_activate_values Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 26/93] vrf: Fix use-after-free in vrf_xmit Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 27/93] net/tunnel: set inner protocol in network gro hooks Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 28/93] uapi: fix linux/packet_diag.h userspace compilation error Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 30/93] mpls: Send route delete notifications when router module is unloaded Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 31/93] mpls: Do not decrement alive counter for unregister events Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 32/93] ipv6: make ECMP route replacement less greedy Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 33/93] ipv6: avoid write to a possibly cloned skb Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 34/93] bridge: drop netfilter fake rtable unconditionally Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 37/93] dccp: fix memory leak during tear-down of unsuccessful connection request Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 38/93] bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 39/93] bpf: fix state equivalence Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 40/93] bpf: fix regression on verifier pruning wrt map lookups Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 41/93] bpf: fix mark_reg_unknown_value for spilled regs on map value marking Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 42/93] dmaengine: iota: ioat_alloc_chan_resources should not perform sleeping allocations Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 43/93] xen: do not re-use pirq number cached in pci device msi msg data Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 44/93] igb: Workaround for igb i210 firmware issue Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 45/93] igb: add i211 to i210 PHY workaround Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 46/93] scsi: ibmvscsis: Issues from Dan Carpenter/Smatch Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 47/93] scsi: ibmvscsis: Return correct partition name/# to client Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 48/93] scsi: ibmvscsis: Clean up properly if target_submit_cmd/tmr fails Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 49/93] scsi: ibmvscsis: Rearrange functions for future patches Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 50/93] scsi: ibmvscsis: Synchronize cmds at tpg_enable_store time Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 51/93] scsi: ibmvscsis: Synchronize cmds at remove time Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 52/93] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 53/93] PCI: Separate VF BAR updates from standard BAR updates Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 54/93] PCI: Remove pci_resource_bar() and pci_iov_resource_bar() Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 55/93] PCI: Add comments about ROM BAR updating Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 56/93] PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 57/93] PCI: Dont update VF BARs while VF memory space is enabled Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 58/93] PCI: Update BARs using property bits appropriate for type Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 59/93] PCI: Ignore BAR updates on virtual functions Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 60/93] PCI: Do any VF BAR updates before enabling the BARs Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 61/93] ibmveth: calculate gso_segs for large packets Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 62/93] Drivers: hv: ring_buffer: count on wrap around mappings in get_next_pkt_raw() (v2) Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 63/93] vfio/spapr: Postpone allocation of userspace version of TCE table Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 64/93] powerpc/iommu: Pass mm_struct to init/cleanup helpers Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 65/93] powerpc/iommu: Stop using @current in mm_iommu_xxx Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 66/93] vfio/spapr: Reference mm in tce_container Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 67/93] powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 68/93] vfio/spapr: Add a helper to create default DMA window Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 69/93] vfio/spapr: Postpone default window creation Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 70/93] drm/nouveau/disp/gp102: fix cursor/overlay immediate channel indices Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 71/93] drm/nouveau/disp/nv50-: split chid into chid.ctrl and chid.user Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 72/93] drm/nouveau/disp/nv50-: specify ctrl/user separately when constructing classes Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 73/93] block: allow WRITE_SAME commands with the SG_IO ioctl Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 74/93] s390/zcrypt: Introduce CEX6 toleration Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 75/93] [media] uvcvideo: uvc_scan_fallback() for webcams with broken chain Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 76/93] slub: move synchronize_sched out of slab_mutex on shrink Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 77/93] ACPI / blacklist: add _REV quirks for Dell Precision 5520 and 3520 Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 78/93] ACPI / blacklist: Make Dell Latitude 3350 ethernet work Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 79/93] serial: 8250_pci: Detach low-level driver during PCI error recovery Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 80/93] usb: gadget: udc: atmel: remove memory leak Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 82/93] clk: bcm2835: Fix ->fixed_divider of pllh_aux Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 83/93] drm/vc4: Fix race between page flip completion event and clean-up Greg Kroah-Hartman
2017-03-20 17:51 ` [PATCH 4.9 84/93] drm/vc4: Fix ->clock_select setting for the VEC encoder Greg Kroah-Hartman
2017-03-20 17:52 ` [PATCH 4.9 85/93] arm64: KVM: VHE: Clear HCR_TGE when invalidating guest TLBs Greg Kroah-Hartman
2017-03-20 17:52 ` [PATCH 4.9 86/93] irqchip/gicv3-its: Add workaround for QDF2400 ITS erratum 0065 Greg Kroah-Hartman
2017-03-20 17:52 ` [PATCH 4.9 87/93] x86/tsc: Fix ART for TSC_KNOWN_FREQ Greg Kroah-Hartman
2017-03-20 17:52 ` [PATCH 4.9 88/93] x86/kasan: Fix boot with KASAN=y and PROFILE_ANNOTATED_BRANCHES=y Greg Kroah-Hartman
2017-03-20 17:52   ` Greg Kroah-Hartman
2017-03-20 17:52 ` [PATCH 4.9 89/93] x86/perf: Fix CR4.PCE propagation to use active_mm instead of mm Greg Kroah-Hartman
2017-03-20 17:52 ` [PATCH 4.9 90/93] futex: Fix potential use-after-free in FUTEX_REQUEUE_PI Greg Kroah-Hartman
2017-03-20 17:52 ` [PATCH 4.9 91/93] futex: Add missing error handling to FUTEX_REQUEUE_PI Greg Kroah-Hartman
2017-03-20 17:52 ` [PATCH 4.9 92/93] locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y Greg Kroah-Hartman
2017-03-20 17:52 ` [PATCH 4.9 93/93] crypto: powerpc - Fix initialisation of crc32c context Greg Kroah-Hartman
2017-03-21  0:12 ` [PATCH 4.9 00/93] 4.9.17-stable review Shuah Khan
2017-03-21  2:13 ` Guenter Roeck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.