All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 4.4 00/76] 4.4.58-stable review
@ 2017-03-28 12:29 Greg Kroah-Hartman
  2017-03-28 12:29 ` [PATCH 4.4 01/76] net/openvswitch: Set the ipv6 source tunnel key address attribute correctly Greg Kroah-Hartman
                   ` (75 more replies)
  0 siblings, 76 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuahkh, patches,
	ben.hutchings, stable

This is the start of the stable review cycle for the 4.4.58 release.
There are 76 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Mar 30 12:25:40 UTC 2017.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.58-rc1.gz
or in the git tree and branch at:
  git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 4.4.58-rc1

Jiri Slaby <jslaby@suse.cz>
    crypto: algif_hash - avoid zero-sized array

Takashi Iwai <tiwai@suse.de>
    fbcon: Fix vc attr at deinit

Sumit Semwal <sumit.semwal@linaro.org>
    serial: 8250_pci: Detach low-level driver during PCI error recovery

Sumit Semwal <sumit.semwal@linaro.org>
    ACPI / blacklist: Make Dell Latitude 3350 ethernet work

Sumit Semwal <sumit.semwal@linaro.org>
    ACPI / blacklist: add _REV quirks for Dell Precision 5520 and 3520

Sumit Semwal <sumit.semwal@linaro.org>
    uvcvideo: uvc_scan_fallback() for webcams with broken chain

Sumit Semwal <sumit.semwal@linaro.org>
    s390/zcrypt: Introduce CEX6 toleration

Sumit Semwal <sumit.semwal@linaro.org>
    block: allow WRITE_SAME commands with the SG_IO ioctl

Sumit Semwal <sumit.semwal@linaro.org>
    vfio/spapr: Postpone allocation of userspace version of TCE table

Sumit Semwal <sumit.semwal@linaro.org>
    PCI: Do any VF BAR updates before enabling the BARs

Sumit Semwal <sumit.semwal@linaro.org>
    PCI: Ignore BAR updates on virtual functions

Sumit Semwal <sumit.semwal@linaro.org>
    PCI: Update BARs using property bits appropriate for type

Sumit Semwal <sumit.semwal@linaro.org>
    PCI: Don't update VF BARs while VF memory space is enabled

Sumit Semwal <sumit.semwal@linaro.org>
    PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE

Sumit Semwal <sumit.semwal@linaro.org>
    PCI: Add comments about ROM BAR updating

Sumit Semwal <sumit.semwal@linaro.org>
    PCI: Remove pci_resource_bar() and pci_iov_resource_bar()

Sumit Semwal <sumit.semwal@linaro.org>
    PCI: Separate VF BAR updates from standard BAR updates

Sumit Semwal <sumit.semwal@linaro.org>
    x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic

Sumit Semwal <sumit.semwal@linaro.org>
    igb: add i211 to i210 PHY workaround

Sumit Semwal <sumit.semwal@linaro.org>
    igb: Workaround for igb i210 firmware issue

Sumit Semwal <sumit.semwal@linaro.org>
    xen: do not re-use pirq number cached in pci device msi msg data

Darrick J. Wong <darrick.wong@oracle.com>
    xfs: clear _XBF_PAGES from buffers when readahead page

Johan Hovold <johan@kernel.org>
    USB: usbtmc: add missing endpoint sanity check

Johannes Berg <johannes.berg@intel.com>
    nl80211: fix dumpit error path RTNL deadlocks

Eric Sandeen <sandeen@sandeen.net>
    xfs: fix up xfs_swap_extent_forks inline extent handling

Darrick J. Wong <darrick.wong@oracle.com>
    xfs: don't allow di_size with high bit set

Ilya Dryomov <idryomov@gmail.com>
    libceph: don't set weight to IN when OSD is destroyed

Tomasz Majchrzak <tomasz.majchrzak@intel.com>
    raid10: increment write counter after bio is split

Ilya Dryomov <idryomov@gmail.com>
    libceph: force GFP_NOIO for socket allocations

Viresh Kumar <viresh.kumar@linaro.org>
    cpufreq: Restore policy min/max limits on CPU online

Nicolas Ferre <nicolas.ferre@atmel.com>
    ARM: dts: at91: sama5d2: add dma properties to UART nodes

Nicolas Ferre <nicolas.ferre@microchip.com>
    ARM: at91: pm: cpu_idle: switch DDR to power-down mode

Koos Vriezen <koos.vriezen@gmail.com>
    iommu/vt-d: Fix NULL pointer dereference in device_to_iommu

Ankur Arora <ankur.a.arora@oracle.com>
    xen/acpi: upload PM state from init-domain to Xen

Adrian Hunter <adrian.hunter@intel.com>
    mmc: sdhci: Do not disable interrupts while waiting for clock

Eric Biggers <ebiggers@google.com>
    ext4: mark inode dirty after converting inline directory

Sudip Mukherjee <sudipm.mukherjee@gmail.com>
    parport: fix attempt to write duplicate procfiles

Song Hongyan <hongyan.song@intel.com>
    iio: hid-sensor-trigger: Change get poll value function order to avoid sensor properties losing after resume from S3

Michael Engl <michael.engl@wjw-solutions.com>
    iio: adc: ti_am335x_adc: fix fifo overrun recovery

Johan Hovold <johan@kernel.org>
    mmc: ushc: fix NULL-deref at probe

Johan Hovold <johan@kernel.org>
    uwb: hwa-rc: fix NULL-deref at probe

Johan Hovold <johan@kernel.org>
    uwb: i1480-dfu: fix NULL-deref at probe

Guenter Roeck <linux@roeck-us.net>
    usb: hub: Fix crash after failure to read BOS descriptor

Bin Liu <b-liu@ti.com>
    usb: musb: cppi41: don't check early-TX-interrupt for Isoch transfer

Johan Hovold <johan@kernel.org>
    USB: wusbcore: fix NULL-deref at probe

Johan Hovold <johan@kernel.org>
    USB: idmouse: fix NULL-deref at probe

Johan Hovold <johan@kernel.org>
    USB: lvtest: fix NULL-deref at probe

Johan Hovold <johan@kernel.org>
    USB: uss720: fix NULL-deref at probe

Samuel Thibault <samuel.thibault@ens-lyon.org>
    usb-core: Add LINEAR_FRAME_INTR_BINTERVAL USB quirk

Roger Quadros <rogerq@ti.com>
    usb: gadget: f_uvc: Fix SuperSpeed companion descriptor's wBytesPerInterval

Oliver Neukum <oneukum@suse.com>
    ACM gadget: fix endianness in notifications

Bjørn Mork <bjorn@mork.no>
    USB: serial: qcserial: add Dell DW5811e

Dan Williams <dcbw@redhat.com>
    USB: serial: option: add Quectel UC15, UC20, EC21, and EC25 modems

Hui Wang <hui.wang@canonical.com>
    ALSA: hda - Adding a group of pin definition to fix headset problem

Takashi Iwai <tiwai@suse.de>
    ALSA: ctxfi: Fix the incorrect check of dma_set_mask() call

Takashi Iwai <tiwai@suse.de>
    ALSA: seq: Fix racy cell insertions during snd_seq_pool_done()

Johan Hovold <johan@kernel.org>
    Input: sur40 - validate number of endpoints before using them

Johan Hovold <johan@kernel.org>
    Input: kbtab - validate number of endpoints before using them

Johan Hovold <johan@kernel.org>
    Input: cm109 - validate number of endpoints before using them

Johan Hovold <johan@kernel.org>
    Input: yealink - validate number of endpoints before using them

Johan Hovold <johan@kernel.org>
    Input: hanwang - validate number of endpoints before using them

Johan Hovold <johan@kernel.org>
    Input: ims-pcu - validate number of endpoints before using them

Johan Hovold <johan@kernel.org>
    Input: iforce - validate number of endpoints before using them

Kai-Heng Feng <kai.heng.feng@canonical.com>
    Input: i8042 - add noloop quirk for Dell Embedded Box PC 3000

Matjaz Hegedic <matjaz.hegedic@gmail.com>
    Input: elan_i2c - add ASUS EeeBook X205TA special touchpad fw

Eric Dumazet <edumazet@google.com>
    tcp: initialize icsk_ack.lrcvtime at session start time

Daniel Borkmann <daniel@iogearbox.net>
    socket, bpf: fix sk_filter use after free in sk_clone_lock

Eric Dumazet <edumazet@google.com>
    ipv4: provide stronger user input validation in nl_fib_input()

Doug Berger <opendmb@gmail.com>
    net: bcmgenet: remove bcmgenet_internal_phy_setup()

Gal Pressman <galp@mellanox.com>
    net/mlx5e: Count LRO packets correctly

Maor Gottlieb <maorg@mellanox.com>
    net/mlx5: Increase number of max QPs in default profile

Andrey Ulanov <andreyu@google.com>
    net: unix: properly re-increment inflight counter of GC discarded candidates

Lendacky, Thomas <Thomas.Lendacky@amd.com>
    amd-xgbe: Fix jumbo MTU processing on newer hardware

Eric Dumazet <edumazet@google.com>
    net: properly release sk_frag.page

Florian Fainelli <f.fainelli@gmail.com>
    net: bcmgenet: Do not suspend PHY if Wake-on-LAN is enabled

Or Gerlitz <ogerlitz@mellanox.com>
    net/openvswitch: Set the ipv6 source tunnel key address attribute correctly


-------------

Diffstat:

 Makefile                                           |   4 +-
 arch/arm/boot/dts/sama5d2.dtsi                     |  35 ++++++
 arch/arm/mach-at91/pm.c                            |  18 ++-
 arch/x86/kernel/cpu/mshyperv.c                     |  24 ++++
 arch/x86/pci/xen.c                                 |  23 ++--
 block/scsi_ioctl.c                                 |   3 +
 crypto/algif_hash.c                                |   2 +-
 drivers/acpi/blacklist.c                           |  28 +++++
 drivers/cpufreq/cpufreq.c                          |   3 +
 drivers/iio/adc/ti_am335x_adc.c                    |  13 ++-
 .../iio/common/hid-sensors/hid-sensor-trigger.c    |   6 +-
 drivers/input/joystick/iforce/iforce-usb.c         |   3 +
 drivers/input/misc/cm109.c                         |   4 +
 drivers/input/misc/ims-pcu.c                       |   4 +
 drivers/input/misc/yealink.c                       |   4 +
 drivers/input/mouse/elan_i2c_core.c                |  20 ++--
 drivers/input/serio/i8042-x86ia64io.h              |   7 ++
 drivers/input/tablet/hanwang.c                     |   3 +
 drivers/input/tablet/kbtab.c                       |   3 +
 drivers/input/touchscreen/sur40.c                  |   3 +
 drivers/iommu/intel-iommu.c                        |   2 +-
 drivers/md/raid10.c                                |   4 +-
 drivers/media/usb/uvc/uvc_driver.c                 | 118 +++++++++++++++++++-
 drivers/mmc/host/sdhci.c                           |   4 +-
 drivers/mmc/host/ushc.c                            |   3 +
 drivers/net/ethernet/amd/xgbe/xgbe-common.h        |   6 +-
 drivers/net/ethernet/amd/xgbe/xgbe-dev.c           |  20 ++--
 drivers/net/ethernet/amd/xgbe/xgbe-drv.c           | 102 ++++++++++-------
 drivers/net/ethernet/broadcom/genet/bcmgenet.c     |   6 +-
 drivers/net/ethernet/broadcom/genet/bcmmii.c       |  15 ---
 drivers/net/ethernet/intel/igb/e1000_phy.c         |   4 +
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |   4 +
 drivers/net/ethernet/mellanox/mlx5/core/main.c     |   2 +-
 drivers/parport/share.c                            |   6 +-
 drivers/pci/iov.c                                  |  70 +++++++++---
 drivers/pci/pci.c                                  |  34 ------
 drivers/pci/pci.h                                  |   7 +-
 drivers/pci/probe.c                                |   3 +-
 drivers/pci/rom.c                                  |   5 +
 drivers/pci/setup-res.c                            |  48 +++++---
 drivers/s390/crypto/ap_bus.c                       |   3 +
 drivers/s390/crypto/ap_bus.h                       |   1 +
 drivers/tty/serial/8250/8250_pci.c                 |  23 +++-
 drivers/usb/class/usbtmc.c                         |   9 +-
 drivers/usb/core/config.c                          |  10 ++
 drivers/usb/core/hub.c                             |   2 +-
 drivers/usb/core/quirks.c                          |   8 ++
 drivers/usb/gadget/function/f_acm.c                |   4 +-
 drivers/usb/gadget/function/f_uvc.c                |   2 +-
 drivers/usb/misc/idmouse.c                         |   3 +
 drivers/usb/misc/lvstest.c                         |   4 +
 drivers/usb/misc/uss720.c                          |   5 +
 drivers/usb/musb/musb_cppi41.c                     |  23 +++-
 drivers/usb/serial/option.c                        |  17 ++-
 drivers/usb/serial/qcserial.c                      |   2 +
 drivers/usb/wusbcore/wa-hc.c                       |   3 +
 drivers/uwb/hwa-rc.c                               |   3 +
 drivers/uwb/i1480/dfu/usb.c                        |   3 +
 drivers/vfio/vfio_iommu_spapr_tce.c                |  20 ++--
 drivers/video/console/fbcon.c                      |  67 +++++++-----
 drivers/xen/xen-acpi-processor.c                   |  34 ++++--
 fs/ext4/inline.c                                   |   5 +-
 fs/xfs/libxfs/xfs_inode_buf.c                      |   8 ++
 fs/xfs/xfs_bmap_util.c                             |   7 +-
 fs/xfs/xfs_buf.c                                   |   1 +
 include/linux/usb/quirks.h                         |   6 +
 net/ceph/messenger.c                               |   6 +
 net/ceph/osdmap.c                                  |   1 -
 net/core/sock.c                                    |  16 ++-
 net/ipv4/fib_frontend.c                            |   3 +-
 net/ipv4/tcp_input.c                               |   2 +-
 net/ipv4/tcp_minisocks.c                           |   1 +
 net/openvswitch/flow_netlink.c                     |   2 +-
 net/unix/garbage.c                                 |  17 +--
 net/wireless/nl80211.c                             | 121 +++++++++------------
 sound/core/seq/seq_clientmgr.c                     |   1 +
 sound/core/seq/seq_fifo.c                          |   3 +
 sound/core/seq/seq_memory.c                        |  17 ++-
 sound/core/seq/seq_memory.h                        |   1 +
 sound/pci/ctxfi/cthw20k1.c                         |   2 +-
 sound/pci/hda/patch_realtek.c                      |   2 +
 81 files changed, 804 insertions(+), 337 deletions(-)

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 01/76] net/openvswitch: Set the ipv6 source tunnel key address attribute correctly
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
@ 2017-03-28 12:29 ` Greg Kroah-Hartman
  2017-03-28 12:29 ` [PATCH 4.4 02/76] net: bcmgenet: Do not suspend PHY if Wake-on-LAN is enabled Greg Kroah-Hartman
                   ` (74 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Or Gerlitz, Paul Blakey, Jiri Benc,
	Joe Stringer, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Or Gerlitz <ogerlitz@mellanox.com>


[ Upstream commit 3d20f1f7bd575d147ffa75621fa560eea0aec690 ]

When dealing with ipv6 source tunnel key address attribute
(OVS_TUNNEL_KEY_ATTR_IPV6_SRC) we are wrongly setting the tunnel
dst ip, fix that.

Fixes: 6b26ba3a7d95 ('openvswitch: netlink attributes for IPv6 tunneling')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/openvswitch/flow_netlink.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -588,7 +588,7 @@ static int ip_tun_from_nlattr(const stru
 			ipv4 = true;
 			break;
 		case OVS_TUNNEL_KEY_ATTR_IPV6_SRC:
-			SW_FLOW_KEY_PUT(match, tun_key.u.ipv6.dst,
+			SW_FLOW_KEY_PUT(match, tun_key.u.ipv6.src,
 					nla_get_in6_addr(a), is_mask);
 			ipv6 = true;
 			break;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 02/76] net: bcmgenet: Do not suspend PHY if Wake-on-LAN is enabled
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
  2017-03-28 12:29 ` [PATCH 4.4 01/76] net/openvswitch: Set the ipv6 source tunnel key address attribute correctly Greg Kroah-Hartman
@ 2017-03-28 12:29 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 03/76] net: properly release sk_frag.page Greg Kroah-Hartman
                   ` (73 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Florian Fainelli, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Florian Fainelli <f.fainelli@gmail.com>


[ Upstream commit 5371bbf4b295eea334ed453efa286afa2c3ccff3 ]

Suspending the PHY would be putting it in a low power state where it
may no longer allow us to do Wake-on-LAN.

Fixes: cc013fb48898 ("net: bcmgenet: correctly suspend and resume PHY device")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/broadcom/genet/bcmgenet.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -3495,7 +3495,8 @@ static int bcmgenet_suspend(struct devic
 
 	bcmgenet_netif_stop(dev);
 
-	phy_suspend(priv->phydev);
+	if (!device_may_wakeup(d))
+		phy_suspend(priv->phydev);
 
 	netif_device_detach(dev);
 
@@ -3592,7 +3593,8 @@ static int bcmgenet_resume(struct device
 
 	netif_device_attach(dev);
 
-	phy_resume(priv->phydev);
+	if (!device_may_wakeup(d))
+		phy_resume(priv->phydev);
 
 	if (priv->eee.eee_enabled)
 		bcmgenet_eee_enable_set(dev, true);

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 03/76] net: properly release sk_frag.page
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
  2017-03-28 12:29 ` [PATCH 4.4 01/76] net/openvswitch: Set the ipv6 source tunnel key address attribute correctly Greg Kroah-Hartman
  2017-03-28 12:29 ` [PATCH 4.4 02/76] net: bcmgenet: Do not suspend PHY if Wake-on-LAN is enabled Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 04/76] amd-xgbe: Fix jumbo MTU processing on newer hardware Greg Kroah-Hartman
                   ` (72 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Dumazet, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>


[ Upstream commit 22a0e18eac7a9e986fec76c60fa4a2926d1291e2 ]

I mistakenly added the code to release sk->sk_frag in
sk_common_release() instead of sk_destruct()

TCP sockets using sk->sk_allocation == GFP_ATOMIC do no call
sk_common_release() at close time, thus leaking one (order-3) page.

iSCSI is using such sockets.

Fixes: 5640f7685831 ("net: use a per task frag allocator")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/core/sock.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1459,6 +1459,11 @@ void sk_destruct(struct sock *sk)
 		pr_debug("%s: optmem leakage (%d bytes) detected\n",
 			 __func__, atomic_read(&sk->sk_omem_alloc));
 
+	if (sk->sk_frag.page) {
+		put_page(sk->sk_frag.page);
+		sk->sk_frag.page = NULL;
+	}
+
 	if (sk->sk_peer_cred)
 		put_cred(sk->sk_peer_cred);
 	put_pid(sk->sk_peer_pid);
@@ -2691,11 +2696,6 @@ void sk_common_release(struct sock *sk)
 
 	sk_refcnt_debug_release(sk);
 
-	if (sk->sk_frag.page) {
-		put_page(sk->sk_frag.page);
-		sk->sk_frag.page = NULL;
-	}
-
 	sock_put(sk);
 }
 EXPORT_SYMBOL(sk_common_release);

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 04/76] amd-xgbe: Fix jumbo MTU processing on newer hardware
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 03/76] net: properly release sk_frag.page Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 05/76] net: unix: properly re-increment inflight counter of GC discarded candidates Greg Kroah-Hartman
                   ` (71 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Tom Lendacky, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: "Lendacky, Thomas" <Thomas.Lendacky@amd.com>


[ Upstream commit 622c36f143fc9566ba49d7cec994c2da1182d9e2 ]

Newer hardware does not provide a cumulative payload length when multiple
descriptors are needed to handle the data. Once the MTU increases beyond
the size that can be handled by a single descriptor, the SKB does not get
built properly by the driver.

The driver will now calculate the size of the data buffers used by the
hardware.  The first buffer of the first descriptor is for packet headers
or packet headers and data when the headers can't be split. Subsequent
descriptors in a multi-descriptor chain will not use the first buffer. The
second buffer is used by all the descriptors in the chain for payload data.
Based on whether the driver is processing the first, intermediate, or last
descriptor it can calculate the buffer usage and build the SKB properly.

Tested and verified on both old and new hardware.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/amd/xgbe/xgbe-common.h |    6 +
 drivers/net/ethernet/amd/xgbe/xgbe-dev.c    |   20 +++--
 drivers/net/ethernet/amd/xgbe/xgbe-drv.c    |  102 +++++++++++++++++-----------
 3 files changed, 78 insertions(+), 50 deletions(-)

--- a/drivers/net/ethernet/amd/xgbe/xgbe-common.h
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-common.h
@@ -913,8 +913,8 @@
 #define RX_PACKET_ATTRIBUTES_CSUM_DONE_WIDTH	1
 #define RX_PACKET_ATTRIBUTES_VLAN_CTAG_INDEX	1
 #define RX_PACKET_ATTRIBUTES_VLAN_CTAG_WIDTH	1
-#define RX_PACKET_ATTRIBUTES_INCOMPLETE_INDEX	2
-#define RX_PACKET_ATTRIBUTES_INCOMPLETE_WIDTH	1
+#define RX_PACKET_ATTRIBUTES_LAST_INDEX		2
+#define RX_PACKET_ATTRIBUTES_LAST_WIDTH		1
 #define RX_PACKET_ATTRIBUTES_CONTEXT_NEXT_INDEX	3
 #define RX_PACKET_ATTRIBUTES_CONTEXT_NEXT_WIDTH	1
 #define RX_PACKET_ATTRIBUTES_CONTEXT_INDEX	4
@@ -923,6 +923,8 @@
 #define RX_PACKET_ATTRIBUTES_RX_TSTAMP_WIDTH	1
 #define RX_PACKET_ATTRIBUTES_RSS_HASH_INDEX	6
 #define RX_PACKET_ATTRIBUTES_RSS_HASH_WIDTH	1
+#define RX_PACKET_ATTRIBUTES_FIRST_INDEX	7
+#define RX_PACKET_ATTRIBUTES_FIRST_WIDTH	1
 
 #define RX_NORMAL_DESC0_OVT_INDEX		0
 #define RX_NORMAL_DESC0_OVT_WIDTH		16
--- a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
@@ -1658,10 +1658,15 @@ static int xgbe_dev_read(struct xgbe_cha
 
 	/* Get the header length */
 	if (XGMAC_GET_BITS_LE(rdesc->desc3, RX_NORMAL_DESC3, FD)) {
+		XGMAC_SET_BITS(packet->attributes, RX_PACKET_ATTRIBUTES,
+			       FIRST, 1);
 		rdata->rx.hdr_len = XGMAC_GET_BITS_LE(rdesc->desc2,
 						      RX_NORMAL_DESC2, HL);
 		if (rdata->rx.hdr_len)
 			pdata->ext_stats.rx_split_header_packets++;
+	} else {
+		XGMAC_SET_BITS(packet->attributes, RX_PACKET_ATTRIBUTES,
+			       FIRST, 0);
 	}
 
 	/* Get the RSS hash */
@@ -1684,19 +1689,16 @@ static int xgbe_dev_read(struct xgbe_cha
 		}
 	}
 
-	/* Get the packet length */
-	rdata->rx.len = XGMAC_GET_BITS_LE(rdesc->desc3, RX_NORMAL_DESC3, PL);
-
-	if (!XGMAC_GET_BITS_LE(rdesc->desc3, RX_NORMAL_DESC3, LD)) {
-		/* Not all the data has been transferred for this packet */
-		XGMAC_SET_BITS(packet->attributes, RX_PACKET_ATTRIBUTES,
-			       INCOMPLETE, 1);
+	/* Not all the data has been transferred for this packet */
+	if (!XGMAC_GET_BITS_LE(rdesc->desc3, RX_NORMAL_DESC3, LD))
 		return 0;
-	}
 
 	/* This is the last of the data for this packet */
 	XGMAC_SET_BITS(packet->attributes, RX_PACKET_ATTRIBUTES,
-		       INCOMPLETE, 0);
+		       LAST, 1);
+
+	/* Get the packet length */
+	rdata->rx.len = XGMAC_GET_BITS_LE(rdesc->desc3, RX_NORMAL_DESC3, PL);
 
 	/* Set checksum done indicator as appropriate */
 	if (netdev->features & NETIF_F_RXCSUM)
--- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
@@ -1760,13 +1760,12 @@ static struct sk_buff *xgbe_create_skb(s
 {
 	struct sk_buff *skb;
 	u8 *packet;
-	unsigned int copy_len;
 
 	skb = napi_alloc_skb(napi, rdata->rx.hdr.dma_len);
 	if (!skb)
 		return NULL;
 
-	/* Start with the header buffer which may contain just the header
+	/* Pull in the header buffer which may contain just the header
 	 * or the header plus data
 	 */
 	dma_sync_single_range_for_cpu(pdata->dev, rdata->rx.hdr.dma_base,
@@ -1775,30 +1774,49 @@ static struct sk_buff *xgbe_create_skb(s
 
 	packet = page_address(rdata->rx.hdr.pa.pages) +
 		 rdata->rx.hdr.pa.pages_offset;
-	copy_len = (rdata->rx.hdr_len) ? rdata->rx.hdr_len : len;
-	copy_len = min(rdata->rx.hdr.dma_len, copy_len);
-	skb_copy_to_linear_data(skb, packet, copy_len);
-	skb_put(skb, copy_len);
-
-	len -= copy_len;
-	if (len) {
-		/* Add the remaining data as a frag */
-		dma_sync_single_range_for_cpu(pdata->dev,
-					      rdata->rx.buf.dma_base,
-					      rdata->rx.buf.dma_off,
-					      rdata->rx.buf.dma_len,
-					      DMA_FROM_DEVICE);
-
-		skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
-				rdata->rx.buf.pa.pages,
-				rdata->rx.buf.pa.pages_offset,
-				len, rdata->rx.buf.dma_len);
-		rdata->rx.buf.pa.pages = NULL;
-	}
+	skb_copy_to_linear_data(skb, packet, len);
+	skb_put(skb, len);
 
 	return skb;
 }
 
+static unsigned int xgbe_rx_buf1_len(struct xgbe_ring_data *rdata,
+				     struct xgbe_packet_data *packet)
+{
+	/* Always zero if not the first descriptor */
+	if (!XGMAC_GET_BITS(packet->attributes, RX_PACKET_ATTRIBUTES, FIRST))
+		return 0;
+
+	/* First descriptor with split header, return header length */
+	if (rdata->rx.hdr_len)
+		return rdata->rx.hdr_len;
+
+	/* First descriptor but not the last descriptor and no split header,
+	 * so the full buffer was used
+	 */
+	if (!XGMAC_GET_BITS(packet->attributes, RX_PACKET_ATTRIBUTES, LAST))
+		return rdata->rx.hdr.dma_len;
+
+	/* First descriptor and last descriptor and no split header, so
+	 * calculate how much of the buffer was used
+	 */
+	return min_t(unsigned int, rdata->rx.hdr.dma_len, rdata->rx.len);
+}
+
+static unsigned int xgbe_rx_buf2_len(struct xgbe_ring_data *rdata,
+				     struct xgbe_packet_data *packet,
+				     unsigned int len)
+{
+	/* Always the full buffer if not the last descriptor */
+	if (!XGMAC_GET_BITS(packet->attributes, RX_PACKET_ATTRIBUTES, LAST))
+		return rdata->rx.buf.dma_len;
+
+	/* Last descriptor so calculate how much of the buffer was used
+	 * for the last bit of data
+	 */
+	return rdata->rx.len - len;
+}
+
 static int xgbe_tx_poll(struct xgbe_channel *channel)
 {
 	struct xgbe_prv_data *pdata = channel->pdata;
@@ -1881,8 +1899,8 @@ static int xgbe_rx_poll(struct xgbe_chan
 	struct napi_struct *napi;
 	struct sk_buff *skb;
 	struct skb_shared_hwtstamps *hwtstamps;
-	unsigned int incomplete, error, context_next, context;
-	unsigned int len, rdesc_len, max_len;
+	unsigned int last, error, context_next, context;
+	unsigned int len, buf1_len, buf2_len, max_len;
 	unsigned int received = 0;
 	int packet_count = 0;
 
@@ -1892,7 +1910,7 @@ static int xgbe_rx_poll(struct xgbe_chan
 	if (!ring)
 		return 0;
 
-	incomplete = 0;
+	last = 0;
 	context_next = 0;
 
 	napi = (pdata->per_channel_irq) ? &channel->napi : &pdata->napi;
@@ -1926,9 +1944,8 @@ read_again:
 		received++;
 		ring->cur++;
 
-		incomplete = XGMAC_GET_BITS(packet->attributes,
-					    RX_PACKET_ATTRIBUTES,
-					    INCOMPLETE);
+		last = XGMAC_GET_BITS(packet->attributes, RX_PACKET_ATTRIBUTES,
+				      LAST);
 		context_next = XGMAC_GET_BITS(packet->attributes,
 					      RX_PACKET_ATTRIBUTES,
 					      CONTEXT_NEXT);
@@ -1937,7 +1954,7 @@ read_again:
 					 CONTEXT);
 
 		/* Earlier error, just drain the remaining data */
-		if ((incomplete || context_next) && error)
+		if ((!last || context_next) && error)
 			goto read_again;
 
 		if (error || packet->errors) {
@@ -1949,16 +1966,22 @@ read_again:
 		}
 
 		if (!context) {
-			/* Length is cumulative, get this descriptor's length */
-			rdesc_len = rdata->rx.len - len;
-			len += rdesc_len;
+			/* Get the data length in the descriptor buffers */
+			buf1_len = xgbe_rx_buf1_len(rdata, packet);
+			len += buf1_len;
+			buf2_len = xgbe_rx_buf2_len(rdata, packet, len);
+			len += buf2_len;
 
-			if (rdesc_len && !skb) {
+			if (!skb) {
 				skb = xgbe_create_skb(pdata, napi, rdata,
-						      rdesc_len);
-				if (!skb)
+						      buf1_len);
+				if (!skb) {
 					error = 1;
-			} else if (rdesc_len) {
+					goto skip_data;
+				}
+			}
+
+			if (buf2_len) {
 				dma_sync_single_range_for_cpu(pdata->dev,
 							rdata->rx.buf.dma_base,
 							rdata->rx.buf.dma_off,
@@ -1968,13 +1991,14 @@ read_again:
 				skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
 						rdata->rx.buf.pa.pages,
 						rdata->rx.buf.pa.pages_offset,
-						rdesc_len,
+						buf2_len,
 						rdata->rx.buf.dma_len);
 				rdata->rx.buf.pa.pages = NULL;
 			}
 		}
 
-		if (incomplete || context_next)
+skip_data:
+		if (!last || context_next)
 			goto read_again;
 
 		if (!skb)
@@ -2033,7 +2057,7 @@ next_packet:
 	}
 
 	/* Check if we need to save state before leaving */
-	if (received && (incomplete || context_next)) {
+	if (received && (!last || context_next)) {
 		rdata = XGBE_GET_DESC_DATA(ring, ring->cur);
 		rdata->state_saved = 1;
 		rdata->state.skb = skb;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 05/76] net: unix: properly re-increment inflight counter of GC discarded candidates
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (3 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 04/76] amd-xgbe: Fix jumbo MTU processing on newer hardware Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 06/76] net/mlx5: Increase number of max QPs in default profile Greg Kroah-Hartman
                   ` (70 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andrey Ulanov, Dmitry Vyukov,
	David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andrey Ulanov <andreyu@google.com>


[ Upstream commit 7df9c24625b9981779afb8fcdbe2bb4765e61147 ]

Dmitry has reported that a BUG_ON() condition in unix_notinflight()
may be triggered by a simple code that forwards unix socket in an
SCM_RIGHTS message.
That is caused by incorrect unix socket GC implementation in unix_gc().

The GC first collects list of candidates, then (a) decrements their
"children's" inflight counter, (b) checks which inflight counters are
now 0, and then (c) increments all inflight counters back.
(a) and (c) are done by calling scan_children() with inc_inflight or
dec_inflight as the second argument.

Commit 6209344f5a37 ("net: unix: fix inflight counting bug in garbage
collector") changed scan_children() such that it no longer considers
sockets that do not have UNIX_GC_CANDIDATE flag. It also added a block
of code that that unsets this flag _before_ invoking
scan_children(, dec_iflight, ). This may lead to incorrect inflight
counters for some sockets.

This change fixes this bug by changing order of operations:
UNIX_GC_CANDIDATE is now unset only after all inflight counters are
restored to the original state.

  kernel BUG at net/unix/garbage.c:149!
  RIP: 0010:[<ffffffff8717ebf4>]  [<ffffffff8717ebf4>]
  unix_notinflight+0x3b4/0x490 net/unix/garbage.c:149
  Call Trace:
   [<ffffffff8716cfbf>] unix_detach_fds.isra.19+0xff/0x170 net/unix/af_unix.c:1487
   [<ffffffff8716f6a9>] unix_destruct_scm+0xf9/0x210 net/unix/af_unix.c:1496
   [<ffffffff86a90a01>] skb_release_head_state+0x101/0x200 net/core/skbuff.c:655
   [<ffffffff86a9808a>] skb_release_all+0x1a/0x60 net/core/skbuff.c:668
   [<ffffffff86a980ea>] __kfree_skb+0x1a/0x30 net/core/skbuff.c:684
   [<ffffffff86a98284>] kfree_skb+0x184/0x570 net/core/skbuff.c:705
   [<ffffffff871789d5>] unix_release_sock+0x5b5/0xbd0 net/unix/af_unix.c:559
   [<ffffffff87179039>] unix_release+0x49/0x90 net/unix/af_unix.c:836
   [<ffffffff86a694b2>] sock_release+0x92/0x1f0 net/socket.c:570
   [<ffffffff86a6962b>] sock_close+0x1b/0x20 net/socket.c:1017
   [<ffffffff81a76b8e>] __fput+0x34e/0x910 fs/file_table.c:208
   [<ffffffff81a771da>] ____fput+0x1a/0x20 fs/file_table.c:244
   [<ffffffff81483ab0>] task_work_run+0x1a0/0x280 kernel/task_work.c:116
   [<     inline     >] exit_task_work include/linux/task_work.h:21
   [<ffffffff8141287a>] do_exit+0x183a/0x2640 kernel/exit.c:828
   [<ffffffff8141383e>] do_group_exit+0x14e/0x420 kernel/exit.c:931
   [<ffffffff814429d3>] get_signal+0x663/0x1880 kernel/signal.c:2307
   [<ffffffff81239b45>] do_signal+0xc5/0x2190 arch/x86/kernel/signal.c:807
   [<ffffffff8100666a>] exit_to_usermode_loop+0x1ea/0x2d0
  arch/x86/entry/common.c:156
   [<     inline     >] prepare_exit_to_usermode arch/x86/entry/common.c:190
   [<ffffffff81009693>] syscall_return_slowpath+0x4d3/0x570
  arch/x86/entry/common.c:259
   [<ffffffff881478e6>] entry_SYSCALL_64_fastpath+0xc4/0xc6

Link: https://lkml.org/lkml/2017/3/6/252
Signed-off-by: Andrey Ulanov <andreyu@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: 6209344 ("net: unix: fix inflight counting bug in garbage collector")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/unix/garbage.c |   17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -146,6 +146,7 @@ void unix_notinflight(struct user_struct
 	if (s) {
 		struct unix_sock *u = unix_sk(s);
 
+		BUG_ON(!atomic_long_read(&u->inflight));
 		BUG_ON(list_empty(&u->link));
 
 		if (atomic_long_dec_and_test(&u->inflight))
@@ -341,6 +342,14 @@ void unix_gc(void)
 	}
 	list_del(&cursor);
 
+	/* Now gc_candidates contains only garbage.  Restore original
+	 * inflight counters for these as well, and remove the skbuffs
+	 * which are creating the cycle(s).
+	 */
+	skb_queue_head_init(&hitlist);
+	list_for_each_entry(u, &gc_candidates, link)
+		scan_children(&u->sk, inc_inflight, &hitlist);
+
 	/* not_cycle_list contains those sockets which do not make up a
 	 * cycle.  Restore these to the inflight list.
 	 */
@@ -350,14 +359,6 @@ void unix_gc(void)
 		list_move_tail(&u->link, &gc_inflight_list);
 	}
 
-	/* Now gc_candidates contains only garbage.  Restore original
-	 * inflight counters for these as well, and remove the skbuffs
-	 * which are creating the cycle(s).
-	 */
-	skb_queue_head_init(&hitlist);
-	list_for_each_entry(u, &gc_candidates, link)
-	scan_children(&u->sk, inc_inflight, &hitlist);
-
 	spin_unlock(&unix_gc_lock);
 
 	/* Here we are. Hitlist is filled. Die. */

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 06/76] net/mlx5: Increase number of max QPs in default profile
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 05/76] net: unix: properly re-increment inflight counter of GC discarded candidates Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 07/76] net/mlx5e: Count LRO packets correctly Greg Kroah-Hartman
                   ` (69 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Maor Gottlieb, Saeed Mahameed,
	David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Maor Gottlieb <maorg@mellanox.com>


[ Upstream commit 5f40b4ed975c26016cf41953b7510fe90718e21c ]

With ConnectX-4 sharing SRQs from the same space as QPs, we hit a
limit preventing some applications to allocate needed QPs amount.
Double the size to 256K.

Fixes: e126ba97dba9e ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/main.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -85,7 +85,7 @@ static struct mlx5_profile profile[] = {
 	[2] = {
 		.mask		= MLX5_PROF_MASK_QP_SIZE |
 				  MLX5_PROF_MASK_MR_CACHE,
-		.log_max_qp	= 17,
+		.log_max_qp	= 18,
 		.mr_cache[0]	= {
 			.size	= 500,
 			.limit	= 250

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 07/76] net/mlx5e: Count LRO packets correctly
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (5 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 06/76] net/mlx5: Increase number of max QPs in default profile Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 08/76] net: bcmgenet: remove bcmgenet_internal_phy_setup() Greg Kroah-Hartman
                   ` (68 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Gal Pressman, kernel-team,
	Saeed Mahameed, Alexei Starovoitov, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Gal Pressman <galp@mellanox.com>


[ Upstream commit 8ab7e2ae15d84ba758b2c8c6f4075722e9bd2a08 ]

RX packets statistics ('rx_packets' counter) used to count LRO packets
as one, even though it contains multiple segments.
This patch will increment the counter by the number of segments, and
align the driver with the behavior of other drivers in the stack.

Note that no information is lost in this patch due to 'rx_lro_packets'
counter existence.

Before, ethtool showed:
$ ethtool -S ens6 | egrep "rx_packets|rx_lro_packets"
     rx_packets: 435277
     rx_lro_packets: 35847
     rx_packets_phy: 1935066

Now, we will see the more logical statistics:
$ ethtool -S ens6 | egrep "rx_packets|rx_lro_packets"
     rx_packets: 1935066
     rx_lro_packets: 35847
     rx_packets_phy: 1935066

Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -197,6 +197,10 @@ static inline void mlx5e_build_rx_skb(st
 	if (lro_num_seg > 1) {
 		mlx5e_lro_update_hdr(skb, cqe);
 		skb_shinfo(skb)->gso_size = DIV_ROUND_UP(cqe_bcnt, lro_num_seg);
+		/* Subtract one since we already counted this as one
+		 * "regular" packet in mlx5e_complete_rx_cqe()
+		 */
+		rq->stats.packets += lro_num_seg - 1;
 		rq->stats.lro_packets++;
 		rq->stats.lro_bytes += cqe_bcnt;
 	}

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 08/76] net: bcmgenet: remove bcmgenet_internal_phy_setup()
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (6 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 07/76] net/mlx5e: Count LRO packets correctly Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 09/76] ipv4: provide stronger user input validation in nl_fib_input() Greg Kroah-Hartman
                   ` (67 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Doug Berger, Florian Fainelli,
	David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Doug Berger <opendmb@gmail.com>


[ Upstream commit 31739eae738ccbe8b9d627c3f2251017ca03f4d2 ]

Commit 6ac3ce8295e6 ("net: bcmgenet: Remove excessive PHY reset")
removed the bcmgenet_mii_reset() function from bcmgenet_power_up() and
bcmgenet_internal_phy_setup() functions.  In so doing it broke the reset
of the internal PHY devices used by the GENETv1-GENETv3 which required
this reset before the UniMAC was enabled.  It also broke the internal
GPHY devices used by the GENETv4 because the config_init that installed
the AFE workaround was no longer occurring after the reset of the GPHY
performed by bcmgenet_phy_power_set() in bcmgenet_internal_phy_setup().
In addition the code in bcmgenet_internal_phy_setup() related to the
"enable APD" comment goes with the bcmgenet_mii_reset() so it should
have also been removed.

Commit bd4060a6108b ("net: bcmgenet: Power on integrated GPHY in
bcmgenet_power_up()") moved the bcmgenet_phy_power_set() call to the
bcmgenet_power_up() function, but failed to remove it from the
bcmgenet_internal_phy_setup() function.  Had it done so, the
bcmgenet_internal_phy_setup() function would have been empty and could
have been removed at that time.

Commit 5dbebbb44a6a ("net: bcmgenet: Software reset EPHY after power on")
was submitted to correct the functional problems introduced by
commit 6ac3ce8295e6 ("net: bcmgenet: Remove excessive PHY reset"). It
was included in v4.4 and made available on 4.3-stable. Unfortunately,
it didn't fully revert the commit because this bcmgenet_mii_reset()
doesn't apply the soft reset to the internal GPHY used by GENETv4 like
the previous one did. This prevents the restoration of the AFE work-
arounds for internal GPHY devices after the bcmgenet_phy_power_set() in
bcmgenet_internal_phy_setup().

This commit takes the alternate approach of removing the unnecessary
bcmgenet_internal_phy_setup() function which shouldn't have been in v4.3
so that when bcmgenet_mii_reset() was restored it should have only gone
into bcmgenet_power_up().  This will avoid the problems while also
removing the redundancy (and hopefully some of the confusion).

Fixes: 6ac3ce8295e6 ("net: bcmgenet: Remove excessive PHY reset")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/broadcom/genet/bcmmii.c |   15 ---------------
 1 file changed, 15 deletions(-)

--- a/drivers/net/ethernet/broadcom/genet/bcmmii.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmmii.c
@@ -220,20 +220,6 @@ void bcmgenet_phy_power_set(struct net_d
 	udelay(60);
 }
 
-static void bcmgenet_internal_phy_setup(struct net_device *dev)
-{
-	struct bcmgenet_priv *priv = netdev_priv(dev);
-	u32 reg;
-
-	/* Power up PHY */
-	bcmgenet_phy_power_set(dev, true);
-	/* enable APD */
-	reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
-	reg |= EXT_PWR_DN_EN_LD;
-	bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
-	bcmgenet_mii_reset(dev);
-}
-
 static void bcmgenet_moca_phy_setup(struct bcmgenet_priv *priv)
 {
 	u32 reg;
@@ -281,7 +267,6 @@ int bcmgenet_mii_config(struct net_devic
 
 		if (priv->internal_phy) {
 			phy_name = "internal PHY";
-			bcmgenet_internal_phy_setup(dev);
 		} else if (priv->phy_interface == PHY_INTERFACE_MODE_MOCA) {
 			phy_name = "MoCA";
 			bcmgenet_moca_phy_setup(priv);

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 09/76] ipv4: provide stronger user input validation in nl_fib_input()
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (7 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 08/76] net: bcmgenet: remove bcmgenet_internal_phy_setup() Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 10/76] socket, bpf: fix sk_filter use after free in sk_clone_lock Greg Kroah-Hartman
                   ` (66 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexander Potapenko, Eric Dumazet,
	David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>


[ Upstream commit c64c0b3cac4c5b8cb093727d2c19743ea3965c0b ]

Alexander reported a KMSAN splat caused by reads of uninitialized
field (tb_id_in) from user provided struct fib_result_nl

It turns out nl_fib_input() sanity tests on user input is a bit
wrong :

User can pretend nlh->nlmsg_len is big enough, but provide
at sendmsg() time a too small buffer.

Reported-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/fib_frontend.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -1080,7 +1080,8 @@ static void nl_fib_input(struct sk_buff
 
 	net = sock_net(skb->sk);
 	nlh = nlmsg_hdr(skb);
-	if (skb->len < NLMSG_HDRLEN || skb->len < nlh->nlmsg_len ||
+	if (skb->len < nlmsg_total_size(sizeof(*frn)) ||
+	    skb->len < nlh->nlmsg_len ||
 	    nlmsg_len(nlh) < sizeof(*frn))
 		return;
 

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 10/76] socket, bpf: fix sk_filter use after free in sk_clone_lock
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (8 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 09/76] ipv4: provide stronger user input validation in nl_fib_input() Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 11/76] tcp: initialize icsk_ack.lrcvtime at session start time Greg Kroah-Hartman
                   ` (65 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Daniel Borkmann, Alexei Starovoitov,
	David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <daniel@iogearbox.net>


[ Upstream commit a97e50cc4cb67e1e7bff56f6b41cda62ca832336 ]

In sk_clone_lock(), we create a new socket and inherit most of the
parent's members via sock_copy() which memcpy()'s various sections.
Now, in case the parent socket had a BPF socket filter attached,
then newsk->sk_filter points to the same instance as the original
sk->sk_filter.

sk_filter_charge() is then called on the newsk->sk_filter to take a
reference and should that fail due to hitting max optmem, we bail
out and release the newsk instance.

The issue is that commit 278571baca2a ("net: filter: simplify socket
charging") wrongly combined the dismantle path with the failure path
of xfrm_sk_clone_policy(). This means, even when charging failed, we
call sk_free_unlock_clone() on the newsk, which then still points to
the same sk_filter as the original sk.

Thus, sk_free_unlock_clone() calls into __sk_destruct() eventually
where it tests for present sk_filter and calls sk_filter_uncharge()
on it, which potentially lets sk_omem_alloc wrap around and releases
the eBPF prog and sk_filter structure from the (still intact) parent.

Fix it by making sure that when sk_filter_charge() failed, we reset
newsk->sk_filter back to NULL before passing to sk_free_unlock_clone(),
so that we don't mess with the parents sk_filter.

Only if xfrm_sk_clone_policy() fails, we did reach the point where
either the parent's filter was NULL and as a result newsk's as well
or where we previously had a successful sk_filter_charge(), thus for
that case, we do need sk_filter_uncharge() to release the prior taken
reference on sk_filter.

Fixes: 278571baca2a ("net: filter: simplify socket charging")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/core/sock.c |    6 ++++++
 1 file changed, 6 insertions(+)

--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1557,6 +1557,12 @@ struct sock *sk_clone_lock(const struct
 			is_charged = sk_filter_charge(newsk, filter);
 
 		if (unlikely(!is_charged || xfrm_sk_clone_policy(newsk, sk))) {
+			/* We need to make sure that we don't uncharge the new
+			 * socket if we couldn't charge it in the first place
+			 * as otherwise we uncharge the parent's filter.
+			 */
+			if (!is_charged)
+				RCU_INIT_POINTER(newsk->sk_filter, NULL);
 			/* It is still raw copy of parent, so invalidate
 			 * destructor and make plain sk_free() */
 			newsk->sk_destruct = NULL;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 11/76] tcp: initialize icsk_ack.lrcvtime at session start time
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (9 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 10/76] socket, bpf: fix sk_filter use after free in sk_clone_lock Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 12/76] Input: elan_i2c - add ASUS EeeBook X205TA special touchpad fw Greg Kroah-Hartman
                   ` (64 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Neal Cardwell, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>


[ Upstream commit 15bb7745e94a665caf42bfaabf0ce062845b533b ]

icsk_ack.lrcvtime has a 0 value at socket creation time.

tcpi_last_data_recv can have bogus value if no payload is ever received.

This patch initializes icsk_ack.lrcvtime for active sessions
in tcp_finish_connect(), and for passive sessions in
tcp_create_openreq_child()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/tcp_input.c     |    2 +-
 net/ipv4/tcp_minisocks.c |    1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5435,6 +5435,7 @@ void tcp_finish_connect(struct sock *sk,
 	struct inet_connection_sock *icsk = inet_csk(sk);
 
 	tcp_set_state(sk, TCP_ESTABLISHED);
+	icsk->icsk_ack.lrcvtime = tcp_time_stamp;
 
 	if (skb) {
 		icsk->icsk_af_ops->sk_rx_dst_set(sk, skb);
@@ -5647,7 +5648,6 @@ static int tcp_rcv_synsent_state_process
 			 * to stand against the temptation 8)     --ANK
 			 */
 			inet_csk_schedule_ack(sk);
-			icsk->icsk_ack.lrcvtime = tcp_time_stamp;
 			tcp_enter_quickack_mode(sk);
 			inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK,
 						  TCP_DELACK_MAX, TCP_RTO_MAX);
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -472,6 +472,7 @@ struct sock *tcp_create_openreq_child(co
 		newtp->mdev_us = jiffies_to_usecs(TCP_TIMEOUT_INIT);
 		newtp->rtt_min[0].rtt = ~0U;
 		newicsk->icsk_rto = TCP_TIMEOUT_INIT;
+		newicsk->icsk_ack.lrcvtime = tcp_time_stamp;
 
 		newtp->packets_out = 0;
 		newtp->retrans_out = 0;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 12/76] Input: elan_i2c - add ASUS EeeBook X205TA special touchpad fw
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (10 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 11/76] tcp: initialize icsk_ack.lrcvtime at session start time Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 13/76] Input: i8042 - add noloop quirk for Dell Embedded Box PC 3000 Greg Kroah-Hartman
                   ` (63 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Matjaz Hegedic, KT Liao, Dmitry Torokhov

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Matjaz Hegedic <matjaz.hegedic@gmail.com>

commit 92ef6f97a66e580189a41a132d0f8a9f78d6ddce upstream.

EeeBook X205TA is yet another ASUS device with a special touchpad
firmware that needs to be accounted for during initialization, or
else the touchpad will go into an invalid state upon suspend/resume.
Adding the appropriate ic_type and product_id check fixes the problem.

Signed-off-by: Matjaz Hegedic <matjaz.hegedic@gmail.com>
Acked-by: KT Liao <kt.liao@emc.com.tw>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/mouse/elan_i2c_core.c |   20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

--- a/drivers/input/mouse/elan_i2c_core.c
+++ b/drivers/input/mouse/elan_i2c_core.c
@@ -218,17 +218,19 @@ static int elan_query_product(struct ela
 
 static int elan_check_ASUS_special_fw(struct elan_tp_data *data)
 {
-	if (data->ic_type != 0x0E)
-		return false;
-
-	switch (data->product_id) {
-	case 0x05 ... 0x07:
-	case 0x09:
-	case 0x13:
+	if (data->ic_type == 0x0E) {
+		switch (data->product_id) {
+		case 0x05 ... 0x07:
+		case 0x09:
+		case 0x13:
+			return true;
+		}
+	} else if (data->ic_type == 0x08 && data->product_id == 0x26) {
+		/* ASUS EeeBook X205TA */
 		return true;
-	default:
-		return false;
 	}
+
+	return false;
 }
 
 static int __elan_initialize(struct elan_tp_data *data)

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 13/76] Input: i8042 - add noloop quirk for Dell Embedded Box PC 3000
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (11 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 12/76] Input: elan_i2c - add ASUS EeeBook X205TA special touchpad fw Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 14/76] Input: iforce - validate number of endpoints before using them Greg Kroah-Hartman
                   ` (62 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Kai-Heng Feng, Marcos Paulo de Souza,
	Dmitry Torokhov

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kai-Heng Feng <kai.heng.feng@canonical.com>

commit 45838660e34d90db8d4f7cbc8fd66e8aff79f4fe upstream.

The aux port does not get detected without noloop quirk, so external PS/2
mouse cannot work as result.

The PS/2 mouse can work with this quirk.

BugLink: https://bugs.launchpad.net/bugs/1591053
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Marcos Paulo de Souza <marcos.souza.org@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/serio/i8042-x86ia64io.h |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/drivers/input/serio/i8042-x86ia64io.h
+++ b/drivers/input/serio/i8042-x86ia64io.h
@@ -120,6 +120,13 @@ static const struct dmi_system_id __init
 		},
 	},
 	{
+		/* Dell Embedded Box PC 3000 */
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+			DMI_MATCH(DMI_PRODUCT_NAME, "Embedded Box PC 3000"),
+		},
+	},
+	{
 		/* OQO Model 01 */
 		.matches = {
 			DMI_MATCH(DMI_SYS_VENDOR, "OQO"),

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 14/76] Input: iforce - validate number of endpoints before using them
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (12 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 13/76] Input: i8042 - add noloop quirk for Dell Embedded Box PC 3000 Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 15/76] Input: ims-pcu " Greg Kroah-Hartman
                   ` (61 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Johan Hovold, Dmitry Torokhov

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan@kernel.org>

commit 59cf8bed44a79ec42303151dd014fdb6434254bb upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer or accessing memory that lie beyond the end of the endpoint
array should a malicious device lack the expected endpoints.

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/joystick/iforce/iforce-usb.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/input/joystick/iforce/iforce-usb.c
+++ b/drivers/input/joystick/iforce/iforce-usb.c
@@ -141,6 +141,9 @@ static int iforce_usb_probe(struct usb_i
 
 	interface = intf->cur_altsetting;
 
+	if (interface->desc.bNumEndpoints < 2)
+		return -ENODEV;
+
 	epirq = &interface->endpoint[0].desc;
 	epout = &interface->endpoint[1].desc;
 

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 15/76] Input: ims-pcu - validate number of endpoints before using them
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (13 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 14/76] Input: iforce - validate number of endpoints before using them Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 16/76] Input: hanwang " Greg Kroah-Hartman
                   ` (60 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Johan Hovold, Dmitry Torokhov

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan@kernel.org>

commit 1916d319271664241b7aa0cd2b05e32bdb310ce9 upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack control-interface endpoints.

Fixes: 628329d52474 ("Input: add IMS Passenger Control Unit driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/misc/ims-pcu.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/drivers/input/misc/ims-pcu.c
+++ b/drivers/input/misc/ims-pcu.c
@@ -1667,6 +1667,10 @@ static int ims_pcu_parse_cdc_data(struct
 		return -EINVAL;
 
 	alt = pcu->ctrl_intf->cur_altsetting;
+
+	if (alt->desc.bNumEndpoints < 1)
+		return -ENODEV;
+
 	pcu->ep_ctrl = &alt->endpoint[0].desc;
 	pcu->max_ctrl_size = usb_endpoint_maxp(pcu->ep_ctrl);
 

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 16/76] Input: hanwang - validate number of endpoints before using them
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (14 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 15/76] Input: ims-pcu " Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 17/76] Input: yealink " Greg Kroah-Hartman
                   ` (59 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Johan Hovold, Dmitry Torokhov

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan@kernel.org>

commit ba340d7b83703768ce566f53f857543359aa1b98 upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.

Fixes: bba5394ad3bd ("Input: add support for Hanwang tablets")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/tablet/hanwang.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/input/tablet/hanwang.c
+++ b/drivers/input/tablet/hanwang.c
@@ -340,6 +340,9 @@ static int hanwang_probe(struct usb_inte
 	int error;
 	int i;
 
+	if (intf->cur_altsetting->desc.bNumEndpoints < 1)
+		return -ENODEV;
+
 	hanwang = kzalloc(sizeof(struct hanwang), GFP_KERNEL);
 	input_dev = input_allocate_device();
 	if (!hanwang || !input_dev) {

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 17/76] Input: yealink - validate number of endpoints before using them
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (15 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 16/76] Input: hanwang " Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 18/76] Input: cm109 " Greg Kroah-Hartman
                   ` (58 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Johan Hovold, Dmitry Torokhov

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan@kernel.org>

commit 5cc4a1a9f5c179795c8a1f2b0f4361829d6a070e upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.

Fixes: aca951a22a1d ("[PATCH] input-driver-yealink-P1K-usb-phone")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/misc/yealink.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/drivers/input/misc/yealink.c
+++ b/drivers/input/misc/yealink.c
@@ -875,6 +875,10 @@ static int usb_probe(struct usb_interfac
 	int ret, pipe, i;
 
 	interface = intf->cur_altsetting;
+
+	if (interface->desc.bNumEndpoints < 1)
+		return -ENODEV;
+
 	endpoint = &interface->endpoint[0].desc;
 	if (!usb_endpoint_is_int_in(endpoint))
 		return -ENODEV;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 18/76] Input: cm109 - validate number of endpoints before using them
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (16 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 17/76] Input: yealink " Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 19/76] Input: kbtab " Greg Kroah-Hartman
                   ` (57 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Johan Hovold, Dmitry Torokhov

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan@kernel.org>

commit ac2ee9ba953afe88f7a673e1c0c839227b1d7891 upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.

Fixes: c04148f915e5 ("Input: add driver for USB VoIP phones with CM109...")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/misc/cm109.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/drivers/input/misc/cm109.c
+++ b/drivers/input/misc/cm109.c
@@ -675,6 +675,10 @@ static int cm109_usb_probe(struct usb_in
 	int error = -ENOMEM;
 
 	interface = intf->cur_altsetting;
+
+	if (interface->desc.bNumEndpoints < 1)
+		return -ENODEV;
+
 	endpoint = &interface->endpoint[0].desc;
 
 	if (!usb_endpoint_is_int_in(endpoint))

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 19/76] Input: kbtab - validate number of endpoints before using them
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (17 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 18/76] Input: cm109 " Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 20/76] Input: sur40 " Greg Kroah-Hartman
                   ` (56 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Johan Hovold, Dmitry Torokhov

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan@kernel.org>

commit cb1b494663e037253337623bf1ef2df727883cb7 upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/tablet/kbtab.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/input/tablet/kbtab.c
+++ b/drivers/input/tablet/kbtab.c
@@ -122,6 +122,9 @@ static int kbtab_probe(struct usb_interf
 	struct input_dev *input_dev;
 	int error = -ENOMEM;
 
+	if (intf->cur_altsetting->desc.bNumEndpoints < 1)
+		return -ENODEV;
+
 	kbtab = kzalloc(sizeof(struct kbtab), GFP_KERNEL);
 	input_dev = input_allocate_device();
 	if (!kbtab || !input_dev)

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 20/76] Input: sur40 - validate number of endpoints before using them
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (18 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 19/76] Input: kbtab " Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 21/76] ALSA: seq: Fix racy cell insertions during snd_seq_pool_done() Greg Kroah-Hartman
                   ` (55 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Johan Hovold, Dmitry Torokhov

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan@kernel.org>

commit 92461f5d723037530c1f36cce93640770037812c upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer or accessing memory that lie beyond the end of the endpoint
array should a malicious device lack the expected endpoints.

Fixes: bdb5c57f209c ("Input: add sur40 driver for Samsung SUR40... ")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/touchscreen/sur40.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/input/touchscreen/sur40.c
+++ b/drivers/input/touchscreen/sur40.c
@@ -500,6 +500,9 @@ static int sur40_probe(struct usb_interf
 	if (iface_desc->desc.bInterfaceClass != 0xFF)
 		return -ENODEV;
 
+	if (iface_desc->desc.bNumEndpoints < 5)
+		return -ENODEV;
+
 	/* Use endpoint #4 (0x86). */
 	endpoint = &iface_desc->endpoint[4].desc;
 	if (endpoint->bEndpointAddress != TOUCH_ENDPOINT)

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 21/76] ALSA: seq: Fix racy cell insertions during snd_seq_pool_done()
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (19 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 20/76] Input: sur40 " Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 22/76] ALSA: ctxfi: Fix the incorrect check of dma_set_mask() call Greg Kroah-Hartman
                   ` (54 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Takashi Iwai

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <tiwai@suse.de>

commit c520ff3d03f0b5db7146d9beed6373ad5d2a5e0e upstream.

When snd_seq_pool_done() is called, it marks the closing flag to
refuse the further cell insertions.  But snd_seq_pool_done() itself
doesn't clear the cells but just waits until all cells are cleared by
the caller side.  That is, it's racy, and this leads to the endless
stall as syzkaller spotted.

This patch addresses the racy by splitting the setup of pool->closing
flag out of snd_seq_pool_done(), and calling it properly before
snd_seq_pool_done().

BugLink: http://lkml.kernel.org/r/CACT4Y+aqqy8bZA1fFieifNxR2fAfFQQABcBHj801+u5ePV0URw@mail.gmail.com
Reported-and-tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/core/seq/seq_clientmgr.c |    1 +
 sound/core/seq/seq_fifo.c      |    3 +++
 sound/core/seq/seq_memory.c    |   17 +++++++++++++----
 sound/core/seq/seq_memory.h    |    1 +
 4 files changed, 18 insertions(+), 4 deletions(-)

--- a/sound/core/seq/seq_clientmgr.c
+++ b/sound/core/seq/seq_clientmgr.c
@@ -1921,6 +1921,7 @@ static int snd_seq_ioctl_set_client_pool
 	     info.output_pool != client->pool->size)) {
 		if (snd_seq_write_pool_allocated(client)) {
 			/* remove all existing cells */
+			snd_seq_pool_mark_closing(client->pool);
 			snd_seq_queue_client_leave_cells(client->number);
 			snd_seq_pool_done(client->pool);
 		}
--- a/sound/core/seq/seq_fifo.c
+++ b/sound/core/seq/seq_fifo.c
@@ -70,6 +70,9 @@ void snd_seq_fifo_delete(struct snd_seq_
 		return;
 	*fifo = NULL;
 
+	if (f->pool)
+		snd_seq_pool_mark_closing(f->pool);
+
 	snd_seq_fifo_clear(f);
 
 	/* wake up clients if any */
--- a/sound/core/seq/seq_memory.c
+++ b/sound/core/seq/seq_memory.c
@@ -414,6 +414,18 @@ int snd_seq_pool_init(struct snd_seq_poo
 	return 0;
 }
 
+/* refuse the further insertion to the pool */
+void snd_seq_pool_mark_closing(struct snd_seq_pool *pool)
+{
+	unsigned long flags;
+
+	if (snd_BUG_ON(!pool))
+		return;
+	spin_lock_irqsave(&pool->lock, flags);
+	pool->closing = 1;
+	spin_unlock_irqrestore(&pool->lock, flags);
+}
+
 /* remove events */
 int snd_seq_pool_done(struct snd_seq_pool *pool)
 {
@@ -424,10 +436,6 @@ int snd_seq_pool_done(struct snd_seq_poo
 		return -EINVAL;
 
 	/* wait for closing all threads */
-	spin_lock_irqsave(&pool->lock, flags);
-	pool->closing = 1;
-	spin_unlock_irqrestore(&pool->lock, flags);
-
 	if (waitqueue_active(&pool->output_sleep))
 		wake_up(&pool->output_sleep);
 
@@ -484,6 +492,7 @@ int snd_seq_pool_delete(struct snd_seq_p
 	*ppool = NULL;
 	if (pool == NULL)
 		return 0;
+	snd_seq_pool_mark_closing(pool);
 	snd_seq_pool_done(pool);
 	kfree(pool);
 	return 0;
--- a/sound/core/seq/seq_memory.h
+++ b/sound/core/seq/seq_memory.h
@@ -84,6 +84,7 @@ static inline int snd_seq_total_cells(st
 int snd_seq_pool_init(struct snd_seq_pool *pool);
 
 /* done pool - free events */
+void snd_seq_pool_mark_closing(struct snd_seq_pool *pool);
 int snd_seq_pool_done(struct snd_seq_pool *pool);
 
 /* create pool */

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 22/76] ALSA: ctxfi: Fix the incorrect check of dma_set_mask() call
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (20 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 21/76] ALSA: seq: Fix racy cell insertions during snd_seq_pool_done() Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 23/76] ALSA: hda - Adding a group of pin definition to fix headset problem Greg Kroah-Hartman
                   ` (53 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Takashi Iwai

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <tiwai@suse.de>

commit f363a06642f28caaa78cb6446bbad90c73fe183c upstream.

In the commit [15c75b09f8d1: ALSA: ctxfi: Fallback DMA mask to 32bit],
I forgot to put "!" at dam_set_mask() call check in cthw20k1.c (while
cthw20k2.c is OK).  This patch fixes that obvious bug.

(As a side note: although the original commit was completely wrong,
 it's still working for most of machines, as it sets to 32bit DMA mask
 in the end.  So the bug severity is low.)

Fixes: 15c75b09f8d1 ("ALSA: ctxfi: Fallback DMA mask to 32bit")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/pci/ctxfi/cthw20k1.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/sound/pci/ctxfi/cthw20k1.c
+++ b/sound/pci/ctxfi/cthw20k1.c
@@ -1905,7 +1905,7 @@ static int hw_card_start(struct hw *hw)
 		return err;
 
 	/* Set DMA transfer mask */
-	if (dma_set_mask(&pci->dev, DMA_BIT_MASK(dma_bits))) {
+	if (!dma_set_mask(&pci->dev, DMA_BIT_MASK(dma_bits))) {
 		dma_set_coherent_mask(&pci->dev, DMA_BIT_MASK(dma_bits));
 	} else {
 		dma_set_mask(&pci->dev, DMA_BIT_MASK(32));

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 23/76] ALSA: hda - Adding a group of pin definition to fix headset problem
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (21 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 22/76] ALSA: ctxfi: Fix the incorrect check of dma_set_mask() call Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 24/76] USB: serial: option: add Quectel UC15, UC20, EC21, and EC25 modems Greg Kroah-Hartman
                   ` (52 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Hui Wang, Takashi Iwai

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hui Wang <hui.wang@canonical.com>

commit 3f307834e695f59dac4337a40316bdecfb9d0508 upstream.

A new Dell laptop needs to apply ALC269_FIXUP_DELL1_MIC_NO_PRESENCE to
fix the headset problem, and the pin definiton of this machine is not
in the pin quirk table yet, now adding it to the table.

Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/pci/hda/patch_realtek.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -6040,6 +6040,8 @@ static const struct snd_hda_pin_quirk al
 		ALC295_STANDARD_PINS,
 		{0x17, 0x21014040},
 		{0x18, 0x21a19050}),
+	SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL1_MIC_NO_PRESENCE,
+		ALC295_STANDARD_PINS),
 	SND_HDA_PIN_QUIRK(0x10ec0298, 0x1028, "Dell", ALC298_FIXUP_DELL1_MIC_NO_PRESENCE,
 		ALC298_STANDARD_PINS,
 		{0x17, 0x90170110}),

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 24/76] USB: serial: option: add Quectel UC15, UC20, EC21, and EC25 modems
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (22 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 23/76] ALSA: hda - Adding a group of pin definition to fix headset problem Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 27/76] usb: gadget: f_uvc: Fix SuperSpeed companion descriptors wBytesPerInterval Greg Kroah-Hartman
                   ` (51 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Dan Williams, Johan Hovold

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Williams <dcbw@redhat.com>

commit 6e9f44eaaef0df7b846e9316fa9ca72a02025d44 upstream.

Add Quectel UC15, UC20, EC21, and EC25.  The EC20 is handled by
qcserial due to a USB VID/PID conflict with an existing Acer
device.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/usb/serial/option.c |   17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

--- a/drivers/usb/serial/option.c
+++ b/drivers/usb/serial/option.c
@@ -233,6 +233,14 @@ static void option_instat_callback(struc
 #define BANDRICH_PRODUCT_1012			0x1012
 
 #define QUALCOMM_VENDOR_ID			0x05C6
+/* These Quectel products use Qualcomm's vendor ID */
+#define QUECTEL_PRODUCT_UC20			0x9003
+#define QUECTEL_PRODUCT_UC15			0x9090
+
+#define QUECTEL_VENDOR_ID			0x2c7c
+/* These Quectel products use Quectel's vendor ID */
+#define QUECTEL_PRODUCT_EC21			0x0121
+#define QUECTEL_PRODUCT_EC25			0x0125
 
 #define CMOTECH_VENDOR_ID			0x16d8
 #define CMOTECH_PRODUCT_6001			0x6001
@@ -1161,7 +1169,14 @@ static const struct usb_device_id option
 	{ USB_DEVICE(QUALCOMM_VENDOR_ID, 0x6613)}, /* Onda H600/ZTE MF330 */
 	{ USB_DEVICE(QUALCOMM_VENDOR_ID, 0x0023)}, /* ONYX 3G device */
 	{ USB_DEVICE(QUALCOMM_VENDOR_ID, 0x9000)}, /* SIMCom SIM5218 */
-	{ USB_DEVICE(QUALCOMM_VENDOR_ID, 0x9003), /* Quectel UC20 */
+	/* Quectel products using Qualcomm vendor ID */
+	{ USB_DEVICE(QUALCOMM_VENDOR_ID, QUECTEL_PRODUCT_UC15)},
+	{ USB_DEVICE(QUALCOMM_VENDOR_ID, QUECTEL_PRODUCT_UC20),
+	  .driver_info = (kernel_ulong_t)&net_intf4_blacklist },
+	/* Quectel products using Quectel vendor ID */
+	{ USB_DEVICE(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC21),
+	  .driver_info = (kernel_ulong_t)&net_intf4_blacklist },
+	{ USB_DEVICE(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC25),
 	  .driver_info = (kernel_ulong_t)&net_intf4_blacklist },
 	{ USB_DEVICE(CMOTECH_VENDOR_ID, CMOTECH_PRODUCT_6001) },
 	{ USB_DEVICE(CMOTECH_VENDOR_ID, CMOTECH_PRODUCT_CMU_300) },

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 27/76] usb: gadget: f_uvc: Fix SuperSpeed companion descriptors wBytesPerInterval
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (23 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 24/76] USB: serial: option: add Quectel UC15, UC20, EC21, and EC25 modems Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 28/76] usb-core: Add LINEAR_FRAME_INTR_BINTERVAL USB quirk Greg Kroah-Hartman
                   ` (50 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Laurent Pinchart, Roger Quadros,
	Felipe Balbi

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Roger Quadros <rogerq@ti.com>

commit 09424c50b7dff40cb30011c09114404a4656e023 upstream.

The streaming_maxburst module parameter is 0 offset (0..15)
so we must add 1 while using it for wBytesPerInterval
calculation for the SuperSpeed companion descriptor.

Without this host uvcvideo driver will always see the wrong
wBytesPerInterval for SuperSpeed uvc gadget and may not find
a suitable video interface endpoint.
e.g. for streaming_maxburst = 0 case it will always
fail as wBytePerInterval was evaluating to 0.

Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/usb/gadget/function/f_uvc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/usb/gadget/function/f_uvc.c
+++ b/drivers/usb/gadget/function/f_uvc.c
@@ -625,7 +625,7 @@ uvc_function_bind(struct usb_configurati
 	uvc_ss_streaming_comp.bMaxBurst = opts->streaming_maxburst;
 	uvc_ss_streaming_comp.wBytesPerInterval =
 		cpu_to_le16(max_packet_size * max_packet_mult *
-			    opts->streaming_maxburst);
+			    (opts->streaming_maxburst + 1));
 
 	/* Allocate endpoints. */
 	ep = usb_ep_autoconfig(cdev->gadget, &uvc_control_ep);

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 28/76] usb-core: Add LINEAR_FRAME_INTR_BINTERVAL USB quirk
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (24 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 27/76] usb: gadget: f_uvc: Fix SuperSpeed companion descriptors wBytesPerInterval Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 29/76] USB: uss720: fix NULL-deref at probe Greg Kroah-Hartman
                   ` (49 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Samuel Thibault, Alan Stern

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Samuel Thibault <samuel.thibault@ens-lyon.org>

commit 3243367b209faed5c320a4e5f9a565ee2a2ba958 upstream.

Some USB 2.0 devices erroneously report millisecond values in
bInterval. The generic config code manages to catch most of them,
but in some cases it's not completely enough.

The case at stake here is a USB 2.0 braille device, which wants to
announce 10ms and thus sets bInterval to 10, but with the USB 2.0
computation that yields to 64ms.  It happens that one can type fast
enough to reach this interval and get the device buffers overflown,
leading to problematic latencies.  The generic config code does not
catch this case because the 64ms is considered a sane enough value.

This change thus adds a USB_QUIRK_LINEAR_FRAME_INTR_BINTERVAL quirk
to mark devices which actually report milliseconds in bInterval,
and marks Vario Ultra devices as needing it.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/usb/core/config.c  |   10 ++++++++++
 drivers/usb/core/quirks.c  |    8 ++++++++
 include/linux/usb/quirks.h |    6 ++++++
 3 files changed, 24 insertions(+)

--- a/drivers/usb/core/config.c
+++ b/drivers/usb/core/config.c
@@ -246,6 +246,16 @@ static int usb_parse_endpoint(struct dev
 
 			/*
 			 * Adjust bInterval for quirked devices.
+			 */
+			/*
+			 * This quirk fixes bIntervals reported in ms.
+			 */
+			if (to_usb_device(ddev)->quirks &
+				USB_QUIRK_LINEAR_FRAME_INTR_BINTERVAL) {
+				n = clamp(fls(d->bInterval) + 3, i, j);
+				i = j = n;
+			}
+			/*
 			 * This quirk fixes bIntervals reported in
 			 * linear microframes.
 			 */
--- a/drivers/usb/core/quirks.c
+++ b/drivers/usb/core/quirks.c
@@ -170,6 +170,14 @@ static const struct usb_device_id usb_qu
 	/* M-Systems Flash Disk Pioneers */
 	{ USB_DEVICE(0x08ec, 0x1000), .driver_info = USB_QUIRK_RESET_RESUME },
 
+	/* Baum Vario Ultra */
+	{ USB_DEVICE(0x0904, 0x6101), .driver_info =
+			USB_QUIRK_LINEAR_FRAME_INTR_BINTERVAL },
+	{ USB_DEVICE(0x0904, 0x6102), .driver_info =
+			USB_QUIRK_LINEAR_FRAME_INTR_BINTERVAL },
+	{ USB_DEVICE(0x0904, 0x6103), .driver_info =
+			USB_QUIRK_LINEAR_FRAME_INTR_BINTERVAL },
+
 	/* Keytouch QWERTY Panel keyboard */
 	{ USB_DEVICE(0x0926, 0x3333), .driver_info =
 			USB_QUIRK_CONFIG_INTF_STRINGS },
--- a/include/linux/usb/quirks.h
+++ b/include/linux/usb/quirks.h
@@ -50,4 +50,10 @@
 /* device can't handle Link Power Management */
 #define USB_QUIRK_NO_LPM			BIT(10)
 
+/*
+ * Device reports its bInterval as linear frames instead of the
+ * USB 2.0 calculation.
+ */
+#define USB_QUIRK_LINEAR_FRAME_INTR_BINTERVAL	BIT(11)
+
 #endif /* __LINUX_USB_QUIRKS_H */

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 29/76] USB: uss720: fix NULL-deref at probe
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (25 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 28/76] usb-core: Add LINEAR_FRAME_INTR_BINTERVAL USB quirk Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 30/76] USB: lvtest: " Greg Kroah-Hartman
                   ` (48 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Johan Hovold

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan@kernel.org>

commit f259ca3eed6e4b79ac3d5c5c9fb259fb46e86217 upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer or accessing memory beyond the endpoint array should a
malicious device lack the expected endpoints.

Note that the endpoint access that causes the NULL-deref is currently
only used for debugging purposes during probe so the oops only happens
when dynamic debugging is enabled. This means the driver could be
rewritten to continue to accept device with only two endpoints, should
such devices exist.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/usb/misc/uss720.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/drivers/usb/misc/uss720.c
+++ b/drivers/usb/misc/uss720.c
@@ -711,6 +711,11 @@ static int uss720_probe(struct usb_inter
 
 	interface = intf->cur_altsetting;
 
+	if (interface->desc.bNumEndpoints < 3) {
+		usb_put_dev(usbdev);
+		return -ENODEV;
+	}
+
 	/*
 	 * Allocate parport interface 
 	 */

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 30/76] USB: lvtest: fix NULL-deref at probe
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (26 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 29/76] USB: uss720: fix NULL-deref at probe Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 31/76] USB: idmouse: " Greg Kroah-Hartman
                   ` (47 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Pratyush Anand, Johan Hovold

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan@kernel.org>

commit 1dc56c52d2484be09c7398a5207d6b11a4256be9 upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should the probed device lack endpoints.

Note that this driver does not bind to any devices by default.

Fixes: ce21bfe603b3 ("USB: Add LVS Test device driver")
Cc: Pratyush Anand <pratyush.anand@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/usb/misc/lvstest.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/drivers/usb/misc/lvstest.c
+++ b/drivers/usb/misc/lvstest.c
@@ -370,6 +370,10 @@ static int lvs_rh_probe(struct usb_inter
 
 	hdev = interface_to_usbdev(intf);
 	desc = intf->cur_altsetting;
+
+	if (desc->desc.bNumEndpoints < 1)
+		return -ENODEV;
+
 	endpoint = &desc->endpoint[0].desc;
 
 	/* valid only for SS root hub */

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 31/76] USB: idmouse: fix NULL-deref at probe
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (27 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 30/76] USB: lvtest: " Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 32/76] USB: wusbcore: " Greg Kroah-Hartman
                   ` (46 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Johan Hovold

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan@kernel.org>

commit b0addd3fa6bcd119be9428996d5d4522479ab240 upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/usb/misc/idmouse.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/usb/misc/idmouse.c
+++ b/drivers/usb/misc/idmouse.c
@@ -346,6 +346,9 @@ static int idmouse_probe(struct usb_inte
 	if (iface_desc->desc.bInterfaceClass != 0x0A)
 		return -ENODEV;
 
+	if (iface_desc->desc.bNumEndpoints < 1)
+		return -ENODEV;
+
 	/* allocate memory for our device state and initialize it */
 	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
 	if (dev == NULL)

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 32/76] USB: wusbcore: fix NULL-deref at probe
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (28 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 31/76] USB: idmouse: " Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 33/76] usb: musb: cppi41: dont check early-TX-interrupt for Isoch transfer Greg Kroah-Hartman
                   ` (45 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Inaky Perez-Gonzalez, David Vrabel,
	Johan Hovold

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan@kernel.org>

commit 03ace948a4eb89d1cf51c06afdfc41ebca5fdb27 upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer or accessing memory beyond the endpoint array should a
malicious device lack the expected endpoints.

This specifically fixes the NULL-pointer dereference when probing HWA HC
devices.

Fixes: df3654236e31 ("wusb: add the Wire Adapter (WA) core")
Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Cc: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/usb/wusbcore/wa-hc.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/usb/wusbcore/wa-hc.c
+++ b/drivers/usb/wusbcore/wa-hc.c
@@ -39,6 +39,9 @@ int wa_create(struct wahc *wa, struct us
 	int result;
 	struct device *dev = &iface->dev;
 
+	if (iface->cur_altsetting->desc.bNumEndpoints < 3)
+		return -ENODEV;
+
 	result = wa_rpipes_create(wa);
 	if (result < 0)
 		goto error_rpipes_create;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 33/76] usb: musb: cppi41: dont check early-TX-interrupt for Isoch transfer
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (29 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 32/76] USB: wusbcore: " Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 34/76] usb: hub: Fix crash after failure to read BOS descriptor Greg Kroah-Hartman
                   ` (44 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexandre Bailon,
	Sebastian Andrzej Siewior, Bin Liu

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Bin Liu <b-liu@ti.com>

commit 0090114d336a9604aa2d90bc83f20f7cd121b76c upstream.

The CPPI 4.1 driver polls register to workaround the premature TX
interrupt issue, but it causes audio playback underrun when triggered in
Isoch transfers.

Isoch doesn't do back-to-back transfers, the TX should be done by the
time the next transfer is scheduled. So skip this polling workaround for
Isoch transfer.

Fixes: a655f481d83d6 ("usb: musb: musb_cppi41: handle pre-mature TX complete interrupt")
Reported-by: Alexandre Bailon <abailon@baylibre.com>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Alexandre Bailon <abailon@baylibre.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/usb/musb/musb_cppi41.c |   23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

--- a/drivers/usb/musb/musb_cppi41.c
+++ b/drivers/usb/musb/musb_cppi41.c
@@ -250,8 +250,27 @@ static void cppi41_dma_callback(void *pr
 			transferred < cppi41_channel->packet_sz)
 		cppi41_channel->prog_len = 0;
 
-	if (cppi41_channel->is_tx)
-		empty = musb_is_tx_fifo_empty(hw_ep);
+	if (cppi41_channel->is_tx) {
+		u8 type;
+
+		if (is_host_active(musb))
+			type = hw_ep->out_qh->type;
+		else
+			type = hw_ep->ep_in.type;
+
+		if (type == USB_ENDPOINT_XFER_ISOC)
+			/*
+			 * Don't use the early-TX-interrupt workaround below
+			 * for Isoch transfter. Since Isoch are periodic
+			 * transfer, by the time the next transfer is
+			 * scheduled, the current one should be done already.
+			 *
+			 * This avoids audio playback underrun issue.
+			 */
+			empty = true;
+		else
+			empty = musb_is_tx_fifo_empty(hw_ep);
+	}
 
 	if (!cppi41_channel->is_tx || empty) {
 		cppi41_trans_done(cppi41_channel);

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 34/76] usb: hub: Fix crash after failure to read BOS descriptor
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (30 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 33/76] usb: musb: cppi41: dont check early-TX-interrupt for Isoch transfer Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 35/76] uwb: i1480-dfu: fix NULL-deref at probe Greg Kroah-Hartman
                   ` (43 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Mathias Nyman, Guenter Roeck

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Guenter Roeck <linux@roeck-us.net>

commit 7b2db29fbb4e766fcd02207eb2e2087170bd6ebc upstream.

If usb_get_bos_descriptor() returns an error, usb->bos will be NULL.
Nevertheless, it is dereferenced unconditionally in
hub_set_initial_usb2_lpm_policy() if usb2_hw_lpm_capable is set.
This results in a crash.

usb 5-1: unable to get BOS descriptor
...
Unable to handle kernel NULL pointer dereference at virtual address 00000008
pgd = ffffffc00165f000
[00000008] *pgd=000000000174f003, *pud=000000000174f003,
		*pmd=0000000001750003, *pte=00e8000001751713
Internal error: Oops: 96000005 [#1] PREEMPT SMP
Modules linked in: uinput uvcvideo videobuf2_vmalloc cmac [ ... ]
CPU: 5 PID: 3353 Comm: kworker/5:3 Tainted: G    B 4.4.52 #480
Hardware name: Google Kevin (DT)
Workqueue: events driver_set_config_work
task: ffffffc0c3690000 ti: ffffffc0ae9a8000 task.ti: ffffffc0ae9a8000
PC is at hub_port_init+0xc3c/0xd10
LR is at hub_port_init+0xc3c/0xd10
...
Call trace:
[<ffffffc0007fbbfc>] hub_port_init+0xc3c/0xd10
[<ffffffc0007fbe2c>] usb_reset_and_verify_device+0x15c/0x82c
[<ffffffc0007fc5e0>] usb_reset_device+0xe4/0x298
[<ffffffbffc0e3fcc>] rtl8152_probe+0x84/0x9b0 [r8152]
[<ffffffc00080ca8c>] usb_probe_interface+0x244/0x2f8
[<ffffffc000774a24>] driver_probe_device+0x180/0x3b4
[<ffffffc000774e48>] __device_attach_driver+0xb4/0xe0
[<ffffffc000772168>] bus_for_each_drv+0xb4/0xe4
[<ffffffc0007747ec>] __device_attach+0xd0/0x158
[<ffffffc000775080>] device_initial_probe+0x24/0x30
[<ffffffc0007739d4>] bus_probe_device+0x50/0xe4
[<ffffffc000770bd0>] device_add+0x414/0x738
[<ffffffc000809fe8>] usb_set_configuration+0x89c/0x914
[<ffffffc00080a120>] driver_set_config_work+0xc0/0xf0
[<ffffffc000249bb8>] process_one_work+0x390/0x6b8
[<ffffffc00024abcc>] worker_thread+0x480/0x610
[<ffffffc000251a80>] kthread+0x164/0x178
[<ffffffc0002045d0>] ret_from_fork+0x10/0x40

Since we don't know anything about LPM capabilities without BOS descriptor,
don't attempt to enable LPM if it is not available.

Fixes: 890dae886721 ("xhci: Enable LPM support only for hardwired ...")
Cc: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/usb/core/hub.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -4199,7 +4199,7 @@ static void hub_set_initial_usb2_lpm_pol
 	struct usb_hub *hub = usb_hub_to_struct_hub(udev->parent);
 	int connect_type = USB_PORT_CONNECT_TYPE_UNKNOWN;
 
-	if (!udev->usb2_hw_lpm_capable)
+	if (!udev->usb2_hw_lpm_capable || !udev->bos)
 		return;
 
 	if (hub)

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 35/76] uwb: i1480-dfu: fix NULL-deref at probe
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (31 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 34/76] usb: hub: Fix crash after failure to read BOS descriptor Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 36/76] uwb: hwa-rc: " Greg Kroah-Hartman
                   ` (42 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Inaky Perez-Gonzalez, David Vrabel,
	Johan Hovold

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan@kernel.org>

commit 4ce362711d78a4999011add3115b8f4b0bc25e8c upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.

Note that the dereference happens in the cmd and wait_init_done
callbacks which are called during probe.

Fixes: 1ba47da52712 ("uwb: add the i1480 DFU driver")
Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Cc: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/uwb/i1480/dfu/usb.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/uwb/i1480/dfu/usb.c
+++ b/drivers/uwb/i1480/dfu/usb.c
@@ -362,6 +362,9 @@ int i1480_usb_probe(struct usb_interface
 				 result);
 	}
 
+	if (iface->cur_altsetting->desc.bNumEndpoints < 1)
+		return -ENODEV;
+
 	result = -ENOMEM;
 	i1480_usb = kzalloc(sizeof(*i1480_usb), GFP_KERNEL);
 	if (i1480_usb == NULL) {

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 36/76] uwb: hwa-rc: fix NULL-deref at probe
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (32 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 35/76] uwb: i1480-dfu: fix NULL-deref at probe Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 37/76] mmc: ushc: " Greg Kroah-Hartman
                   ` (41 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Inaky Perez-Gonzalez, David Vrabel,
	Johan Hovold

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan@kernel.org>

commit daf229b15907fbfdb6ee183aac8ca428cb57e361 upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.

Note that the dereference happens in the start callback which is called
during probe.

Fixes: de520b8bd552 ("uwb: add HWA radio controller driver")
Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Cc: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/uwb/hwa-rc.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/uwb/hwa-rc.c
+++ b/drivers/uwb/hwa-rc.c
@@ -825,6 +825,9 @@ static int hwarc_probe(struct usb_interf
 	struct hwarc *hwarc;
 	struct device *dev = &iface->dev;
 
+	if (iface->cur_altsetting->desc.bNumEndpoints < 1)
+		return -ENODEV;
+
 	result = -ENOMEM;
 	uwb_rc = uwb_rc_alloc();
 	if (uwb_rc == NULL) {

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 37/76] mmc: ushc: fix NULL-deref at probe
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (33 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 36/76] uwb: hwa-rc: " Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 38/76] iio: adc: ti_am335x_adc: fix fifo overrun recovery Greg Kroah-Hartman
                   ` (40 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, David Vrabel, Johan Hovold, Ulf Hansson

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan@kernel.org>

commit 181302dc7239add8ab1449c23ecab193f52ee6ab upstream.

Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.

Fixes: 53f3a9e26ed5 ("mmc: USB SD Host Controller (USHC) driver")
Cc: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/mmc/host/ushc.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/mmc/host/ushc.c
+++ b/drivers/mmc/host/ushc.c
@@ -426,6 +426,9 @@ static int ushc_probe(struct usb_interfa
 	struct ushc_data *ushc;
 	int ret;
 
+	if (intf->cur_altsetting->desc.bNumEndpoints < 1)
+		return -ENODEV;
+
 	mmc = mmc_alloc_host(sizeof(struct ushc_data), &intf->dev);
 	if (mmc == NULL)
 		return -ENOMEM;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 38/76] iio: adc: ti_am335x_adc: fix fifo overrun recovery
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (34 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 37/76] mmc: ushc: " Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 39/76] iio: hid-sensor-trigger: Change get poll value function order to avoid sensor properties losing after resume from S3 Greg Kroah-Hartman
                   ` (39 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Michael Engl, Jonathan Cameron

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michael Engl <michael.engl@wjw-solutions.com>

commit e83bb3e6f3efa21f4a9d883a25d0ecd9dfb431e1 upstream.

The tiadc_irq_h(int irq, void *private) function is handling FIFO
overruns by clearing flags, disabling and enabling the ADC to
recover.

If the ADC is running in continuous mode a FIFO overrun happens
regularly. If the disabling of the ADC happens concurrently with
a new conversion. It might happen that the enabling of the ADC
is ignored by the hardware. This stops the ADC permanently. No
more interrupts are triggered.

According to the AM335x Reference Manual (SPRUH73H October 2011 -
Revised April 2013 - Chapter 12.4 and 12.5) it is necessary to
check the ADC FSM bits in REG_ADCFSM before enabling the ADC
again. Because the disabling of the ADC is done right after the
current conversion has been finished.

To trigger this bug it is necessary to run the ADC in continuous
mode. The ADC values of all channels need to be read in an endless
loop. The bug appears within the first 6 hours (~5.4 million
handled FIFO overruns). The user space application will hang on
reading new values from the character device.

Fixes: ca9a563805f7a ("iio: ti_am335x_adc: Add continuous sampling support")
Signed-off-by: Michael Engl <michael.engl@wjw-solutions.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/iio/adc/ti_am335x_adc.c |   13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

--- a/drivers/iio/adc/ti_am335x_adc.c
+++ b/drivers/iio/adc/ti_am335x_adc.c
@@ -151,7 +151,9 @@ static irqreturn_t tiadc_irq_h(int irq,
 {
 	struct iio_dev *indio_dev = private;
 	struct tiadc_device *adc_dev = iio_priv(indio_dev);
-	unsigned int status, config;
+	unsigned int status, config, adc_fsm;
+	unsigned short count = 0;
+
 	status = tiadc_readl(adc_dev, REG_IRQSTATUS);
 
 	/*
@@ -165,6 +167,15 @@ static irqreturn_t tiadc_irq_h(int irq,
 		tiadc_writel(adc_dev, REG_CTRL, config);
 		tiadc_writel(adc_dev, REG_IRQSTATUS, IRQENB_FIFO1OVRRUN
 				| IRQENB_FIFO1UNDRFLW | IRQENB_FIFO1THRES);
+
+		/* wait for idle state.
+		 * ADC needs to finish the current conversion
+		 * before disabling the module
+		 */
+		do {
+			adc_fsm = tiadc_readl(adc_dev, REG_ADCFSM);
+		} while (adc_fsm != 0x10 && count++ < 100);
+
 		tiadc_writel(adc_dev, REG_CTRL, (config | CNTRLREG_TSCSSENB));
 		return IRQ_HANDLED;
 	} else if (status & IRQENB_FIFO1THRES) {

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 39/76] iio: hid-sensor-trigger: Change get poll value function order to avoid sensor properties losing after resume from S3
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (35 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 38/76] iio: adc: ti_am335x_adc: fix fifo overrun recovery Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 40/76] parport: fix attempt to write duplicate procfiles Greg Kroah-Hartman
                   ` (38 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Song Hongyan, Srinivas Pandruvada,
	Jonathan Cameron

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Song Hongyan <hongyan.song@intel.com>

commit 3bec247474469f769af41e8c80d3a100dd97dd76 upstream.

In function _hid_sensor_power_state(), when hid_sensor_read_poll_value()
is called, sensor's all properties will be updated by the value from
sensor hardware/firmware.
In some implementation, sensor hardware/firmware will do a power cycle
during S3. In this case, after resume, once hid_sensor_read_poll_value()
is called, sensor's all properties which are kept by driver during S3
will be changed to default value.
But instead, if a set feature function is called first, sensor
hardware/firmware will be recovered to the last status. So change the
sensor_hub_set_feature() calling order to behind of set feature function
to avoid sensor properties lose.

Signed-off-by: Song Hongyan <hongyan.song@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/iio/common/hid-sensors/hid-sensor-trigger.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/iio/common/hid-sensors/hid-sensor-trigger.c
+++ b/drivers/iio/common/hid-sensors/hid-sensor-trigger.c
@@ -51,8 +51,6 @@ static int _hid_sensor_power_state(struc
 			st->report_state.report_id,
 			st->report_state.index,
 			HID_USAGE_SENSOR_PROP_REPORTING_STATE_ALL_EVENTS_ENUM);
-
-		poll_value = hid_sensor_read_poll_value(st);
 	} else {
 		int val;
 
@@ -89,7 +87,9 @@ static int _hid_sensor_power_state(struc
 	sensor_hub_get_feature(st->hsdev, st->power_state.report_id,
 			       st->power_state.index,
 			       sizeof(state_val), &state_val);
-	if (state && poll_value)
+	if (state)
+		poll_value = hid_sensor_read_poll_value(st);
+	if (poll_value > 0)
 		msleep_interruptible(poll_value * 2);
 
 	return 0;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 40/76] parport: fix attempt to write duplicate procfiles
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (36 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 39/76] iio: hid-sensor-trigger: Change get poll value function order to avoid sensor properties losing after resume from S3 Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 41/76] ext4: mark inode dirty after converting inline directory Greg Kroah-Hartman
                   ` (37 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, James Feeney, Sudip Mukherjee

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sudip Mukherjee <sudipm.mukherjee@gmail.com>

commit 03270c6ac6207fc55bbf9d20d195029dca210c79 upstream.

Usually every parallel port will have a single pardev registered with
it. But ppdev driver is an exception. This userspace parallel port
driver allows to create multiple parrallel port devices for a single
parallel port. And as a result we were having a nice warning like:
"sysctl table check failed:
/dev/parport/parport0/devices/ppdev0/timeslice Sysctl already exists"

Use the same logic as used in parport_register_device() and register
the proc files only once for each parallel port.

Fixes: 6fa45a226897 ("parport: add device-model to parport subsystem")
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1414656
Bugzilla: https://bugs.archlinux.org/task/52322
Tested-by: James Feeney <james@nurealm.net>
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/parport/share.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/parport/share.c
+++ b/drivers/parport/share.c
@@ -936,8 +936,10 @@ parport_register_dev_model(struct parpor
 	 * pardevice fields. -arca
 	 */
 	port->ops->init_state(par_dev, par_dev->state);
-	port->proc_device = par_dev;
-	parport_device_proc_register(par_dev);
+	if (!test_and_set_bit(PARPORT_DEVPROC_REGISTERED, &port->devflags)) {
+		port->proc_device = par_dev;
+		parport_device_proc_register(par_dev);
+	}
 
 	return par_dev;
 

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 41/76] ext4: mark inode dirty after converting inline directory
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (37 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 40/76] parport: fix attempt to write duplicate procfiles Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 42/76] mmc: sdhci: Do not disable interrupts while waiting for clock Greg Kroah-Hartman
                   ` (36 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Biggers, Theodore Tso

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Biggers <ebiggers@google.com>

commit b9cf625d6ecde0d372e23ae022feead72b4228a6 upstream.

If ext4_convert_inline_data() was called on a directory with inline
data, the filesystem was left in an inconsistent state (as considered by
e2fsck) because the file size was not increased to cover the new block.
This happened because the inode was not marked dirty after i_disksize
was updated.  Fix this by marking the inode dirty at the end of
ext4_finish_convert_inline_dir().

This bug was probably not noticed before because most users mark the
inode dirty afterwards for other reasons.  But if userspace executed
FS_IOC_SET_ENCRYPTION_POLICY with invalid parameters, as exercised by
'kvm-xfstests -c adv generic/396', then the inode was never marked dirty
after updating i_disksize.

Fixes: 3c47d54170b6a678875566b1b8d6dcf57904e49b
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/ext4/inline.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -1158,10 +1158,9 @@ static int ext4_finish_convert_inline_di
 	set_buffer_uptodate(dir_block);
 	err = ext4_handle_dirty_dirent_node(handle, inode, dir_block);
 	if (err)
-		goto out;
+		return err;
 	set_buffer_verified(dir_block);
-out:
-	return err;
+	return ext4_mark_inode_dirty(handle, inode);
 }
 
 static int ext4_convert_inline_data_nolock(handle_t *handle,

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 42/76] mmc: sdhci: Do not disable interrupts while waiting for clock
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (38 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 41/76] ext4: mark inode dirty after converting inline directory Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-04-04 16:50   ` Ben Hutchings
  2017-03-28 12:30 ` [PATCH 4.4 43/76] xen/acpi: upload PM state from init-domain to Xen Greg Kroah-Hartman
                   ` (35 subsequent siblings)
  75 siblings, 1 reply; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Adrian Hunter, Ulf Hansson,
	Ludovic Desroches

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Adrian Hunter <adrian.hunter@intel.com>

commit e2ebfb2142acefecc2496e71360f50d25726040b upstream.

Disabling interrupts for even a millisecond can cause problems for some
devices. That can happen when sdhci changes clock frequency because it
waits for the clock to become stable under a spin lock.

The spin lock is not necessary here. Anything that is racing with changes
to the I/O state is already broken. The mmc core already provides
synchronization via "claiming" the host.

Although the spin lock probably should be removed from the code paths that
lead to this point, such a patch would touch too much code to be suitable
for stable trees. Consequently, for this patch, just drop the spin lock
while waiting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/mmc/host/sdhci.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1274,7 +1274,9 @@ clock_set:
 			return;
 		}
 		timeout--;
-		mdelay(1);
+		spin_unlock_irq(&host->lock);
+		usleep_range(900, 1100);
+		spin_lock_irq(&host->lock);
 	}
 
 	clk |= SDHCI_CLOCK_CARD_EN;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 43/76] xen/acpi: upload PM state from init-domain to Xen
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (39 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 42/76] mmc: sdhci: Do not disable interrupts while waiting for clock Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 44/76] iommu/vt-d: Fix NULL pointer dereference in device_to_iommu Greg Kroah-Hartman
                   ` (34 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stanislaw Gruszka,
	Konrad Rzeszutek Wilk, Ankur Arora, Boris Ostrovsky

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ankur Arora <ankur.a.arora@oracle.com>

commit 1914f0cd203c941bba72f9452c8290324f1ef3dc upstream.

This was broken in commit cd979883b9ed ("xen/acpi-processor:
fix enabling interrupts on syscore_resume"). do_suspend (from
xen/manage.c) and thus xen_resume_notifier never get called on
the initial-domain at resume (it is if running as guest.)

The rationale for the breaking change was that upload_pm_data()
potentially does blocking work in syscore_resume(). This patch
addresses the original issue by scheduling upload_pm_data() to
execute in workqueue context.

Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Based-on-patch-by: Konrad Wilk <konrad.wilk@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/xen/xen-acpi-processor.c |   34 ++++++++++++++++++++++++++--------
 1 file changed, 26 insertions(+), 8 deletions(-)

--- a/drivers/xen/xen-acpi-processor.c
+++ b/drivers/xen/xen-acpi-processor.c
@@ -27,10 +27,10 @@
 #include <linux/init.h>
 #include <linux/module.h>
 #include <linux/types.h>
+#include <linux/syscore_ops.h>
 #include <linux/acpi.h>
 #include <acpi/processor.h>
 #include <xen/xen.h>
-#include <xen/xen-ops.h>
 #include <xen/interface/platform.h>
 #include <asm/xen/hypercall.h>
 
@@ -466,15 +466,33 @@ static int xen_upload_processor_pm_data(
 	return rc;
 }
 
-static int xen_acpi_processor_resume(struct notifier_block *nb,
-				     unsigned long action, void *data)
+static void xen_acpi_processor_resume_worker(struct work_struct *dummy)
 {
+	int rc;
+
 	bitmap_zero(acpi_ids_done, nr_acpi_bits);
-	return xen_upload_processor_pm_data();
+
+	rc = xen_upload_processor_pm_data();
+	if (rc != 0)
+		pr_info("ACPI data upload failed, error = %d\n", rc);
+}
+
+static void xen_acpi_processor_resume(void)
+{
+	static DECLARE_WORK(wq, xen_acpi_processor_resume_worker);
+
+	/*
+	 * xen_upload_processor_pm_data() calls non-atomic code.
+	 * However, the context for xen_acpi_processor_resume is syscore
+	 * with only the boot CPU online and in an atomic context.
+	 *
+	 * So defer the upload for some point safer.
+	 */
+	schedule_work(&wq);
 }
 
-struct notifier_block xen_acpi_processor_resume_nb = {
-	.notifier_call = xen_acpi_processor_resume,
+static struct syscore_ops xap_syscore_ops = {
+	.resume	= xen_acpi_processor_resume,
 };
 
 static int __init xen_acpi_processor_init(void)
@@ -527,7 +545,7 @@ static int __init xen_acpi_processor_ini
 	if (rc)
 		goto err_unregister;
 
-	xen_resume_notifier_register(&xen_acpi_processor_resume_nb);
+	register_syscore_ops(&xap_syscore_ops);
 
 	return 0;
 err_unregister:
@@ -544,7 +562,7 @@ static void __exit xen_acpi_processor_ex
 {
 	int i;
 
-	xen_resume_notifier_unregister(&xen_acpi_processor_resume_nb);
+	unregister_syscore_ops(&xap_syscore_ops);
 	kfree(acpi_ids_done);
 	kfree(acpi_id_present);
 	kfree(acpi_id_cst_present);

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 44/76] iommu/vt-d: Fix NULL pointer dereference in device_to_iommu
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (40 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 43/76] xen/acpi: upload PM state from init-domain to Xen Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 45/76] ARM: at91: pm: cpu_idle: switch DDR to power-down mode Greg Kroah-Hartman
                   ` (33 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Koos Vriezen, Joerg Roedel

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Koos Vriezen <koos.vriezen@gmail.com>

commit 5003ae1e735e6bfe4679d9bed6846274f322e77e upstream.

The function device_to_iommu() in the Intel VT-d driver
lacks a NULL-ptr check, resulting in this oops at boot on
some platforms:

 BUG: unable to handle kernel NULL pointer dereference at 00000000000007ab
 IP: [<ffffffff8132234a>] device_to_iommu+0x11a/0x1a0
 PGD 0

 [...]

 Call Trace:
   ? find_or_alloc_domain.constprop.29+0x1a/0x300
   ? dw_dma_probe+0x561/0x580 [dw_dmac_core]
   ? __get_valid_domain_for_dev+0x39/0x120
   ? __intel_map_single+0x138/0x180
   ? intel_alloc_coherent+0xb6/0x120
   ? sst_hsw_dsp_init+0x173/0x420 [snd_soc_sst_haswell_pcm]
   ? mutex_lock+0x9/0x30
   ? kernfs_add_one+0xdb/0x130
   ? devres_add+0x19/0x60
   ? hsw_pcm_dev_probe+0x46/0xd0 [snd_soc_sst_haswell_pcm]
   ? platform_drv_probe+0x30/0x90
   ? driver_probe_device+0x1ed/0x2b0
   ? __driver_attach+0x8f/0xa0
   ? driver_probe_device+0x2b0/0x2b0
   ? bus_for_each_dev+0x55/0x90
   ? bus_add_driver+0x110/0x210
   ? 0xffffffffa11ea000
   ? driver_register+0x52/0xc0
   ? 0xffffffffa11ea000
   ? do_one_initcall+0x32/0x130
   ? free_vmap_area_noflush+0x37/0x70
   ? kmem_cache_alloc+0x88/0xd0
   ? do_init_module+0x51/0x1c4
   ? load_module+0x1ee9/0x2430
   ? show_taint+0x20/0x20
   ? kernel_read_file+0xfd/0x190
   ? SyS_finit_module+0xa3/0xb0
   ? do_syscall_64+0x4a/0xb0
   ? entry_SYSCALL64_slow_path+0x25/0x25
 Code: 78 ff ff ff 4d 85 c0 74 ee 49 8b 5a 10 0f b6 9b e0 00 00 00 41 38 98 e0 00 00 00 77 da 0f b6 eb 49 39 a8 88 00 00 00 72 ce eb 8f <41> f6 82 ab 07 00 00 04 0f 85 76 ff ff ff 0f b6 4d 08 88 0e 49
 RIP  [<ffffffff8132234a>] device_to_iommu+0x11a/0x1a0
  RSP <ffffc90001457a78>
 CR2: 00000000000007ab
 ---[ end trace 16f974b6d58d0aad ]---

Add the missing pointer check.

Fixes: 1c387188c60f53b338c20eee32db055dfe022a9b ("iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions")
Signed-off-by: Koos Vriezen <koos.vriezen@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/iommu/intel-iommu.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -908,7 +908,7 @@ static struct intel_iommu *device_to_iom
 				 * which we used for the IOMMU lookup. Strictly speaking
 				 * we could do this for all PCI devices; we only need to
 				 * get the BDF# from the scope table for ACPI matches. */
-				if (pdev->is_virtfn)
+				if (pdev && pdev->is_virtfn)
 					goto got_pdev;
 
 				*bus = drhd->devices[i].bus;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 45/76] ARM: at91: pm: cpu_idle: switch DDR to power-down mode
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (41 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 44/76] iommu/vt-d: Fix NULL pointer dereference in device_to_iommu Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 46/76] ARM: dts: at91: sama5d2: add dma properties to UART nodes Greg Kroah-Hartman
                   ` (32 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Nicolas Ferre, Alexandre Belloni

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nicolas Ferre <nicolas.ferre@microchip.com>

commit 60b89f1928af80b546b5c3fd8714a62f6f4b8844 upstream.

On some DDR controllers, compatible with the sama5d3 one,
the sequence to enter/exit/re-enter the self-refresh mode adds
more constrains than what is currently written in the at91_idle
driver. An actual access to the DDR chip is needed between exit
and re-enter of this mode which is somehow difficult to implement.
This sequence can completely hang the SoC. It is particularly
experienced on parts which embed a L2 cache if the code run
between IDLE calls fits in it...

Moreover, as the intention is to enter and exit pretty rapidly
from IDLE, the power-down mode is a good candidate.

So now we use power-down instead of self-refresh. As we can
simplify the code for sama5d3 compatible DDR controllers,
we instantiate a new sama5d3_ddr_standby() function.

Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Fixes: 017b5522d5e3 ("ARM: at91: Add new binding for sama5d3-ddramc")
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arm/mach-at91/pm.c |   18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

--- a/arch/arm/mach-at91/pm.c
+++ b/arch/arm/mach-at91/pm.c
@@ -286,6 +286,22 @@ static void at91_ddr_standby(void)
 		at91_ramc_write(1, AT91_DDRSDRC_LPR, saved_lpr1);
 }
 
+static void sama5d3_ddr_standby(void)
+{
+	u32 lpr0;
+	u32 saved_lpr0;
+
+	saved_lpr0 = at91_ramc_read(0, AT91_DDRSDRC_LPR);
+	lpr0 = saved_lpr0 & ~AT91_DDRSDRC_LPCB;
+	lpr0 |= AT91_DDRSDRC_LPCB_POWER_DOWN;
+
+	at91_ramc_write(0, AT91_DDRSDRC_LPR, lpr0);
+
+	cpu_do_idle();
+
+	at91_ramc_write(0, AT91_DDRSDRC_LPR, saved_lpr0);
+}
+
 /* We manage both DDRAM/SDRAM controllers, we need more than one value to
  * remember.
  */
@@ -320,7 +336,7 @@ static const struct of_device_id const r
 	{ .compatible = "atmel,at91rm9200-sdramc", .data = at91rm9200_standby },
 	{ .compatible = "atmel,at91sam9260-sdramc", .data = at91sam9_sdram_standby },
 	{ .compatible = "atmel,at91sam9g45-ddramc", .data = at91_ddr_standby },
-	{ .compatible = "atmel,sama5d3-ddramc", .data = at91_ddr_standby },
+	{ .compatible = "atmel,sama5d3-ddramc", .data = sama5d3_ddr_standby },
 	{ /*sentinel*/ }
 };
 

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 46/76] ARM: dts: at91: sama5d2: add dma properties to UART nodes
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (42 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 45/76] ARM: at91: pm: cpu_idle: switch DDR to power-down mode Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 47/76] cpufreq: Restore policy min/max limits on CPU online Greg Kroah-Hartman
                   ` (31 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Nicolas Ferre, Alexandre Belloni

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nicolas Ferre <nicolas.ferre@atmel.com>

commit b1708b72a0959a032cd2eebb77fa9086ea3e0c84 upstream.

The dmas/dma-names properties are added to the UART nodes. Note that additional
properties are needed to enable them at the board level: check bindings for
details.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arm/boot/dts/sama5d2.dtsi |   35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

--- a/arch/arm/boot/dts/sama5d2.dtsi
+++ b/arch/arm/boot/dts/sama5d2.dtsi
@@ -856,6 +856,13 @@
 				compatible = "atmel,at91sam9260-usart";
 				reg = <0xf801c000 0x100>;
 				interrupts = <24 IRQ_TYPE_LEVEL_HIGH 7>;
+				dmas = <&dma0
+					(AT91_XDMAC_DT_MEM_IF(0) | AT91_XDMAC_DT_PER_IF(1) |
+					 AT91_XDMAC_DT_PERID(35))>,
+				       <&dma0
+					(AT91_XDMAC_DT_MEM_IF(0) | AT91_XDMAC_DT_PER_IF(1) |
+					 AT91_XDMAC_DT_PERID(36))>;
+				dma-names = "tx", "rx";
 				clocks = <&uart0_clk>;
 				clock-names = "usart";
 				status = "disabled";
@@ -865,6 +872,13 @@
 				compatible = "atmel,at91sam9260-usart";
 				reg = <0xf8020000 0x100>;
 				interrupts = <25 IRQ_TYPE_LEVEL_HIGH 7>;
+				dmas = <&dma0
+					(AT91_XDMAC_DT_MEM_IF(0) | AT91_XDMAC_DT_PER_IF(1) |
+					 AT91_XDMAC_DT_PERID(37))>,
+				       <&dma0
+					(AT91_XDMAC_DT_MEM_IF(0) | AT91_XDMAC_DT_PER_IF(1) |
+					 AT91_XDMAC_DT_PERID(38))>;
+				dma-names = "tx", "rx";
 				clocks = <&uart1_clk>;
 				clock-names = "usart";
 				status = "disabled";
@@ -874,6 +888,13 @@
 				compatible = "atmel,at91sam9260-usart";
 				reg = <0xf8024000 0x100>;
 				interrupts = <26 IRQ_TYPE_LEVEL_HIGH 7>;
+				dmas = <&dma0
+					(AT91_XDMAC_DT_MEM_IF(0) | AT91_XDMAC_DT_PER_IF(1) |
+					 AT91_XDMAC_DT_PERID(39))>,
+				       <&dma0
+					(AT91_XDMAC_DT_MEM_IF(0) | AT91_XDMAC_DT_PER_IF(1) |
+					 AT91_XDMAC_DT_PERID(40))>;
+				dma-names = "tx", "rx";
 				clocks = <&uart2_clk>;
 				clock-names = "usart";
 				status = "disabled";
@@ -985,6 +1006,13 @@
 				compatible = "atmel,at91sam9260-usart";
 				reg = <0xfc008000 0x100>;
 				interrupts = <27 IRQ_TYPE_LEVEL_HIGH 7>;
+				dmas = <&dma0
+					(AT91_XDMAC_DT_MEM_IF(0) | AT91_XDMAC_DT_PER_IF(1) |
+					 AT91_XDMAC_DT_PERID(41))>,
+				       <&dma0
+					(AT91_XDMAC_DT_MEM_IF(0) | AT91_XDMAC_DT_PER_IF(1) |
+					 AT91_XDMAC_DT_PERID(42))>;
+				dma-names = "tx", "rx";
 				clocks = <&uart3_clk>;
 				clock-names = "usart";
 				status = "disabled";
@@ -993,6 +1021,13 @@
 			uart4: serial@fc00c000 {
 				compatible = "atmel,at91sam9260-usart";
 				reg = <0xfc00c000 0x100>;
+				dmas = <&dma0
+					(AT91_XDMAC_DT_MEM_IF(0) | AT91_XDMAC_DT_PER_IF(1) |
+					 AT91_XDMAC_DT_PERID(43))>,
+				       <&dma0
+					(AT91_XDMAC_DT_MEM_IF(0) | AT91_XDMAC_DT_PER_IF(1) |
+					 AT91_XDMAC_DT_PERID(44))>;
+				dma-names = "tx", "rx";
 				interrupts = <28 IRQ_TYPE_LEVEL_HIGH 7>;
 				clocks = <&uart4_clk>;
 				clock-names = "usart";

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 47/76] cpufreq: Restore policy min/max limits on CPU online
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (43 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 46/76] ARM: dts: at91: sama5d2: add dma properties to UART nodes Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations Greg Kroah-Hartman
                   ` (30 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Rafael J. Wysocki, Viresh Kumar

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Viresh Kumar <viresh.kumar@linaro.org>

commit ff010472fb75670cb5c08671e820eeea3af59c87 upstream.

On CPU online the cpufreq core restores the previous governor (or
the previous "policy" setting for ->setpolicy drivers), but it does
not restore the min/max limits at the same time, which is confusing,
inconsistent and real pain for users who set the limits and then
suspend/resume the system (using full suspend), in which case the
limits are reset on all CPUs except for the boot one.

Fix this by making cpufreq_online() restore the limits when an inactive
policy is brought online.

The commit log and patch are inspired from Rafael's earlier work.

Reported-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/cpufreq/cpufreq.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1186,6 +1186,9 @@ static int cpufreq_online(unsigned int c
 		for_each_cpu(j, policy->related_cpus)
 			per_cpu(cpufreq_cpu_data, j) = policy;
 		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	} else {
+		policy->min = policy->user_policy.min;
+		policy->max = policy->user_policy.max;
 	}
 
 	if (cpufreq_driver->get && !cpufreq_driver->setpolicy) {

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (44 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 47/76] cpufreq: Restore policy min/max limits on CPU online Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:43   ` Michal Hocko
  2017-03-28 12:30 ` [PATCH 4.4 49/76] raid10: increment write counter after bio is split Greg Kroah-Hartman
                   ` (29 subsequent siblings)
  75 siblings, 1 reply; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Sergey Jerusalimov, Ilya Dryomov,
	Jeff Layton

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ilya Dryomov <idryomov@gmail.com>

commit 633ee407b9d15a75ac9740ba9d3338815e1fcb95 upstream.

sock_alloc_inode() allocates socket+inode and socket_wq with
GFP_KERNEL, which is not allowed on the writeback path:

    Workqueue: ceph-msgr con_work [libceph]
    ffff8810871cb018 0000000000000046 0000000000000000 ffff881085d40000
    0000000000012b00 ffff881025cad428 ffff8810871cbfd8 0000000000012b00
    ffff880102fc1000 ffff881085d40000 ffff8810871cb038 ffff8810871cb148
    Call Trace:
    [<ffffffff816dd629>] schedule+0x29/0x70
    [<ffffffff816e066d>] schedule_timeout+0x1bd/0x200
    [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
    [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
    [<ffffffff816deb5f>] wait_for_completion+0xbf/0x180
    [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
    [<ffffffff81086335>] flush_work+0x165/0x250
    [<ffffffff81082940>] ? worker_detach_from_pool+0xd0/0xd0
    [<ffffffffa03b65b1>] xlog_cil_force_lsn+0x81/0x200 [xfs]
    [<ffffffff816d6b42>] ? __slab_free+0xee/0x234
    [<ffffffffa03b4b1d>] _xfs_log_force_lsn+0x4d/0x2c0 [xfs]
    [<ffffffff811adc1e>] ? lookup_page_cgroup_used+0xe/0x30
    [<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
    [<ffffffffa03b4dcf>] xfs_log_force_lsn+0x3f/0xf0 [xfs]
    [<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
    [<ffffffffa03a62c6>] xfs_iunpin_wait+0xc6/0x1a0 [xfs]
    [<ffffffff810aa250>] ? wake_atomic_t_function+0x40/0x40
    [<ffffffffa039a723>] xfs_reclaim_inode+0xa3/0x330 [xfs]
    [<ffffffffa039ac07>] xfs_reclaim_inodes_ag+0x257/0x3d0 [xfs]
    [<ffffffffa039bb13>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
    [<ffffffffa03ab745>] xfs_fs_free_cached_objects+0x15/0x20 [xfs]
    [<ffffffff811c0c18>] super_cache_scan+0x178/0x180
    [<ffffffff8115912e>] shrink_slab_node+0x14e/0x340
    [<ffffffff811afc3b>] ? mem_cgroup_iter+0x16b/0x450
    [<ffffffff8115af70>] shrink_slab+0x100/0x140
    [<ffffffff8115e425>] do_try_to_free_pages+0x335/0x490
    [<ffffffff8115e7f9>] try_to_free_pages+0xb9/0x1f0
    [<ffffffff816d56e4>] ? __alloc_pages_direct_compact+0x69/0x1be
    [<ffffffff81150cba>] __alloc_pages_nodemask+0x69a/0xb40
    [<ffffffff8119743e>] alloc_pages_current+0x9e/0x110
    [<ffffffff811a0ac5>] new_slab+0x2c5/0x390
    [<ffffffff816d71c4>] __slab_alloc+0x33b/0x459
    [<ffffffff815b906d>] ? sock_alloc_inode+0x2d/0xd0
    [<ffffffff8164bda1>] ? inet_sendmsg+0x71/0xc0
    [<ffffffff815b906d>] ? sock_alloc_inode+0x2d/0xd0
    [<ffffffff811a21f2>] kmem_cache_alloc+0x1a2/0x1b0
    [<ffffffff815b906d>] sock_alloc_inode+0x2d/0xd0
    [<ffffffff811d8566>] alloc_inode+0x26/0xa0
    [<ffffffff811da04a>] new_inode_pseudo+0x1a/0x70
    [<ffffffff815b933e>] sock_alloc+0x1e/0x80
    [<ffffffff815ba855>] __sock_create+0x95/0x220
    [<ffffffff815baa04>] sock_create_kern+0x24/0x30
    [<ffffffffa04794d9>] con_work+0xef9/0x2050 [libceph]
    [<ffffffffa04aa9ec>] ? rbd_img_request_submit+0x4c/0x60 [rbd]
    [<ffffffff81084c19>] process_one_work+0x159/0x4f0
    [<ffffffff8108561b>] worker_thread+0x11b/0x530
    [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
    [<ffffffff8108b6f9>] kthread+0xc9/0xe0
    [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
    [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
    [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90

Use memalloc_noio_{save,restore}() to temporarily force GFP_NOIO here.

Link: http://tracker.ceph.com/issues/19309
Reported-by: Sergey Jerusalimov <wintchester@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/ceph/messenger.c |    6 ++++++
 1 file changed, 6 insertions(+)

--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -7,6 +7,7 @@
 #include <linux/kthread.h>
 #include <linux/net.h>
 #include <linux/nsproxy.h>
+#include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/socket.h>
 #include <linux/string.h>
@@ -478,11 +479,16 @@ static int ceph_tcp_connect(struct ceph_
 {
 	struct sockaddr_storage *paddr = &con->peer_addr.in_addr;
 	struct socket *sock;
+	unsigned int noio_flag;
 	int ret;
 
 	BUG_ON(con->sock);
+
+	/* sock_create_kern() allocates with GFP_KERNEL */
+	noio_flag = memalloc_noio_save();
 	ret = sock_create_kern(read_pnet(&con->msgr->net), paddr->ss_family,
 			       SOCK_STREAM, IPPROTO_TCP, &sock);
+	memalloc_noio_restore(noio_flag);
 	if (ret)
 		return ret;
 	sock->sk->sk_allocation = GFP_NOFS;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 49/76] raid10: increment write counter after bio is split
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (45 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 50/76] libceph: dont set weight to IN when OSD is destroyed Greg Kroah-Hartman
                   ` (28 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tomasz Majchrzak, Artur Paszkiewicz,
	Shaohua Li

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tomasz Majchrzak <tomasz.majchrzak@intel.com>

commit 9b622e2bbcf049c82e2550d35fb54ac205965f50 upstream.

md pending write counter must be incremented after bio is split,
otherwise it gets decremented too many times in end bio callback and
becomes negative.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/md/raid10.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1072,6 +1072,8 @@ static void __make_request(struct mddev
 	int max_sectors;
 	int sectors;
 
+	md_write_start(mddev, bio);
+
 	/*
 	 * Register the new request and wait if the reconstruction
 	 * thread has put up a bar for new requests.
@@ -1455,8 +1457,6 @@ static void make_request(struct mddev *m
 		return;
 	}
 
-	md_write_start(mddev, bio);
-
 	do {
 
 		/*

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 50/76] libceph: dont set weight to IN when OSD is destroyed
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (46 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 49/76] raid10: increment write counter after bio is split Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 51/76] xfs: dont allow di_size with high bit set Greg Kroah-Hartman
                   ` (27 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Ilya Dryomov, Sage Weil

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ilya Dryomov <idryomov@gmail.com>

commit b581a5854eee4b7851dedb0f8c2ceb54fb902c06 upstream.

Since ceph.git commit 4e28f9e63644 ("osd/OSDMap: clear osd_info,
osd_xinfo on osd deletion"), weight is set to IN when OSD is deleted.
This changes the result of applying an incremental for clients, not
just OSDs.  Because CRUSH computations are obviously affected,
pre-4e28f9e63644 servers disagree with post-4e28f9e63644 clients on
object placement, resulting in misdirected requests.

Mirrors ceph.git commit a6009d1039a55e2c77f431662b3d6cc5a8e8e63f.

Fixes: 930c53286977 ("libceph: apply new_state before new_up_client on incrementals")
Link: http://tracker.ceph.com/issues/19122
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/ceph/osdmap.c |    1 -
 1 file changed, 1 deletion(-)

--- a/net/ceph/osdmap.c
+++ b/net/ceph/osdmap.c
@@ -1265,7 +1265,6 @@ static int decode_new_up_state_weight(vo
 		if ((map->osd_state[osd] & CEPH_OSD_EXISTS) &&
 		    (xorstate & CEPH_OSD_EXISTS)) {
 			pr_info("osd%d does not exist\n", osd);
-			map->osd_weight[osd] = CEPH_OSD_IN;
 			ret = set_primary_affinity(map, osd,
 						   CEPH_OSD_DEFAULT_PRIMARY_AFFINITY);
 			if (ret)

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 51/76] xfs: dont allow di_size with high bit set
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (47 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 50/76] libceph: dont set weight to IN when OSD is destroyed Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 52/76] xfs: fix up xfs_swap_extent_forks inline extent handling Greg Kroah-Hartman
                   ` (26 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Darrick J. Wong, Dave Chinner,
	Dave Chinner, Nikolay Borisov

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Darrick J. Wong <darrick.wong@oracle.com>

commit ef388e2054feedaeb05399ed654bdb06f385d294 upstream.

The on-disk field di_size is used to set i_size, which is a signed
integer of loff_t.  If the high bit of di_size is set, we'll end up with
a negative i_size, which will cause all sorts of problems.  Since the
VFS won't let us create a file with such length, we should catch them
here in the verifier too.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Nikolay Borisov <n.borisov.lkml@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/xfs/libxfs/xfs_inode_buf.c |    8 ++++++++
 1 file changed, 8 insertions(+)

--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -299,6 +299,14 @@ xfs_dinode_verify(
 	if (dip->di_magic != cpu_to_be16(XFS_DINODE_MAGIC))
 		return false;
 
+	/* don't allow invalid i_size */
+	if (be64_to_cpu(dip->di_size) & (1ULL << 63))
+		return false;
+
+	/* No zero-length symlinks. */
+	if (S_ISLNK(be16_to_cpu(dip->di_mode)) && dip->di_size == 0)
+		return false;
+
 	/* only version 3 or greater inodes are extensively verified here */
 	if (dip->di_version < 3)
 		return true;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 52/76] xfs: fix up xfs_swap_extent_forks inline extent handling
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (48 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 51/76] xfs: dont allow di_size with high bit set Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 53/76] nl80211: fix dumpit error path RTNL deadlocks Greg Kroah-Hartman
                   ` (25 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Sandeen, Brian Foster,
	Dave Chinner, Nikolay Borisov

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Sandeen <sandeen@sandeen.net>

commit 4dfce57db6354603641132fac3c887614e3ebe81 upstream.

There have been several reports over the years of NULL pointer
dereferences in xfs_trans_log_inode during xfs_fsr processes,
when the process is doing an fput and tearing down extents
on the temporary inode, something like:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
PID: 29439  TASK: ffff880550584fa0  CPU: 6   COMMAND: "xfs_fsr"
    [exception RIP: xfs_trans_log_inode+0x10]
 #9 [ffff8800a57bbbe0] xfs_bunmapi at ffffffffa037398e [xfs]
#10 [ffff8800a57bbce8] xfs_itruncate_extents at ffffffffa0391b29 [xfs]
#11 [ffff8800a57bbd88] xfs_inactive_truncate at ffffffffa0391d0c [xfs]
#12 [ffff8800a57bbdb8] xfs_inactive at ffffffffa0392508 [xfs]
#13 [ffff8800a57bbdd8] xfs_fs_evict_inode at ffffffffa035907e [xfs]
#14 [ffff8800a57bbe00] evict at ffffffff811e1b67
#15 [ffff8800a57bbe28] iput at ffffffff811e23a5
#16 [ffff8800a57bbe58] dentry_kill at ffffffff811dcfc8
#17 [ffff8800a57bbe88] dput at ffffffff811dd06c
#18 [ffff8800a57bbea8] __fput at ffffffff811c823b
#19 [ffff8800a57bbef0] ____fput at ffffffff811c846e
#20 [ffff8800a57bbf00] task_work_run at ffffffff81093b27
#21 [ffff8800a57bbf30] do_notify_resume at ffffffff81013b0c
#22 [ffff8800a57bbf50] int_signal at ffffffff8161405d

As it turns out, this is because the i_itemp pointer, along
with the d_ops pointer, has been overwritten with zeros
when we tear down the extents during truncate.  When the in-core
inode fork on the temporary inode used by xfs_fsr was originally
set up during the extent swap, we mistakenly looked at di_nextents
to determine whether all extents fit inline, but this misses extents
generated by speculative preallocation; we should be using if_bytes
instead.

This mistake corrupts the in-memory inode, and code in
xfs_iext_remove_inline eventually gets bad inputs, causing
it to memmove and memset incorrect ranges; this became apparent
because the two values in ifp->if_u2.if_inline_ext[1] contained
what should have been in d_ops and i_itemp; they were memmoved due
to incorrect array indexing and then the original locations
were zeroed with memset, again due to an array overrun.

Fix this by properly using i_df.if_bytes to determine the number
of extents, not di_nextents.

Thanks to dchinner for looking at this with me and spotting the
root cause.

[nborisov: backported to 4.4]

Cc: stable@vger.kernel.org
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
--
 fs/xfs/xfs_bmap_util.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -1713,6 +1713,7 @@ xfs_swap_extents(
 	xfs_trans_t	*tp;
 	xfs_bstat_t	*sbp = &sxp->sx_stat;
 	xfs_ifork_t	*tempifp, *ifp, *tifp;
+	xfs_extnum_t	nextents;
 	int		src_log_flags, target_log_flags;
 	int		error = 0;
 	int		aforkblks = 0;
@@ -1899,7 +1900,8 @@ xfs_swap_extents(
 		 * pointer.  Otherwise it's already NULL or
 		 * pointing to the extent.
 		 */
-		if (ip->i_d.di_nextents <= XFS_INLINE_EXTS) {
+		nextents = ip->i_df.if_bytes / (uint)sizeof(xfs_bmbt_rec_t);
+		if (nextents <= XFS_INLINE_EXTS) {
 			ifp->if_u1.if_extents =
 				ifp->if_u2.if_inline_ext;
 		}
@@ -1918,7 +1920,8 @@ xfs_swap_extents(
 		 * pointer.  Otherwise it's already NULL or
 		 * pointing to the extent.
 		 */
-		if (tip->i_d.di_nextents <= XFS_INLINE_EXTS) {
+		nextents = tip->i_df.if_bytes / (uint)sizeof(xfs_bmbt_rec_t);
+		if (nextents <= XFS_INLINE_EXTS) {
 			tifp->if_u1.if_extents =
 				tifp->if_u2.if_inline_ext;
 		}

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 53/76] nl80211: fix dumpit error path RTNL deadlocks
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (49 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 52/76] xfs: fix up xfs_swap_extent_forks inline extent handling Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 54/76] USB: usbtmc: add missing endpoint sanity check Greg Kroah-Hartman
                   ` (24 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Sowmini Varadhan, Dmitry Vyukov,
	Johannes Berg

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johannes Berg <johannes.berg@intel.com>

commit ea90e0dc8cecba6359b481e24d9c37160f6f524f upstream.

Sowmini pointed out Dmitry's RTNL deadlock report to me, and it turns out
to be perfectly accurate - there are various error paths that miss unlock
of the RTNL.

To fix those, change the locking a bit to not be conditional in all those
nl80211_prepare_*_dump() functions, but make those require the RTNL to
start with, and fix the buggy error paths. This also let me use sparse
(by appropriately overriding the rtnl_lock/rtnl_unlock functions) to
validate the changes.

Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/wireless/nl80211.c |  121 +++++++++++++++++++++----------------------------
 1 file changed, 53 insertions(+), 68 deletions(-)

--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -492,21 +492,17 @@ static int nl80211_prepare_wdev_dump(str
 {
 	int err;
 
-	rtnl_lock();
-
 	if (!cb->args[0]) {
 		err = nlmsg_parse(cb->nlh, GENL_HDRLEN + nl80211_fam.hdrsize,
 				  nl80211_fam.attrbuf, nl80211_fam.maxattr,
 				  nl80211_policy);
 		if (err)
-			goto out_unlock;
+			return err;
 
 		*wdev = __cfg80211_wdev_from_attrs(sock_net(skb->sk),
 						   nl80211_fam.attrbuf);
-		if (IS_ERR(*wdev)) {
-			err = PTR_ERR(*wdev);
-			goto out_unlock;
-		}
+		if (IS_ERR(*wdev))
+			return PTR_ERR(*wdev);
 		*rdev = wiphy_to_rdev((*wdev)->wiphy);
 		/* 0 is the first index - add 1 to parse only once */
 		cb->args[0] = (*rdev)->wiphy_idx + 1;
@@ -516,10 +512,8 @@ static int nl80211_prepare_wdev_dump(str
 		struct wiphy *wiphy = wiphy_idx_to_wiphy(cb->args[0] - 1);
 		struct wireless_dev *tmp;
 
-		if (!wiphy) {
-			err = -ENODEV;
-			goto out_unlock;
-		}
+		if (!wiphy)
+			return -ENODEV;
 		*rdev = wiphy_to_rdev(wiphy);
 		*wdev = NULL;
 
@@ -530,21 +524,11 @@ static int nl80211_prepare_wdev_dump(str
 			}
 		}
 
-		if (!*wdev) {
-			err = -ENODEV;
-			goto out_unlock;
-		}
+		if (!*wdev)
+			return -ENODEV;
 	}
 
 	return 0;
- out_unlock:
-	rtnl_unlock();
-	return err;
-}
-
-static void nl80211_finish_wdev_dump(struct cfg80211_registered_device *rdev)
-{
-	rtnl_unlock();
 }
 
 /* IE validation */
@@ -3884,9 +3868,10 @@ static int nl80211_dump_station(struct s
 	int sta_idx = cb->args[2];
 	int err;
 
+	rtnl_lock();
 	err = nl80211_prepare_wdev_dump(skb, cb, &rdev, &wdev);
 	if (err)
-		return err;
+		goto out_err;
 
 	if (!wdev->netdev) {
 		err = -EINVAL;
@@ -3922,7 +3907,7 @@ static int nl80211_dump_station(struct s
 	cb->args[2] = sta_idx;
 	err = skb->len;
  out_err:
-	nl80211_finish_wdev_dump(rdev);
+	rtnl_unlock();
 
 	return err;
 }
@@ -4639,9 +4624,10 @@ static int nl80211_dump_mpath(struct sk_
 	int path_idx = cb->args[2];
 	int err;
 
+	rtnl_lock();
 	err = nl80211_prepare_wdev_dump(skb, cb, &rdev, &wdev);
 	if (err)
-		return err;
+		goto out_err;
 
 	if (!rdev->ops->dump_mpath) {
 		err = -EOPNOTSUPP;
@@ -4675,7 +4661,7 @@ static int nl80211_dump_mpath(struct sk_
 	cb->args[2] = path_idx;
 	err = skb->len;
  out_err:
-	nl80211_finish_wdev_dump(rdev);
+	rtnl_unlock();
 	return err;
 }
 
@@ -4835,9 +4821,10 @@ static int nl80211_dump_mpp(struct sk_bu
 	int path_idx = cb->args[2];
 	int err;
 
+	rtnl_lock();
 	err = nl80211_prepare_wdev_dump(skb, cb, &rdev, &wdev);
 	if (err)
-		return err;
+		goto out_err;
 
 	if (!rdev->ops->dump_mpp) {
 		err = -EOPNOTSUPP;
@@ -4870,7 +4857,7 @@ static int nl80211_dump_mpp(struct sk_bu
 	cb->args[2] = path_idx;
 	err = skb->len;
  out_err:
-	nl80211_finish_wdev_dump(rdev);
+	rtnl_unlock();
 	return err;
 }
 
@@ -6806,9 +6793,12 @@ static int nl80211_dump_scan(struct sk_b
 	int start = cb->args[2], idx = 0;
 	int err;
 
+	rtnl_lock();
 	err = nl80211_prepare_wdev_dump(skb, cb, &rdev, &wdev);
-	if (err)
+	if (err) {
+		rtnl_unlock();
 		return err;
+	}
 
 	wdev_lock(wdev);
 	spin_lock_bh(&rdev->bss_lock);
@@ -6831,7 +6821,7 @@ static int nl80211_dump_scan(struct sk_b
 	wdev_unlock(wdev);
 
 	cb->args[2] = idx;
-	nl80211_finish_wdev_dump(rdev);
+	rtnl_unlock();
 
 	return skb->len;
 }
@@ -6915,9 +6905,10 @@ static int nl80211_dump_survey(struct sk
 	int res;
 	bool radio_stats;
 
+	rtnl_lock();
 	res = nl80211_prepare_wdev_dump(skb, cb, &rdev, &wdev);
 	if (res)
-		return res;
+		goto out_err;
 
 	/* prepare_wdev_dump parsed the attributes */
 	radio_stats = nl80211_fam.attrbuf[NL80211_ATTR_SURVEY_RADIO_STATS];
@@ -6958,7 +6949,7 @@ static int nl80211_dump_survey(struct sk
 	cb->args[2] = survey_idx;
 	res = skb->len;
  out_err:
-	nl80211_finish_wdev_dump(rdev);
+	rtnl_unlock();
 	return res;
 }
 
@@ -10158,17 +10149,13 @@ static int nl80211_prepare_vendor_dump(s
 	void *data = NULL;
 	unsigned int data_len = 0;
 
-	rtnl_lock();
-
 	if (cb->args[0]) {
 		/* subtract the 1 again here */
 		struct wiphy *wiphy = wiphy_idx_to_wiphy(cb->args[0] - 1);
 		struct wireless_dev *tmp;
 
-		if (!wiphy) {
-			err = -ENODEV;
-			goto out_unlock;
-		}
+		if (!wiphy)
+			return -ENODEV;
 		*rdev = wiphy_to_rdev(wiphy);
 		*wdev = NULL;
 
@@ -10189,13 +10176,11 @@ static int nl80211_prepare_vendor_dump(s
 			  nl80211_fam.attrbuf, nl80211_fam.maxattr,
 			  nl80211_policy);
 	if (err)
-		goto out_unlock;
+		return err;
 
 	if (!nl80211_fam.attrbuf[NL80211_ATTR_VENDOR_ID] ||
-	    !nl80211_fam.attrbuf[NL80211_ATTR_VENDOR_SUBCMD]) {
-		err = -EINVAL;
-		goto out_unlock;
-	}
+	    !nl80211_fam.attrbuf[NL80211_ATTR_VENDOR_SUBCMD])
+		return -EINVAL;
 
 	*wdev = __cfg80211_wdev_from_attrs(sock_net(skb->sk),
 					   nl80211_fam.attrbuf);
@@ -10204,10 +10189,8 @@ static int nl80211_prepare_vendor_dump(s
 
 	*rdev = __cfg80211_rdev_from_attrs(sock_net(skb->sk),
 					   nl80211_fam.attrbuf);
-	if (IS_ERR(*rdev)) {
-		err = PTR_ERR(*rdev);
-		goto out_unlock;
-	}
+	if (IS_ERR(*rdev))
+		return PTR_ERR(*rdev);
 
 	vid = nla_get_u32(nl80211_fam.attrbuf[NL80211_ATTR_VENDOR_ID]);
 	subcmd = nla_get_u32(nl80211_fam.attrbuf[NL80211_ATTR_VENDOR_SUBCMD]);
@@ -10220,19 +10203,15 @@ static int nl80211_prepare_vendor_dump(s
 		if (vcmd->info.vendor_id != vid || vcmd->info.subcmd != subcmd)
 			continue;
 
-		if (!vcmd->dumpit) {
-			err = -EOPNOTSUPP;
-			goto out_unlock;
-		}
+		if (!vcmd->dumpit)
+			return -EOPNOTSUPP;
 
 		vcmd_idx = i;
 		break;
 	}
 
-	if (vcmd_idx < 0) {
-		err = -EOPNOTSUPP;
-		goto out_unlock;
-	}
+	if (vcmd_idx < 0)
+		return -EOPNOTSUPP;
 
 	if (nl80211_fam.attrbuf[NL80211_ATTR_VENDOR_DATA]) {
 		data = nla_data(nl80211_fam.attrbuf[NL80211_ATTR_VENDOR_DATA]);
@@ -10249,9 +10228,6 @@ static int nl80211_prepare_vendor_dump(s
 
 	/* keep rtnl locked in successful case */
 	return 0;
- out_unlock:
-	rtnl_unlock();
-	return err;
 }
 
 static int nl80211_vendor_cmd_dump(struct sk_buff *skb,
@@ -10266,9 +10242,10 @@ static int nl80211_vendor_cmd_dump(struc
 	int err;
 	struct nlattr *vendor_data;
 
+	rtnl_lock();
 	err = nl80211_prepare_vendor_dump(skb, cb, &rdev, &wdev);
 	if (err)
-		return err;
+		goto out;
 
 	vcmd_idx = cb->args[2];
 	data = (void *)cb->args[3];
@@ -10277,18 +10254,26 @@ static int nl80211_vendor_cmd_dump(struc
 
 	if (vcmd->flags & (WIPHY_VENDOR_CMD_NEED_WDEV |
 			   WIPHY_VENDOR_CMD_NEED_NETDEV)) {
-		if (!wdev)
-			return -EINVAL;
+		if (!wdev) {
+			err = -EINVAL;
+			goto out;
+		}
 		if (vcmd->flags & WIPHY_VENDOR_CMD_NEED_NETDEV &&
-		    !wdev->netdev)
-			return -EINVAL;
+		    !wdev->netdev) {
+			err = -EINVAL;
+			goto out;
+		}
 
 		if (vcmd->flags & WIPHY_VENDOR_CMD_NEED_RUNNING) {
 			if (wdev->netdev &&
-			    !netif_running(wdev->netdev))
-				return -ENETDOWN;
-			if (!wdev->netdev && !wdev->p2p_started)
-				return -ENETDOWN;
+			    !netif_running(wdev->netdev)) {
+				err = -ENETDOWN;
+				goto out;
+			}
+			if (!wdev->netdev && !wdev->p2p_started) {
+				err = -ENETDOWN;
+				goto out;
+			}
 		}
 	}
 

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 54/76] USB: usbtmc: add missing endpoint sanity check
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (50 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 53/76] nl80211: fix dumpit error path RTNL deadlocks Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 55/76] xfs: clear _XBF_PAGES from buffers when readahead page Greg Kroah-Hartman
                   ` (23 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Johan Hovold

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan@kernel.org>

commit 687e0687f71ec00e0132a21fef802dee88c2f1ad upstream.

USBTMC devices are required to have a bulk-in and a bulk-out endpoint,
but the driver failed to verify this, something which could lead to the
endpoint addresses being taken from uninitialised memory.

Make sure to zero all private data as part of allocation, and add the
missing endpoint sanity check.

Note that this also addresses a more recently introduced issue, where
the interrupt-in-presence flag would also be uninitialised whenever the
optional interrupt-in endpoint is not present. This in turn could lead
to an interrupt urb being allocated, initialised and submitted based on
uninitialised values.

Fixes: dbf3e7f654c0 ("Implement an ioctl to support the USMTMC-USB488 READ_STATUS_BYTE operation.")
Fixes: 5b775f672cc9 ("USB: add USB test and measurement class driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
[ johan: backport to v4.4 ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/usb/class/usbtmc.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

--- a/drivers/usb/class/usbtmc.c
+++ b/drivers/usb/class/usbtmc.c
@@ -1105,7 +1105,7 @@ static int usbtmc_probe(struct usb_inter
 
 	dev_dbg(&intf->dev, "%s called\n", __func__);
 
-	data = kmalloc(sizeof(*data), GFP_KERNEL);
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
 	if (!data)
 		return -ENOMEM;
 
@@ -1163,6 +1163,12 @@ static int usbtmc_probe(struct usb_inter
 		}
 	}
 
+	if (!data->bulk_out || !data->bulk_in) {
+		dev_err(&intf->dev, "bulk endpoints not found\n");
+		retcode = -ENODEV;
+		goto err_put;
+	}
+
 	retcode = get_capabilities(data);
 	if (retcode)
 		dev_err(&intf->dev, "can't read capabilities\n");
@@ -1186,6 +1192,7 @@ static int usbtmc_probe(struct usb_inter
 error_register:
 	sysfs_remove_group(&intf->dev.kobj, &capability_attr_grp);
 	sysfs_remove_group(&intf->dev.kobj, &data_attr_grp);
+err_put:
 	kref_put(&data->kref, usbtmc_delete);
 	return retcode;
 }

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 55/76] xfs: clear _XBF_PAGES from buffers when readahead page
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (51 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 54/76] USB: usbtmc: add missing endpoint sanity check Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 56/76] xen: do not re-use pirq number cached in pci device msi msg data Greg Kroah-Hartman
                   ` (22 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Darrick J. Wong, Eric Sandeen, Ivan Kozik

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Darrick J. Wong <darrick.wong@oracle.com>

commit 2aa6ba7b5ad3189cc27f14540aa2f57f0ed8df4b upstream.

If we try to allocate memory pages to back an xfs_buf that we're trying
to read, it's possible that we'll be so short on memory that the page
allocation fails.  For a blocking read we'll just wait, but for
readahead we simply dump all the pages we've collected so far.

Unfortunately, after dumping the pages we neglect to clear the
_XBF_PAGES state, which means that the subsequent call to xfs_buf_free
thinks that b_pages still points to pages we own.  It then double-frees
the b_pages pages.

This results in screaming about negative page refcounts from the memory
manager, which xfs oughtn't be triggering.  To reproduce this case,
mount a filesystem where the size of the inodes far outweighs the
availalble memory (a ~500M inode filesystem on a VM with 300MB memory
did the trick here) and run bulkstat in parallel with other memory
eating processes to put a huge load on the system.  The "check summary"
phase of xfs_scrub also works for this purpose.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Cc: Ivan Kozik <ivan@ludios.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/xfs/xfs_buf.c |    1 +
 1 file changed, 1 insertion(+)

--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -375,6 +375,7 @@ retry:
 out_free_pages:
 	for (i = 0; i < bp->b_page_count; i++)
 		__free_page(bp->b_pages[i]);
+	bp->b_flags &= ~_XBF_PAGES;
 	return error;
 }
 

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 56/76] xen: do not re-use pirq number cached in pci device msi msg data
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (52 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 55/76] xfs: clear _XBF_PAGES from buffers when readahead page Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 57/76] igb: Workaround for igb i210 firmware issue Greg Kroah-Hartman
                   ` (21 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Dan Streetman, Stefano Stabellini,
	Konrad Rzeszutek Wilk, Boris Ostrovsky, Sasha Levin,
	Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Dan Streetman <ddstreet@ieee.org>

[ Upstream commit c74fd80f2f41d05f350bb478151021f88551afe8 ]

Revert the main part of commit:
af42b8d12f8a ("xen: fix MSI setup and teardown for PV on HVM guests")

That commit introduced reading the pci device's msi message data to see
if a pirq was previously configured for the device's msi/msix, and re-use
that pirq.  At the time, that was the correct behavior.  However, a
later change to Qemu caused it to call into the Xen hypervisor to unmap
all pirqs for a pci device, when the pci device disables its MSI/MSIX
vectors; specifically the Qemu commit:
c976437c7dba9c7444fb41df45468968aaa326ad
("qemu-xen: free all the pirqs for msi/msix when driver unload")

Once Qemu added this pirq unmapping, it was no longer correct for the
kernel to re-use the pirq number cached in the pci device msi message
data.  All Qemu releases since 2.1.0 contain the patch that unmaps the
pirqs when the pci device disables its MSI/MSIX vectors.

This bug is causing failures to initialize multiple NVMe controllers
under Xen, because the NVMe driver sets up a single MSIX vector for
each controller (concurrently), and then after using that to talk to
the controller for some configuration data, it disables the single MSIX
vector and re-configures all the MSIX vectors it needs.  So the MSIX
setup code tries to re-use the cached pirq from the first vector
for each controller, but the hypervisor has already given away that
pirq to another controller, and its initialization fails.

This is discussed in more detail at:
https://lists.xen.org/archives/html/xen-devel/2017-01/msg00447.html

Fixes: af42b8d12f8a ("xen: fix MSI setup and teardown for PV on HVM guests")
Signed-off-by: Dan Streetman <dan.streetman@canonical.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/pci/xen.c |   23 +++++++----------------
 1 file changed, 7 insertions(+), 16 deletions(-)

--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -231,23 +231,14 @@ static int xen_hvm_setup_msi_irqs(struct
 		return 1;
 
 	for_each_pci_msi_entry(msidesc, dev) {
-		__pci_read_msi_msg(msidesc, &msg);
-		pirq = MSI_ADDR_EXT_DEST_ID(msg.address_hi) |
-			((msg.address_lo >> MSI_ADDR_DEST_ID_SHIFT) & 0xff);
-		if (msg.data != XEN_PIRQ_MSI_DATA ||
-		    xen_irq_from_pirq(pirq) < 0) {
-			pirq = xen_allocate_pirq_msi(dev, msidesc);
-			if (pirq < 0) {
-				irq = -ENODEV;
-				goto error;
-			}
-			xen_msi_compose_msg(dev, pirq, &msg);
-			__pci_write_msi_msg(msidesc, &msg);
-			dev_dbg(&dev->dev, "xen: msi bound to pirq=%d\n", pirq);
-		} else {
-			dev_dbg(&dev->dev,
-				"xen: msi already bound to pirq=%d\n", pirq);
+		pirq = xen_allocate_pirq_msi(dev, msidesc);
+		if (pirq < 0) {
+			irq = -ENODEV;
+			goto error;
 		}
+		xen_msi_compose_msg(dev, pirq, &msg);
+		__pci_write_msi_msg(msidesc, &msg);
+		dev_dbg(&dev->dev, "xen: msi bound to pirq=%d\n", pirq);
 		irq = xen_bind_pirq_msi_to_irq(dev, msidesc, pirq,
 					       (type == PCI_CAP_ID_MSI) ? nvec : 1,
 					       (type == PCI_CAP_ID_MSIX) ?

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 57/76] igb: Workaround for igb i210 firmware issue
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (53 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 56/76] xen: do not re-use pirq number cached in pci device msi msg data Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 58/76] igb: add i211 to i210 PHY workaround Greg Kroah-Hartman
                   ` (20 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Chris J Arges, Aaron Brown, Jeff Kirsher,
	Sasha Levin, Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Chris J Arges <christopherarges@gmail.com>

[ Upstream commit 4e684f59d760a2c7c716bb60190783546e2d08a1 ]

Sometimes firmware may not properly initialize I347AT4_PAGE_SELECT causing
the probe of an igb i210 NIC to fail. This patch adds an addition zeroing
of this register during igb_get_phy_id to workaround this issue.

Thanks for Jochen Henneberg for the idea and original patch.

Signed-off-by: Chris J Arges <christopherarges@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/intel/igb/e1000_phy.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/drivers/net/ethernet/intel/igb/e1000_phy.c
+++ b/drivers/net/ethernet/intel/igb/e1000_phy.c
@@ -77,6 +77,10 @@ s32 igb_get_phy_id(struct e1000_hw *hw)
 	s32 ret_val = 0;
 	u16 phy_id;
 
+	/* ensure PHY page selection to fix misconfigured i210 */
+	if (hw->mac.type == e1000_i210)
+		phy->ops.write_reg(hw, I347AT4_PAGE_SELECT, 0);
+
 	ret_val = phy->ops.read_reg(hw, PHY_ID1, &phy_id);
 	if (ret_val)
 		goto out;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 58/76] igb: add i211 to i210 PHY workaround
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (54 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 57/76] igb: Workaround for igb i210 firmware issue Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 59/76] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic Greg Kroah-Hartman
                   ` (19 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Todd Fujinaka, Aaron Brown, Jeff Kirsher,
	Sasha Levin, Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Todd Fujinaka <todd.fujinaka@intel.com>

[ Upstream commit 5bc8c230e2a993b49244f9457499f17283da9ec7 ]

i210 and i211 share the same PHY but have different PCI IDs. Don't
forget i211 for any i210 workarounds.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/intel/igb/e1000_phy.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/ethernet/intel/igb/e1000_phy.c
+++ b/drivers/net/ethernet/intel/igb/e1000_phy.c
@@ -78,7 +78,7 @@ s32 igb_get_phy_id(struct e1000_hw *hw)
 	u16 phy_id;
 
 	/* ensure PHY page selection to fix misconfigured i210 */
-	if (hw->mac.type == e1000_i210)
+	if ((hw->mac.type == e1000_i210) || (hw->mac.type == e1000_i211))
 		phy->ops.write_reg(hw, I347AT4_PAGE_SELECT, 0);
 
 	ret_val = phy->ops.read_reg(hw, PHY_ID1, &phy_id);

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 59/76] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (55 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 58/76] igb: add i211 to i210 PHY workaround Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 60/76] PCI: Separate VF BAR updates from standard BAR updates Greg Kroah-Hartman
                   ` (18 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Vitaly Kuznetsov, K. Y. Srinivasan, devel,
	Haiyang Zhang, Thomas Gleixner, Ingo Molnar, Sasha Levin,
	Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Vitaly Kuznetsov <vkuznets@redhat.com>

[ Upstream commit 59107e2f48831daedc46973ce4988605ab066de3 ]

There is a feature in Hyper-V ('Debug-VM --InjectNonMaskableInterrupt')
which injects NMI to the guest. We may want to crash the guest and do kdump
on this NMI by enabling unknown_nmi_panic. To make kdump succeed we need to
allow the kdump kernel to re-establish VMBus connection so it will see
VMBus devices (storage, network,..).

To properly unload VMBus making it possible to start over during kdump we
need to do the following:

 - Send an 'unload' message to the hypervisor. This can be done on any CPU
   so we do this the crashing CPU.

 - Receive the 'unload finished' reply message. WS2012R2 delivers this
   message to the CPU which was used to establish VMBus connection during
   module load and this CPU may differ from the CPU sending 'unload'.

Receiving a VMBus message means the following:

 - There is a per-CPU slot in memory for one message. This slot can in
   theory be accessed by any CPU.

 - We get an interrupt on the CPU when a message was placed into the slot.

 - When we read the message we need to clear the slot and signal the fact
   to the hypervisor. In case there are more messages to this CPU pending
   the hypervisor will deliver the next message. The signaling is done by
   writing to an MSR so this can only be done on the appropriate CPU.

To avoid doing cross-CPU work on crash we have vmbus_wait_for_unload()
function which checks message slots for all CPUs in a loop waiting for the
'unload finished' messages. However, there is an issue which arises when
these conditions are met:

 - We're crashing on a CPU which is different from the one which was used
   to initially contact the hypervisor.

 - The CPU which was used for the initial contact is blocked with interrupts
   disabled and there is a message pending in the message slot.

In this case we won't be able to read the 'unload finished' message on the
crashing CPU. This is reproducible when we receive unknown NMIs on all CPUs
simultaneously: the first CPU entering panic() will proceed to crash and
all other CPUs will stop themselves with interrupts disabled.

The suggested solution is to handle unknown NMIs for Hyper-V guests on the
first CPU which gets them only. This will allow us to rely on VMBus
interrupt handler being able to receive the 'unload finish' message in
case it is delivered to a different CPU.

The issue is not reproducible on WS2016 as Debug-VM delivers NMI to the
boot CPU only, WS2012R2 and earlier Hyper-V versions are affected.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: devel@linuxdriverproject.org
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Link: http://lkml.kernel.org/r/20161202100720.28121-1-vkuznets@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/mshyperv.c |   24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -30,6 +30,7 @@
 #include <asm/apic.h>
 #include <asm/timer.h>
 #include <asm/reboot.h>
+#include <asm/nmi.h>
 
 struct ms_hyperv_info ms_hyperv;
 EXPORT_SYMBOL_GPL(ms_hyperv);
@@ -157,6 +158,26 @@ static unsigned char hv_get_nmi_reason(v
 	return 0;
 }
 
+#ifdef CONFIG_X86_LOCAL_APIC
+/*
+ * Prior to WS2016 Debug-VM sends NMIs to all CPUs which makes
+ * it dificult to process CHANNELMSG_UNLOAD in case of crash. Handle
+ * unknown NMI on the first CPU which gets it.
+ */
+static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs)
+{
+	static atomic_t nmi_cpu = ATOMIC_INIT(-1);
+
+	if (!unknown_nmi_panic)
+		return NMI_DONE;
+
+	if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)
+		return NMI_HANDLED;
+
+	return NMI_DONE;
+}
+#endif
+
 static void __init ms_hyperv_init_platform(void)
 {
 	/*
@@ -182,6 +203,9 @@ static void __init ms_hyperv_init_platfo
 		printk(KERN_INFO "HyperV: LAPIC Timer Frequency: %#x\n",
 				lapic_timer_frequency);
 	}
+
+	register_nmi_handler(NMI_UNKNOWN, hv_nmi_unknown, NMI_FLAG_FIRST,
+			     "hv_nmi_unknown");
 #endif
 
 	if (ms_hyperv.features & HV_X64_MSR_TIME_REF_COUNT_AVAILABLE)

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 60/76] PCI: Separate VF BAR updates from standard BAR updates
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (56 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 59/76] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 61/76] PCI: Remove pci_resource_bar() and pci_iov_resource_bar() Greg Kroah-Hartman
                   ` (17 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Bjorn Helgaas, Gavin Shan, Sasha Levin, Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Bjorn Helgaas <bhelgaas@google.com>

[ Upstream commit 6ffa2489c51da77564a0881a73765ea2169f955d ]

Previously pci_update_resource() used the same code path for updating
standard BARs and VF BARs in SR-IOV capabilities.

Split the VF BAR update into a new pci_iov_update_resource() internal
interface, which makes it simpler to compute the BAR address (we can get
rid of pci_resource_bar() and pci_iov_resource_bar()).

This patch:

  - Renames pci_update_resource() to pci_std_update_resource(),
  - Adds pci_iov_update_resource(),
  - Makes pci_update_resource() a wrapper that calls the appropriate one,

No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/iov.c       |   50 ++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/pci/pci.h       |    1 
 drivers/pci/setup-res.c |   13 ++++++++++--
 3 files changed, 62 insertions(+), 2 deletions(-)

--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -572,6 +572,56 @@ int pci_iov_resource_bar(struct pci_dev
 		4 * (resno - PCI_IOV_RESOURCES);
 }
 
+/**
+ * pci_iov_update_resource - update a VF BAR
+ * @dev: the PCI device
+ * @resno: the resource number
+ *
+ * Update a VF BAR in the SR-IOV capability of a PF.
+ */
+void pci_iov_update_resource(struct pci_dev *dev, int resno)
+{
+	struct pci_sriov *iov = dev->is_physfn ? dev->sriov : NULL;
+	struct resource *res = dev->resource + resno;
+	int vf_bar = resno - PCI_IOV_RESOURCES;
+	struct pci_bus_region region;
+	u32 new;
+	int reg;
+
+	/*
+	 * The generic pci_restore_bars() path calls this for all devices,
+	 * including VFs and non-SR-IOV devices.  If this is not a PF, we
+	 * have nothing to do.
+	 */
+	if (!iov)
+		return;
+
+	/*
+	 * Ignore unimplemented BARs, unused resource slots for 64-bit
+	 * BARs, and non-movable resources, e.g., those described via
+	 * Enhanced Allocation.
+	 */
+	if (!res->flags)
+		return;
+
+	if (res->flags & IORESOURCE_UNSET)
+		return;
+
+	if (res->flags & IORESOURCE_PCI_FIXED)
+		return;
+
+	pcibios_resource_to_bus(dev->bus, &region, res);
+	new = region.start;
+	new |= res->flags & ~PCI_BASE_ADDRESS_MEM_MASK;
+
+	reg = iov->pos + PCI_SRIOV_BAR + 4 * vf_bar;
+	pci_write_config_dword(dev, reg, new);
+	if (res->flags & IORESOURCE_MEM_64) {
+		new = region.start >> 16 >> 16;
+		pci_write_config_dword(dev, reg + 4, new);
+	}
+}
+
 resource_size_t __weak pcibios_iov_resource_alignment(struct pci_dev *dev,
 						      int resno)
 {
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -277,6 +277,7 @@ static inline void pci_restore_ats_state
 int pci_iov_init(struct pci_dev *dev);
 void pci_iov_release(struct pci_dev *dev);
 int pci_iov_resource_bar(struct pci_dev *dev, int resno);
+void pci_iov_update_resource(struct pci_dev *dev, int resno);
 resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno);
 void pci_restore_iov_state(struct pci_dev *dev);
 int pci_iov_bus_range(struct pci_bus *bus);
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -25,8 +25,7 @@
 #include <linux/slab.h>
 #include "pci.h"
 
-
-void pci_update_resource(struct pci_dev *dev, int resno)
+static void pci_std_update_resource(struct pci_dev *dev, int resno)
 {
 	struct pci_bus_region region;
 	bool disable;
@@ -110,6 +109,16 @@ void pci_update_resource(struct pci_dev
 		pci_write_config_word(dev, PCI_COMMAND, cmd);
 }
 
+void pci_update_resource(struct pci_dev *dev, int resno)
+{
+	if (resno <= PCI_ROM_RESOURCE)
+		pci_std_update_resource(dev, resno);
+#ifdef CONFIG_PCI_IOV
+	else if (resno >= PCI_IOV_RESOURCES && resno <= PCI_IOV_RESOURCE_END)
+		pci_iov_update_resource(dev, resno);
+#endif
+}
+
 int pci_claim_resource(struct pci_dev *dev, int resource)
 {
 	struct resource *res = &dev->resource[resource];

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 61/76] PCI: Remove pci_resource_bar() and pci_iov_resource_bar()
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (57 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 60/76] PCI: Separate VF BAR updates from standard BAR updates Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:30 ` [PATCH 4.4 62/76] PCI: Add comments about ROM BAR updating Greg Kroah-Hartman
                   ` (16 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Bjorn Helgaas, Gavin Shan, Sasha Levin, Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Bjorn Helgaas <bhelgaas@google.com>

[ Upstream commit 286c2378aaccc7343ebf17ec6cd86567659caf70 ]

pci_std_update_resource() only deals with standard BARs, so we don't have
to worry about the complications of VF BARs in an SR-IOV capability.

Compute the BAR address inline and remove pci_resource_bar().  That makes
pci_iov_resource_bar() unused, so remove that as well.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/iov.c       |   18 ------------------
 drivers/pci/pci.c       |   30 ------------------------------
 drivers/pci/pci.h       |    6 ------
 drivers/pci/setup-res.c |   13 +++++++------
 4 files changed, 7 insertions(+), 60 deletions(-)

--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -555,24 +555,6 @@ void pci_iov_release(struct pci_dev *dev
 }
 
 /**
- * pci_iov_resource_bar - get position of the SR-IOV BAR
- * @dev: the PCI device
- * @resno: the resource number
- *
- * Returns position of the BAR encapsulated in the SR-IOV capability.
- */
-int pci_iov_resource_bar(struct pci_dev *dev, int resno)
-{
-	if (resno < PCI_IOV_RESOURCES || resno > PCI_IOV_RESOURCE_END)
-		return 0;
-
-	BUG_ON(!dev->is_physfn);
-
-	return dev->sriov->pos + PCI_SRIOV_BAR +
-		4 * (resno - PCI_IOV_RESOURCES);
-}
-
-/**
  * pci_iov_update_resource - update a VF BAR
  * @dev: the PCI device
  * @resno: the resource number
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4472,36 +4472,6 @@ int pci_select_bars(struct pci_dev *dev,
 }
 EXPORT_SYMBOL(pci_select_bars);
 
-/**
- * pci_resource_bar - get position of the BAR associated with a resource
- * @dev: the PCI device
- * @resno: the resource number
- * @type: the BAR type to be filled in
- *
- * Returns BAR position in config space, or 0 if the BAR is invalid.
- */
-int pci_resource_bar(struct pci_dev *dev, int resno, enum pci_bar_type *type)
-{
-	int reg;
-
-	if (resno < PCI_ROM_RESOURCE) {
-		*type = pci_bar_unknown;
-		return PCI_BASE_ADDRESS_0 + 4 * resno;
-	} else if (resno == PCI_ROM_RESOURCE) {
-		*type = pci_bar_mem32;
-		return dev->rom_base_reg;
-	} else if (resno < PCI_BRIDGE_RESOURCES) {
-		/* device specific resource */
-		*type = pci_bar_unknown;
-		reg = pci_iov_resource_bar(dev, resno);
-		if (reg)
-			return reg;
-	}
-
-	dev_err(&dev->dev, "BAR %d: invalid resource\n", resno);
-	return 0;
-}
-
 /* Some architectures require additional programming to enable VGA */
 static arch_set_vga_state_t arch_set_vga_state;
 
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -232,7 +232,6 @@ bool pci_bus_read_dev_vendor_id(struct p
 int pci_setup_device(struct pci_dev *dev);
 int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
 		    struct resource *res, unsigned int reg);
-int pci_resource_bar(struct pci_dev *dev, int resno, enum pci_bar_type *type);
 void pci_configure_ari(struct pci_dev *dev);
 void __pci_bus_size_bridges(struct pci_bus *bus,
 			struct list_head *realloc_head);
@@ -276,7 +275,6 @@ static inline void pci_restore_ats_state
 #ifdef CONFIG_PCI_IOV
 int pci_iov_init(struct pci_dev *dev);
 void pci_iov_release(struct pci_dev *dev);
-int pci_iov_resource_bar(struct pci_dev *dev, int resno);
 void pci_iov_update_resource(struct pci_dev *dev, int resno);
 resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno);
 void pci_restore_iov_state(struct pci_dev *dev);
@@ -291,10 +289,6 @@ static inline void pci_iov_release(struc
 
 {
 }
-static inline int pci_iov_resource_bar(struct pci_dev *dev, int resno)
-{
-	return 0;
-}
 static inline void pci_restore_iov_state(struct pci_dev *dev)
 {
 }
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -32,7 +32,6 @@ static void pci_std_update_resource(stru
 	u16 cmd;
 	u32 new, check, mask;
 	int reg;
-	enum pci_bar_type type;
 	struct resource *res = dev->resource + resno;
 
 	if (dev->is_virtfn) {
@@ -66,14 +65,16 @@ static void pci_std_update_resource(stru
 	else
 		mask = (u32)PCI_BASE_ADDRESS_MEM_MASK;
 
-	reg = pci_resource_bar(dev, resno, &type);
-	if (!reg)
-		return;
-	if (type != pci_bar_unknown) {
+	if (resno < PCI_ROM_RESOURCE) {
+		reg = PCI_BASE_ADDRESS_0 + 4 * resno;
+	} else if (resno == PCI_ROM_RESOURCE) {
 		if (!(res->flags & IORESOURCE_ROM_ENABLE))
 			return;
+
+		reg = dev->rom_base_reg;
 		new |= PCI_ROM_ADDRESS_ENABLE;
-	}
+	} else
+		return;
 
 	/*
 	 * We can't update a 64-bit BAR atomically, so when possible,

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 62/76] PCI: Add comments about ROM BAR updating
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (58 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 61/76] PCI: Remove pci_resource_bar() and pci_iov_resource_bar() Greg Kroah-Hartman
@ 2017-03-28 12:30 ` Greg Kroah-Hartman
  2017-03-28 12:31 ` [PATCH 4.4 63/76] PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE Greg Kroah-Hartman
                   ` (15 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:30 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Bjorn Helgaas, Gavin Shan, Sasha Levin, Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Bjorn Helgaas <bhelgaas@google.com>

[ Upstream commit 0b457dde3cf8b7c76a60f8e960f21bbd4abdc416 ]

pci_update_resource() updates a hardware BAR so its address matches the
kernel's struct resource UNLESS it's a disabled ROM BAR.  We only update
those when we enable the ROM.

It's not obvious from the code why ROM BARs should be handled specially.
Apparently there are Matrox devices with defective ROM BARs that read as
zero when disabled.  That means that if pci_enable_rom() reads the disabled
BAR, sets PCI_ROM_ADDRESS_ENABLE (without re-inserting the address), and
writes it back, it would enable the ROM at address zero.

Add comments and references to explain why we can't make the code look more
rational.

The code changes are from 755528c860b0 ("Ignore disabled ROM resources at
setup") and 8085ce084c0f ("[PATCH] Fix PCI ROM mapping").

Link: https://lkml.org/lkml/2005/8/30/138
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 [sumits: minor fixup in rom.c for 4.4.y]
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/rom.c       |    5 +++++
 drivers/pci/setup-res.c |    6 ++++++
 2 files changed, 11 insertions(+)

--- a/drivers/pci/rom.c
+++ b/drivers/pci/rom.c
@@ -31,6 +31,11 @@ int pci_enable_rom(struct pci_dev *pdev)
 	if (!res->flags)
 		return -1;
 
+	/*
+	 * Ideally pci_update_resource() would update the ROM BAR address,
+	 * and we would only set the enable bit here.  But apparently some
+	 * devices have buggy ROM BARs that read as zero when disabled.
+	 */
 	pcibios_resource_to_bus(pdev->bus, &region, res);
 	pci_read_config_dword(pdev, pdev->rom_base_reg, &rom_addr);
 	rom_addr &= ~PCI_ROM_ADDRESS_MASK;
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -68,6 +68,12 @@ static void pci_std_update_resource(stru
 	if (resno < PCI_ROM_RESOURCE) {
 		reg = PCI_BASE_ADDRESS_0 + 4 * resno;
 	} else if (resno == PCI_ROM_RESOURCE) {
+
+		/*
+		 * Apparently some Matrox devices have ROM BARs that read
+		 * as zero when disabled, so don't update ROM BARs unless
+		 * they're enabled.  See https://lkml.org/lkml/2005/8/30/138.
+		 */
 		if (!(res->flags & IORESOURCE_ROM_ENABLE))
 			return;
 

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 63/76] PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (59 preceding siblings ...)
  2017-03-28 12:30 ` [PATCH 4.4 62/76] PCI: Add comments about ROM BAR updating Greg Kroah-Hartman
@ 2017-03-28 12:31 ` Greg Kroah-Hartman
  2017-03-28 12:31 ` [PATCH 4.4 64/76] PCI: Dont update VF BARs while VF memory space is enabled Greg Kroah-Hartman
                   ` (14 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:31 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Bjorn Helgaas, Gavin Shan, Sasha Levin, Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Bjorn Helgaas <bhelgaas@google.com>

[ Upstream commit 7a6d312b50e63f598f5b5914c4fd21878ac2b595 ]

Remove the assumption that IORESOURCE_ROM_ENABLE == PCI_ROM_ADDRESS_ENABLE.
PCI_ROM_ADDRESS_ENABLE is the ROM enable bit defined by the PCI spec, so if
we're reading or writing a BAR register value, that's what we should use.
IORESOURCE_ROM_ENABLE is a corresponding bit in struct resource flags.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/probe.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -226,7 +226,8 @@ int __pci_read_base(struct pci_dev *dev,
 			mask64 = (u32)PCI_BASE_ADDRESS_MEM_MASK;
 		}
 	} else {
-		res->flags |= (l & IORESOURCE_ROM_ENABLE);
+		if (l & PCI_ROM_ADDRESS_ENABLE)
+			res->flags |= IORESOURCE_ROM_ENABLE;
 		l64 = l & PCI_ROM_ADDRESS_MASK;
 		sz64 = sz & PCI_ROM_ADDRESS_MASK;
 		mask64 = (u32)PCI_ROM_ADDRESS_MASK;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 64/76] PCI: Dont update VF BARs while VF memory space is enabled
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (60 preceding siblings ...)
  2017-03-28 12:31 ` [PATCH 4.4 63/76] PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE Greg Kroah-Hartman
@ 2017-03-28 12:31 ` Greg Kroah-Hartman
  2017-03-28 12:31 ` [PATCH 4.4 65/76] PCI: Update BARs using property bits appropriate for type Greg Kroah-Hartman
                   ` (13 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:31 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Bjorn Helgaas, Gavin Shan, Sasha Levin, Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Bjorn Helgaas <bhelgaas@google.com>

[ Upstream commit 546ba9f8f22f71b0202b6ba8967be5cc6dae4e21 ]

If we update a VF BAR while it's enabled, there are two potential problems:

  1) Any driver that's using the VF has a cached BAR value that is stale
     after the update, and

  2) We can't update 64-bit BARs atomically, so the intermediate state
     (new lower dword with old upper dword) may conflict with another
     device, and an access by a driver unrelated to the VF may cause a bus
     error.

Warn about attempts to update VF BARs while they are enabled.  This is a
programming error, so use dev_WARN() to get a backtrace.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/iov.c |    8 ++++++++
 1 file changed, 8 insertions(+)

--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -567,6 +567,7 @@ void pci_iov_update_resource(struct pci_
 	struct resource *res = dev->resource + resno;
 	int vf_bar = resno - PCI_IOV_RESOURCES;
 	struct pci_bus_region region;
+	u16 cmd;
 	u32 new;
 	int reg;
 
@@ -578,6 +579,13 @@ void pci_iov_update_resource(struct pci_
 	if (!iov)
 		return;
 
+	pci_read_config_word(dev, iov->pos + PCI_SRIOV_CTRL, &cmd);
+	if ((cmd & PCI_SRIOV_CTRL_VFE) && (cmd & PCI_SRIOV_CTRL_MSE)) {
+		dev_WARN(&dev->dev, "can't update enabled VF BAR%d %pR\n",
+			 vf_bar, res);
+		return;
+	}
+
 	/*
 	 * Ignore unimplemented BARs, unused resource slots for 64-bit
 	 * BARs, and non-movable resources, e.g., those described via

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 65/76] PCI: Update BARs using property bits appropriate for type
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (61 preceding siblings ...)
  2017-03-28 12:31 ` [PATCH 4.4 64/76] PCI: Dont update VF BARs while VF memory space is enabled Greg Kroah-Hartman
@ 2017-03-28 12:31 ` Greg Kroah-Hartman
  2017-03-28 12:31 ` [PATCH 4.4 66/76] PCI: Ignore BAR updates on virtual functions Greg Kroah-Hartman
                   ` (12 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:31 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Bjorn Helgaas, Sasha Levin, Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Bjorn Helgaas <bhelgaas@google.com>

[ Upstream commit 45d004f4afefdd8d79916ee6d97a9ecd94bb1ffe ]

The BAR property bits (0-3 for memory BARs, 0-1 for I/O BARs) are supposed
to be read-only, but we do save them in res->flags and include them when
updating the BAR.

Mask the I/O property bits with ~PCI_BASE_ADDRESS_IO_MASK (0x3) instead of
PCI_REGION_FLAG_MASK (0xf) to make it obvious that we can't corrupt bits
2-3 of I/O addresses.

Use PCI_ROM_ADDRESS_MASK for ROM BARs.  This means we'll only check the top
21 bits (instead of the 28 bits we used to check) of a ROM BAR to see if
the update was successful.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/setup-res.c |   11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -58,12 +58,17 @@ static void pci_std_update_resource(stru
 		return;
 
 	pcibios_resource_to_bus(dev->bus, &region, res);
+	new = region.start;
 
-	new = region.start | (res->flags & PCI_REGION_FLAG_MASK);
-	if (res->flags & IORESOURCE_IO)
+	if (res->flags & IORESOURCE_IO) {
 		mask = (u32)PCI_BASE_ADDRESS_IO_MASK;
-	else
+		new |= res->flags & ~PCI_BASE_ADDRESS_IO_MASK;
+	} else if (resno == PCI_ROM_RESOURCE) {
+		mask = (u32)PCI_ROM_ADDRESS_MASK;
+	} else {
 		mask = (u32)PCI_BASE_ADDRESS_MEM_MASK;
+		new |= res->flags & ~PCI_BASE_ADDRESS_MEM_MASK;
+	}
 
 	if (resno < PCI_ROM_RESOURCE) {
 		reg = PCI_BASE_ADDRESS_0 + 4 * resno;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 66/76] PCI: Ignore BAR updates on virtual functions
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (62 preceding siblings ...)
  2017-03-28 12:31 ` [PATCH 4.4 65/76] PCI: Update BARs using property bits appropriate for type Greg Kroah-Hartman
@ 2017-03-28 12:31 ` Greg Kroah-Hartman
  2017-03-28 12:31 ` [PATCH 4.4 67/76] PCI: Do any VF BAR updates before enabling the BARs Greg Kroah-Hartman
                   ` (11 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:31 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Bjorn Helgaas, Gavin Shan, Sasha Levin, Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Bjorn Helgaas <bhelgaas@google.com>

[ Upstream commit 63880b230a4af502c56dde3d4588634c70c66006 ]

VF BARs are read-only zero, so updating VF BARs will not have any effect.
See the SR-IOV spec r1.1, sec 3.4.1.11.

We already ignore these updates because of 70675e0b6a1a ("PCI: Don't try to
restore VF BARs"); this merely restructures it slightly to make it easier
to split updates for standard and SR-IOV BARs.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/pci.c       |    4 ----
 drivers/pci/setup-res.c |    5 ++---
 2 files changed, 2 insertions(+), 7 deletions(-)

--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -519,10 +519,6 @@ static void pci_restore_bars(struct pci_
 {
 	int i;
 
-	/* Per SR-IOV spec 3.4.1.11, VF BARs are RO zero */
-	if (dev->is_virtfn)
-		return;
-
 	for (i = 0; i < PCI_BRIDGE_RESOURCES; i++)
 		pci_update_resource(dev, i);
 }
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -34,10 +34,9 @@ static void pci_std_update_resource(stru
 	int reg;
 	struct resource *res = dev->resource + resno;
 
-	if (dev->is_virtfn) {
-		dev_warn(&dev->dev, "can't update VF BAR%d\n", resno);
+	/* Per SR-IOV spec 3.4.1.11, VF BARs are RO zero */
+	if (dev->is_virtfn)
 		return;
-	}
 
 	/*
 	 * Ignore resources for unimplemented BARs and unused resource slots

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 67/76] PCI: Do any VF BAR updates before enabling the BARs
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (63 preceding siblings ...)
  2017-03-28 12:31 ` [PATCH 4.4 66/76] PCI: Ignore BAR updates on virtual functions Greg Kroah-Hartman
@ 2017-03-28 12:31 ` Greg Kroah-Hartman
  2017-03-28 12:31 ` [PATCH 4.4 68/76] vfio/spapr: Postpone allocation of userspace version of TCE table Greg Kroah-Hartman
                   ` (10 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:31 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Carol Soto, Gavin Shan, Bjorn Helgaas,
	Sasha Levin, Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Gavin Shan <gwshan@linux.vnet.ibm.com>

[ Upstream commit f40ec3c748c6912f6266c56a7f7992de61b255ed ]

Previously we enabled VFs and enable their memory space before calling
pcibios_sriov_enable().  But pcibios_sriov_enable() may update the VF BARs:
for example, on PPC PowerNV we may change them to manage the association of
VFs to PEs.

Because 64-bit BARs cannot be updated atomically, it's unsafe to update
them while they're enabled.  The half-updated state may conflict with other
devices in the system.

Call pcibios_sriov_enable() before enabling the VFs so any BAR updates
happen while the VF BARs are disabled.

[bhelgaas: changelog]
Tested-by: Carol Soto <clsoto@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/iov.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -303,13 +303,6 @@ static int sriov_enable(struct pci_dev *
 			return rc;
 	}
 
-	pci_iov_set_numvfs(dev, nr_virtfn);
-	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
-	pci_cfg_access_lock(dev);
-	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
-	msleep(100);
-	pci_cfg_access_unlock(dev);
-
 	iov->initial_VFs = initial;
 	if (nr_virtfn < initial)
 		initial = nr_virtfn;
@@ -320,6 +313,13 @@ static int sriov_enable(struct pci_dev *
 		goto err_pcibios;
 	}
 
+	pci_iov_set_numvfs(dev, nr_virtfn);
+	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
+	pci_cfg_access_lock(dev);
+	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+	msleep(100);
+	pci_cfg_access_unlock(dev);
+
 	for (i = 0; i < initial; i++) {
 		rc = virtfn_add(dev, i, 0);
 		if (rc)

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 68/76] vfio/spapr: Postpone allocation of userspace version of TCE table
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (64 preceding siblings ...)
  2017-03-28 12:31 ` [PATCH 4.4 67/76] PCI: Do any VF BAR updates before enabling the BARs Greg Kroah-Hartman
@ 2017-03-28 12:31 ` Greg Kroah-Hartman
  2017-03-28 12:31 ` [PATCH 4.4 69/76] block: allow WRITE_SAME commands with the SG_IO ioctl Greg Kroah-Hartman
                   ` (9 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:31 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Alexey Kardashevskiy, David Gibson,
	Alex Williamson, Michael Ellerman, Sasha Levin, Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Alexey Kardashevskiy <aik@ozlabs.ru>

[ Upstream commit 39701e56f5f16ea0cf8fc9e8472e645f8de91d23 ]

The iommu_table struct manages a hardware TCE table and a vmalloc'd
table with corresponding userspace addresses. Both are allocated when
the default DMA window is created and this happens when the very first
group is attached to a container.

As we are going to allow the userspace to configure container in one
memory context and pas container fd to another, we have to postpones
such allocations till a container fd is passed to the destination
user process so we would account locked memory limit against the actual
container user constrainsts.

This postpones the it_userspace array allocation till it is used first
time for mapping. The unmapping patch already checks if the array is
allocated.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/vfio/vfio_iommu_spapr_tce.c |   20 +++++++-------------
 1 file changed, 7 insertions(+), 13 deletions(-)

--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -511,6 +511,12 @@ static long tce_iommu_build_v2(struct tc
 	unsigned long hpa;
 	enum dma_data_direction dirtmp;
 
+	if (!tbl->it_userspace) {
+		ret = tce_iommu_userspace_view_alloc(tbl);
+		if (ret)
+			return ret;
+	}
+
 	for (i = 0; i < pages; ++i) {
 		struct mm_iommu_table_group_mem_t *mem = NULL;
 		unsigned long *pua = IOMMU_TABLE_USERSPACE_ENTRY(tbl,
@@ -584,15 +590,6 @@ static long tce_iommu_create_table(struc
 	WARN_ON(!ret && !(*ptbl)->it_ops->free);
 	WARN_ON(!ret && ((*ptbl)->it_allocated_size != table_size));
 
-	if (!ret && container->v2) {
-		ret = tce_iommu_userspace_view_alloc(*ptbl);
-		if (ret)
-			(*ptbl)->it_ops->free(*ptbl);
-	}
-
-	if (ret)
-		decrement_locked_vm(table_size >> PAGE_SHIFT);
-
 	return ret;
 }
 
@@ -1064,10 +1061,7 @@ static int tce_iommu_take_ownership(stru
 		if (!tbl || !tbl->it_map)
 			continue;
 
-		rc = tce_iommu_userspace_view_alloc(tbl);
-		if (!rc)
-			rc = iommu_take_ownership(tbl);
-
+		rc = iommu_take_ownership(tbl);
 		if (rc) {
 			for (j = 0; j < i; ++j)
 				iommu_release_ownership(

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 69/76] block: allow WRITE_SAME commands with the SG_IO ioctl
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (65 preceding siblings ...)
  2017-03-28 12:31 ` [PATCH 4.4 68/76] vfio/spapr: Postpone allocation of userspace version of TCE table Greg Kroah-Hartman
@ 2017-03-28 12:31 ` Greg Kroah-Hartman
  2017-03-28 12:31 ` [PATCH 4.4 70/76] s390/zcrypt: Introduce CEX6 toleration Greg Kroah-Hartman
                   ` (8 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:31 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Mauricio Faria de Oliveira,
	Brahadambal Srinivasan, Manjunatha H R, Christoph Hellwig,
	Jens Axboe, Sasha Levin, Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>

[ Upstream commit 25cdb64510644f3e854d502d69c73f21c6df88a9 ]

The WRITE_SAME commands are not present in the blk_default_cmd_filter
write_ok list, and thus are failed with -EPERM when the SG_IO ioctl()
is executed without CAP_SYS_RAWIO capability (e.g., unprivileged users).
[ sg_io() -> blk_fill_sghdr_rq() > blk_verify_command() -> -EPERM ]

The problem can be reproduced with the sg_write_same command

  # sg_write_same --num 1 --xferlen 512 /dev/sda
  #

  # capsh --drop=cap_sys_rawio -- -c \
    'sg_write_same --num 1 --xferlen 512 /dev/sda'
    Write same: pass through os error: Operation not permitted
  #

For comparison, the WRITE_VERIFY command does not observe this problem,
since it is in that list:

  # capsh --drop=cap_sys_rawio -- -c \
    'sg_write_verify --num 1 --ilen 512 --lba 0 /dev/sda'
  #

So, this patch adds the WRITE_SAME commands to the list, in order
for the SG_IO ioctl to finish successfully:

  # capsh --drop=cap_sys_rawio -- -c \
    'sg_write_same --num 1 --xferlen 512 /dev/sda'
  #

That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices
(qemu "-device scsi-block" [1], libvirt "<disk type='block' device='lun'>" [2]),
which employs the SG_IO ioctl() and runs as an unprivileged user (libvirt-qemu).

In that scenario, when a filesystem (e.g., ext4) performs its zero-out calls,
which are translated to write-same calls in the guest kernel, and then into
SG_IO ioctls to the host kernel, SCSI I/O errors may be observed in the guest:

  [...] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
  [...] sd 0:0:0:0: [sda] tag#0 Sense Key : Aborted Command [current]
  [...] sd 0:0:0:0: [sda] tag#0 Add. Sense: I/O process terminated
  [...] sd 0:0:0:0: [sda] tag#0 CDB: Write Same(10) 41 00 01 04 e0 78 00 00 08 00
  [...] blk_update_request: I/O error, dev sda, sector 17096824

Links:
[1] http://git.qemu.org/?p=qemu.git;a=commit;h=336a6915bc7089fb20fea4ba99972ad9a97c5f52
[2] https://libvirt.org/formatdomain.html#elementsDisks (see 'disk' -> 'device')

Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Brahadambal Srinivasan <latha@linux.vnet.ibm.com>
Reported-by: Manjunatha H R <manjuhr1@in.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 block/scsi_ioctl.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -182,6 +182,9 @@ static void blk_set_cmd_filter_defaults(
 	__set_bit(WRITE_16, filter->write_ok);
 	__set_bit(WRITE_LONG, filter->write_ok);
 	__set_bit(WRITE_LONG_2, filter->write_ok);
+	__set_bit(WRITE_SAME, filter->write_ok);
+	__set_bit(WRITE_SAME_16, filter->write_ok);
+	__set_bit(WRITE_SAME_32, filter->write_ok);
 	__set_bit(ERASE, filter->write_ok);
 	__set_bit(GPCMD_MODE_SELECT_10, filter->write_ok);
 	__set_bit(MODE_SELECT, filter->write_ok);

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 70/76] s390/zcrypt: Introduce CEX6 toleration
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (66 preceding siblings ...)
  2017-03-28 12:31 ` [PATCH 4.4 69/76] block: allow WRITE_SAME commands with the SG_IO ioctl Greg Kroah-Hartman
@ 2017-03-28 12:31 ` Greg Kroah-Hartman
  2017-03-28 12:31 ` [PATCH 4.4 71/76] uvcvideo: uvc_scan_fallback() for webcams with broken chain Greg Kroah-Hartman
                   ` (7 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:31 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Harald Freudenberger, Martin Schwidefsky,
	Sasha Levin, Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Harald Freudenberger <freude@linux.vnet.ibm.com>

[ Upstream commit b3e8652bcbfa04807e44708d4d0c8cdad39c9215 ]

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/s390/crypto/ap_bus.c |    3 +++
 drivers/s390/crypto/ap_bus.h |    1 +
 2 files changed, 4 insertions(+)

--- a/drivers/s390/crypto/ap_bus.c
+++ b/drivers/s390/crypto/ap_bus.c
@@ -1651,6 +1651,9 @@ static void ap_scan_bus(struct work_stru
 		ap_dev->queue_depth = queue_depth;
 		ap_dev->raw_hwtype = device_type;
 		ap_dev->device_type = device_type;
+		/* CEX6 toleration: map to CEX5 */
+		if (device_type == AP_DEVICE_TYPE_CEX6)
+			ap_dev->device_type = AP_DEVICE_TYPE_CEX5;
 		ap_dev->functions = device_functions;
 		spin_lock_init(&ap_dev->lock);
 		INIT_LIST_HEAD(&ap_dev->pendingq);
--- a/drivers/s390/crypto/ap_bus.h
+++ b/drivers/s390/crypto/ap_bus.h
@@ -105,6 +105,7 @@ static inline int ap_test_bit(unsigned i
 #define AP_DEVICE_TYPE_CEX3C	9
 #define AP_DEVICE_TYPE_CEX4	10
 #define AP_DEVICE_TYPE_CEX5	11
+#define AP_DEVICE_TYPE_CEX6	12
 
 /*
  * Known function facilities

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 71/76] uvcvideo: uvc_scan_fallback() for webcams with broken chain
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (67 preceding siblings ...)
  2017-03-28 12:31 ` [PATCH 4.4 70/76] s390/zcrypt: Introduce CEX6 toleration Greg Kroah-Hartman
@ 2017-03-28 12:31 ` Greg Kroah-Hartman
  2017-03-28 12:31 ` [PATCH 4.4 72/76] ACPI / blacklist: add _REV quirks for Dell Precision 5520 and 3520 Greg Kroah-Hartman
                   ` (6 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:31 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Henrik Ingo, Laurent Pinchart,
	Mauro Carvalho Chehab, Sasha Levin, Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Henrik Ingo <henrik.ingo@avoinelama.fi>

[ Upstream commit e950267ab802c8558f1100eafd4087fd039ad634 ]

Some devices have invalid baSourceID references, causing uvc_scan_chain()
to fail, but if we just take the entities we can find and put them
together in the most sensible chain we can think of, turns out they do
work anyway. Note: This heuristic assumes there is a single chain.

At the time of writing, devices known to have such a broken chain are
  - Acer Integrated Camera (5986:055a)
  - Realtek rtl157a7 (0bda:57a7)

Signed-off-by: Henrik Ingo <henrik.ingo@avoinelama.fi>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/media/usb/uvc/uvc_driver.c |  118 +++++++++++++++++++++++++++++++++++--
 1 file changed, 112 insertions(+), 6 deletions(-)

--- a/drivers/media/usb/uvc/uvc_driver.c
+++ b/drivers/media/usb/uvc/uvc_driver.c
@@ -1595,6 +1595,114 @@ static const char *uvc_print_chain(struc
 	return buffer;
 }
 
+static struct uvc_video_chain *uvc_alloc_chain(struct uvc_device *dev)
+{
+	struct uvc_video_chain *chain;
+
+	chain = kzalloc(sizeof(*chain), GFP_KERNEL);
+	if (chain == NULL)
+		return NULL;
+
+	INIT_LIST_HEAD(&chain->entities);
+	mutex_init(&chain->ctrl_mutex);
+	chain->dev = dev;
+	v4l2_prio_init(&chain->prio);
+
+	return chain;
+}
+
+/*
+ * Fallback heuristic for devices that don't connect units and terminals in a
+ * valid chain.
+ *
+ * Some devices have invalid baSourceID references, causing uvc_scan_chain()
+ * to fail, but if we just take the entities we can find and put them together
+ * in the most sensible chain we can think of, turns out they do work anyway.
+ * Note: This heuristic assumes there is a single chain.
+ *
+ * At the time of writing, devices known to have such a broken chain are
+ *  - Acer Integrated Camera (5986:055a)
+ *  - Realtek rtl157a7 (0bda:57a7)
+ */
+static int uvc_scan_fallback(struct uvc_device *dev)
+{
+	struct uvc_video_chain *chain;
+	struct uvc_entity *iterm = NULL;
+	struct uvc_entity *oterm = NULL;
+	struct uvc_entity *entity;
+	struct uvc_entity *prev;
+
+	/*
+	 * Start by locating the input and output terminals. We only support
+	 * devices with exactly one of each for now.
+	 */
+	list_for_each_entry(entity, &dev->entities, list) {
+		if (UVC_ENTITY_IS_ITERM(entity)) {
+			if (iterm)
+				return -EINVAL;
+			iterm = entity;
+		}
+
+		if (UVC_ENTITY_IS_OTERM(entity)) {
+			if (oterm)
+				return -EINVAL;
+			oterm = entity;
+		}
+	}
+
+	if (iterm == NULL || oterm == NULL)
+		return -EINVAL;
+
+	/* Allocate the chain and fill it. */
+	chain = uvc_alloc_chain(dev);
+	if (chain == NULL)
+		return -ENOMEM;
+
+	if (uvc_scan_chain_entity(chain, oterm) < 0)
+		goto error;
+
+	prev = oterm;
+
+	/*
+	 * Add all Processing and Extension Units with two pads. The order
+	 * doesn't matter much, use reverse list traversal to connect units in
+	 * UVC descriptor order as we build the chain from output to input. This
+	 * leads to units appearing in the order meant by the manufacturer for
+	 * the cameras known to require this heuristic.
+	 */
+	list_for_each_entry_reverse(entity, &dev->entities, list) {
+		if (entity->type != UVC_VC_PROCESSING_UNIT &&
+		    entity->type != UVC_VC_EXTENSION_UNIT)
+			continue;
+
+		if (entity->num_pads != 2)
+			continue;
+
+		if (uvc_scan_chain_entity(chain, entity) < 0)
+			goto error;
+
+		prev->baSourceID[0] = entity->id;
+		prev = entity;
+	}
+
+	if (uvc_scan_chain_entity(chain, iterm) < 0)
+		goto error;
+
+	prev->baSourceID[0] = iterm->id;
+
+	list_add_tail(&chain->list, &dev->chains);
+
+	uvc_trace(UVC_TRACE_PROBE,
+		  "Found a video chain by fallback heuristic (%s).\n",
+		  uvc_print_chain(chain));
+
+	return 0;
+
+error:
+	kfree(chain);
+	return -EINVAL;
+}
+
 /*
  * Scan the device for video chains and register video devices.
  *
@@ -1617,15 +1725,10 @@ static int uvc_scan_device(struct uvc_de
 		if (term->chain.next || term->chain.prev)
 			continue;
 
-		chain = kzalloc(sizeof(*chain), GFP_KERNEL);
+		chain = uvc_alloc_chain(dev);
 		if (chain == NULL)
 			return -ENOMEM;
 
-		INIT_LIST_HEAD(&chain->entities);
-		mutex_init(&chain->ctrl_mutex);
-		chain->dev = dev;
-		v4l2_prio_init(&chain->prio);
-
 		term->flags |= UVC_ENTITY_FLAG_DEFAULT;
 
 		if (uvc_scan_chain(chain, term) < 0) {
@@ -1639,6 +1742,9 @@ static int uvc_scan_device(struct uvc_de
 		list_add_tail(&chain->list, &dev->chains);
 	}
 
+	if (list_empty(&dev->chains))
+		uvc_scan_fallback(dev);
+
 	if (list_empty(&dev->chains)) {
 		uvc_printk(KERN_INFO, "No valid video chain found.\n");
 		return -1;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 72/76] ACPI / blacklist: add _REV quirks for Dell Precision 5520 and 3520
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (68 preceding siblings ...)
  2017-03-28 12:31 ` [PATCH 4.4 71/76] uvcvideo: uvc_scan_fallback() for webcams with broken chain Greg Kroah-Hartman
@ 2017-03-28 12:31 ` Greg Kroah-Hartman
  2017-03-28 12:31 ` [PATCH 4.4 73/76] ACPI / blacklist: Make Dell Latitude 3350 ethernet work Greg Kroah-Hartman
                   ` (5 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:31 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Alex Hung, Rafael J. Wysocki, Sasha Levin,
	Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Alex Hung <alex.hung@canonical.com>

[ Upstream commit 9523b9bf6dceef6b0215e90b2348cd646597f796 ]

Precision 5520 and 3520 either hang at login and during suspend or reboot.

It turns out that that adding them to acpi_rev_dmi_table[] helps to work
around those issues.

Signed-off-by: Alex Hung <alex.hung@canonical.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/acpi/blacklist.c |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)

--- a/drivers/acpi/blacklist.c
+++ b/drivers/acpi/blacklist.c
@@ -346,6 +346,22 @@ static struct dmi_system_id acpi_osi_dmi
 		      DMI_MATCH(DMI_PRODUCT_NAME, "XPS 13 9343"),
 		},
 	},
+	{
+	 .callback = dmi_enable_rev_override,
+	 .ident = "DELL Precision 5520",
+	 .matches = {
+		      DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+		      DMI_MATCH(DMI_PRODUCT_NAME, "Precision 5520"),
+		},
+	},
+	{
+	 .callback = dmi_enable_rev_override,
+	 .ident = "DELL Precision 3520",
+	 .matches = {
+		      DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+		      DMI_MATCH(DMI_PRODUCT_NAME, "Precision 3520"),
+		},
+	},
 #endif
 	{}
 };

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 73/76] ACPI / blacklist: Make Dell Latitude 3350 ethernet work
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (69 preceding siblings ...)
  2017-03-28 12:31 ` [PATCH 4.4 72/76] ACPI / blacklist: add _REV quirks for Dell Precision 5520 and 3520 Greg Kroah-Hartman
@ 2017-03-28 12:31 ` Greg Kroah-Hartman
  2017-03-28 12:31 ` [PATCH 4.4 74/76] serial: 8250_pci: Detach low-level driver during PCI error recovery Greg Kroah-Hartman
                   ` (4 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:31 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Michael Pobega, Rafael J. Wysocki,
	Sasha Levin, Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Michael Pobega <mpobega@neverware.com>

[ Upstream commit 708f5dcc21ae9b35f395865fc154b0105baf4de4 ]

The Dell Latitude 3350's ethernet card attempts to use a reserved
IRQ (18), resulting in ACPI being unable to enable the ethernet.

Adding it to acpi_rev_dmi_table[] helps to work around this problem.

Signed-off-by: Michael Pobega <mpobega@neverware.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/acpi/blacklist.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

--- a/drivers/acpi/blacklist.c
+++ b/drivers/acpi/blacklist.c
@@ -362,6 +362,18 @@ static struct dmi_system_id acpi_osi_dmi
 		      DMI_MATCH(DMI_PRODUCT_NAME, "Precision 3520"),
 		},
 	},
+	/*
+	 * Resolves a quirk with the Dell Latitude 3350 that
+	 * causes the ethernet adapter to not function.
+	 */
+	{
+	 .callback = dmi_enable_rev_override,
+	 .ident = "DELL Latitude 3350",
+	 .matches = {
+		      DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+		      DMI_MATCH(DMI_PRODUCT_NAME, "Latitude 3350"),
+		},
+	},
 #endif
 	{}
 };

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 74/76] serial: 8250_pci: Detach low-level driver during PCI error recovery
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (70 preceding siblings ...)
  2017-03-28 12:31 ` [PATCH 4.4 73/76] ACPI / blacklist: Make Dell Latitude 3350 ethernet work Greg Kroah-Hartman
@ 2017-03-28 12:31 ` Greg Kroah-Hartman
  2017-04-04 20:26   ` Ben Hutchings
  2017-03-28 12:31 ` [PATCH 4.4 75/76] fbcon: Fix vc attr at deinit Greg Kroah-Hartman
                   ` (3 subsequent siblings)
  75 siblings, 1 reply; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:31 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Gabriel Krisman Bertazi, Sasha Levin, Sumit Semwal

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sumit Semwal <sumit.semwal@linaro.org>


From: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>

[ Upstream commit f209fa03fc9d131b3108c2e4936181eabab87416 ]

During a PCI error recovery, like the ones provoked by EEH in the ppc64
platform, all IO to the device must be blocked while the recovery is
completed.  Current 8250_pci implementation only suspends the port
instead of detaching it, which doesn't prevent incoming accesses like
TIOCMGET and TIOCMSET calls from reaching the device.  Those end up
racing with the EEH recovery, crashing it.  Similar races were also
observed when opening the device and when shutting it down during
recovery.

This patch implements a more robust IO blockage for the 8250_pci
recovery by unregistering the port at the beginning of the procedure and
re-adding it afterwards.  Since the port is detached from the uart
layer, we can be sure that no request will make through to the device
during recovery.  This is similar to the solution used by the JSM serial
driver.

I thank Peter Hurley <peter@hurleysoftware.com> for valuable input on
this one over one year ago.

Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/tty/serial/8250/8250_pci.c |   23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

--- a/drivers/tty/serial/8250/8250_pci.c
+++ b/drivers/tty/serial/8250/8250_pci.c
@@ -57,6 +57,7 @@ struct serial_private {
 	unsigned int		nr;
 	void __iomem		*remapped_bar[PCI_NUM_BAR_RESOURCES];
 	struct pci_serial_quirk	*quirk;
+	const struct pciserial_board *board;
 	int			line[0];
 };
 
@@ -4058,6 +4059,7 @@ pciserial_init_ports(struct pci_dev *dev
 		}
 	}
 	priv->nr = i;
+	priv->board = board;
 	return priv;
 
 err_deinit:
@@ -4068,7 +4070,7 @@ err_out:
 }
 EXPORT_SYMBOL_GPL(pciserial_init_ports);
 
-void pciserial_remove_ports(struct serial_private *priv)
+void pciserial_detach_ports(struct serial_private *priv)
 {
 	struct pci_serial_quirk *quirk;
 	int i;
@@ -4088,7 +4090,11 @@ void pciserial_remove_ports(struct seria
 	quirk = find_quirk(priv->dev);
 	if (quirk->exit)
 		quirk->exit(priv->dev);
+}
 
+void pciserial_remove_ports(struct serial_private *priv)
+{
+	pciserial_detach_ports(priv);
 	kfree(priv);
 }
 EXPORT_SYMBOL_GPL(pciserial_remove_ports);
@@ -5819,7 +5825,7 @@ static pci_ers_result_t serial8250_io_er
 		return PCI_ERS_RESULT_DISCONNECT;
 
 	if (priv)
-		pciserial_suspend_ports(priv);
+		pciserial_detach_ports(priv);
 
 	pci_disable_device(dev);
 
@@ -5844,9 +5850,18 @@ static pci_ers_result_t serial8250_io_sl
 static void serial8250_io_resume(struct pci_dev *dev)
 {
 	struct serial_private *priv = pci_get_drvdata(dev);
+	const struct pciserial_board *board;
 
-	if (priv)
-		pciserial_resume_ports(priv);
+	if (!priv)
+		return;
+
+	board = priv->board;
+	kfree(priv);
+	priv = pciserial_init_ports(dev, board);
+
+	if (!IS_ERR(priv)) {
+		pci_set_drvdata(dev, priv);
+	}
 }
 
 static const struct pci_error_handlers serial8250_err_handler = {

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 75/76] fbcon: Fix vc attr at deinit
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (71 preceding siblings ...)
  2017-03-28 12:31 ` [PATCH 4.4 74/76] serial: 8250_pci: Detach low-level driver during PCI error recovery Greg Kroah-Hartman
@ 2017-03-28 12:31 ` Greg Kroah-Hartman
  2017-03-28 12:31 ` [PATCH 4.4 76/76] crypto: algif_hash - avoid zero-sized array Greg Kroah-Hartman
                   ` (2 subsequent siblings)
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Takashi Iwai,
	Bartlomiej Zolnierkiewicz, Arnd Bergmann

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <tiwai@suse.de>

commit 8aac7f34369726d1a158788ae8aff3002d5eb528 upstream.

fbcon can deal with vc_hi_font_mask (the upper 256 chars) and adjust
the vc attrs dynamically when vc_hi_font_mask is changed at
fbcon_init().  When the vc_hi_font_mask is set, it remaps the attrs in
the existing console buffer with one bit shift up (for 9 bits), while
it remaps with one bit shift down (for 8 bits) when the value is
cleared.  It works fine as long as the font gets updated after fbcon
was initialized.

However, we hit a bizarre problem when the console is switched to
another fb driver (typically from vesafb or efifb to drmfb).  At
switching to the new fb driver, we temporarily rebind the console to
the dummy console, then rebind to the new driver.  During the
switching, we leave the modified attrs as is.  Thus, the new fbcon
takes over the old buffer as if it were to contain 8 bits chars
(although the attrs are still shifted for 9 bits), and effectively
this results in the yellow color texts instead of the original white
color, as found in the bugzilla entry below.

An easy fix for this is to re-adjust the attrs before leaving the
fbcon at con_deinit callback.  Since the code to adjust the attrs is
already present in the current fbcon code, in this patch, we simply
factor out the relevant code, and call it from fbcon_deinit().

Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1000619
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/video/console/fbcon.c |   67 +++++++++++++++++++++++++-----------------
 1 file changed, 40 insertions(+), 27 deletions(-)

--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -1168,6 +1168,8 @@ static void fbcon_free_font(struct displ
 	p->userfont = 0;
 }
 
+static void set_vc_hi_font(struct vc_data *vc, bool set);
+
 static void fbcon_deinit(struct vc_data *vc)
 {
 	struct display *p = &fb_display[vc->vc_num];
@@ -1203,6 +1205,9 @@ finished:
 	if (free_font)
 		vc->vc_font.data = NULL;
 
+	if (vc->vc_hi_font_mask)
+		set_vc_hi_font(vc, false);
+
 	if (!con_is_bound(&fb_con))
 		fbcon_exit();
 
@@ -2439,32 +2444,10 @@ static int fbcon_get_font(struct vc_data
 	return 0;
 }
 
-static int fbcon_do_set_font(struct vc_data *vc, int w, int h,
-			     const u8 * data, int userfont)
+/* set/clear vc_hi_font_mask and update vc attrs accordingly */
+static void set_vc_hi_font(struct vc_data *vc, bool set)
 {
-	struct fb_info *info = registered_fb[con2fb_map[vc->vc_num]];
-	struct fbcon_ops *ops = info->fbcon_par;
-	struct display *p = &fb_display[vc->vc_num];
-	int resize;
-	int cnt;
-	char *old_data = NULL;
-
-	if (CON_IS_VISIBLE(vc) && softback_lines)
-		fbcon_set_origin(vc);
-
-	resize = (w != vc->vc_font.width) || (h != vc->vc_font.height);
-	if (p->userfont)
-		old_data = vc->vc_font.data;
-	if (userfont)
-		cnt = FNTCHARCNT(data);
-	else
-		cnt = 256;
-	vc->vc_font.data = (void *)(p->fontdata = data);
-	if ((p->userfont = userfont))
-		REFCOUNT(data)++;
-	vc->vc_font.width = w;
-	vc->vc_font.height = h;
-	if (vc->vc_hi_font_mask && cnt == 256) {
+	if (!set) {
 		vc->vc_hi_font_mask = 0;
 		if (vc->vc_can_do_color) {
 			vc->vc_complement_mask >>= 1;
@@ -2487,7 +2470,7 @@ static int fbcon_do_set_font(struct vc_d
 			    ((c & 0xfe00) >> 1) | (c & 0xff);
 			vc->vc_attr >>= 1;
 		}
-	} else if (!vc->vc_hi_font_mask && cnt == 512) {
+	} else {
 		vc->vc_hi_font_mask = 0x100;
 		if (vc->vc_can_do_color) {
 			vc->vc_complement_mask <<= 1;
@@ -2519,8 +2502,38 @@ static int fbcon_do_set_font(struct vc_d
 			} else
 				vc->vc_video_erase_char = c & ~0x100;
 		}
-
 	}
+}
+
+static int fbcon_do_set_font(struct vc_data *vc, int w, int h,
+			     const u8 * data, int userfont)
+{
+	struct fb_info *info = registered_fb[con2fb_map[vc->vc_num]];
+	struct fbcon_ops *ops = info->fbcon_par;
+	struct display *p = &fb_display[vc->vc_num];
+	int resize;
+	int cnt;
+	char *old_data = NULL;
+
+	if (CON_IS_VISIBLE(vc) && softback_lines)
+		fbcon_set_origin(vc);
+
+	resize = (w != vc->vc_font.width) || (h != vc->vc_font.height);
+	if (p->userfont)
+		old_data = vc->vc_font.data;
+	if (userfont)
+		cnt = FNTCHARCNT(data);
+	else
+		cnt = 256;
+	vc->vc_font.data = (void *)(p->fontdata = data);
+	if ((p->userfont = userfont))
+		REFCOUNT(data)++;
+	vc->vc_font.width = w;
+	vc->vc_font.height = h;
+	if (vc->vc_hi_font_mask && cnt == 256)
+		set_vc_hi_font(vc, false);
+	else if (!vc->vc_hi_font_mask && cnt == 512)
+		set_vc_hi_font(vc, true);
 
 	if (resize) {
 		int cols, rows;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 4.4 76/76] crypto: algif_hash - avoid zero-sized array
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (72 preceding siblings ...)
  2017-03-28 12:31 ` [PATCH 4.4 75/76] fbcon: Fix vc attr at deinit Greg Kroah-Hartman
@ 2017-03-28 12:31 ` Greg Kroah-Hartman
  2017-03-28 19:38 ` [PATCH 4.4 00/76] 4.4.58-stable review Shuah Khan
  2017-03-29  2:58 ` Guenter Roeck
  75 siblings, 0 replies; 106+ messages in thread
From: Greg Kroah-Hartman @ 2017-03-28 12:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jiri Slaby, Herbert Xu, Sasha Levin,
	Arnd Bergmann, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jiri Slaby <jslaby@suse.cz>

commit 6207119444595d287b1e9e83a2066c17209698f3 upstream.

With this reproducer:
  struct sockaddr_alg alg = {
          .salg_family = 0x26,
          .salg_type = "hash",
          .salg_feat = 0xf,
          .salg_mask = 0x5,
          .salg_name = "digest_null",
  };
  int sock, sock2;

  sock = socket(AF_ALG, SOCK_SEQPACKET, 0);
  bind(sock, (struct sockaddr *)&alg, sizeof(alg));
  sock2 = accept(sock, NULL, NULL);
  setsockopt(sock, SOL_ALG, ALG_SET_KEY, "\x9b\xca", 2);
  accept(sock2, NULL, NULL);

==== 8< ======== 8< ======== 8< ======== 8< ====

one can immediatelly see an UBSAN warning:
UBSAN: Undefined behaviour in crypto/algif_hash.c:187:7
variable length array bound value 0 <= 0
CPU: 0 PID: 15949 Comm: syz-executor Tainted: G            E      4.4.30-0-default #1
...
Call Trace:
...
 [<ffffffff81d598fd>] ? __ubsan_handle_vla_bound_not_positive+0x13d/0x188
 [<ffffffff81d597c0>] ? __ubsan_handle_out_of_bounds+0x1bc/0x1bc
 [<ffffffffa0e2204d>] ? hash_accept+0x5bd/0x7d0 [algif_hash]
 [<ffffffffa0e2293f>] ? hash_accept_nokey+0x3f/0x51 [algif_hash]
 [<ffffffffa0e206b0>] ? hash_accept_parent_nokey+0x4a0/0x4a0 [algif_hash]
 [<ffffffff8235c42b>] ? SyS_accept+0x2b/0x40

It is a correct warning, as hash state is propagated to accept as zero,
but creating a zero-length variable array is not allowed in C.

Fix this as proposed by Herbert -- do "?: 1" on that site. No sizeof or
similar happens in the code there, so we just allocate one byte even
though we do not use the array.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net> (maintainer:CRYPTO API)
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 crypto/algif_hash.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -184,7 +184,7 @@ static int hash_accept(struct socket *so
 	struct alg_sock *ask = alg_sk(sk);
 	struct hash_ctx *ctx = ask->private;
 	struct ahash_request *req = &ctx->req;
-	char state[crypto_ahash_statesize(crypto_ahash_reqtfm(req))];
+	char state[crypto_ahash_statesize(crypto_ahash_reqtfm(req)) ? : 1];
 	struct sock *sk2;
 	struct alg_sock *ask2;
 	struct hash_ctx *ctx2;

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-28 12:30 ` [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations Greg Kroah-Hartman
@ 2017-03-28 12:43   ` Michal Hocko
  2017-03-28 13:23     ` Ilya Dryomov
  0 siblings, 1 reply; 106+ messages in thread
From: Michal Hocko @ 2017-03-28 12:43 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, Sergey Jerusalimov, Ilya Dryomov, Jeff Layton

On Tue 28-03-17 14:30:45, Greg KH wrote:
> 4.4-stable review patch.  If anyone has any objections, please let me know.

I haven't seen the original patch but the changelog makes me worried.
How exactly this is a problem? Where do we lockup? Does rbd/libceph take
any xfs locks?

> ------------------
> 
> From: Ilya Dryomov <idryomov@gmail.com>
> 
> commit 633ee407b9d15a75ac9740ba9d3338815e1fcb95 upstream.
> 
> sock_alloc_inode() allocates socket+inode and socket_wq with
> GFP_KERNEL, which is not allowed on the writeback path:
> 
>     Workqueue: ceph-msgr con_work [libceph]
>     ffff8810871cb018 0000000000000046 0000000000000000 ffff881085d40000
>     0000000000012b00 ffff881025cad428 ffff8810871cbfd8 0000000000012b00
>     ffff880102fc1000 ffff881085d40000 ffff8810871cb038 ffff8810871cb148
>     Call Trace:
>     [<ffffffff816dd629>] schedule+0x29/0x70
>     [<ffffffff816e066d>] schedule_timeout+0x1bd/0x200
>     [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
>     [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
>     [<ffffffff816deb5f>] wait_for_completion+0xbf/0x180
>     [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
>     [<ffffffff81086335>] flush_work+0x165/0x250
>     [<ffffffff81082940>] ? worker_detach_from_pool+0xd0/0xd0
>     [<ffffffffa03b65b1>] xlog_cil_force_lsn+0x81/0x200 [xfs]
>     [<ffffffff816d6b42>] ? __slab_free+0xee/0x234
>     [<ffffffffa03b4b1d>] _xfs_log_force_lsn+0x4d/0x2c0 [xfs]
>     [<ffffffff811adc1e>] ? lookup_page_cgroup_used+0xe/0x30
>     [<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
>     [<ffffffffa03b4dcf>] xfs_log_force_lsn+0x3f/0xf0 [xfs]
>     [<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
>     [<ffffffffa03a62c6>] xfs_iunpin_wait+0xc6/0x1a0 [xfs]
>     [<ffffffff810aa250>] ? wake_atomic_t_function+0x40/0x40
>     [<ffffffffa039a723>] xfs_reclaim_inode+0xa3/0x330 [xfs]
>     [<ffffffffa039ac07>] xfs_reclaim_inodes_ag+0x257/0x3d0 [xfs]
>     [<ffffffffa039bb13>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
>     [<ffffffffa03ab745>] xfs_fs_free_cached_objects+0x15/0x20 [xfs]
>     [<ffffffff811c0c18>] super_cache_scan+0x178/0x180
>     [<ffffffff8115912e>] shrink_slab_node+0x14e/0x340
>     [<ffffffff811afc3b>] ? mem_cgroup_iter+0x16b/0x450
>     [<ffffffff8115af70>] shrink_slab+0x100/0x140
>     [<ffffffff8115e425>] do_try_to_free_pages+0x335/0x490
>     [<ffffffff8115e7f9>] try_to_free_pages+0xb9/0x1f0
>     [<ffffffff816d56e4>] ? __alloc_pages_direct_compact+0x69/0x1be
>     [<ffffffff81150cba>] __alloc_pages_nodemask+0x69a/0xb40
>     [<ffffffff8119743e>] alloc_pages_current+0x9e/0x110
>     [<ffffffff811a0ac5>] new_slab+0x2c5/0x390
>     [<ffffffff816d71c4>] __slab_alloc+0x33b/0x459
>     [<ffffffff815b906d>] ? sock_alloc_inode+0x2d/0xd0
>     [<ffffffff8164bda1>] ? inet_sendmsg+0x71/0xc0
>     [<ffffffff815b906d>] ? sock_alloc_inode+0x2d/0xd0
>     [<ffffffff811a21f2>] kmem_cache_alloc+0x1a2/0x1b0
>     [<ffffffff815b906d>] sock_alloc_inode+0x2d/0xd0
>     [<ffffffff811d8566>] alloc_inode+0x26/0xa0
>     [<ffffffff811da04a>] new_inode_pseudo+0x1a/0x70
>     [<ffffffff815b933e>] sock_alloc+0x1e/0x80
>     [<ffffffff815ba855>] __sock_create+0x95/0x220
>     [<ffffffff815baa04>] sock_create_kern+0x24/0x30
>     [<ffffffffa04794d9>] con_work+0xef9/0x2050 [libceph]
>     [<ffffffffa04aa9ec>] ? rbd_img_request_submit+0x4c/0x60 [rbd]
>     [<ffffffff81084c19>] process_one_work+0x159/0x4f0
>     [<ffffffff8108561b>] worker_thread+0x11b/0x530
>     [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
>     [<ffffffff8108b6f9>] kthread+0xc9/0xe0
>     [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
>     [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
>     [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> 
> Use memalloc_noio_{save,restore}() to temporarily force GFP_NOIO here.
> 
> Link: http://tracker.ceph.com/issues/19309
> Reported-by: Sergey Jerusalimov <wintchester@gmail.com>
> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
> Reviewed-by: Jeff Layton <jlayton@redhat.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> ---
>  net/ceph/messenger.c |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -7,6 +7,7 @@
>  #include <linux/kthread.h>
>  #include <linux/net.h>
>  #include <linux/nsproxy.h>
> +#include <linux/sched.h>
>  #include <linux/slab.h>
>  #include <linux/socket.h>
>  #include <linux/string.h>
> @@ -478,11 +479,16 @@ static int ceph_tcp_connect(struct ceph_
>  {
>  	struct sockaddr_storage *paddr = &con->peer_addr.in_addr;
>  	struct socket *sock;
> +	unsigned int noio_flag;
>  	int ret;
>  
>  	BUG_ON(con->sock);
> +
> +	/* sock_create_kern() allocates with GFP_KERNEL */
> +	noio_flag = memalloc_noio_save();
>  	ret = sock_create_kern(read_pnet(&con->msgr->net), paddr->ss_family,
>  			       SOCK_STREAM, IPPROTO_TCP, &sock);
> +	memalloc_noio_restore(noio_flag);
>  	if (ret)
>  		return ret;
>  	sock->sk->sk_allocation = GFP_NOFS;
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-28 12:43   ` Michal Hocko
@ 2017-03-28 13:23     ` Ilya Dryomov
  2017-03-28 13:30       ` Michal Hocko
  0 siblings, 1 reply; 106+ messages in thread
From: Ilya Dryomov @ 2017-03-28 13:23 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton

On Tue, Mar 28, 2017 at 2:43 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Tue 28-03-17 14:30:45, Greg KH wrote:
>> 4.4-stable review patch.  If anyone has any objections, please let me know.
>
> I haven't seen the original patch but the changelog makes me worried.
> How exactly this is a problem? Where do we lockup? Does rbd/libceph take
> any xfs locks?

No, it doesn't.  This is just another instance of "using GFP_KERNEL on
the writeback path may lead to a deadlock" with nothing extra to it.

XFS is writing out data, libceph messenger worker tries to open
a socket and recurses back into XFS because the sockfs inode is
allocated with GFP_KERNEL.  The message with some of the data never
goes out and eventually we get a deadlock.

I've only included the offending stack trace.  I guess I should have
stressed that ceph-msgr workqueue is used for reclaim.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-28 13:23     ` Ilya Dryomov
@ 2017-03-28 13:30       ` Michal Hocko
  2017-03-29  9:21         ` Ilya Dryomov
  0 siblings, 1 reply; 106+ messages in thread
From: Michal Hocko @ 2017-03-28 13:30 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton

On Tue 28-03-17 15:23:58, Ilya Dryomov wrote:
> On Tue, Mar 28, 2017 at 2:43 PM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Tue 28-03-17 14:30:45, Greg KH wrote:
> >> 4.4-stable review patch.  If anyone has any objections, please let me know.
> >
> > I haven't seen the original patch but the changelog makes me worried.
> > How exactly this is a problem? Where do we lockup? Does rbd/libceph take
> > any xfs locks?
> 
> No, it doesn't.  This is just another instance of "using GFP_KERNEL on
> the writeback path may lead to a deadlock" with nothing extra to it.
> 
> XFS is writing out data, libceph messenger worker tries to open
> a socket and recurses back into XFS because the sockfs inode is
> allocated with GFP_KERNEL.  The message with some of the data never
> goes out and eventually we get a deadlock.
> 
> I've only included the offending stack trace.  I guess I should have
> stressed that ceph-msgr workqueue is used for reclaim.

Could you be more specific about the lockup scenario. I still do not get
how this would lead to a deadlock.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 00/76] 4.4.58-stable review
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (73 preceding siblings ...)
  2017-03-28 12:31 ` [PATCH 4.4 76/76] crypto: algif_hash - avoid zero-sized array Greg Kroah-Hartman
@ 2017-03-28 19:38 ` Shuah Khan
  2017-03-29  2:58 ` Guenter Roeck
  75 siblings, 0 replies; 106+ messages in thread
From: Shuah Khan @ 2017-03-28 19:38 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, linux, patches, ben.hutchings, stable, Shuah Khan

On 03/28/2017 06:29 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.4.58 release.
> There are 76 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu Mar 30 12:25:40 UTC 2017.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.58-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 00/76] 4.4.58-stable review
  2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
                   ` (74 preceding siblings ...)
  2017-03-28 19:38 ` [PATCH 4.4 00/76] 4.4.58-stable review Shuah Khan
@ 2017-03-29  2:58 ` Guenter Roeck
  75 siblings, 0 replies; 106+ messages in thread
From: Guenter Roeck @ 2017-03-29  2:58 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, shuahkh, patches, ben.hutchings, stable

On 03/28/2017 05:29 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.4.58 release.
> There are 76 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Thu Mar 30 12:25:40 UTC 2017.
> Anything received after that time might be too late.
>

Build results:
	total: 149 pass: 149 fail: 0
Qemu test results:
	total: 115 pass: 115 fail: 0

Details are available at http://kerneltests.org/builders.

Guenter

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-28 13:30       ` Michal Hocko
@ 2017-03-29  9:21         ` Ilya Dryomov
  2017-03-29 10:41           ` Michal Hocko
  0 siblings, 1 reply; 106+ messages in thread
From: Ilya Dryomov @ 2017-03-29  9:21 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton

On Tue, Mar 28, 2017 at 3:30 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Tue 28-03-17 15:23:58, Ilya Dryomov wrote:
>> On Tue, Mar 28, 2017 at 2:43 PM, Michal Hocko <mhocko@kernel.org> wrote:
>> > On Tue 28-03-17 14:30:45, Greg KH wrote:
>> >> 4.4-stable review patch.  If anyone has any objections, please let me know.
>> >
>> > I haven't seen the original patch but the changelog makes me worried.
>> > How exactly this is a problem? Where do we lockup? Does rbd/libceph take
>> > any xfs locks?
>>
>> No, it doesn't.  This is just another instance of "using GFP_KERNEL on
>> the writeback path may lead to a deadlock" with nothing extra to it.
>>
>> XFS is writing out data, libceph messenger worker tries to open
>> a socket and recurses back into XFS because the sockfs inode is
>> allocated with GFP_KERNEL.  The message with some of the data never
>> goes out and eventually we get a deadlock.
>>
>> I've only included the offending stack trace.  I guess I should have
>> stressed that ceph-msgr workqueue is used for reclaim.
>
> Could you be more specific about the lockup scenario. I still do not get
> how this would lead to a deadlock.

This is a set of stack traces from http://tracker.ceph.com/issues/19309
(linked in the changelog):

Workqueue: ceph-msgr con_work [libceph]
ffff8810871cb018 0000000000000046 0000000000000000 ffff881085d40000
0000000000012b00 ffff881025cad428 ffff8810871cbfd8 0000000000012b00
ffff880102fc1000 ffff881085d40000 ffff8810871cb038 ffff8810871cb148
Call Trace:
[<ffffffff816dd629>] schedule+0x29/0x70
[<ffffffff816e066d>] schedule_timeout+0x1bd/0x200
[<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
[<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
[<ffffffff816deb5f>] wait_for_completion+0xbf/0x180
[<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
[<ffffffff81086335>] flush_work+0x165/0x250
[<ffffffff81082940>] ? worker_detach_from_pool+0xd0/0xd0
[<ffffffffa03b65b1>] xlog_cil_force_lsn+0x81/0x200 [xfs]
[<ffffffff816d6b42>] ? __slab_free+0xee/0x234
[<ffffffffa03b4b1d>] _xfs_log_force_lsn+0x4d/0x2c0 [xfs]
[<ffffffff811adc1e>] ? lookup_page_cgroup_used+0xe/0x30
[<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
[<ffffffffa03b4dcf>] xfs_log_force_lsn+0x3f/0xf0 [xfs]
[<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
[<ffffffffa03a62c6>] xfs_iunpin_wait+0xc6/0x1a0 [xfs]
[<ffffffff810aa250>] ? wake_atomic_t_function+0x40/0x40
[<ffffffffa039a723>] xfs_reclaim_inode+0xa3/0x330 [xfs]
[<ffffffffa039ac07>] xfs_reclaim_inodes_ag+0x257/0x3d0 [xfs]
[<ffffffffa039bb13>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
[<ffffffffa03ab745>] xfs_fs_free_cached_objects+0x15/0x20 [xfs]
[<ffffffff811c0c18>] super_cache_scan+0x178/0x180
[<ffffffff8115912e>] shrink_slab_node+0x14e/0x340
[<ffffffff811afc3b>] ? mem_cgroup_iter+0x16b/0x450
[<ffffffff8115af70>] shrink_slab+0x100/0x140
[<ffffffff8115e425>] do_try_to_free_pages+0x335/0x490
[<ffffffff8115e7f9>] try_to_free_pages+0xb9/0x1f0
[<ffffffff816d56e4>] ? __alloc_pages_direct_compact+0x69/0x1be
[<ffffffff81150cba>] __alloc_pages_nodemask+0x69a/0xb40
[<ffffffff8119743e>] alloc_pages_current+0x9e/0x110
[<ffffffff811a0ac5>] new_slab+0x2c5/0x390
[<ffffffff816d71c4>] __slab_alloc+0x33b/0x459
[<ffffffff815b906d>] ? sock_alloc_inode+0x2d/0xd0
[<ffffffff8164bda1>] ? inet_sendmsg+0x71/0xc0
[<ffffffff815b906d>] ? sock_alloc_inode+0x2d/0xd0
[<ffffffff811a21f2>] kmem_cache_alloc+0x1a2/0x1b0
[<ffffffff815b906d>] sock_alloc_inode+0x2d/0xd0
[<ffffffff811d8566>] alloc_inode+0x26/0xa0
[<ffffffff811da04a>] new_inode_pseudo+0x1a/0x70
[<ffffffff815b933e>] sock_alloc+0x1e/0x80
[<ffffffff815ba855>] __sock_create+0x95/0x220
[<ffffffff815baa04>] sock_create_kern+0x24/0x30
[<ffffffffa04794d9>] con_work+0xef9/0x2050 [libceph]
[<ffffffffa04aa9ec>] ? rbd_img_request_submit+0x4c/0x60 [rbd]
[<ffffffff81084c19>] process_one_work+0x159/0x4f0
[<ffffffff8108561b>] worker_thread+0x11b/0x530
[<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
[<ffffffff8108b6f9>] kthread+0xc9/0xe0
[<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
[<ffffffff816e1b98>] ret_from_fork+0x58/0x90
[<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90

We are writing out data on ceph_connection X:

ceph_con_workfn
  mutex_lock(&con->mutex)  # ceph_connection::mutex
  try_write
    ceph_tcp_connect
      sock_create_kern
        GFP_KERNEL allocation
          allocator recurses into XFS, more I/O is issued


Workqueue: rbd rbd_request_workfn [rbd]
ffff880047a83b38 0000000000000046 ffff881025350c00 ffff8800383fa9e0
0000000000012b00 0000000000000000 ffff880047a83fd8 0000000000012b00
ffff88014b638860 ffff8800383fa9e0 ffff880047a83b38 ffff8810878dc1b8
Call Trace:
[<ffffffff816dd629>] schedule+0x29/0x70
[<ffffffff816dd906>] schedule_preempt_disabled+0x16/0x20
[<ffffffff816df755>] __mutex_lock_slowpath+0xa5/0x110
[<ffffffffa048ad66>] ? ceph_str_hash+0x26/0x80 [libceph]
[<ffffffff816df7f6>] mutex_lock+0x36/0x4a
[<ffffffffa04784fd>] ceph_con_send+0x4d/0x130 [libceph]
[<ffffffffa047d3f0>] __send_queued+0x120/0x150 [libceph]
[<ffffffffa047fe7b>] __ceph_osdc_start_request+0x5b/0xd0 [libceph]
[<ffffffffa047ff41>] ceph_osdc_start_request+0x51/0x80 [libceph]
[<ffffffffa04a8050>] rbd_obj_request_submit.isra.27+0x10/0x20 [rbd]
[<ffffffffa04aa6de>] rbd_img_obj_request_submit+0x23e/0x500 [rbd]
[<ffffffffa04aa9ec>] rbd_img_request_submit+0x4c/0x60 [rbd]
[<ffffffffa04ab3d5>] rbd_request_workfn+0x305/0x410 [rbd]
[<ffffffff81084c19>] process_one_work+0x159/0x4f0
[<ffffffff8108561b>] worker_thread+0x11b/0x530
[<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
[<ffffffff8108b6f9>] kthread+0xc9/0xe0
[<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
[<ffffffff816e1b98>] ret_from_fork+0x58/0x90
[<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90

Here is that I/O.  We grab ceph_osd_client::request_mutex, but
ceph_connection::mutex is being held by the worker that recursed into
XFS:

rbd_queue_workfn
  ceph_osdc_start_request
    mutex_lock(&osdc->request_mutex);
    ceph_con_send
      mutex_lock(&con->mutex)  # deadlock


Workqueue: ceph-msgr con_work [libceph]
ffff88014a89fc08 0000000000000046 ffff88014a89fc18 ffff88013a2d90c0
0000000000012b00 0000000000000000 ffff88014a89ffd8 0000000000012b00
ffff880015a210c0 ffff88013a2d90c0 0000000000000000 ffff882028a84798
Call Trace:
[<ffffffff816dd629>] schedule+0x29/0x70
[<ffffffff816dd906>] schedule_preempt_disabled+0x16/0x20
[<ffffffff816df755>] __mutex_lock_slowpath+0xa5/0x110
[<ffffffff816df7f6>] mutex_lock+0x36/0x4a
[<ffffffffa047ec1f>] alloc_msg+0xcf/0x210 [libceph]
[<ffffffffa0479c55>] con_work+0x1675/0x2050 [libceph]
[<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
[<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
[<ffffffff81084c19>] process_one_work+0x159/0x4f0
[<ffffffff8108561b>] worker_thread+0x11b/0x530
[<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
[<ffffffff8108b6f9>] kthread+0xc9/0xe0
[<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
[<ffffffff816e1b98>] ret_from_fork+0x58/0x90
[<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90

Workqueue: ceph-msgr con_work [libceph]
ffff88014c10fc08 0000000000000046 ffff88013a2d9988 ffff88013a2d9920
0000000000012b00 0000000000000000 ffff88014c10ffd8 0000000000012b00
ffffffff81c1b4a0 ffff88013a2d9920 0000000000000000 ffff882028a84798
Call Trace:
[<ffffffff816dd629>] schedule+0x29/0x70
[<ffffffff816dd906>] schedule_preempt_disabled+0x16/0x20
[<ffffffff816df755>] __mutex_lock_slowpath+0xa5/0x110
[<ffffffff816df7f6>] mutex_lock+0x36/0x4a
[<ffffffffa047ec1f>] alloc_msg+0xcf/0x210 [libceph]
[<ffffffffa0479c55>] con_work+0x1675/0x2050 [libceph]
[<ffffffff810a076c>] ? put_prev_entity+0x3c/0x2e0
[<ffffffff8109b315>] ? sched_clock_cpu+0x95/0xd0
[<ffffffff81084c19>] process_one_work+0x159/0x4f0
[<ffffffff8108561b>] worker_thread+0x11b/0x530
[<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
[<ffffffff8108b6f9>] kthread+0xc9/0xe0
[<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
[<ffffffff816e1b98>] ret_from_fork+0x58/0x90
[<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90

These two are replies on ceph_connections Y and Z, which need
ceph_osd_client::request_mutex to figure out which requests can be
completed:

alloc_msg
  get_reply
    mutex_lock(&osdc->request_mutex);

Eventually everything else blocks on ceph_osd_client::request_mutex,
since it's used for both submitting requests and handling replies.

This really is a straightforward "using GFP_KERNEL on the writeback
path isn't allowed" case.  I'm not sure what made you worried here.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-29  9:21         ` Ilya Dryomov
@ 2017-03-29 10:41           ` Michal Hocko
  2017-03-29 10:55             ` Michal Hocko
  2017-03-29 11:05             ` Brian Foster
  0 siblings, 2 replies; 106+ messages in thread
From: Michal Hocko @ 2017-03-29 10:41 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton, linux-xfs

[CC xfs guys]

On Wed 29-03-17 11:21:44, Ilya Dryomov wrote:
[...]
> This is a set of stack traces from http://tracker.ceph.com/issues/19309
> (linked in the changelog):
> 
> Workqueue: ceph-msgr con_work [libceph]
> ffff8810871cb018 0000000000000046 0000000000000000 ffff881085d40000
> 0000000000012b00 ffff881025cad428 ffff8810871cbfd8 0000000000012b00
> ffff880102fc1000 ffff881085d40000 ffff8810871cb038 ffff8810871cb148
> Call Trace:
> [<ffffffff816dd629>] schedule+0x29/0x70
> [<ffffffff816e066d>] schedule_timeout+0x1bd/0x200
> [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
> [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
> [<ffffffff816deb5f>] wait_for_completion+0xbf/0x180
> [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
> [<ffffffff81086335>] flush_work+0x165/0x250

I suspect this is xlog_cil_push_now -> flush_work(&cil->xc_push_work)
right? I kind of got lost where this waits on an IO.

> [<ffffffff81082940>] ? worker_detach_from_pool+0xd0/0xd0
> [<ffffffffa03b65b1>] xlog_cil_force_lsn+0x81/0x200 [xfs]
> [<ffffffff816d6b42>] ? __slab_free+0xee/0x234
> [<ffffffffa03b4b1d>] _xfs_log_force_lsn+0x4d/0x2c0 [xfs]
> [<ffffffff811adc1e>] ? lookup_page_cgroup_used+0xe/0x30
> [<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
> [<ffffffffa03b4dcf>] xfs_log_force_lsn+0x3f/0xf0 [xfs]
> [<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
> [<ffffffffa03a62c6>] xfs_iunpin_wait+0xc6/0x1a0 [xfs]
> [<ffffffff810aa250>] ? wake_atomic_t_function+0x40/0x40
> [<ffffffffa039a723>] xfs_reclaim_inode+0xa3/0x330 [xfs]
[...]
> [<ffffffff815b933e>] sock_alloc+0x1e/0x80
> [<ffffffff815ba855>] __sock_create+0x95/0x220
> [<ffffffff815baa04>] sock_create_kern+0x24/0x30
> [<ffffffffa04794d9>] con_work+0xef9/0x2050 [libceph]
> [<ffffffffa04aa9ec>] ? rbd_img_request_submit+0x4c/0x60 [rbd]
> [<ffffffff81084c19>] process_one_work+0x159/0x4f0
> [<ffffffff8108561b>] worker_thread+0x11b/0x530
> [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
> [<ffffffff8108b6f9>] kthread+0xc9/0xe0
> [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
> [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
>
> We are writing out data on ceph_connection X:
> 
> ceph_con_workfn
>   mutex_lock(&con->mutex)  # ceph_connection::mutex
>   try_write
>     ceph_tcp_connect
>       sock_create_kern
>         GFP_KERNEL allocation
>           allocator recurses into XFS, more I/O is issued

I am not sure this is true actually. XFS tends to do an IO from a
separate kworkers rather than the direct reclaim context.

> Workqueue: rbd rbd_request_workfn [rbd]
> ffff880047a83b38 0000000000000046 ffff881025350c00 ffff8800383fa9e0
> 0000000000012b00 0000000000000000 ffff880047a83fd8 0000000000012b00
> ffff88014b638860 ffff8800383fa9e0 ffff880047a83b38 ffff8810878dc1b8
> Call Trace:
> [<ffffffff816dd629>] schedule+0x29/0x70
> [<ffffffff816dd906>] schedule_preempt_disabled+0x16/0x20
> [<ffffffff816df755>] __mutex_lock_slowpath+0xa5/0x110
> [<ffffffffa048ad66>] ? ceph_str_hash+0x26/0x80 [libceph]
> [<ffffffff816df7f6>] mutex_lock+0x36/0x4a
> [<ffffffffa04784fd>] ceph_con_send+0x4d/0x130 [libceph]
> [<ffffffffa047d3f0>] __send_queued+0x120/0x150 [libceph]
> [<ffffffffa047fe7b>] __ceph_osdc_start_request+0x5b/0xd0 [libceph]
> [<ffffffffa047ff41>] ceph_osdc_start_request+0x51/0x80 [libceph]
> [<ffffffffa04a8050>] rbd_obj_request_submit.isra.27+0x10/0x20 [rbd]
> [<ffffffffa04aa6de>] rbd_img_obj_request_submit+0x23e/0x500 [rbd]
> [<ffffffffa04aa9ec>] rbd_img_request_submit+0x4c/0x60 [rbd]
> [<ffffffffa04ab3d5>] rbd_request_workfn+0x305/0x410 [rbd]
> [<ffffffff81084c19>] process_one_work+0x159/0x4f0
> [<ffffffff8108561b>] worker_thread+0x11b/0x530
> [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
> [<ffffffff8108b6f9>] kthread+0xc9/0xe0
> [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
> [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> 
> Here is that I/O.  We grab ceph_osd_client::request_mutex, but
> ceph_connection::mutex is being held by the worker that recursed into
> XFS:
> 
> rbd_queue_workfn
>   ceph_osdc_start_request
>     mutex_lock(&osdc->request_mutex);
>     ceph_con_send
>       mutex_lock(&con->mutex)  # deadlock
> 
> 
> Workqueue: ceph-msgr con_work [libceph]
> ffff88014a89fc08 0000000000000046 ffff88014a89fc18 ffff88013a2d90c0
> 0000000000012b00 0000000000000000 ffff88014a89ffd8 0000000000012b00
> ffff880015a210c0 ffff88013a2d90c0 0000000000000000 ffff882028a84798
> Call Trace:
> [<ffffffff816dd629>] schedule+0x29/0x70
> [<ffffffff816dd906>] schedule_preempt_disabled+0x16/0x20
> [<ffffffff816df755>] __mutex_lock_slowpath+0xa5/0x110
> [<ffffffff816df7f6>] mutex_lock+0x36/0x4a
> [<ffffffffa047ec1f>] alloc_msg+0xcf/0x210 [libceph]
> [<ffffffffa0479c55>] con_work+0x1675/0x2050 [libceph]
> [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
> [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
> [<ffffffff81084c19>] process_one_work+0x159/0x4f0
> [<ffffffff8108561b>] worker_thread+0x11b/0x530
> [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
> [<ffffffff8108b6f9>] kthread+0xc9/0xe0
> [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
> [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> 
> Workqueue: ceph-msgr con_work [libceph]
> ffff88014c10fc08 0000000000000046 ffff88013a2d9988 ffff88013a2d9920
> 0000000000012b00 0000000000000000 ffff88014c10ffd8 0000000000012b00
> ffffffff81c1b4a0 ffff88013a2d9920 0000000000000000 ffff882028a84798
> Call Trace:
> [<ffffffff816dd629>] schedule+0x29/0x70
> [<ffffffff816dd906>] schedule_preempt_disabled+0x16/0x20
> [<ffffffff816df755>] __mutex_lock_slowpath+0xa5/0x110
> [<ffffffff816df7f6>] mutex_lock+0x36/0x4a
> [<ffffffffa047ec1f>] alloc_msg+0xcf/0x210 [libceph]
> [<ffffffffa0479c55>] con_work+0x1675/0x2050 [libceph]
> [<ffffffff810a076c>] ? put_prev_entity+0x3c/0x2e0
> [<ffffffff8109b315>] ? sched_clock_cpu+0x95/0xd0
> [<ffffffff81084c19>] process_one_work+0x159/0x4f0
> [<ffffffff8108561b>] worker_thread+0x11b/0x530
> [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
> [<ffffffff8108b6f9>] kthread+0xc9/0xe0
> [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
> [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> 
> These two are replies on ceph_connections Y and Z, which need
> ceph_osd_client::request_mutex to figure out which requests can be
> completed:
> 
> alloc_msg
>   get_reply
>     mutex_lock(&osdc->request_mutex);
> 
> Eventually everything else blocks on ceph_osd_client::request_mutex,
> since it's used for both submitting requests and handling replies.
> 
> This really is a straightforward "using GFP_KERNEL on the writeback
> path isn't allowed" case.  I'm not sure what made you worried here.

I am still not sure there is the dependency there. But if anything and
the con->mutex is the lock which is dangerous to recurse back to the FS
then please wrap the whole scope which takes the lock with the
memalloc_noio_save (or memalloc_nofs_save currently sitting in the mmotm
tree, if you can wait until that API gets merged) with a big fat comment
explaining why that is needed. Sticking the scope protection down the
path is just hard to understand later on. And as already mentioned
NOFS/NOIO context are (ab)used way too much without a clear/good reason.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-29 10:41           ` Michal Hocko
@ 2017-03-29 10:55             ` Michal Hocko
  2017-03-29 11:10               ` Ilya Dryomov
  2017-03-29 11:05             ` Brian Foster
  1 sibling, 1 reply; 106+ messages in thread
From: Michal Hocko @ 2017-03-29 10:55 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton, linux-xfs

On Wed 29-03-17 12:41:26, Michal Hocko wrote:
[...]
> > ceph_con_workfn
> >   mutex_lock(&con->mutex)  # ceph_connection::mutex
> >   try_write
> >     ceph_tcp_connect
> >       sock_create_kern
> >         GFP_KERNEL allocation
> >           allocator recurses into XFS, more I/O is issued

One more note. So what happens if this is a GFP_NOIO request which
cannot make any progress? Your IO thread is blocked on con->mutex
as you write below but the above thread cannot proceed as well. So I am
_really_ not sure this acutally helps.

[...]
> > 
> > rbd_queue_workfn
> >   ceph_osdc_start_request
> >     mutex_lock(&osdc->request_mutex);
> >     ceph_con_send
> >       mutex_lock(&con->mutex)  # deadlock

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-29 10:41           ` Michal Hocko
  2017-03-29 10:55             ` Michal Hocko
@ 2017-03-29 11:05             ` Brian Foster
  2017-03-29 11:14               ` Ilya Dryomov
  1 sibling, 1 reply; 106+ messages in thread
From: Brian Foster @ 2017-03-29 11:05 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Ilya Dryomov, Greg Kroah-Hartman, linux-kernel, stable,
	Sergey Jerusalimov, Jeff Layton, linux-xfs

On Wed, Mar 29, 2017 at 12:41:26PM +0200, Michal Hocko wrote:
> [CC xfs guys]
> 
> On Wed 29-03-17 11:21:44, Ilya Dryomov wrote:
> [...]
> > This is a set of stack traces from http://tracker.ceph.com/issues/19309
> > (linked in the changelog):
> > 
> > Workqueue: ceph-msgr con_work [libceph]
> > ffff8810871cb018 0000000000000046 0000000000000000 ffff881085d40000
> > 0000000000012b00 ffff881025cad428 ffff8810871cbfd8 0000000000012b00
> > ffff880102fc1000 ffff881085d40000 ffff8810871cb038 ffff8810871cb148
> > Call Trace:
> > [<ffffffff816dd629>] schedule+0x29/0x70
> > [<ffffffff816e066d>] schedule_timeout+0x1bd/0x200
> > [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
> > [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
> > [<ffffffff816deb5f>] wait_for_completion+0xbf/0x180
> > [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
> > [<ffffffff81086335>] flush_work+0x165/0x250
> 
> I suspect this is xlog_cil_push_now -> flush_work(&cil->xc_push_work)
> right? I kind of got lost where this waits on an IO.
> 

Yep. That means a CIL push is already in progress. We wait on that to
complete here. After that, the resulting task queues execution of
xlog_cil_push_work()->xlog_cil_push() on m_cil_workqueue. That task may
submit I/O to the log.

I don't see any reference to xlog_cil_push() anywhere in the traces here
or in the bug referenced above, however..?

Brian

> > [<ffffffff81082940>] ? worker_detach_from_pool+0xd0/0xd0
> > [<ffffffffa03b65b1>] xlog_cil_force_lsn+0x81/0x200 [xfs]
> > [<ffffffff816d6b42>] ? __slab_free+0xee/0x234
> > [<ffffffffa03b4b1d>] _xfs_log_force_lsn+0x4d/0x2c0 [xfs]
> > [<ffffffff811adc1e>] ? lookup_page_cgroup_used+0xe/0x30
> > [<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
> > [<ffffffffa03b4dcf>] xfs_log_force_lsn+0x3f/0xf0 [xfs]
> > [<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
> > [<ffffffffa03a62c6>] xfs_iunpin_wait+0xc6/0x1a0 [xfs]
> > [<ffffffff810aa250>] ? wake_atomic_t_function+0x40/0x40
> > [<ffffffffa039a723>] xfs_reclaim_inode+0xa3/0x330 [xfs]
> [...]
> > [<ffffffff815b933e>] sock_alloc+0x1e/0x80
> > [<ffffffff815ba855>] __sock_create+0x95/0x220
> > [<ffffffff815baa04>] sock_create_kern+0x24/0x30
> > [<ffffffffa04794d9>] con_work+0xef9/0x2050 [libceph]
> > [<ffffffffa04aa9ec>] ? rbd_img_request_submit+0x4c/0x60 [rbd]
> > [<ffffffff81084c19>] process_one_work+0x159/0x4f0
> > [<ffffffff8108561b>] worker_thread+0x11b/0x530
> > [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
> > [<ffffffff8108b6f9>] kthread+0xc9/0xe0
> > [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> > [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
> > [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> >
> > We are writing out data on ceph_connection X:
> > 
> > ceph_con_workfn
> >   mutex_lock(&con->mutex)  # ceph_connection::mutex
> >   try_write
> >     ceph_tcp_connect
> >       sock_create_kern
> >         GFP_KERNEL allocation
> >           allocator recurses into XFS, more I/O is issued
> 
> I am not sure this is true actually. XFS tends to do an IO from a
> separate kworkers rather than the direct reclaim context.
> 
> > Workqueue: rbd rbd_request_workfn [rbd]
> > ffff880047a83b38 0000000000000046 ffff881025350c00 ffff8800383fa9e0
> > 0000000000012b00 0000000000000000 ffff880047a83fd8 0000000000012b00
> > ffff88014b638860 ffff8800383fa9e0 ffff880047a83b38 ffff8810878dc1b8
> > Call Trace:
> > [<ffffffff816dd629>] schedule+0x29/0x70
> > [<ffffffff816dd906>] schedule_preempt_disabled+0x16/0x20
> > [<ffffffff816df755>] __mutex_lock_slowpath+0xa5/0x110
> > [<ffffffffa048ad66>] ? ceph_str_hash+0x26/0x80 [libceph]
> > [<ffffffff816df7f6>] mutex_lock+0x36/0x4a
> > [<ffffffffa04784fd>] ceph_con_send+0x4d/0x130 [libceph]
> > [<ffffffffa047d3f0>] __send_queued+0x120/0x150 [libceph]
> > [<ffffffffa047fe7b>] __ceph_osdc_start_request+0x5b/0xd0 [libceph]
> > [<ffffffffa047ff41>] ceph_osdc_start_request+0x51/0x80 [libceph]
> > [<ffffffffa04a8050>] rbd_obj_request_submit.isra.27+0x10/0x20 [rbd]
> > [<ffffffffa04aa6de>] rbd_img_obj_request_submit+0x23e/0x500 [rbd]
> > [<ffffffffa04aa9ec>] rbd_img_request_submit+0x4c/0x60 [rbd]
> > [<ffffffffa04ab3d5>] rbd_request_workfn+0x305/0x410 [rbd]
> > [<ffffffff81084c19>] process_one_work+0x159/0x4f0
> > [<ffffffff8108561b>] worker_thread+0x11b/0x530
> > [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
> > [<ffffffff8108b6f9>] kthread+0xc9/0xe0
> > [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> > [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
> > [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> > 
> > Here is that I/O.  We grab ceph_osd_client::request_mutex, but
> > ceph_connection::mutex is being held by the worker that recursed into
> > XFS:
> > 
> > rbd_queue_workfn
> >   ceph_osdc_start_request
> >     mutex_lock(&osdc->request_mutex);
> >     ceph_con_send
> >       mutex_lock(&con->mutex)  # deadlock
> > 
> > 
> > Workqueue: ceph-msgr con_work [libceph]
> > ffff88014a89fc08 0000000000000046 ffff88014a89fc18 ffff88013a2d90c0
> > 0000000000012b00 0000000000000000 ffff88014a89ffd8 0000000000012b00
> > ffff880015a210c0 ffff88013a2d90c0 0000000000000000 ffff882028a84798
> > Call Trace:
> > [<ffffffff816dd629>] schedule+0x29/0x70
> > [<ffffffff816dd906>] schedule_preempt_disabled+0x16/0x20
> > [<ffffffff816df755>] __mutex_lock_slowpath+0xa5/0x110
> > [<ffffffff816df7f6>] mutex_lock+0x36/0x4a
> > [<ffffffffa047ec1f>] alloc_msg+0xcf/0x210 [libceph]
> > [<ffffffffa0479c55>] con_work+0x1675/0x2050 [libceph]
> > [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
> > [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
> > [<ffffffff81084c19>] process_one_work+0x159/0x4f0
> > [<ffffffff8108561b>] worker_thread+0x11b/0x530
> > [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
> > [<ffffffff8108b6f9>] kthread+0xc9/0xe0
> > [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> > [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
> > [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> > 
> > Workqueue: ceph-msgr con_work [libceph]
> > ffff88014c10fc08 0000000000000046 ffff88013a2d9988 ffff88013a2d9920
> > 0000000000012b00 0000000000000000 ffff88014c10ffd8 0000000000012b00
> > ffffffff81c1b4a0 ffff88013a2d9920 0000000000000000 ffff882028a84798
> > Call Trace:
> > [<ffffffff816dd629>] schedule+0x29/0x70
> > [<ffffffff816dd906>] schedule_preempt_disabled+0x16/0x20
> > [<ffffffff816df755>] __mutex_lock_slowpath+0xa5/0x110
> > [<ffffffff816df7f6>] mutex_lock+0x36/0x4a
> > [<ffffffffa047ec1f>] alloc_msg+0xcf/0x210 [libceph]
> > [<ffffffffa0479c55>] con_work+0x1675/0x2050 [libceph]
> > [<ffffffff810a076c>] ? put_prev_entity+0x3c/0x2e0
> > [<ffffffff8109b315>] ? sched_clock_cpu+0x95/0xd0
> > [<ffffffff81084c19>] process_one_work+0x159/0x4f0
> > [<ffffffff8108561b>] worker_thread+0x11b/0x530
> > [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
> > [<ffffffff8108b6f9>] kthread+0xc9/0xe0
> > [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> > [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
> > [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> > 
> > These two are replies on ceph_connections Y and Z, which need
> > ceph_osd_client::request_mutex to figure out which requests can be
> > completed:
> > 
> > alloc_msg
> >   get_reply
> >     mutex_lock(&osdc->request_mutex);
> > 
> > Eventually everything else blocks on ceph_osd_client::request_mutex,
> > since it's used for both submitting requests and handling replies.
> > 
> > This really is a straightforward "using GFP_KERNEL on the writeback
> > path isn't allowed" case.  I'm not sure what made you worried here.
> 
> I am still not sure there is the dependency there. But if anything and
> the con->mutex is the lock which is dangerous to recurse back to the FS
> then please wrap the whole scope which takes the lock with the
> memalloc_noio_save (or memalloc_nofs_save currently sitting in the mmotm
> tree, if you can wait until that API gets merged) with a big fat comment
> explaining why that is needed. Sticking the scope protection down the
> path is just hard to understand later on. And as already mentioned
> NOFS/NOIO context are (ab)used way too much without a clear/good reason.
> -- 
> Michal Hocko
> SUSE Labs
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-29 10:55             ` Michal Hocko
@ 2017-03-29 11:10               ` Ilya Dryomov
  2017-03-29 11:16                 ` Michal Hocko
  0 siblings, 1 reply; 106+ messages in thread
From: Ilya Dryomov @ 2017-03-29 11:10 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton, linux-xfs

On Wed, Mar 29, 2017 at 12:55 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Wed 29-03-17 12:41:26, Michal Hocko wrote:
> [...]
>> > ceph_con_workfn
>> >   mutex_lock(&con->mutex)  # ceph_connection::mutex
>> >   try_write
>> >     ceph_tcp_connect
>> >       sock_create_kern
>> >         GFP_KERNEL allocation
>> >           allocator recurses into XFS, more I/O is issued
>
> One more note. So what happens if this is a GFP_NOIO request which
> cannot make any progress? Your IO thread is blocked on con->mutex
> as you write below but the above thread cannot proceed as well. So I am
> _really_ not sure this acutally helps.

This is not the only I/O worker.  A ceph cluster typically consists of
at least a few OSDs and can be as large as thousands of OSDs.  This is
the reason we are calling sock_create_kern() on the writeback path in
the first place: pre-opening thousands of sockets isn't feasible.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-29 11:05             ` Brian Foster
@ 2017-03-29 11:14               ` Ilya Dryomov
  2017-03-29 11:18                 ` Michal Hocko
  0 siblings, 1 reply; 106+ messages in thread
From: Ilya Dryomov @ 2017-03-29 11:14 UTC (permalink / raw)
  To: Brian Foster
  Cc: Michal Hocko, Greg Kroah-Hartman, linux-kernel, stable,
	Sergey Jerusalimov, Jeff Layton, linux-xfs

On Wed, Mar 29, 2017 at 1:05 PM, Brian Foster <bfoster@redhat.com> wrote:
> On Wed, Mar 29, 2017 at 12:41:26PM +0200, Michal Hocko wrote:
>> [CC xfs guys]
>>
>> On Wed 29-03-17 11:21:44, Ilya Dryomov wrote:
>> [...]
>> > This is a set of stack traces from http://tracker.ceph.com/issues/19309
>> > (linked in the changelog):
>> >
>> > Workqueue: ceph-msgr con_work [libceph]
>> > ffff8810871cb018 0000000000000046 0000000000000000 ffff881085d40000
>> > 0000000000012b00 ffff881025cad428 ffff8810871cbfd8 0000000000012b00
>> > ffff880102fc1000 ffff881085d40000 ffff8810871cb038 ffff8810871cb148
>> > Call Trace:
>> > [<ffffffff816dd629>] schedule+0x29/0x70
>> > [<ffffffff816e066d>] schedule_timeout+0x1bd/0x200
>> > [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
>> > [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
>> > [<ffffffff816deb5f>] wait_for_completion+0xbf/0x180
>> > [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
>> > [<ffffffff81086335>] flush_work+0x165/0x250
>>
>> I suspect this is xlog_cil_push_now -> flush_work(&cil->xc_push_work)
>> right? I kind of got lost where this waits on an IO.
>>
>
> Yep. That means a CIL push is already in progress. We wait on that to
> complete here. After that, the resulting task queues execution of
> xlog_cil_push_work()->xlog_cil_push() on m_cil_workqueue. That task may
> submit I/O to the log.
>
> I don't see any reference to xlog_cil_push() anywhere in the traces here
> or in the bug referenced above, however..?

Well, it's prefaced with "Interesting is:"...  Sergey (the original
reporter, CCed here) might still have the rest of them.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-29 11:10               ` Ilya Dryomov
@ 2017-03-29 11:16                 ` Michal Hocko
  2017-03-29 14:25                   ` Ilya Dryomov
  0 siblings, 1 reply; 106+ messages in thread
From: Michal Hocko @ 2017-03-29 11:16 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton, linux-xfs

On Wed 29-03-17 13:10:01, Ilya Dryomov wrote:
> On Wed, Mar 29, 2017 at 12:55 PM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Wed 29-03-17 12:41:26, Michal Hocko wrote:
> > [...]
> >> > ceph_con_workfn
> >> >   mutex_lock(&con->mutex)  # ceph_connection::mutex
> >> >   try_write
> >> >     ceph_tcp_connect
> >> >       sock_create_kern
> >> >         GFP_KERNEL allocation
> >> >           allocator recurses into XFS, more I/O is issued
> >
> > One more note. So what happens if this is a GFP_NOIO request which
> > cannot make any progress? Your IO thread is blocked on con->mutex
> > as you write below but the above thread cannot proceed as well. So I am
> > _really_ not sure this acutally helps.
> 
> This is not the only I/O worker.  A ceph cluster typically consists of
> at least a few OSDs and can be as large as thousands of OSDs.  This is
> the reason we are calling sock_create_kern() on the writeback path in
> the first place: pre-opening thousands of sockets isn't feasible.

Sorry for being dense here but what actually guarantees the forward
progress? My current understanding is that the deadlock is caused by
con->mutext being held while the allocation cannot make a forward
progress. I can imagine this would be possible if the other io flushers
depend on this lock. But then NOIO vs. KERNEL allocation doesn't make
much difference. What am I missing?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-29 11:14               ` Ilya Dryomov
@ 2017-03-29 11:18                 ` Michal Hocko
  2017-03-29 11:49                   ` Brian Foster
  0 siblings, 1 reply; 106+ messages in thread
From: Michal Hocko @ 2017-03-29 11:18 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Brian Foster, Greg Kroah-Hartman, linux-kernel, stable,
	Sergey Jerusalimov, Jeff Layton, linux-xfs

On Wed 29-03-17 13:14:42, Ilya Dryomov wrote:
> On Wed, Mar 29, 2017 at 1:05 PM, Brian Foster <bfoster@redhat.com> wrote:
> > On Wed, Mar 29, 2017 at 12:41:26PM +0200, Michal Hocko wrote:
> >> [CC xfs guys]
> >>
> >> On Wed 29-03-17 11:21:44, Ilya Dryomov wrote:
> >> [...]
> >> > This is a set of stack traces from http://tracker.ceph.com/issues/19309
> >> > (linked in the changelog):
> >> >
> >> > Workqueue: ceph-msgr con_work [libceph]
> >> > ffff8810871cb018 0000000000000046 0000000000000000 ffff881085d40000
> >> > 0000000000012b00 ffff881025cad428 ffff8810871cbfd8 0000000000012b00
> >> > ffff880102fc1000 ffff881085d40000 ffff8810871cb038 ffff8810871cb148
> >> > Call Trace:
> >> > [<ffffffff816dd629>] schedule+0x29/0x70
> >> > [<ffffffff816e066d>] schedule_timeout+0x1bd/0x200
> >> > [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
> >> > [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
> >> > [<ffffffff816deb5f>] wait_for_completion+0xbf/0x180
> >> > [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
> >> > [<ffffffff81086335>] flush_work+0x165/0x250
> >>
> >> I suspect this is xlog_cil_push_now -> flush_work(&cil->xc_push_work)
> >> right? I kind of got lost where this waits on an IO.
> >>
> >
> > Yep. That means a CIL push is already in progress. We wait on that to
> > complete here. After that, the resulting task queues execution of
> > xlog_cil_push_work()->xlog_cil_push() on m_cil_workqueue. That task may
> > submit I/O to the log.
> >
> > I don't see any reference to xlog_cil_push() anywhere in the traces here
> > or in the bug referenced above, however..?
> 
> Well, it's prefaced with "Interesting is:"...  Sergey (the original
> reporter, CCed here) might still have the rest of them.

JFTR
http://tracker.ceph.com/attachments/download/2769/full_kern_trace.txt
[288420.754637] Workqueue: xfs-cil/rbd1 xlog_cil_push_work [xfs]
[288420.754638]  ffff880130c1fb38 0000000000000046 ffff880130c1fac8 ffff880130d72180
[288420.754640]  0000000000012b00 ffff880130c1fad8 ffff880130c1ffd8 0000000000012b00
[288420.754641]  ffff8810297b6480 ffff880130d72180 ffffffffa03b1264 ffff8820263d6800
[288420.754643] Call Trace:
[288420.754652]  [<ffffffffa03b1264>] ? xlog_bdstrat+0x34/0x70 [xfs]
[288420.754653]  [<ffffffff816dd629>] schedule+0x29/0x70
[288420.754661]  [<ffffffffa03b3b9c>] xlog_state_get_iclog_space+0xdc/0x2e0 [xfs]
[288420.754669]  [<ffffffffa03b1264>] ? xlog_bdstrat+0x34/0x70 [xfs]
[288420.754670]  [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
[288420.754678]  [<ffffffffa03b4090>] xlog_write+0x190/0x730 [xfs]
[288420.754686]  [<ffffffffa03b5d9e>] xlog_cil_push+0x24e/0x3e0 [xfs]
[288420.754693]  [<ffffffffa03b5f45>] xlog_cil_push_work+0x15/0x20 [xfs]
[288420.754695]  [<ffffffff81084c19>] process_one_work+0x159/0x4f0
[288420.754697]  [<ffffffff81084fdc>] process_scheduled_works+0x2c/0x40
[288420.754698]  [<ffffffff8108579b>] worker_thread+0x29b/0x530
[288420.754699]  [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
[288420.754701]  [<ffffffff8108b6f9>] kthread+0xc9/0xe0
[288420.754703]  [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
[288420.754705]  [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
[288420.754707]  [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-29 11:18                 ` Michal Hocko
@ 2017-03-29 11:49                   ` Brian Foster
  2017-03-29 14:30                     ` Ilya Dryomov
  0 siblings, 1 reply; 106+ messages in thread
From: Brian Foster @ 2017-03-29 11:49 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Ilya Dryomov, Greg Kroah-Hartman, linux-kernel, stable,
	Sergey Jerusalimov, Jeff Layton, linux-xfs

On Wed, Mar 29, 2017 at 01:18:34PM +0200, Michal Hocko wrote:
> On Wed 29-03-17 13:14:42, Ilya Dryomov wrote:
> > On Wed, Mar 29, 2017 at 1:05 PM, Brian Foster <bfoster@redhat.com> wrote:
> > > On Wed, Mar 29, 2017 at 12:41:26PM +0200, Michal Hocko wrote:
> > >> [CC xfs guys]
> > >>
> > >> On Wed 29-03-17 11:21:44, Ilya Dryomov wrote:
> > >> [...]
> > >> > This is a set of stack traces from http://tracker.ceph.com/issues/19309
> > >> > (linked in the changelog):
> > >> >
> > >> > Workqueue: ceph-msgr con_work [libceph]
> > >> > ffff8810871cb018 0000000000000046 0000000000000000 ffff881085d40000
> > >> > 0000000000012b00 ffff881025cad428 ffff8810871cbfd8 0000000000012b00
> > >> > ffff880102fc1000 ffff881085d40000 ffff8810871cb038 ffff8810871cb148
> > >> > Call Trace:
> > >> > [<ffffffff816dd629>] schedule+0x29/0x70
> > >> > [<ffffffff816e066d>] schedule_timeout+0x1bd/0x200
> > >> > [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
> > >> > [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
> > >> > [<ffffffff816deb5f>] wait_for_completion+0xbf/0x180
> > >> > [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
> > >> > [<ffffffff81086335>] flush_work+0x165/0x250
> > >>
> > >> I suspect this is xlog_cil_push_now -> flush_work(&cil->xc_push_work)
> > >> right? I kind of got lost where this waits on an IO.
> > >>
> > >
> > > Yep. That means a CIL push is already in progress. We wait on that to
> > > complete here. After that, the resulting task queues execution of
> > > xlog_cil_push_work()->xlog_cil_push() on m_cil_workqueue. That task may
> > > submit I/O to the log.
> > >
> > > I don't see any reference to xlog_cil_push() anywhere in the traces here
> > > or in the bug referenced above, however..?
> > 
> > Well, it's prefaced with "Interesting is:"...  Sergey (the original
> > reporter, CCed here) might still have the rest of them.
> 
> JFTR
> http://tracker.ceph.com/attachments/download/2769/full_kern_trace.txt
> [288420.754637] Workqueue: xfs-cil/rbd1 xlog_cil_push_work [xfs]
> [288420.754638]  ffff880130c1fb38 0000000000000046 ffff880130c1fac8 ffff880130d72180
> [288420.754640]  0000000000012b00 ffff880130c1fad8 ffff880130c1ffd8 0000000000012b00
> [288420.754641]  ffff8810297b6480 ffff880130d72180 ffffffffa03b1264 ffff8820263d6800
> [288420.754643] Call Trace:
> [288420.754652]  [<ffffffffa03b1264>] ? xlog_bdstrat+0x34/0x70 [xfs]
> [288420.754653]  [<ffffffff816dd629>] schedule+0x29/0x70
> [288420.754661]  [<ffffffffa03b3b9c>] xlog_state_get_iclog_space+0xdc/0x2e0 [xfs]
> [288420.754669]  [<ffffffffa03b1264>] ? xlog_bdstrat+0x34/0x70 [xfs]
> [288420.754670]  [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
> [288420.754678]  [<ffffffffa03b4090>] xlog_write+0x190/0x730 [xfs]
> [288420.754686]  [<ffffffffa03b5d9e>] xlog_cil_push+0x24e/0x3e0 [xfs]
> [288420.754693]  [<ffffffffa03b5f45>] xlog_cil_push_work+0x15/0x20 [xfs]
> [288420.754695]  [<ffffffff81084c19>] process_one_work+0x159/0x4f0
> [288420.754697]  [<ffffffff81084fdc>] process_scheduled_works+0x2c/0x40
> [288420.754698]  [<ffffffff8108579b>] worker_thread+0x29b/0x530
> [288420.754699]  [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
> [288420.754701]  [<ffffffff8108b6f9>] kthread+0xc9/0xe0
> [288420.754703]  [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> [288420.754705]  [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
> [288420.754707]  [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90

Ah, thanks. According to above, xfs_cil is waiting on log space to free
up. This means xfs-cil is probably in:

	xlog_state_get_iclog_space()
	  ->xlog_wait(&log->l_flush_wait, &log->l_icloglock);

l_flush_wait is awoken during log I/O completion handling via the
xfs-log workqueue. That guy is here:

[288420.773968] INFO: task kworker/6:3:420227 blocked for more than 300 seconds.
[288420.773986]       Not tainted 3.18.43-40 #1
[288420.773997] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[288420.774017] kworker/6:3     D ffff880103893650     0 420227      2 0x00000000
[288420.774027] Workqueue: xfs-log/rbd1 xfs_log_worker [xfs]
[288420.774028]  ffff88010357fac8 0000000000000046 0000000000000000 ffff880103893240
[288420.774030]  0000000000012b00 ffff880146361128 ffff88010357ffd8 0000000000012b00
[288420.774031]  ffff8810297b7540 ffff880103893240 ffff88010357fae8 ffff88010357fbf8
[288420.774033] Call Trace:
[288420.774035]  [<ffffffff816dd629>] schedule+0x29/0x70
[288420.774036]  [<ffffffff816e066d>] schedule_timeout+0x1bd/0x200
[288420.774038]  [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
[288420.774040]  [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
[288420.774042]  [<ffffffff816deb5f>] wait_for_completion+0xbf/0x180
[288420.774043]  [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
[288420.774044]  [<ffffffff81086335>] flush_work+0x165/0x250
[288420.774046]  [<ffffffff81082940>] ? worker_detach_from_pool+0xd0/0xd0
[288420.774054]  [<ffffffffa03b65b1>] xlog_cil_force_lsn+0x81/0x200 [xfs]
[288420.774056]  [<ffffffff8109f3cc>] ? dequeue_entity+0x17c/0x520
[288420.774063]  [<ffffffffa03b478e>] _xfs_log_force+0x6e/0x280 [xfs]
[288420.774065]  [<ffffffff810a076c>] ? put_prev_entity+0x3c/0x2e0
[288420.774067]  [<ffffffff8109b315>] ? sched_clock_cpu+0x95/0xd0
[288420.774068]  [<ffffffff810145a2>] ? __switch_to+0xf2/0x5f0
[288420.774076]  [<ffffffffa03b49d9>] xfs_log_force+0x39/0xe0 [xfs]
[288420.774083]  [<ffffffffa03b4aa8>] xfs_log_worker+0x28/0x50 [xfs]
[288420.774085]  [<ffffffff81084c19>] process_one_work+0x159/0x4f0
[288420.774086]  [<ffffffff8108561b>] worker_thread+0x11b/0x530
[288420.774088]  [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
[288420.774089]  [<ffffffff8108b6f9>] kthread+0xc9/0xe0
[288420.774091]  [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
[288420.774093]  [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
[288420.774095]  [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90

... which is back waiting on xfs-cil.

Ilya,

Have you looked into this[1] patch by any chance? Note that 7a29ac474
("xfs: give all workqueues rescuer threads") may also be a potential
band aid for this. Or IOW, the lack thereof in v3.18.z may make this
problem more likely.

Brian

[1] http://www.spinics.net/lists/linux-xfs/msg04886.html

> -- 
> Michal Hocko
> SUSE Labs
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-29 11:16                 ` Michal Hocko
@ 2017-03-29 14:25                   ` Ilya Dryomov
  2017-03-30  6:25                     ` Michal Hocko
  0 siblings, 1 reply; 106+ messages in thread
From: Ilya Dryomov @ 2017-03-29 14:25 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton, linux-xfs

On Wed, Mar 29, 2017 at 1:16 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Wed 29-03-17 13:10:01, Ilya Dryomov wrote:
>> On Wed, Mar 29, 2017 at 12:55 PM, Michal Hocko <mhocko@kernel.org> wrote:
>> > On Wed 29-03-17 12:41:26, Michal Hocko wrote:
>> > [...]
>> >> > ceph_con_workfn
>> >> >   mutex_lock(&con->mutex)  # ceph_connection::mutex
>> >> >   try_write
>> >> >     ceph_tcp_connect
>> >> >       sock_create_kern
>> >> >         GFP_KERNEL allocation
>> >> >           allocator recurses into XFS, more I/O is issued
>> >
>> > One more note. So what happens if this is a GFP_NOIO request which
>> > cannot make any progress? Your IO thread is blocked on con->mutex
>> > as you write below but the above thread cannot proceed as well. So I am
>> > _really_ not sure this acutally helps.
>>
>> This is not the only I/O worker.  A ceph cluster typically consists of
>> at least a few OSDs and can be as large as thousands of OSDs.  This is
>> the reason we are calling sock_create_kern() on the writeback path in
>> the first place: pre-opening thousands of sockets isn't feasible.
>
> Sorry for being dense here but what actually guarantees the forward
> progress? My current understanding is that the deadlock is caused by
> con->mutext being held while the allocation cannot make a forward
> progress. I can imagine this would be possible if the other io flushers
> depend on this lock. But then NOIO vs. KERNEL allocation doesn't make
> much difference. What am I missing?

con->mutex is per-ceph_connection, osdc->request_mutex is global and is
the real problem here because we need both on the submit side, at least
in 3.18.  You are correct that even with GFP_NOIO this code may lock up
in theory, however I think it's very unlikely in practice.

We got rid of osdc->request_mutex in 4.7, so these workers are almost
independent in newer kernels and should be able to free up memory for
those blocked on GFP_NOIO retries with their respective con->mutex
held.  Using GFP_KERNEL and thus allowing the recursion is just asking
for an AA deadlock on con->mutex OTOH, so it does make a difference.

I'm a little confused by this discussion because for me this patch was
a no-brainer...  Locking aside, you said it was the stack trace in the
changelog that got your attention -- are you saying it's OK for a block
device to recurse back into the filesystem when doing I/O, potentially
generating more I/O?

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-29 11:49                   ` Brian Foster
@ 2017-03-29 14:30                     ` Ilya Dryomov
  0 siblings, 0 replies; 106+ messages in thread
From: Ilya Dryomov @ 2017-03-29 14:30 UTC (permalink / raw)
  To: Brian Foster
  Cc: Michal Hocko, Greg Kroah-Hartman, linux-kernel, stable,
	Sergey Jerusalimov, Jeff Layton, linux-xfs

On Wed, Mar 29, 2017 at 1:49 PM, Brian Foster <bfoster@redhat.com> wrote:
> On Wed, Mar 29, 2017 at 01:18:34PM +0200, Michal Hocko wrote:
>> On Wed 29-03-17 13:14:42, Ilya Dryomov wrote:
>> > On Wed, Mar 29, 2017 at 1:05 PM, Brian Foster <bfoster@redhat.com> wrote:
>> > > On Wed, Mar 29, 2017 at 12:41:26PM +0200, Michal Hocko wrote:
>> > >> [CC xfs guys]
>> > >>
>> > >> On Wed 29-03-17 11:21:44, Ilya Dryomov wrote:
>> > >> [...]
>> > >> > This is a set of stack traces from http://tracker.ceph.com/issues/19309
>> > >> > (linked in the changelog):
>> > >> >
>> > >> > Workqueue: ceph-msgr con_work [libceph]
>> > >> > ffff8810871cb018 0000000000000046 0000000000000000 ffff881085d40000
>> > >> > 0000000000012b00 ffff881025cad428 ffff8810871cbfd8 0000000000012b00
>> > >> > ffff880102fc1000 ffff881085d40000 ffff8810871cb038 ffff8810871cb148
>> > >> > Call Trace:
>> > >> > [<ffffffff816dd629>] schedule+0x29/0x70
>> > >> > [<ffffffff816e066d>] schedule_timeout+0x1bd/0x200
>> > >> > [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
>> > >> > [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
>> > >> > [<ffffffff816deb5f>] wait_for_completion+0xbf/0x180
>> > >> > [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
>> > >> > [<ffffffff81086335>] flush_work+0x165/0x250
>> > >>
>> > >> I suspect this is xlog_cil_push_now -> flush_work(&cil->xc_push_work)
>> > >> right? I kind of got lost where this waits on an IO.
>> > >>
>> > >
>> > > Yep. That means a CIL push is already in progress. We wait on that to
>> > > complete here. After that, the resulting task queues execution of
>> > > xlog_cil_push_work()->xlog_cil_push() on m_cil_workqueue. That task may
>> > > submit I/O to the log.
>> > >
>> > > I don't see any reference to xlog_cil_push() anywhere in the traces here
>> > > or in the bug referenced above, however..?
>> >
>> > Well, it's prefaced with "Interesting is:"...  Sergey (the original
>> > reporter, CCed here) might still have the rest of them.
>>
>> JFTR
>> http://tracker.ceph.com/attachments/download/2769/full_kern_trace.txt
>> [288420.754637] Workqueue: xfs-cil/rbd1 xlog_cil_push_work [xfs]
>> [288420.754638]  ffff880130c1fb38 0000000000000046 ffff880130c1fac8 ffff880130d72180
>> [288420.754640]  0000000000012b00 ffff880130c1fad8 ffff880130c1ffd8 0000000000012b00
>> [288420.754641]  ffff8810297b6480 ffff880130d72180 ffffffffa03b1264 ffff8820263d6800
>> [288420.754643] Call Trace:
>> [288420.754652]  [<ffffffffa03b1264>] ? xlog_bdstrat+0x34/0x70 [xfs]
>> [288420.754653]  [<ffffffff816dd629>] schedule+0x29/0x70
>> [288420.754661]  [<ffffffffa03b3b9c>] xlog_state_get_iclog_space+0xdc/0x2e0 [xfs]
>> [288420.754669]  [<ffffffffa03b1264>] ? xlog_bdstrat+0x34/0x70 [xfs]
>> [288420.754670]  [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
>> [288420.754678]  [<ffffffffa03b4090>] xlog_write+0x190/0x730 [xfs]
>> [288420.754686]  [<ffffffffa03b5d9e>] xlog_cil_push+0x24e/0x3e0 [xfs]
>> [288420.754693]  [<ffffffffa03b5f45>] xlog_cil_push_work+0x15/0x20 [xfs]
>> [288420.754695]  [<ffffffff81084c19>] process_one_work+0x159/0x4f0
>> [288420.754697]  [<ffffffff81084fdc>] process_scheduled_works+0x2c/0x40
>> [288420.754698]  [<ffffffff8108579b>] worker_thread+0x29b/0x530
>> [288420.754699]  [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
>> [288420.754701]  [<ffffffff8108b6f9>] kthread+0xc9/0xe0
>> [288420.754703]  [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
>> [288420.754705]  [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
>> [288420.754707]  [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
>
> Ah, thanks. According to above, xfs_cil is waiting on log space to free
> up. This means xfs-cil is probably in:
>
>         xlog_state_get_iclog_space()
>           ->xlog_wait(&log->l_flush_wait, &log->l_icloglock);
>
> l_flush_wait is awoken during log I/O completion handling via the
> xfs-log workqueue. That guy is here:
>
> [288420.773968] INFO: task kworker/6:3:420227 blocked for more than 300 seconds.
> [288420.773986]       Not tainted 3.18.43-40 #1
> [288420.773997] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [288420.774017] kworker/6:3     D ffff880103893650     0 420227      2 0x00000000
> [288420.774027] Workqueue: xfs-log/rbd1 xfs_log_worker [xfs]
> [288420.774028]  ffff88010357fac8 0000000000000046 0000000000000000 ffff880103893240
> [288420.774030]  0000000000012b00 ffff880146361128 ffff88010357ffd8 0000000000012b00
> [288420.774031]  ffff8810297b7540 ffff880103893240 ffff88010357fae8 ffff88010357fbf8
> [288420.774033] Call Trace:
> [288420.774035]  [<ffffffff816dd629>] schedule+0x29/0x70
> [288420.774036]  [<ffffffff816e066d>] schedule_timeout+0x1bd/0x200
> [288420.774038]  [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
> [288420.774040]  [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
> [288420.774042]  [<ffffffff816deb5f>] wait_for_completion+0xbf/0x180
> [288420.774043]  [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
> [288420.774044]  [<ffffffff81086335>] flush_work+0x165/0x250
> [288420.774046]  [<ffffffff81082940>] ? worker_detach_from_pool+0xd0/0xd0
> [288420.774054]  [<ffffffffa03b65b1>] xlog_cil_force_lsn+0x81/0x200 [xfs]
> [288420.774056]  [<ffffffff8109f3cc>] ? dequeue_entity+0x17c/0x520
> [288420.774063]  [<ffffffffa03b478e>] _xfs_log_force+0x6e/0x280 [xfs]
> [288420.774065]  [<ffffffff810a076c>] ? put_prev_entity+0x3c/0x2e0
> [288420.774067]  [<ffffffff8109b315>] ? sched_clock_cpu+0x95/0xd0
> [288420.774068]  [<ffffffff810145a2>] ? __switch_to+0xf2/0x5f0
> [288420.774076]  [<ffffffffa03b49d9>] xfs_log_force+0x39/0xe0 [xfs]
> [288420.774083]  [<ffffffffa03b4aa8>] xfs_log_worker+0x28/0x50 [xfs]
> [288420.774085]  [<ffffffff81084c19>] process_one_work+0x159/0x4f0
> [288420.774086]  [<ffffffff8108561b>] worker_thread+0x11b/0x530
> [288420.774088]  [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
> [288420.774089]  [<ffffffff8108b6f9>] kthread+0xc9/0xe0
> [288420.774091]  [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> [288420.774093]  [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
> [288420.774095]  [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
>
> ... which is back waiting on xfs-cil.
>
> Ilya,
>
> Have you looked into this[1] patch by any chance? Note that 7a29ac474
> ("xfs: give all workqueues rescuer threads") may also be a potential
> band aid for this. Or IOW, the lack thereof in v3.18.z may make this
> problem more likely.

No, I haven't -- this was a clear rbd/libceph bug to me.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-29 14:25                   ` Ilya Dryomov
@ 2017-03-30  6:25                     ` Michal Hocko
  2017-03-30 10:02                       ` Ilya Dryomov
  2017-03-30 13:53                       ` Ilya Dryomov
  0 siblings, 2 replies; 106+ messages in thread
From: Michal Hocko @ 2017-03-30  6:25 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton, linux-xfs

On Wed 29-03-17 16:25:18, Ilya Dryomov wrote:
> On Wed, Mar 29, 2017 at 1:16 PM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Wed 29-03-17 13:10:01, Ilya Dryomov wrote:
> >> On Wed, Mar 29, 2017 at 12:55 PM, Michal Hocko <mhocko@kernel.org> wrote:
> >> > On Wed 29-03-17 12:41:26, Michal Hocko wrote:
> >> > [...]
> >> >> > ceph_con_workfn
> >> >> >   mutex_lock(&con->mutex)  # ceph_connection::mutex
> >> >> >   try_write
> >> >> >     ceph_tcp_connect
> >> >> >       sock_create_kern
> >> >> >         GFP_KERNEL allocation
> >> >> >           allocator recurses into XFS, more I/O is issued
> >> >
> >> > One more note. So what happens if this is a GFP_NOIO request which
> >> > cannot make any progress? Your IO thread is blocked on con->mutex
> >> > as you write below but the above thread cannot proceed as well. So I am
> >> > _really_ not sure this acutally helps.
> >>
> >> This is not the only I/O worker.  A ceph cluster typically consists of
> >> at least a few OSDs and can be as large as thousands of OSDs.  This is
> >> the reason we are calling sock_create_kern() on the writeback path in
> >> the first place: pre-opening thousands of sockets isn't feasible.
> >
> > Sorry for being dense here but what actually guarantees the forward
> > progress? My current understanding is that the deadlock is caused by
> > con->mutext being held while the allocation cannot make a forward
> > progress. I can imagine this would be possible if the other io flushers
> > depend on this lock. But then NOIO vs. KERNEL allocation doesn't make
> > much difference. What am I missing?
> 
> con->mutex is per-ceph_connection, osdc->request_mutex is global and is
> the real problem here because we need both on the submit side, at least
> in 3.18.  You are correct that even with GFP_NOIO this code may lock up
> in theory, however I think it's very unlikely in practice.

No, it would just make such a bug more obscure. The real problem seems
to be that you rely on locks which cannot guarantee a forward progress
in the IO path. And that is a bug IMHO.
 
> We got rid of osdc->request_mutex in 4.7, so these workers are almost
> independent in newer kernels and should be able to free up memory for
> those blocked on GFP_NOIO retries with their respective con->mutex
> held.  Using GFP_KERNEL and thus allowing the recursion is just asking
> for an AA deadlock on con->mutex OTOH, so it does make a difference.

You keep saying this but so far I haven't heard how the AA deadlock is
possible. Both GFP_KERNEL and GFP_NOIO can stall for an unbounded amount
of time and that would cause you problems AFAIU.

> I'm a little confused by this discussion because for me this patch was
> a no-brainer...

No, it is a brainer. Because recursion prevention should be carefully
thought through. The lack of this approach has caused that we have
thousands of GFP_NOFS uses all over the kernel without a clear or proper
justification. Adding more on top doesn't help long term
maintainability.

> Locking aside, you said it was the stack trace in the changelog that
> got your attention

No, it is the usage of the scope GFP_NOIO API usage without a proper
explanation which caught my attention.

> are you saying it's OK for a block
> device to recurse back into the filesystem when doing I/O, potentially
> generating more I/O?

No, block device has to make a forward progress guarantee when
allocating and so use mempools or other means to achieve the same.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-30  6:25                     ` Michal Hocko
@ 2017-03-30 10:02                       ` Ilya Dryomov
  2017-03-30 11:21                         ` Michal Hocko
  2017-03-30 13:53                       ` Ilya Dryomov
  1 sibling, 1 reply; 106+ messages in thread
From: Ilya Dryomov @ 2017-03-30 10:02 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton, linux-xfs

On Thu, Mar 30, 2017 at 8:25 AM, Michal Hocko <mhocko@kernel.org> wrote:
> On Wed 29-03-17 16:25:18, Ilya Dryomov wrote:
>> On Wed, Mar 29, 2017 at 1:16 PM, Michal Hocko <mhocko@kernel.org> wrote:
>> > On Wed 29-03-17 13:10:01, Ilya Dryomov wrote:
>> >> On Wed, Mar 29, 2017 at 12:55 PM, Michal Hocko <mhocko@kernel.org> wrote:
>> >> > On Wed 29-03-17 12:41:26, Michal Hocko wrote:
>> >> > [...]
>> >> >> > ceph_con_workfn
>> >> >> >   mutex_lock(&con->mutex)  # ceph_connection::mutex
>> >> >> >   try_write
>> >> >> >     ceph_tcp_connect
>> >> >> >       sock_create_kern
>> >> >> >         GFP_KERNEL allocation
>> >> >> >           allocator recurses into XFS, more I/O is issued
>> >> >
>> >> > One more note. So what happens if this is a GFP_NOIO request which
>> >> > cannot make any progress? Your IO thread is blocked on con->mutex
>> >> > as you write below but the above thread cannot proceed as well. So I am
>> >> > _really_ not sure this acutally helps.
>> >>
>> >> This is not the only I/O worker.  A ceph cluster typically consists of
>> >> at least a few OSDs and can be as large as thousands of OSDs.  This is
>> >> the reason we are calling sock_create_kern() on the writeback path in
>> >> the first place: pre-opening thousands of sockets isn't feasible.
>> >
>> > Sorry for being dense here but what actually guarantees the forward
>> > progress? My current understanding is that the deadlock is caused by
>> > con->mutext being held while the allocation cannot make a forward
>> > progress. I can imagine this would be possible if the other io flushers
>> > depend on this lock. But then NOIO vs. KERNEL allocation doesn't make
>> > much difference. What am I missing?
>>
>> con->mutex is per-ceph_connection, osdc->request_mutex is global and is
>> the real problem here because we need both on the submit side, at least
>> in 3.18.  You are correct that even with GFP_NOIO this code may lock up
>> in theory, however I think it's very unlikely in practice.
>
> No, it would just make such a bug more obscure. The real problem seems
> to be that you rely on locks which cannot guarantee a forward progress
> in the IO path. And that is a bug IMHO.

Just to be clear: the "may lock up" comment above goes for 3.18, which
is where these stack traces came from.  osdc->request_mutex which stood
in the way of other ceph_connection workers is no more.

>
>> We got rid of osdc->request_mutex in 4.7, so these workers are almost
>> independent in newer kernels and should be able to free up memory for
>> those blocked on GFP_NOIO retries with their respective con->mutex
>> held.  Using GFP_KERNEL and thus allowing the recursion is just asking
>> for an AA deadlock on con->mutex OTOH, so it does make a difference.
>
> You keep saying this but so far I haven't heard how the AA deadlock is
> possible. Both GFP_KERNEL and GFP_NOIO can stall for an unbounded amount
> of time and that would cause you problems AFAIU.

Suppose we have an I/O for OSD X, which means it's got to go through
ceph_connection X:

ceph_con_workfn
  mutex_lock(&con->mutex)
    try_write
      ceph_tcp_connect
        sock_create_kern
          GFP_KERNEL allocation

Suppose that generates another I/O for OSD X and blocks on it.  Well,
it's got to go through the same ceph_connection:

rbd_queue_workfn
  ceph_osdc_start_request
    ceph_con_send
      mutex_lock(&con->mutex)  # deadlock, OSD X worker is knocked out

Now if that was a GFP_NOIO allocation, we would simply block in the
allocator.  The placement algorithm distributes objects across the OSDs
in a pseudo-random fashion, so even if we had a whole bunch of I/Os for
that OSD, some other I/Os for other OSDs would complete in the meantime
and free up memory.  If we are under the kind of memory pressure that
makes GFP_NOIO allocations block for an extended period of time, we are
bound to have a lot of pre-open sockets, as we would have done at least
some flushing by then.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-30 10:02                       ` Ilya Dryomov
@ 2017-03-30 11:21                         ` Michal Hocko
  2017-03-30 13:48                           ` Ilya Dryomov
  0 siblings, 1 reply; 106+ messages in thread
From: Michal Hocko @ 2017-03-30 11:21 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton, linux-xfs

On Thu 30-03-17 12:02:03, Ilya Dryomov wrote:
> On Thu, Mar 30, 2017 at 8:25 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Wed 29-03-17 16:25:18, Ilya Dryomov wrote:
[...]
> >> We got rid of osdc->request_mutex in 4.7, so these workers are almost
> >> independent in newer kernels and should be able to free up memory for
> >> those blocked on GFP_NOIO retries with their respective con->mutex
> >> held.  Using GFP_KERNEL and thus allowing the recursion is just asking
> >> for an AA deadlock on con->mutex OTOH, so it does make a difference.
> >
> > You keep saying this but so far I haven't heard how the AA deadlock is
> > possible. Both GFP_KERNEL and GFP_NOIO can stall for an unbounded amount
> > of time and that would cause you problems AFAIU.
> 
> Suppose we have an I/O for OSD X, which means it's got to go through
> ceph_connection X:
> 
> ceph_con_workfn
>   mutex_lock(&con->mutex)
>     try_write
>       ceph_tcp_connect
>         sock_create_kern
>           GFP_KERNEL allocation
> 
> Suppose that generates another I/O for OSD X and blocks on it.

Yeah, I have understand that but I am asking _who_ is going to generate
that IO. We do not do writeback from the direct reclaim path. I am not
familiar with Ceph at all but does any of its (slab) shrinkers generate
IO to recurse back?

> Well,
> it's got to go through the same ceph_connection:
> 
> rbd_queue_workfn
>   ceph_osdc_start_request
>     ceph_con_send
>       mutex_lock(&con->mutex)  # deadlock, OSD X worker is knocked out
> 
> Now if that was a GFP_NOIO allocation, we would simply block in the
> allocator.  The placement algorithm distributes objects across the OSDs
> in a pseudo-random fashion, so even if we had a whole bunch of I/Os for
> that OSD, some other I/Os for other OSDs would complete in the meantime
> and free up memory.  If we are under the kind of memory pressure that
> makes GFP_NOIO allocations block for an extended period of time, we are
> bound to have a lot of pre-open sockets, as we would have done at least
> some flushing by then.

How is this any different from xfs waiting for its IO to be done?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-30 11:21                         ` Michal Hocko
@ 2017-03-30 13:48                           ` Ilya Dryomov
  2017-03-30 14:36                             ` Michal Hocko
  0 siblings, 1 reply; 106+ messages in thread
From: Ilya Dryomov @ 2017-03-30 13:48 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton, linux-xfs

On Thu, Mar 30, 2017 at 1:21 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Thu 30-03-17 12:02:03, Ilya Dryomov wrote:
>> On Thu, Mar 30, 2017 at 8:25 AM, Michal Hocko <mhocko@kernel.org> wrote:
>> > On Wed 29-03-17 16:25:18, Ilya Dryomov wrote:
> [...]
>> >> We got rid of osdc->request_mutex in 4.7, so these workers are almost
>> >> independent in newer kernels and should be able to free up memory for
>> >> those blocked on GFP_NOIO retries with their respective con->mutex
>> >> held.  Using GFP_KERNEL and thus allowing the recursion is just asking
>> >> for an AA deadlock on con->mutex OTOH, so it does make a difference.
>> >
>> > You keep saying this but so far I haven't heard how the AA deadlock is
>> > possible. Both GFP_KERNEL and GFP_NOIO can stall for an unbounded amount
>> > of time and that would cause you problems AFAIU.
>>
>> Suppose we have an I/O for OSD X, which means it's got to go through
>> ceph_connection X:
>>
>> ceph_con_workfn
>>   mutex_lock(&con->mutex)
>>     try_write
>>       ceph_tcp_connect
>>         sock_create_kern
>>           GFP_KERNEL allocation
>>
>> Suppose that generates another I/O for OSD X and blocks on it.
>
> Yeah, I have understand that but I am asking _who_ is going to generate
> that IO. We do not do writeback from the direct reclaim path. I am not

It doesn't have to be a newly issued I/O, it could also be a wait on
something that depends on another I/O to OSD X, but I can't back this
up with any actual stack traces because the ones we have are too old.

That's just one scenario though.  With such recursion allowed, we can
just as easily deadlock in the filesystem.  Here is a couple of traces
circa 4.8, where it's the mutex in xfs_reclaim_inodes_ag():

cc1             D ffff92243fad8180     0  6772   6770 0x00000080
ffff9224d107b200 ffff922438de2f40 ffff922e8304fed8 ffff9224d107b200
ffff922ea7554000 ffff923034fb0618 0000000000000000 ffff9224d107b200
ffff9230368e5400 ffff92303788b000 ffffffff951eb4e1 0000003e00095bc0
Nov 28 18:21:23 dude kernel: Call Trace:
[<ffffffff951eb4e1>] ? schedule+0x31/0x80
[<ffffffffc0ab0570>] ? _xfs_log_force_lsn+0x1b0/0x340 [xfs]
[<ffffffff94ca5790>] ? wake_up_q+0x60/0x60
[<ffffffffc0a9f7ff>] ? __xfs_iunpin_wait+0x9f/0x160 [xfs]
[<ffffffffc0ab0730>] ? xfs_log_force_lsn+0x30/0xb0 [xfs]
[<ffffffffc0a97041>] ? xfs_reclaim_inode+0x131/0x370 [xfs]
[<ffffffffc0a9f7ff>] ? __xfs_iunpin_wait+0x9f/0x160 [xfs]
[<ffffffff94cbcf80>] ? autoremove_wake_function+0x40/0x40
[<ffffffffc0a97041>] ? xfs_reclaim_inode+0x131/0x370 [xfs]
[<ffffffffc0a97442>] ? xfs_reclaim_inodes_ag+0x1c2/0x2d0 [xfs]
[<ffffffff94cb197c>] ? enqueue_task_fair+0x5c/0x920
[<ffffffff94c35895>] ? sched_clock+0x5/0x10
[<ffffffff94ca47e0>] ? check_preempt_curr+0x50/0x90
[<ffffffff94ca4834>] ? ttwu_do_wakeup+0x14/0xe0
[<ffffffff94ca53c3>] ? try_to_wake_up+0x53/0x3a0
[<ffffffffc0a98331>] ? xfs_reclaim_inodes_nr+0x31/0x40 [xfs]
[<ffffffff94e05bfe>] ? super_cache_scan+0x17e/0x190
[<ffffffff94d919f3>] ? shrink_slab.part.38+0x1e3/0x3d0
[<ffffffff94d9616a>] ? shrink_node+0x10a/0x320
[<ffffffff94d96474>] ? do_try_to_free_pages+0xf4/0x350
[<ffffffff94d967ba>] ? try_to_free_pages+0xea/0x1b0
[<ffffffff94d863bd>] ? __alloc_pages_nodemask+0x61d/0xe60
[<ffffffff94dd918a>] ? alloc_pages_vma+0xba/0x280
[<ffffffff94db0f8b>] ? wp_page_copy+0x45b/0x6c0
[<ffffffff94db3e12>] ? alloc_set_pte+0x2e2/0x5f0
[<ffffffff94db2169>] ? do_wp_page+0x4a9/0x7e0
[<ffffffff94db4bd2>] ? handle_mm_fault+0x872/0x1250
[<ffffffff94c65a53>] ? __do_page_fault+0x1e3/0x500
[<ffffffff951f0cd8>] ? page_fault+0x28/0x30

kworker/9:3     D ffff92303f318180     0 20732      2 0x00000080
Workqueue: ceph-msgr ceph_con_workfn [libceph]
 ffff923035dd4480 ffff923038f8a0c0 0000000000000001 000000009eb27318
 ffff92269eb28000 ffff92269eb27338 ffff923036b145ac ffff923035dd4480
 00000000ffffffff ffff923036b145b0 ffffffff951eb4e1 ffff923036b145a8
Call Trace:
 [<ffffffff951eb4e1>] ? schedule+0x31/0x80
 [<ffffffff951eb77a>] ? schedule_preempt_disabled+0xa/0x10
 [<ffffffff951ed1f4>] ? __mutex_lock_slowpath+0xb4/0x130
 [<ffffffff951ed28b>] ? mutex_lock+0x1b/0x30
 [<ffffffffc0a974b3>] ? xfs_reclaim_inodes_ag+0x233/0x2d0 [xfs]
 [<ffffffff94d92ba5>] ? move_active_pages_to_lru+0x125/0x270
 [<ffffffff94f2b985>] ? radix_tree_gang_lookup_tag+0xc5/0x1c0
 [<ffffffff94dad0f3>] ? __list_lru_walk_one.isra.3+0x33/0x120
 [<ffffffffc0a98331>] ? xfs_reclaim_inodes_nr+0x31/0x40 [xfs]
 [<ffffffff94e05bfe>] ? super_cache_scan+0x17e/0x190
 [<ffffffff94d919f3>] ? shrink_slab.part.38+0x1e3/0x3d0
 [<ffffffff94d9616a>] ? shrink_node+0x10a/0x320
 [<ffffffff94d96474>] ? do_try_to_free_pages+0xf4/0x350
 [<ffffffff94d967ba>] ? try_to_free_pages+0xea/0x1b0
 [<ffffffff94d863bd>] ? __alloc_pages_nodemask+0x61d/0xe60
 [<ffffffff94ddf42d>] ? cache_grow_begin+0x9d/0x560
 [<ffffffff94ddfb88>] ? fallback_alloc+0x148/0x1c0
 [<ffffffff94de09db>] ? __kmalloc+0x1eb/0x580
# a buggy ceph_connection worker doing a GFP_KERNEL allocation

xz              D ffff92303f358180     0  5932   5928 0x00000084
 ffff921a56201180 ffff923038f8ae00 ffff92303788b2c8 0000000000000001
 ffff921e90234000 ffff921e90233820 ffff923036b14eac ffff921a56201180
 00000000ffffffff ffff923036b14eb0 ffffffff951eb4e1 ffff923036b14ea8
Call Trace:
 [<ffffffff951eb4e1>] ? schedule+0x31/0x80
 [<ffffffff951eb77a>] ? schedule_preempt_disabled+0xa/0x10
 [<ffffffff951ed1f4>] ? __mutex_lock_slowpath+0xb4/0x130
 [<ffffffff951ed28b>] ? mutex_lock+0x1b/0x30
 [<ffffffffc0a974b3>] ? xfs_reclaim_inodes_ag+0x233/0x2d0 [xfs]
 [<ffffffff94f2b985>] ? radix_tree_gang_lookup_tag+0xc5/0x1c0
 [<ffffffff94dad0f3>] ? __list_lru_walk_one.isra.3+0x33/0x120
 [<ffffffffc0a98331>] ? xfs_reclaim_inodes_nr+0x31/0x40 [xfs]
 [<ffffffff94e05bfe>] ? super_cache_scan+0x17e/0x190
 [<ffffffff94d919f3>] ? shrink_slab.part.38+0x1e3/0x3d0
 [<ffffffff94d9616a>] ? shrink_node+0x10a/0x320
 [<ffffffff94d96474>] ? do_try_to_free_pages+0xf4/0x350
 [<ffffffff94d967ba>] ? try_to_free_pages+0xea/0x1b0
 [<ffffffff94d863bd>] ? __alloc_pages_nodemask+0x61d/0xe60
 [<ffffffff94dd73b1>] ? alloc_pages_current+0x91/0x140
 [<ffffffff94e0ab98>] ? pipe_write+0x208/0x3f0
 [<ffffffff94e01b08>] ? new_sync_write+0xd8/0x130
 [<ffffffff94e02293>] ? vfs_write+0xb3/0x1a0
 [<ffffffff94e03672>] ? SyS_write+0x52/0xc0
 [<ffffffff94c03b8a>] ? do_syscall_64+0x7a/0xd0
 [<ffffffff951ef9a5>] ? entry_SYSCALL64_slow_path+0x25/0x25

We have since fixed that allocation site, but the point is it was
a combination of direct reclaim and GFP_KERNEL recursion.

> familiar with Ceph at all but does any of its (slab) shrinkers generate
> IO to recurse back?

We don't register any custom shrinkers.  This is XFS on top of rbd,
a ceph-backed block device.

>
>> Well,
>> it's got to go through the same ceph_connection:
>>
>> rbd_queue_workfn
>>   ceph_osdc_start_request
>>     ceph_con_send
>>       mutex_lock(&con->mutex)  # deadlock, OSD X worker is knocked out
>>
>> Now if that was a GFP_NOIO allocation, we would simply block in the
>> allocator.  The placement algorithm distributes objects across the OSDs
>> in a pseudo-random fashion, so even if we had a whole bunch of I/Os for
>> that OSD, some other I/Os for other OSDs would complete in the meantime
>> and free up memory.  If we are under the kind of memory pressure that
>> makes GFP_NOIO allocations block for an extended period of time, we are
>> bound to have a lot of pre-open sockets, as we would have done at least
>> some flushing by then.
>
> How is this any different from xfs waiting for its IO to be done?

I feel like we are talking past each other here.  If the worker in
question isn't deadlocked, it will eventually get its socket and start
flushing I/O.  If it has deadlocked, it won't...

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-30  6:25                     ` Michal Hocko
  2017-03-30 10:02                       ` Ilya Dryomov
@ 2017-03-30 13:53                       ` Ilya Dryomov
  2017-03-30 13:59                         ` Michal Hocko
  1 sibling, 1 reply; 106+ messages in thread
From: Ilya Dryomov @ 2017-03-30 13:53 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton, linux-xfs

On Thu, Mar 30, 2017 at 8:25 AM, Michal Hocko <mhocko@kernel.org> wrote:
> On Wed 29-03-17 16:25:18, Ilya Dryomov wrote:
>> On Wed, Mar 29, 2017 at 1:16 PM, Michal Hocko <mhocko@kernel.org> wrote:
>> > On Wed 29-03-17 13:10:01, Ilya Dryomov wrote:
>> >> On Wed, Mar 29, 2017 at 12:55 PM, Michal Hocko <mhocko@kernel.org> wrote:
>> >> > On Wed 29-03-17 12:41:26, Michal Hocko wrote:
>> >> > [...]
>> >> >> > ceph_con_workfn
>> >> >> >   mutex_lock(&con->mutex)  # ceph_connection::mutex
>> >> >> >   try_write
>> >> >> >     ceph_tcp_connect
>> >> >> >       sock_create_kern
>> >> >> >         GFP_KERNEL allocation
>> >> >> >           allocator recurses into XFS, more I/O is issued
>> >> >
>> >> > One more note. So what happens if this is a GFP_NOIO request which
>> >> > cannot make any progress? Your IO thread is blocked on con->mutex
>> >> > as you write below but the above thread cannot proceed as well. So I am
>> >> > _really_ not sure this acutally helps.
>> >>
>> >> This is not the only I/O worker.  A ceph cluster typically consists of
>> >> at least a few OSDs and can be as large as thousands of OSDs.  This is
>> >> the reason we are calling sock_create_kern() on the writeback path in
>> >> the first place: pre-opening thousands of sockets isn't feasible.
>> >
>> > Sorry for being dense here but what actually guarantees the forward
>> > progress? My current understanding is that the deadlock is caused by
>> > con->mutext being held while the allocation cannot make a forward
>> > progress. I can imagine this would be possible if the other io flushers
>> > depend on this lock. But then NOIO vs. KERNEL allocation doesn't make
>> > much difference. What am I missing?
>>
>> con->mutex is per-ceph_connection, osdc->request_mutex is global and is
>> the real problem here because we need both on the submit side, at least
>> in 3.18.  You are correct that even with GFP_NOIO this code may lock up
>> in theory, however I think it's very unlikely in practice.
>
> No, it would just make such a bug more obscure. The real problem seems
> to be that you rely on locks which cannot guarantee a forward progress
> in the IO path. And that is a bug IMHO.
>
>> We got rid of osdc->request_mutex in 4.7, so these workers are almost
>> independent in newer kernels and should be able to free up memory for
>> those blocked on GFP_NOIO retries with their respective con->mutex
>> held.  Using GFP_KERNEL and thus allowing the recursion is just asking
>> for an AA deadlock on con->mutex OTOH, so it does make a difference.
>
> You keep saying this but so far I haven't heard how the AA deadlock is
> possible. Both GFP_KERNEL and GFP_NOIO can stall for an unbounded amount
> of time and that would cause you problems AFAIU.
>
>> I'm a little confused by this discussion because for me this patch was
>> a no-brainer...
>
> No, it is a brainer. Because recursion prevention should be carefully
> thought through. The lack of this approach has caused that we have
> thousands of GFP_NOFS uses all over the kernel without a clear or proper
> justification. Adding more on top doesn't help long term
> maintainability.
>
>> Locking aside, you said it was the stack trace in the changelog that
>> got your attention
>
> No, it is the usage of the scope GFP_NOIO API usage without a proper
> explanation which caught my attention.
>
>> are you saying it's OK for a block
>> device to recurse back into the filesystem when doing I/O, potentially
>> generating more I/O?
>
> No, block device has to make a forward progress guarantee when
> allocating and so use mempools or other means to achieve the same.

OK, let me put this differently.  Do you agree that a block device
cannot make _any_ kind of progress guarantee if it does a GFP_KERNEL
allocation in the I/O path?

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-30 13:53                       ` Ilya Dryomov
@ 2017-03-30 13:59                         ` Michal Hocko
  0 siblings, 0 replies; 106+ messages in thread
From: Michal Hocko @ 2017-03-30 13:59 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton, linux-xfs

On Thu 30-03-17 15:53:35, Ilya Dryomov wrote:
> On Thu, Mar 30, 2017 at 8:25 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Wed 29-03-17 16:25:18, Ilya Dryomov wrote:
[...]
> >> are you saying it's OK for a block
> >> device to recurse back into the filesystem when doing I/O, potentially
> >> generating more I/O?
> >
> > No, block device has to make a forward progress guarantee when
> > allocating and so use mempools or other means to achieve the same.
> 
> OK, let me put this differently.  Do you agree that a block device
> cannot make _any_ kind of progress guarantee if it does a GFP_KERNEL
> allocation in the I/O path?

yes that is correct. And the same is correct for GFP_NOIO allocations as
well.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-30 13:48                           ` Ilya Dryomov
@ 2017-03-30 14:36                             ` Michal Hocko
  2017-03-30 15:06                               ` Ilya Dryomov
  0 siblings, 1 reply; 106+ messages in thread
From: Michal Hocko @ 2017-03-30 14:36 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton, linux-xfs

On Thu 30-03-17 15:48:42, Ilya Dryomov wrote:
> On Thu, Mar 30, 2017 at 1:21 PM, Michal Hocko <mhocko@kernel.org> wrote:
[...]
> > familiar with Ceph at all but does any of its (slab) shrinkers generate
> > IO to recurse back?
> 
> We don't register any custom shrinkers.  This is XFS on top of rbd,
> a ceph-backed block device.

OK, that was the part I was missing. So you depend on the XFS to make a
forward progress here.

> >> Well,
> >> it's got to go through the same ceph_connection:
> >>
> >> rbd_queue_workfn
> >>   ceph_osdc_start_request
> >>     ceph_con_send
> >>       mutex_lock(&con->mutex)  # deadlock, OSD X worker is knocked out
> >>
> >> Now if that was a GFP_NOIO allocation, we would simply block in the
> >> allocator.  The placement algorithm distributes objects across the OSDs
> >> in a pseudo-random fashion, so even if we had a whole bunch of I/Os for
> >> that OSD, some other I/Os for other OSDs would complete in the meantime
> >> and free up memory.  If we are under the kind of memory pressure that
> >> makes GFP_NOIO allocations block for an extended period of time, we are
> >> bound to have a lot of pre-open sockets, as we would have done at least
> >> some flushing by then.
> >
> > How is this any different from xfs waiting for its IO to be done?
> 
> I feel like we are talking past each other here.  If the worker in
> question isn't deadlocked, it will eventually get its socket and start
> flushing I/O.  If it has deadlocked, it won't...

But if the allocation is stuck then the holder of the lock cannot make
a forward progress and it is effectivelly deadlocked because other IO
depends on the lock it holds. Maybe I just ask bad questions but what
makes GFP_NOIO different from GFP_KERNEL here. We know that the later
might need to wait for an IO to finish in the shrinker but it itself
doesn't get the lock in question directly. The former depends on the
allocator forward progress as well and that in turn wait for somebody
else to proceed with the IO. So to me any blocking allocation while
holding a lock which blocks further IO to complete is simply broken.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-30 14:36                             ` Michal Hocko
@ 2017-03-30 15:06                               ` Ilya Dryomov
  2017-03-30 16:12                                 ` Michal Hocko
  0 siblings, 1 reply; 106+ messages in thread
From: Ilya Dryomov @ 2017-03-30 15:06 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton, linux-xfs

On Thu, Mar 30, 2017 at 4:36 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Thu 30-03-17 15:48:42, Ilya Dryomov wrote:
>> On Thu, Mar 30, 2017 at 1:21 PM, Michal Hocko <mhocko@kernel.org> wrote:
> [...]
>> > familiar with Ceph at all but does any of its (slab) shrinkers generate
>> > IO to recurse back?
>>
>> We don't register any custom shrinkers.  This is XFS on top of rbd,
>> a ceph-backed block device.
>
> OK, that was the part I was missing. So you depend on the XFS to make a
> forward progress here.
>
>> >> Well,
>> >> it's got to go through the same ceph_connection:
>> >>
>> >> rbd_queue_workfn
>> >>   ceph_osdc_start_request
>> >>     ceph_con_send
>> >>       mutex_lock(&con->mutex)  # deadlock, OSD X worker is knocked out
>> >>
>> >> Now if that was a GFP_NOIO allocation, we would simply block in the
>> >> allocator.  The placement algorithm distributes objects across the OSDs
>> >> in a pseudo-random fashion, so even if we had a whole bunch of I/Os for
>> >> that OSD, some other I/Os for other OSDs would complete in the meantime
>> >> and free up memory.  If we are under the kind of memory pressure that
>> >> makes GFP_NOIO allocations block for an extended period of time, we are
>> >> bound to have a lot of pre-open sockets, as we would have done at least
>> >> some flushing by then.
>> >
>> > How is this any different from xfs waiting for its IO to be done?
>>
>> I feel like we are talking past each other here.  If the worker in
>> question isn't deadlocked, it will eventually get its socket and start
>> flushing I/O.  If it has deadlocked, it won't...
>
> But if the allocation is stuck then the holder of the lock cannot make
> a forward progress and it is effectivelly deadlocked because other IO
> depends on the lock it holds. Maybe I just ask bad questions but what

Only I/O to the same OSD.  A typical ceph cluster has dozens of OSDs,
so there is plenty of room for other in-flight I/Os to finish and move
the allocator forward.  The lock in question is per-ceph_connection
(read: per-OSD).

> makes GFP_NOIO different from GFP_KERNEL here. We know that the later
> might need to wait for an IO to finish in the shrinker but it itself
> doesn't get the lock in question directly. The former depends on the
> allocator forward progress as well and that in turn wait for somebody
> else to proceed with the IO. So to me any blocking allocation while
> holding a lock which blocks further IO to complete is simply broken.

Right, with GFP_NOIO we simply wait -- there is nothing wrong with
a blocking allocation, at least in the general case.  With GFP_KERNEL
we deadlock, either in rbd/libceph (less likely) or in the filesystem
above (more likely, shown in the xfs_reclaim_inodes_ag() traces you
omitted in your quote).

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-30 15:06                               ` Ilya Dryomov
@ 2017-03-30 16:12                                 ` Michal Hocko
  2017-03-30 17:19                                   ` Ilya Dryomov
  0 siblings, 1 reply; 106+ messages in thread
From: Michal Hocko @ 2017-03-30 16:12 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton, linux-xfs

On Thu 30-03-17 17:06:51, Ilya Dryomov wrote:
[...]
> > But if the allocation is stuck then the holder of the lock cannot make
> > a forward progress and it is effectivelly deadlocked because other IO
> > depends on the lock it holds. Maybe I just ask bad questions but what
> 
> Only I/O to the same OSD.  A typical ceph cluster has dozens of OSDs,
> so there is plenty of room for other in-flight I/Os to finish and move
> the allocator forward.  The lock in question is per-ceph_connection
> (read: per-OSD).
> 
> > makes GFP_NOIO different from GFP_KERNEL here. We know that the later
> > might need to wait for an IO to finish in the shrinker but it itself
> > doesn't get the lock in question directly. The former depends on the
> > allocator forward progress as well and that in turn wait for somebody
> > else to proceed with the IO. So to me any blocking allocation while
> > holding a lock which blocks further IO to complete is simply broken.
> 
> Right, with GFP_NOIO we simply wait -- there is nothing wrong with
> a blocking allocation, at least in the general case.  With GFP_KERNEL
> we deadlock, either in rbd/libceph (less likely) or in the filesystem
> above (more likely, shown in the xfs_reclaim_inodes_ag() traces you
> omitted in your quote).

I am not convinced. It seems you are relying on something that is not
guaranteed fundamentally. AFAIU all the IO paths should _guarantee_
and use mempools for that purpose if they need to allocate.

But, hey, I will not argue as my understanding of ceph is close to
zero. You are the maintainer so it is your call. I would just really
appreciate if you could document this as much as possible (ideally
at the place where you call memalloc_noio_save and describe the lock
dependency there).

Thanks!
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-30 16:12                                 ` Michal Hocko
@ 2017-03-30 17:19                                   ` Ilya Dryomov
  2017-03-30 18:44                                     ` Michal Hocko
  0 siblings, 1 reply; 106+ messages in thread
From: Ilya Dryomov @ 2017-03-30 17:19 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton, linux-xfs

On Thu, Mar 30, 2017 at 6:12 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Thu 30-03-17 17:06:51, Ilya Dryomov wrote:
> [...]
>> > But if the allocation is stuck then the holder of the lock cannot make
>> > a forward progress and it is effectivelly deadlocked because other IO
>> > depends on the lock it holds. Maybe I just ask bad questions but what
>>
>> Only I/O to the same OSD.  A typical ceph cluster has dozens of OSDs,
>> so there is plenty of room for other in-flight I/Os to finish and move
>> the allocator forward.  The lock in question is per-ceph_connection
>> (read: per-OSD).
>>
>> > makes GFP_NOIO different from GFP_KERNEL here. We know that the later
>> > might need to wait for an IO to finish in the shrinker but it itself
>> > doesn't get the lock in question directly. The former depends on the
>> > allocator forward progress as well and that in turn wait for somebody
>> > else to proceed with the IO. So to me any blocking allocation while
>> > holding a lock which blocks further IO to complete is simply broken.
>>
>> Right, with GFP_NOIO we simply wait -- there is nothing wrong with
>> a blocking allocation, at least in the general case.  With GFP_KERNEL
>> we deadlock, either in rbd/libceph (less likely) or in the filesystem
>> above (more likely, shown in the xfs_reclaim_inodes_ag() traces you
>> omitted in your quote).
>
> I am not convinced. It seems you are relying on something that is not
> guaranteed fundamentally. AFAIU all the IO paths should _guarantee_
> and use mempools for that purpose if they need to allocate.
>
> But, hey, I will not argue as my understanding of ceph is close to
> zero. You are the maintainer so it is your call. I would just really
> appreciate if you could document this as much as possible (ideally
> at the place where you call memalloc_noio_save and describe the lock
> dependency there).

It's certainly not perfect (especially this socket case -- putting
together a pool of sockets is not easy) and I'm sure one could poke
some holes in the entire thing, but I'm convinced we are much better
off with the memalloc_noio_{save,restore}() pair in there.

I'll try to come up with a better comment, but the problem is that it
can be an arbitrary lock in an arbitrary filesystem, not just libceph's
con->mutex, so it's hard to be specific.

Do I have your OK to poke Greg to get the backports going?

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
  2017-03-30 17:19                                   ` Ilya Dryomov
@ 2017-03-30 18:44                                     ` Michal Hocko
  0 siblings, 0 replies; 106+ messages in thread
From: Michal Hocko @ 2017-03-30 18:44 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Greg Kroah-Hartman, linux-kernel, stable, Sergey Jerusalimov,
	Jeff Layton, linux-xfs

On Thu 30-03-17 19:19:59, Ilya Dryomov wrote:
> On Thu, Mar 30, 2017 at 6:12 PM, Michal Hocko <mhocko@kernel.org> wrote:
> > On Thu 30-03-17 17:06:51, Ilya Dryomov wrote:
> > [...]
> >> > But if the allocation is stuck then the holder of the lock cannot make
> >> > a forward progress and it is effectivelly deadlocked because other IO
> >> > depends on the lock it holds. Maybe I just ask bad questions but what
> >>
> >> Only I/O to the same OSD.  A typical ceph cluster has dozens of OSDs,
> >> so there is plenty of room for other in-flight I/Os to finish and move
> >> the allocator forward.  The lock in question is per-ceph_connection
> >> (read: per-OSD).
> >>
> >> > makes GFP_NOIO different from GFP_KERNEL here. We know that the later
> >> > might need to wait for an IO to finish in the shrinker but it itself
> >> > doesn't get the lock in question directly. The former depends on the
> >> > allocator forward progress as well and that in turn wait for somebody
> >> > else to proceed with the IO. So to me any blocking allocation while
> >> > holding a lock which blocks further IO to complete is simply broken.
> >>
> >> Right, with GFP_NOIO we simply wait -- there is nothing wrong with
> >> a blocking allocation, at least in the general case.  With GFP_KERNEL
> >> we deadlock, either in rbd/libceph (less likely) or in the filesystem
> >> above (more likely, shown in the xfs_reclaim_inodes_ag() traces you
> >> omitted in your quote).
> >
> > I am not convinced. It seems you are relying on something that is not
> > guaranteed fundamentally. AFAIU all the IO paths should _guarantee_
> > and use mempools for that purpose if they need to allocate.
> >
> > But, hey, I will not argue as my understanding of ceph is close to
> > zero. You are the maintainer so it is your call. I would just really
> > appreciate if you could document this as much as possible (ideally
> > at the place where you call memalloc_noio_save and describe the lock
> > dependency there).
> 
> It's certainly not perfect (especially this socket case -- putting
> together a pool of sockets is not easy) and I'm sure one could poke
> some holes in the entire thing,

I would recommend testing under a heavy memory pressure (involving OOM
killer invocations) with a lot of IO pressure to see what falls out.

> but I'm convinced we are much better
> off with the memalloc_noio_{save,restore}() pair in there.
> 
> I'll try to come up with a better comment, but the problem is that it
> can be an arbitrary lock in an arbitrary filesystem, not just libceph's
> con->mutex, so it's hard to be specific.

But the particular path should describe what is the deadlock scenario
regardless of the FS (xfs is likely not the only one to wait for the
IO to finish).

> Do I have your OK to poke Greg to get the backports going?

As I've said, it's your call, if you feel comfortable with this then I
will certainly not stand in the way.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 42/76] mmc: sdhci: Do not disable interrupts while waiting for clock
  2017-03-28 12:30 ` [PATCH 4.4 42/76] mmc: sdhci: Do not disable interrupts while waiting for clock Greg Kroah-Hartman
@ 2017-04-04 16:50   ` Ben Hutchings
  2017-04-06 12:12     ` Ludovic Desroches
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Hutchings @ 2017-04-04 16:50 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-kernel, stable, Adrian Hunter, Ludovic Desroches,
	Greg Kroah-Hartman

On Tue, 2017-03-28 at 14:30 +0200, Greg Kroah-Hartman wrote:
> 4.4-stable review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Adrian Hunter <adrian.hunter@intel.com>
> 
> commit e2ebfb2142acefecc2496e71360f50d25726040b upstream.
> 
> Disabling interrupts for even a millisecond can cause problems for some
> devices. That can happen when sdhci changes clock frequency because it
> waits for the clock to become stable under a spin lock.
> 
> The spin lock is not necessary here. Anything that is racing with changes
> to the I/O state is already broken. The mmc core already provides
> synchronization via "claiming" the host.
[...]

In mainline, drivers/mmc/host/sdhci-of-at91.c has a slightly different
version of this code that seems to have the same issue.  In 4.4 there's
another (conditional) mdelay(1) further up this function that seems to
be related to that hardware, and probably ought to have an unlock/lock
around it.

Ben.

-- 
Ben Hutchings
Software Developer, Codethink Ltd.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 74/76] serial: 8250_pci: Detach low-level driver during PCI error recovery
  2017-03-28 12:31 ` [PATCH 4.4 74/76] serial: 8250_pci: Detach low-level driver during PCI error recovery Greg Kroah-Hartman
@ 2017-04-04 20:26   ` Ben Hutchings
  0 siblings, 0 replies; 106+ messages in thread
From: Ben Hutchings @ 2017-04-04 20:26 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Gabriel Krisman Bertazi
  Cc: linux-kernel, stable, Sasha Levin, Sumit Semwal

On Tue, 2017-03-28 at 14:31 +0200, Greg Kroah-Hartman wrote:
[...]
>  static void serial8250_io_resume(struct pci_dev *dev)
>  {
>  	struct serial_private *priv = pci_get_drvdata(dev);
> +	const struct pciserial_board *board;
>  
> -	if (priv)
> -		pciserial_resume_ports(priv);
> +	if (!priv)
> +		return;
> +
> +	board = priv->board;
> +	kfree(priv);
> +	priv = pciserial_init_ports(dev, board);
> +
> +	if (!IS_ERR(priv)) {
> +		pci_set_drvdata(dev, priv);
> +	}
>  }

On error, this leaves drvdata as a dangling pointer.  Removing the
device or driver will then cause a use-after-free.  (And setting drvdata
to NULL isn't enough to fix this as there is no null pointer check in
pciserial_remove_ports().)

Ben.

-- 
Ben Hutchings
Software Developer, Codethink Ltd.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 42/76] mmc: sdhci: Do not disable interrupts while waiting for clock
  2017-04-04 16:50   ` Ben Hutchings
@ 2017-04-06 12:12     ` Ludovic Desroches
  2017-04-06 14:22       ` Ben Hutchings
  0 siblings, 1 reply; 106+ messages in thread
From: Ludovic Desroches @ 2017-04-06 12:12 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Ulf Hansson, linux-kernel, stable, Adrian Hunter,
	Ludovic Desroches, Greg Kroah-Hartman

On Tue, Apr 04, 2017 at 05:50:50PM +0100, Ben Hutchings wrote:
> On Tue, 2017-03-28 at 14:30 +0200, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.  If anyone has any objections, please let me know.
> > 
> > ------------------
> > 
> > From: Adrian Hunter <adrian.hunter@intel.com>
> > 
> > commit e2ebfb2142acefecc2496e71360f50d25726040b upstream.
> > 
> > Disabling interrupts for even a millisecond can cause problems for some
> > devices. That can happen when sdhci changes clock frequency because it
> > waits for the clock to become stable under a spin lock.
> > 
> > The spin lock is not necessary here. Anything that is racing with changes
> > to the I/O state is already broken. The mmc core already provides
> > synchronization via "claiming" the host.
> [...]
> 
> In mainline, drivers/mmc/host/sdhci-of-at91.c has a slightly different
> version of this code that seems to have the same issue.  In 4.4 there's
> another (conditional) mdelay(1) further up this function that seems to
> be related to that hardware, and probably ought to have an unlock/lock
> around it.

Right, how do you want to proceed? Do you want me to send a patch on top
of it to manage this extra mdelay?

Regards

Ludovic

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 4.4 42/76] mmc: sdhci: Do not disable interrupts while waiting for clock
  2017-04-06 12:12     ` Ludovic Desroches
@ 2017-04-06 14:22       ` Ben Hutchings
  0 siblings, 0 replies; 106+ messages in thread
From: Ben Hutchings @ 2017-04-06 14:22 UTC (permalink / raw)
  To: Ludovic Desroches
  Cc: Ulf Hansson, linux-kernel, stable, Adrian Hunter, Greg Kroah-Hartman

On Thu, 2017-04-06 at 14:12 +0200, Ludovic Desroches wrote:
> On Tue, Apr 04, 2017 at 05:50:50PM +0100, Ben Hutchings wrote:
> > On Tue, 2017-03-28 at 14:30 +0200, Greg Kroah-Hartman wrote:
> > > 4.4-stable review patch.  If anyone has any objections, please let me know.
> > > 
> > > ------------------
> > > 
> > > From: Adrian Hunter <adrian.hunter@intel.com>
> > > 
> > > commit e2ebfb2142acefecc2496e71360f50d25726040b upstream.
> > > 
> > > Disabling interrupts for even a millisecond can cause problems for some
> > > devices. That can happen when sdhci changes clock frequency because it
> > > waits for the clock to become stable under a spin lock.
> > > 
> > > The spin lock is not necessary here. Anything that is racing with changes
> > > to the I/O state is already broken. The mmc core already provides
> > > synchronization via "claiming" the host.
> > [...]
> > 
> > In mainline, drivers/mmc/host/sdhci-of-at91.c has a slightly different
> > version of this code that seems to have the same issue.  In 4.4 there's
> > another (conditional) mdelay(1) further up this function that seems to
> > be related to that hardware, and probably ought to have an unlock/lock
> > around it.
> 
> Right, how do you want to proceed? Do you want me to send a patch on top
> of it to manage this extra mdelay?

This change doesn't appear to break anything; I'm just saying that it's
an incomplete fix.  The other case where there's a delay with IRQs
disabled should be fixed with an additional patch.

Ben.

-- 
Ben Hutchings
Software Developer, Codethink Ltd.

^ permalink raw reply	[flat|nested] 106+ messages in thread

end of thread, other threads:[~2017-04-06 14:22 UTC | newest]

Thread overview: 106+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
2017-03-28 12:29 ` [PATCH 4.4 01/76] net/openvswitch: Set the ipv6 source tunnel key address attribute correctly Greg Kroah-Hartman
2017-03-28 12:29 ` [PATCH 4.4 02/76] net: bcmgenet: Do not suspend PHY if Wake-on-LAN is enabled Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 03/76] net: properly release sk_frag.page Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 04/76] amd-xgbe: Fix jumbo MTU processing on newer hardware Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 05/76] net: unix: properly re-increment inflight counter of GC discarded candidates Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 06/76] net/mlx5: Increase number of max QPs in default profile Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 07/76] net/mlx5e: Count LRO packets correctly Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 08/76] net: bcmgenet: remove bcmgenet_internal_phy_setup() Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 09/76] ipv4: provide stronger user input validation in nl_fib_input() Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 10/76] socket, bpf: fix sk_filter use after free in sk_clone_lock Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 11/76] tcp: initialize icsk_ack.lrcvtime at session start time Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 12/76] Input: elan_i2c - add ASUS EeeBook X205TA special touchpad fw Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 13/76] Input: i8042 - add noloop quirk for Dell Embedded Box PC 3000 Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 14/76] Input: iforce - validate number of endpoints before using them Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 15/76] Input: ims-pcu " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 16/76] Input: hanwang " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 17/76] Input: yealink " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 18/76] Input: cm109 " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 19/76] Input: kbtab " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 20/76] Input: sur40 " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 21/76] ALSA: seq: Fix racy cell insertions during snd_seq_pool_done() Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 22/76] ALSA: ctxfi: Fix the incorrect check of dma_set_mask() call Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 23/76] ALSA: hda - Adding a group of pin definition to fix headset problem Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 24/76] USB: serial: option: add Quectel UC15, UC20, EC21, and EC25 modems Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 27/76] usb: gadget: f_uvc: Fix SuperSpeed companion descriptors wBytesPerInterval Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 28/76] usb-core: Add LINEAR_FRAME_INTR_BINTERVAL USB quirk Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 29/76] USB: uss720: fix NULL-deref at probe Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 30/76] USB: lvtest: " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 31/76] USB: idmouse: " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 32/76] USB: wusbcore: " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 33/76] usb: musb: cppi41: dont check early-TX-interrupt for Isoch transfer Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 34/76] usb: hub: Fix crash after failure to read BOS descriptor Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 35/76] uwb: i1480-dfu: fix NULL-deref at probe Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 36/76] uwb: hwa-rc: " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 37/76] mmc: ushc: " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 38/76] iio: adc: ti_am335x_adc: fix fifo overrun recovery Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 39/76] iio: hid-sensor-trigger: Change get poll value function order to avoid sensor properties losing after resume from S3 Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 40/76] parport: fix attempt to write duplicate procfiles Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 41/76] ext4: mark inode dirty after converting inline directory Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 42/76] mmc: sdhci: Do not disable interrupts while waiting for clock Greg Kroah-Hartman
2017-04-04 16:50   ` Ben Hutchings
2017-04-06 12:12     ` Ludovic Desroches
2017-04-06 14:22       ` Ben Hutchings
2017-03-28 12:30 ` [PATCH 4.4 43/76] xen/acpi: upload PM state from init-domain to Xen Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 44/76] iommu/vt-d: Fix NULL pointer dereference in device_to_iommu Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 45/76] ARM: at91: pm: cpu_idle: switch DDR to power-down mode Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 46/76] ARM: dts: at91: sama5d2: add dma properties to UART nodes Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 47/76] cpufreq: Restore policy min/max limits on CPU online Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations Greg Kroah-Hartman
2017-03-28 12:43   ` Michal Hocko
2017-03-28 13:23     ` Ilya Dryomov
2017-03-28 13:30       ` Michal Hocko
2017-03-29  9:21         ` Ilya Dryomov
2017-03-29 10:41           ` Michal Hocko
2017-03-29 10:55             ` Michal Hocko
2017-03-29 11:10               ` Ilya Dryomov
2017-03-29 11:16                 ` Michal Hocko
2017-03-29 14:25                   ` Ilya Dryomov
2017-03-30  6:25                     ` Michal Hocko
2017-03-30 10:02                       ` Ilya Dryomov
2017-03-30 11:21                         ` Michal Hocko
2017-03-30 13:48                           ` Ilya Dryomov
2017-03-30 14:36                             ` Michal Hocko
2017-03-30 15:06                               ` Ilya Dryomov
2017-03-30 16:12                                 ` Michal Hocko
2017-03-30 17:19                                   ` Ilya Dryomov
2017-03-30 18:44                                     ` Michal Hocko
2017-03-30 13:53                       ` Ilya Dryomov
2017-03-30 13:59                         ` Michal Hocko
2017-03-29 11:05             ` Brian Foster
2017-03-29 11:14               ` Ilya Dryomov
2017-03-29 11:18                 ` Michal Hocko
2017-03-29 11:49                   ` Brian Foster
2017-03-29 14:30                     ` Ilya Dryomov
2017-03-28 12:30 ` [PATCH 4.4 49/76] raid10: increment write counter after bio is split Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 50/76] libceph: dont set weight to IN when OSD is destroyed Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 51/76] xfs: dont allow di_size with high bit set Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 52/76] xfs: fix up xfs_swap_extent_forks inline extent handling Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 53/76] nl80211: fix dumpit error path RTNL deadlocks Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 54/76] USB: usbtmc: add missing endpoint sanity check Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 55/76] xfs: clear _XBF_PAGES from buffers when readahead page Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 56/76] xen: do not re-use pirq number cached in pci device msi msg data Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 57/76] igb: Workaround for igb i210 firmware issue Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 58/76] igb: add i211 to i210 PHY workaround Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 59/76] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 60/76] PCI: Separate VF BAR updates from standard BAR updates Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 61/76] PCI: Remove pci_resource_bar() and pci_iov_resource_bar() Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 62/76] PCI: Add comments about ROM BAR updating Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 63/76] PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 64/76] PCI: Dont update VF BARs while VF memory space is enabled Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 65/76] PCI: Update BARs using property bits appropriate for type Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 66/76] PCI: Ignore BAR updates on virtual functions Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 67/76] PCI: Do any VF BAR updates before enabling the BARs Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 68/76] vfio/spapr: Postpone allocation of userspace version of TCE table Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 69/76] block: allow WRITE_SAME commands with the SG_IO ioctl Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 70/76] s390/zcrypt: Introduce CEX6 toleration Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 71/76] uvcvideo: uvc_scan_fallback() for webcams with broken chain Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 72/76] ACPI / blacklist: add _REV quirks for Dell Precision 5520 and 3520 Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 73/76] ACPI / blacklist: Make Dell Latitude 3350 ethernet work Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 74/76] serial: 8250_pci: Detach low-level driver during PCI error recovery Greg Kroah-Hartman
2017-04-04 20:26   ` Ben Hutchings
2017-03-28 12:31 ` [PATCH 4.4 75/76] fbcon: Fix vc attr at deinit Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 76/76] crypto: algif_hash - avoid zero-sized array Greg Kroah-Hartman
2017-03-28 19:38 ` [PATCH 4.4 00/76] 4.4.58-stable review Shuah Khan
2017-03-29  2:58 ` Guenter Roeck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.