linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 4.5 000/101] 4.5.5-stable review
@ 2016-05-17  1:20 Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 001/101] staging: wilc1000: remove extraneous variable Greg Kroah-Hartman
                   ` (92 more replies)
  0 siblings, 93 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuah.kh, patches, stable

This is the start of the stable review cycle for the 4.5.5 release.
There are 101 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu May 19 01:14:44 UTC 2016.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.5.5-rc1.gz
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 4.5.5-rc1

Linus Torvalds <torvalds@linux-foundation.org>
    nf_conntrack: avoid kernel pointer value leak in slab name

Yauhen Kharuzhy <yauhen.kharuzhy@zavadatar.com>
    btrfs: Reset IO error counters before start of device replacing

Josef Bacik <jbacik@fb.com>
    Btrfs: don't use src fd for printk

David Sterba <dsterba@suse.com>
    btrfs: fallback to vmalloc in btrfs_compare_tree

Mark Fasheh <mfasheh@suse.de>
    btrfs: handle non-fatal errors in btrfs_qgroup_inherit()

Liu Bo <bo.li.liu@oracle.com>
    Btrfs: fix invalid reference in replace_path

Alex Lyakas <alex.bolshoy@gmail.com>
    btrfs: do not write corrupted metadata blocks to disk

Alex Lyakas <alex.bolshoy@gmail.com>
    btrfs: csum_tree_block: return proper errno value

Filipe Manana <fdmanana@suse.com>
    Btrfs: do not collect ordered extents when logging that inode exists

Filipe Manana <fdmanana@suse.com>
    Btrfs: fix race when checking if we can skip fsync'ing an inode

Filipe Manana <fdmanana@suse.com>
    Btrfs: fix deadlock between direct IO reads and buffered writes

Filipe Manana <fdmanana@suse.com>
    Btrfs: fix extent_same allowing destination offset beyond i_size

Filipe Manana <fdmanana@suse.com>
    Btrfs: fix file loss on log replay after renaming a file and fsync

Filipe Manana <fdmanana@suse.com>
    Btrfs: fix unreplayable log after snapshot delete + parent dir fsync

David Sterba <dsterba@suse.com>
    btrfs: change max_inline default to 2048

David Sterba <dsterba@suse.com>
    btrfs: remove error message from search ioctl for nonexistent tree

Josef Bacik <jbacik@fb.com>
    Btrfs: fix truncate_space_check

Zhao Lei <zhaolei@cn.fujitsu.com>
    btrfs: reada: Fix in-segment calculation for reada

Alex Deucher <alexander.deucher@amd.com>
    drm/amdgpu: fix DP mode validation

Alex Deucher <alexander.deucher@amd.com>
    drm/radeon: fix DP mode validation

Arindam Nath <arindam.nath@amd.com>
    drm/radeon: fix DP link training issue with second 4K monitor

Imre Deak <imre.deak@intel.com>
    drm/i915/bdw: Add missing delay during L3 SQC credit programming

Lyude <cpaul@redhat.com>
    Revert "drm/i915: start adding dp mst audio"

Daniel Vetter <daniel.vetter@ffwll.ch>
    drm/i915: Bail out of pipe config compute loop on LPT

Lucas Stach <dev@lynxeye.de>
    drm/radeon: fix PLL sharing on DCE6.1 (v2)

Ville Syrjälä <ville.syrjala@linux.intel.com>
    drm/i915: Update CDCLK_FREQ register on BDW after changing cdclk frequency

Mauro Carvalho Chehab <mchehab@osg.samsung.com>
    Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing"

Marek Szyprowski <m.szyprowski@samsung.com>
    Input: max8997-haptic - fix NULL pointer dereference

Al Viro <viro@zeniv.linux.org.uk>
    get_rock_ridge_filename(): handle malformed NM entries

Steven Rostedt <rostedt@goodmis.org>
    tools lib traceevent: Do not reassign parg after collapse_tree()

Johannes Thumshirn <jthumshirn@suse.de>
    qla1280: Don't allocate 512kb of host tags

Al Viro <viro@zeniv.linux.org.uk>
    atomic_open(): fix the handling of create_error

Hans de Goede <hdegoede@redhat.com>
    regulator: axp20x: Fix axp22x ldo_io voltage ranges

Krzysztof Kozlowski <k.kozlowski@samsung.com>
    regulator: s2mps11: Fix invalid selector mask and voltages for buck9

Wanpeng Li <wanpeng.li@hotmail.com>
    workqueue: fix rebind bound workers warning

Boris Brezillon <boris.brezillon@free-electrons.com>
    ARM: dts: at91: sam9x5: Fix the memory range assigned to the PMC

Miklos Szeredi <mszeredi@redhat.com>
    vfs: rename: check backing inode being equal

Miklos Szeredi <mszeredi@redhat.com>
    vfs: add vfs_select_inode() helper

Alexander Shishkin <alexander.shishkin@linux.intel.com>
    perf/core: Disable the event on a truncated AUX record

Namhyung Kim <namhyung@kernel.org>
    perf diff: Fix duplicated output column

Jack Pham <jackp@codeaurora.org>
    regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case

Ludovic Desroches <ludovic.desroches@atmel.com>
    pinctrl: at91-pio4: fix pull-up/down logic

Ben Hutchings <ben.hutchings@codethink.co.uk>
    spi: spi-ti-qspi: Handle truncated frames properly

Ben Hutchings <ben.hutchings@codethink.co.uk>
    spi: spi-ti-qspi: Fix FLEN and WLEN settings if bits_per_word is overridden

Jarkko Nikula <jarkko.nikula@linux.intel.com>
    spi: pxa2xx: Do not detect number of enabled chip selects on Intel SPT

Takashi Iwai <tiwai@suse.de>
    ALSA: hda - Fix broken reconfig

Kaho Ng <ngkaho1234@gmail.com>
    ALSA: hda - Fix white noise on Asus UX501VW headset

Yura Pakhuchiy <pakhuchiy@gmail.com>
    ALSA: hda - Fix subwoofer pin on ASUS N751 and N551

Takashi Iwai <tiwai@suse.de>
    ALSA: usb-audio: Yet another Phoneix Audio device quirk

Takashi Iwai <tiwai@suse.de>
    ALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2)

Herbert Xu <herbert@gondor.apana.org.au>
    crypto: testmgr - Use kmalloc memory for RSA input

Herbert Xu <herbert@gondor.apana.org.au>
    crypto: hash - Fix page length clamping in hash walk

Tadeusz Struk <tadeusz.struk@intel.com>
    crypto: qat - fix adf_ctl_drv.c:undefined reference to adf_init_pf_wq

Tadeusz Struk <tadeusz.struk@intel.com>
    crypto: qat - fix invalid pf2vf_resp_wq logic

Andrea Arcangeli <aarcange@redhat.com>
    mm: thp: calculate the mapcount correctly for THP pages during WP faults

Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
    zsmalloc: fix zs_can_compact() integer overflow

Junxiao Bi <junxiao.bi@oracle.com>
    ocfs2: fix posix_acl_create deadlock

Junxiao Bi <junxiao.bi@oracle.com>
    ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang

Paolo Abeni <pabeni@redhat.com>
    net/route: enforce hoplimit max value

Eric Dumazet <edumazet@google.com>
    tcp: refresh skb timestamp at retransmit time

xypron.glpk@gmx.de <xypron.glpk@gmx.de>
    net: thunderx: avoid exposing kernel stack

Kangjie Lu <kangjielu@gmail.com>
    net: fix a kernel infoleak in x25 module

Mikko Rapeli <mikko.rapeli@iki.fi>
    uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h

Linus Lüssing <linus.luessing@c0d3.blue>
    bridge: fix igmp / mld query parsing

Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
    net: bridge: fix old ioctl unlocked net device walk

Ian Campbell <ian.campbell@docker.com>
    VSOCK: do not disconnect socket when peer has shutdown SEND only

Daniel Jurgens <danielj@mellanox.com>
    net/mlx4_en: Fix endianness bug in IPV6 csum calculation

Kangjie Lu <kangjielu@gmail.com>
    net: fix infoleak in rtnetlink

Kangjie Lu <kangjielu@gmail.com>
    net: fix infoleak in llc

Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
    net: fec: only clear a queue's work bit if the queue was emptied

Nicolas Dichtel <nicolas.dichtel@6wind.com>
    ipv6/ila: fix nlsize calculation for lwtunnel

Neil Horman <nhorman@tuxdriver.com>
    netem: Segment GSO packets on enqueue

WANG Cong <xiyou.wangcong@gmail.com>
    sch_dsmark: update backlog as well

WANG Cong <xiyou.wangcong@gmail.com>
    sch_htb: update backlog as well

WANG Cong <xiyou.wangcong@gmail.com>
    net_sched: update hierarchical backlog too

WANG Cong <xiyou.wangcong@gmail.com>
    net_sched: introduce qdisc_replace() helper

Jiri Benc <jbenc@redhat.com>
    gre: do not pull header in ICMP error processing

Tim Bingham <tbingham@akamai.com>
    net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case

Alexei Starovoitov <ast@fb.com>
    samples/bpf: fix trace_output example

Alexei Starovoitov <ast@fb.com>
    bpf: fix check_map_func_compatibility logic

Alexei Starovoitov <ast@fb.com>
    bpf: fix refcnt overflow

Jann Horn <jannh@google.com>
    bpf: fix double-fdput in replace_map_fd_with_map_ptr()

Eric Dumazet <edumazet@google.com>
    net/mlx4_en: fix spurious timestamping callbacks

Paolo Abeni <pabeni@redhat.com>
    ipv4/fib: don't warn when primary address is missing if in_dev is dead

Saeed Mahameed <saeedm@mellanox.com>
    net/mlx5e: Use vport MTU rather than physical port MTU

Saeed Mahameed <saeedm@mellanox.com>
    net/mlx5e: Fix minimum MTU

Saeed Mahameed <saeedm@mellanox.com>
    net/mlx5e: Device's mtu field is u16 and not int

Maor Gottlieb <maorg@mellanox.com>
    net/mlx5_core: Fix soft lockup in steering error flow

Simon Horman <simon.horman@netronome.com>
    openvswitch: use flow protocol when recalculating ipv6 checksums

Ben Hutchings <ben@decadent.org.uk>
    atl2: Disable unimplemented scatter/gather feature

Joe Stringer <joe@ovn.org>
    openvswitch: Orphan skbs before IPv6 defrag

Daniel Borkmann <daniel@iogearbox.net>
    vlan: pull on __vlan_insert_tag error path and fix csum correction

Daniel Borkmann <daniel@iogearbox.net>
    net: use skb_postpush_rcsum instead of own implementations

Craig Gallek <kraig@google.com>
    soreuseport: fix ordering for mixed v4/v6 sockets

Bjørn Mork <bjorn@mork.no>
    cdc_mbim: apply "NDP to end" quirk to all Huawei devices

Alexei Starovoitov <ast@fb.com>
    bpf/verifier: reject invalid LD_ABS | BPF_DW instruction

Lars Persson <lars.persson@axis.com>
    net: sched: do not requeue a NULL skb

Mathias Krause <minipli@googlemail.com>
    packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface

Chris Friesen <chris.friesen@windriver.com>
    route: do not cache fib route info on local routes with oif

David S. Miller <davem@davemloft.net>
    decnet: Do not build routes to devices without decnet private data.

Arnd Bergmann <arnd@arndb.de>
    staging: wilc1000: remove extraneous variable


-------------

Diffstat:

 Makefile                                           |  4 +-
 arch/arm/boot/dts/at91sam9x5.dtsi                  |  2 +-
 crypto/ahash.c                                     |  3 +-
 crypto/testmgr.c                                   | 27 ++++--
 drivers/base/regmap/regmap-spmi.c                  |  2 +-
 drivers/crypto/qat/qat_common/adf_common_drv.h     | 11 +++
 drivers/crypto/qat/qat_common/adf_ctl_drv.c        |  6 ++
 drivers/crypto/qat/qat_common/adf_sriov.c          | 26 +++---
 drivers/gpu/drm/amd/amdgpu/atombios_dp.c           |  4 +-
 drivers/gpu/drm/i915/i915_debugfs.c                | 16 ----
 drivers/gpu/drm/i915/i915_reg.h                    |  2 +
 drivers/gpu/drm/i915/intel_audio.c                 |  9 +-
 drivers/gpu/drm/i915/intel_crt.c                   |  8 +-
 drivers/gpu/drm/i915/intel_ddi.c                   | 24 ++----
 drivers/gpu/drm/i915/intel_display.c               |  2 +
 drivers/gpu/drm/i915/intel_dp_mst.c                | 22 -----
 drivers/gpu/drm/i915/intel_drv.h                   |  2 -
 drivers/gpu/drm/i915/intel_pm.c                    |  6 ++
 drivers/gpu/drm/radeon/atombios_crtc.c             | 10 +++
 drivers/gpu/drm/radeon/atombios_dp.c               |  4 +-
 drivers/gpu/drm/radeon/radeon_dp_auxch.c           |  2 +-
 drivers/infiniband/hw/mlx5/main.c                  |  4 +-
 drivers/input/misc/max8997_haptic.c                |  6 +-
 drivers/media/v4l2-core/videobuf2-v4l2.c           |  6 --
 drivers/net/ethernet/atheros/atlx/atl2.c           |  2 +-
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c |  4 +
 drivers/net/ethernet/freescale/fec_main.c          | 10 ++-
 drivers/net/ethernet/mellanox/mlx4/en_rx.c         |  2 +-
 drivers/net/ethernet/mellanox/mlx4/en_tx.c         |  6 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 57 ++++++++++---
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c  | 46 ++++------
 drivers/net/ethernet/mellanox/mlx5/core/port.c     | 10 +--
 drivers/net/ethernet/mellanox/mlx5/core/vport.c    | 40 +++++++++
 drivers/net/usb/cdc_mbim.c                         |  9 +-
 drivers/pinctrl/pinctrl-at91-pio4.c                |  2 +
 drivers/regulator/axp20x-regulator.c               |  4 +-
 drivers/regulator/s2mps11.c                        | 28 ++++--
 drivers/scsi/qla1280.c                             |  2 +-
 drivers/spi/spi-pxa2xx.c                           |  2 +-
 drivers/spi/spi-ti-qspi.c                          | 45 ++++++----
 drivers/staging/wilc1000/wilc_spi.c                |  2 -
 fs/btrfs/ctree.c                                   | 12 ++-
 fs/btrfs/ctree.h                                   |  2 +-
 fs/btrfs/dev-replace.c                             |  2 +
 fs/btrfs/disk-io.c                                 | 28 +++---
 fs/btrfs/file.c                                    |  9 +-
 fs/btrfs/inode.c                                   | 36 +++++++-
 fs/btrfs/ioctl.c                                   | 10 ++-
 fs/btrfs/qgroup.c                                  | 54 +++++++-----
 fs/btrfs/reada.c                                   |  4 +-
 fs/btrfs/relocation.c                              |  1 +
 fs/btrfs/tree-log.c                                | 99 +++++++++++++++++++---
 fs/btrfs/tree-log.h                                |  2 +
 fs/isofs/rock.c                                    | 13 ++-
 fs/namei.c                                         | 26 ++----
 fs/ocfs2/acl.c                                     | 87 +++++++++++++++++++
 fs/ocfs2/acl.h                                     |  5 ++
 fs/ocfs2/file.c                                    |  4 +-
 fs/ocfs2/namei.c                                   | 23 +----
 fs/ocfs2/refcounttree.c                            | 17 +---
 fs/ocfs2/xattr.c                                   | 14 ++-
 fs/ocfs2/xattr.h                                   |  4 +-
 fs/open.c                                          | 12 +--
 include/linux/bpf.h                                |  3 +-
 include/linux/dcache.h                             | 12 +++
 include/linux/mfd/samsung/s2mps11.h                |  2 +
 include/linux/mlx5/driver.h                        |  6 +-
 include/linux/mlx5/vport.h                         |  2 +
 include/linux/mm.h                                 |  9 ++
 include/linux/net.h                                | 10 ++-
 include/linux/rculist_nulls.h                      | 39 +++++++++
 include/linux/swap.h                               |  6 +-
 include/net/codel.h                                |  4 +
 include/net/sch_generic.h                          | 20 ++++-
 include/net/sock.h                                 |  6 +-
 include/uapi/linux/if.h                            | 28 ++++++
 include/uapi/linux/libc-compat.h                   | 44 ++++++++++
 kernel/bpf/inode.c                                 |  7 +-
 kernel/bpf/syscall.c                               | 24 +++++-
 kernel/bpf/verifier.c                              | 66 +++++++++------
 kernel/events/ring_buffer.c                        | 10 ++-
 kernel/workqueue.c                                 | 11 +++
 mm/huge_memory.c                                   | 71 ++++++++++++++--
 mm/memory.c                                        | 22 +++--
 mm/swapfile.c                                      | 13 +--
 mm/zsmalloc.c                                      |  7 +-
 net/bridge/br_ioctl.c                              |  5 +-
 net/bridge/br_multicast.c                          | 12 +--
 net/core/rtnetlink.c                               | 18 ++--
 net/core/skbuff.c                                  | 11 +--
 net/decnet/dn_route.c                              |  9 +-
 net/ipv4/fib_frontend.c                            |  6 +-
 net/ipv4/fib_semantics.c                           |  2 +
 net/ipv4/ip_gre.c                                  | 11 ++-
 net/ipv4/route.c                                   | 12 +++
 net/ipv4/tcp_output.c                              |  6 +-
 net/ipv4/udp.c                                     |  9 +-
 net/ipv6/ila/ila_lwt.c                             |  3 +-
 net/ipv6/reassembly.c                              |  6 +-
 net/ipv6/route.c                                   |  2 +
 net/llc/af_llc.c                                   |  1 +
 net/netfilter/nf_conntrack_core.c                  |  4 +-
 net/openvswitch/actions.c                          | 12 ++-
 net/openvswitch/conntrack.c                        |  1 +
 net/openvswitch/vport-netdev.c                     |  2 +-
 net/openvswitch/vport.h                            |  7 --
 net/packet/af_packet.c                             |  1 +
 net/sched/sch_api.c                                |  8 +-
 net/sched/sch_cbq.c                                | 12 +--
 net/sched/sch_choke.c                              |  6 +-
 net/sched/sch_codel.c                              | 10 ++-
 net/sched/sch_drr.c                                |  9 +-
 net/sched/sch_dsmark.c                             | 11 +--
 net/sched/sch_fq.c                                 |  4 +-
 net/sched/sch_fq_codel.c                           | 17 ++--
 net/sched/sch_generic.c                            |  5 +-
 net/sched/sch_hfsc.c                               |  9 +-
 net/sched/sch_hhf.c                                | 10 ++-
 net/sched/sch_htb.c                                | 24 +++---
 net/sched/sch_multiq.c                             | 16 ++--
 net/sched/sch_netem.c                              | 74 +++++++++++++---
 net/sched/sch_pie.c                                |  5 +-
 net/sched/sch_prio.c                               | 15 ++--
 net/sched/sch_qfq.c                                |  9 +-
 net/sched/sch_red.c                                | 10 +--
 net/sched/sch_sfb.c                                | 10 +--
 net/sched/sch_sfq.c                                | 16 ++--
 net/sched/sch_tbf.c                                | 15 ++--
 net/vmw_vsock/af_vsock.c                           | 21 +----
 net/x25/x25_facilities.c                           |  1 +
 samples/bpf/trace_output_kern.c                    |  1 -
 sound/pci/hda/hda_sysfs.c                          |  8 --
 sound/pci/hda/patch_realtek.c                      | 13 +++
 sound/usb/quirks.c                                 |  3 +
 tools/lib/traceevent/parse-filter.c                |  4 +-
 tools/perf/util/sort.c                             |  3 +
 136 files changed, 1251 insertions(+), 603 deletions(-)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 001/101] staging: wilc1000: remove extraneous variable
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 002/101] decnet: Do not build routes to devices without decnet private data Greg Kroah-Hartman
                   ` (91 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Arnd Bergmann

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Arnd Bergmann <arnd@arndb.de>

commit ce7b516f3f9e11fe4ee06fad0d7e853bb6e8f160 upstream.

Building wilc1000 with clang currently fails in the staging-next branch:

drivers/staging/wilc1000/wilc_spi.c:123:34: warning: tentative definition of variable with internal linkage has incomplete non-array type 'const struct wilc1000_ops' [-Wtentative-definition-incomplete-type]
static const struct wilc1000_ops wilc1000_spi_ops;

The reason is that wilc1000_ops was left behind after a recent cleanup,
and is completely unused and also uninitialized and const and has an
incomplete type.

Removing the variable is obviously correct, and gets rid of the warning.
No idea why gcc does not complain about it though.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/staging/wilc1000/wilc_spi.c |    2 --
 1 file changed, 2 deletions(-)

--- a/drivers/staging/wilc1000/wilc_spi.c
+++ b/drivers/staging/wilc1000/wilc_spi.c
@@ -120,8 +120,6 @@ static u8 crc7(u8 crc, const u8 *buffer,
 
 #define USE_SPI_DMA     0
 
-static const struct wilc1000_ops wilc1000_spi_ops;
-
 static int wilc_bus_probe(struct spi_device *spi)
 {
 	int ret, gpio;

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 002/101] decnet: Do not build routes to devices without decnet private data.
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 001/101] staging: wilc1000: remove extraneous variable Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 003/101] route: do not cache fib route info on local routes with oif Greg Kroah-Hartman
                   ` (90 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <davem@davemloft.net>

[ Upstream commit a36a0d4008488fa545c74445d69eaf56377d5d4e ]

In particular, make sure we check for decnet private presence
for loopback devices.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/decnet/dn_route.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

--- a/net/decnet/dn_route.c
+++ b/net/decnet/dn_route.c
@@ -1034,10 +1034,13 @@ source_ok:
 	if (!fld.daddr) {
 		fld.daddr = fld.saddr;
 
-		err = -EADDRNOTAVAIL;
 		if (dev_out)
 			dev_put(dev_out);
+		err = -EINVAL;
 		dev_out = init_net.loopback_dev;
+		if (!dev_out->dn_ptr)
+			goto out;
+		err = -EADDRNOTAVAIL;
 		dev_hold(dev_out);
 		if (!fld.daddr) {
 			fld.daddr =
@@ -1110,6 +1113,8 @@ source_ok:
 		if (dev_out == NULL)
 			goto out;
 		dn_db = rcu_dereference_raw(dev_out->dn_ptr);
+		if (!dn_db)
+			goto e_inval;
 		/* Possible improvement - check all devices for local addr */
 		if (dn_dev_islocal(dev_out, fld.daddr)) {
 			dev_put(dev_out);
@@ -1151,6 +1156,8 @@ select_source:
 			dev_put(dev_out);
 		dev_out = init_net.loopback_dev;
 		dev_hold(dev_out);
+		if (!dev_out->dn_ptr)
+			goto e_inval;
 		fld.flowidn_oif = dev_out->ifindex;
 		if (res.fi)
 			dn_fib_info_put(res.fi);

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 003/101] route: do not cache fib route info on local routes with oif
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 001/101] staging: wilc1000: remove extraneous variable Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 002/101] decnet: Do not build routes to devices without decnet private data Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 004/101] packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface Greg Kroah-Hartman
                   ` (89 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Allain Legacy, Julian Anastasov,
	David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Chris Friesen <chris.friesen@windriver.com>

[ Upstream commit d6d5e999e5df67f8ec20b6be45e2229455ee3699 ]

For local routes that require a particular output interface we do not want
to cache the result.  Caching the result causes incorrect behaviour when
there are multiple source addresses on the interface.  The end result
being that if the intended recipient is waiting on that interface for the
packet he won't receive it because it will be delivered on the loopback
interface and the IP_PKTINFO ipi_ifindex will be set to the loopback
interface as well.

This can be tested by running a program such as "dhcp_release" which
attempts to inject a packet on a particular interface so that it is
received by another program on the same board.  The receiving process
should see an IP_PKTINFO ipi_ifndex value of the source interface
(e.g., eth1) instead of the loopback interface (e.g., lo).  The packet
will still appear on the loopback interface in tcpdump but the important
aspect is that the CMSG info is correct.

Sample dhcp_release command line:

   dhcp_release eth1 192.168.204.222 02:11:33:22:44:66

Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed off-by: Chris Friesen <chris.friesen@windriver.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/route.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2045,6 +2045,18 @@ static struct rtable *__mkroute_output(c
 		 */
 		if (fi && res->prefixlen < 4)
 			fi = NULL;
+	} else if ((type == RTN_LOCAL) && (orig_oif != 0) &&
+		   (orig_oif != dev_out->ifindex)) {
+		/* For local routes that require a particular output interface
+		 * we do not want to cache the result.  Caching the result
+		 * causes incorrect behaviour when there are multiple source
+		 * addresses on the interface, the end result being that if the
+		 * intended recipient is waiting on that interface for the
+		 * packet he won't receive it because it will be delivered on
+		 * the loopback interface and the IP_PKTINFO ipi_ifindex will
+		 * be set to the loopback interface as well.
+		 */
+		fi = NULL;
 	}
 
 	fnhe = NULL;

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 004/101] packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 003/101] route: do not cache fib route info on local routes with oif Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 005/101] net: sched: do not requeue a NULL skb Greg Kroah-Hartman
                   ` (88 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mathias Krause, Eric W. Biederman,
	Pavel Emelyanov, Pavel Emelyanov, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mathias Krause <minipli@googlemail.com>

[ Upstream commit 309cf37fe2a781279b7675d4bb7173198e532867 ]

Because we miss to wipe the remainder of i->addr[] in packet_mc_add(),
pdiag_put_mclist() leaks uninitialized heap bytes via the
PACKET_DIAG_MCLIST netlink attribute.

Fix this by explicitly memset(0)ing the remaining bytes in i->addr[].

Fixes: eea68e2f1a00 ("packet: Report socket mclist info via diag module")
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/packet/af_packet.c |    1 +
 1 file changed, 1 insertion(+)

--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -3436,6 +3436,7 @@ static int packet_mc_add(struct sock *sk
 	i->ifindex = mreq->mr_ifindex;
 	i->alen = mreq->mr_alen;
 	memcpy(i->addr, mreq->mr_address, i->alen);
+	memset(i->addr + i->alen, 0, sizeof(i->addr) - i->alen);
 	i->count = 1;
 	i->next = po->mclist;
 	po->mclist = i;

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 005/101] net: sched: do not requeue a NULL skb
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (3 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 004/101] packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 006/101] bpf/verifier: reject invalid LD_ABS | BPF_DW instruction Greg Kroah-Hartman
                   ` (87 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Lars Persson, Eric Dumazet, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Lars Persson <lars.persson@axis.com>

[ Upstream commit 3dcd493fbebfd631913df6e2773cc295d3bf7d22 ]

A failure in validate_xmit_skb_list() triggered an unconditional call
to dev_requeue_skb with skb=NULL. This slowly grows the queue
discipline's qlen count until all traffic through the queue stops.

We take the optimistic approach and continue running the queue after a
failure since it is unknown if later packets also will fail in the
validate path.

Fixes: 55a93b3ea780 ("qdisc: validate skb without holding lock")
Signed-off-by: Lars Persson <larper@axis.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sched/sch_generic.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -159,12 +159,15 @@ int sch_direct_xmit(struct sk_buff *skb,
 	if (validate)
 		skb = validate_xmit_skb_list(skb, dev);
 
-	if (skb) {
+	if (likely(skb)) {
 		HARD_TX_LOCK(dev, txq, smp_processor_id());
 		if (!netif_xmit_frozen_or_stopped(txq))
 			skb = dev_hard_start_xmit(skb, dev, txq, &ret);
 
 		HARD_TX_UNLOCK(dev, txq);
+	} else {
+		spin_lock(root_lock);
+		return qdisc_qlen(q);
 	}
 	spin_lock(root_lock);
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 006/101] bpf/verifier: reject invalid LD_ABS | BPF_DW instruction
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 005/101] net: sched: do not requeue a NULL skb Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 009/101] net: use skb_postpush_rcsum instead of own implementations Greg Kroah-Hartman
                   ` (86 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexei Starovoitov, Daniel Borkmann,
	David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexei Starovoitov <ast@fb.com>

[ Upstream commit d82bccc69041a51f7b7b9b4a36db0772f4cdba21 ]

verifier must check for reserved size bits in instruction opcode and
reject BPF_LD | BPF_ABS | BPF_DW and BPF_LD | BPF_IND | BPF_DW instructions,
otherwise interpreter will WARN_RATELIMIT on them during execution.

Fixes: ddd872bc3098 ("bpf: verifier: add checks for BPF_ABS | BPF_IND instructions")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/bpf/verifier.c |    1 +
 1 file changed, 1 insertion(+)

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1348,6 +1348,7 @@ static int check_ld_abs(struct verifier_
 	}
 
 	if (insn->dst_reg != BPF_REG_0 || insn->off != 0 ||
+	    BPF_SIZE(insn->code) == BPF_DW ||
 	    (mode == BPF_ABS && insn->src_reg != BPF_REG_0)) {
 		verbose("BPF_LD_ABS uses reserved fields\n");
 		return -EINVAL;

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 009/101] net: use skb_postpush_rcsum instead of own implementations
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (5 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 006/101] bpf/verifier: reject invalid LD_ABS | BPF_DW instruction Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 010/101] vlan: pull on __vlan_insert_tag error path and fix csum correction Greg Kroah-Hartman
                   ` (85 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Daniel Borkmann, Tom Herbert,
	Alexei Starovoitov, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <daniel@iogearbox.net>

[ Upstream commit 6b83d28a55a891a9d70fc61ccb1c138e47dcbe74 ]

Replace individual implementations with the recently introduced
skb_postpush_rcsum() helper.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Tom Herbert <tom@herbertland.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/core/skbuff.c              |    4 +---
 net/ipv6/reassembly.c          |    6 ++----
 net/openvswitch/actions.c      |    8 +++-----
 net/openvswitch/vport-netdev.c |    2 +-
 net/openvswitch/vport.h        |    7 -------
 5 files changed, 7 insertions(+), 20 deletions(-)

--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4433,9 +4433,7 @@ int skb_vlan_push(struct sk_buff *skb, _
 		skb->mac_len += VLAN_HLEN;
 		__skb_pull(skb, offset);
 
-		if (skb->ip_summed == CHECKSUM_COMPLETE)
-			skb->csum = csum_add(skb->csum, csum_partial(skb->data
-					+ (2 * ETH_ALEN), VLAN_HLEN, 0));
+		skb_postpush_rcsum(skb, skb->data + (2 * ETH_ALEN), VLAN_HLEN);
 	}
 	__vlan_hwaccel_put_tag(skb, vlan_proto, vlan_tci);
 	return 0;
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -496,10 +496,8 @@ static int ip6_frag_reasm(struct frag_qu
 	IP6CB(head)->flags |= IP6SKB_FRAGMENTED;
 
 	/* Yes, and fold redundant checksum back. 8) */
-	if (head->ip_summed == CHECKSUM_COMPLETE)
-		head->csum = csum_partial(skb_network_header(head),
-					  skb_network_header_len(head),
-					  head->csum);
+	skb_postpush_rcsum(head, skb_network_header(head),
+			   skb_network_header_len(head));
 
 	rcu_read_lock();
 	IP6_INC_STATS_BH(net, __in6_dev_get(dev), IPSTATS_MIB_REASMOKS);
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -158,9 +158,7 @@ static int push_mpls(struct sk_buff *skb
 	new_mpls_lse = (__be32 *)skb_mpls_header(skb);
 	*new_mpls_lse = mpls->mpls_lse;
 
-	if (skb->ip_summed == CHECKSUM_COMPLETE)
-		skb->csum = csum_add(skb->csum, csum_partial(new_mpls_lse,
-							     MPLS_HLEN, 0));
+	skb_postpush_rcsum(skb, new_mpls_lse, MPLS_HLEN);
 
 	hdr = eth_hdr(skb);
 	hdr->h_proto = mpls->mpls_ethertype;
@@ -280,7 +278,7 @@ static int set_eth_addr(struct sk_buff *
 	ether_addr_copy_masked(eth_hdr(skb)->h_dest, key->eth_dst,
 			       mask->eth_dst);
 
-	ovs_skb_postpush_rcsum(skb, eth_hdr(skb), ETH_ALEN * 2);
+	skb_postpush_rcsum(skb, eth_hdr(skb), ETH_ALEN * 2);
 
 	ether_addr_copy(flow_key->eth.src, eth_hdr(skb)->h_source);
 	ether_addr_copy(flow_key->eth.dst, eth_hdr(skb)->h_dest);
@@ -639,7 +637,7 @@ static int ovs_vport_output(struct net *
 	/* Reconstruct the MAC header.  */
 	skb_push(skb, data->l2_len);
 	memcpy(skb->data, &data->l2_data, data->l2_len);
-	ovs_skb_postpush_rcsum(skb, skb->data, data->l2_len);
+	skb_postpush_rcsum(skb, skb->data, data->l2_len);
 	skb_reset_mac_header(skb);
 
 	ovs_vport_send(vport, skb);
--- a/net/openvswitch/vport-netdev.c
+++ b/net/openvswitch/vport-netdev.c
@@ -58,7 +58,7 @@ static void netdev_port_receive(struct s
 		return;
 
 	skb_push(skb, ETH_HLEN);
-	ovs_skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
+	skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
 	ovs_vport_receive(vport, skb, skb_tunnel_info(skb));
 	return;
 error:
--- a/net/openvswitch/vport.h
+++ b/net/openvswitch/vport.h
@@ -185,13 +185,6 @@ static inline struct vport *vport_from_p
 int ovs_vport_receive(struct vport *, struct sk_buff *,
 		      const struct ip_tunnel_info *);
 
-static inline void ovs_skb_postpush_rcsum(struct sk_buff *skb,
-				      const void *start, unsigned int len)
-{
-	if (skb->ip_summed == CHECKSUM_COMPLETE)
-		skb->csum = csum_add(skb->csum, csum_partial(start, len, 0));
-}
-
 static inline const char *ovs_vport_name(struct vport *vport)
 {
 	return vport->dev->name;

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 010/101] vlan: pull on __vlan_insert_tag error path and fix csum correction
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (6 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 009/101] net: use skb_postpush_rcsum instead of own implementations Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 011/101] openvswitch: Orphan skbs before IPv6 defrag Greg Kroah-Hartman
                   ` (84 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Daniel Borkmann, Jiri Pirko, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <daniel@iogearbox.net>

[ Upstream commit 9241e2df4fbc648a92ea0752918e05c26255649e ]

When __vlan_insert_tag() fails from skb_vlan_push() path due to the
skb_cow_head(), we need to undo the __skb_push() in the error path
as well that was done earlier to move skb->data pointer to mac header.

Moreover, I noticed that when in the non-error path the __skb_pull()
is done and the original offset to mac header was non-zero, we fixup
from a wrong skb->data offset in the checksum complete processing.

So the skb_postpush_rcsum() really needs to be done before __skb_pull()
where skb->data still points to the mac header start and thus operates
under the same conditions as in __vlan_insert_tag().

Fixes: 93515d53b133 ("net: move vlan pop/push functions into common code")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/core/skbuff.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4427,13 +4427,16 @@ int skb_vlan_push(struct sk_buff *skb, _
 		__skb_push(skb, offset);
 		err = __vlan_insert_tag(skb, skb->vlan_proto,
 					skb_vlan_tag_get(skb));
-		if (err)
+		if (err) {
+			__skb_pull(skb, offset);
 			return err;
+		}
+
 		skb->protocol = skb->vlan_proto;
 		skb->mac_len += VLAN_HLEN;
-		__skb_pull(skb, offset);
 
 		skb_postpush_rcsum(skb, skb->data + (2 * ETH_ALEN), VLAN_HLEN);
+		__skb_pull(skb, offset);
 	}
 	__vlan_hwaccel_put_tag(skb, vlan_proto, vlan_tci);
 	return 0;

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 011/101] openvswitch: Orphan skbs before IPv6 defrag
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (7 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 010/101] vlan: pull on __vlan_insert_tag error path and fix csum correction Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 012/101] atl2: Disable unimplemented scatter/gather feature Greg Kroah-Hartman
                   ` (83 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Daniele Di Proietto, Joe Stringer,
	David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Joe Stringer <joe@ovn.org>

[ Upstream commit 49e261a8a21e0960a3f7ff187a453ba1c1149053 ]

This is the IPv6 counterpart to commit 8282f27449bf ("inet: frag: Always
orphan skbs inside ip_defrag()").

Prior to commit 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free
clone operations"), ipv6 fragments sent to nf_ct_frag6_gather() would be
cloned (implicitly orphaning) prior to queueing for reassembly. As such,
when the IPv6 message is eventually reassembled, the skb->sk for all
fragments would be NULL. After that commit was introduced, rather than
cloning, the original skbs were queued directly without orphaning. The
end result is that all frags except for the first and last may have a
socket attached.

This commit explicitly orphans such skbs during nf_ct_frag6_gather() to
prevent BUG_ON(skb->sk) during a later call to ip6_fragment().

kernel BUG at net/ipv6/ip6_output.c:631!
[...]
Call Trace:
 <IRQ>
 [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
 [<ffffffffa042c7c0>] ? do_output.isra.28+0x1b0/0x1b0 [openvswitch]
 [<ffffffff810bb8a2>] ? __lock_is_held+0x52/0x70
 [<ffffffffa042c587>] ovs_fragment+0x1f7/0x280 [openvswitch]
 [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
 [<ffffffff817be416>] ? _raw_spin_unlock_irqrestore+0x36/0x50
 [<ffffffff81697ea0>] ? dst_discard_out+0x20/0x20
 [<ffffffff81697e80>] ? dst_ifdown+0x80/0x80
 [<ffffffffa042c703>] do_output.isra.28+0xf3/0x1b0 [openvswitch]
 [<ffffffffa042d279>] do_execute_actions+0x709/0x12c0 [openvswitch]
 [<ffffffffa04340a4>] ? ovs_flow_stats_update+0x74/0x1e0 [openvswitch]
 [<ffffffffa04340d1>] ? ovs_flow_stats_update+0xa1/0x1e0 [openvswitch]
 [<ffffffff817be387>] ? _raw_spin_unlock+0x27/0x40
 [<ffffffffa042de75>] ovs_execute_actions+0x45/0x120 [openvswitch]
 [<ffffffffa0432d65>] ovs_dp_process_packet+0x85/0x150 [openvswitch]
 [<ffffffff817be387>] ? _raw_spin_unlock+0x27/0x40
 [<ffffffffa042def4>] ovs_execute_actions+0xc4/0x120 [openvswitch]
 [<ffffffffa0432d65>] ovs_dp_process_packet+0x85/0x150 [openvswitch]
 [<ffffffffa04337f2>] ? key_extract+0x442/0xc10 [openvswitch]
 [<ffffffffa043b26d>] ovs_vport_receive+0x5d/0xb0 [openvswitch]
 [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
 [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
 [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
 [<ffffffff817be416>] ? _raw_spin_unlock_irqrestore+0x36/0x50
 [<ffffffffa043c11d>] internal_dev_xmit+0x6d/0x150 [openvswitch]
 [<ffffffffa043c0b5>] ? internal_dev_xmit+0x5/0x150 [openvswitch]
 [<ffffffff8168fb5f>] dev_hard_start_xmit+0x2df/0x660
 [<ffffffff8168f5ea>] ? validate_xmit_skb.isra.105.part.106+0x1a/0x2b0
 [<ffffffff81690925>] __dev_queue_xmit+0x8f5/0x950
 [<ffffffff81690080>] ? __dev_queue_xmit+0x50/0x950
 [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
 [<ffffffff81690990>] dev_queue_xmit+0x10/0x20
 [<ffffffff8169a418>] neigh_resolve_output+0x178/0x220
 [<ffffffff81752759>] ? ip6_finish_output2+0x219/0x7b0
 [<ffffffff81752759>] ip6_finish_output2+0x219/0x7b0
 [<ffffffff817525a5>] ? ip6_finish_output2+0x65/0x7b0
 [<ffffffff816cde2b>] ? ip_idents_reserve+0x6b/0x80
 [<ffffffff8175488f>] ? ip6_fragment+0x93f/0xc50
 [<ffffffff81754af1>] ip6_fragment+0xba1/0xc50
 [<ffffffff81752540>] ? ip6_flush_pending_frames+0x40/0x40
 [<ffffffff81754c6b>] ip6_finish_output+0xcb/0x1d0
 [<ffffffff81754dcf>] ip6_output+0x5f/0x1a0
 [<ffffffff81754ba0>] ? ip6_fragment+0xc50/0xc50
 [<ffffffff81797fbd>] ip6_local_out+0x3d/0x80
 [<ffffffff817554df>] ip6_send_skb+0x2f/0xc0
 [<ffffffff817555bd>] ip6_push_pending_frames+0x4d/0x50
 [<ffffffff817796cc>] icmpv6_push_pending_frames+0xac/0xe0
 [<ffffffff8177a4be>] icmpv6_echo_reply+0x42e/0x500
 [<ffffffff8177acbf>] icmpv6_rcv+0x4cf/0x580
 [<ffffffff81755ac7>] ip6_input_finish+0x1a7/0x690
 [<ffffffff81755925>] ? ip6_input_finish+0x5/0x690
 [<ffffffff817567a0>] ip6_input+0x30/0xa0
 [<ffffffff81755920>] ? ip6_rcv_finish+0x1a0/0x1a0
 [<ffffffff817557ce>] ip6_rcv_finish+0x4e/0x1a0
 [<ffffffff8175640f>] ipv6_rcv+0x45f/0x7c0
 [<ffffffff81755fe6>] ? ipv6_rcv+0x36/0x7c0
 [<ffffffff81755780>] ? ip6_make_skb+0x1c0/0x1c0
 [<ffffffff8168b649>] __netif_receive_skb_core+0x229/0xb80
 [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
 [<ffffffff8168c07f>] ? process_backlog+0x6f/0x230
 [<ffffffff8168bfb6>] __netif_receive_skb+0x16/0x70
 [<ffffffff8168c088>] process_backlog+0x78/0x230
 [<ffffffff8168c0ed>] ? process_backlog+0xdd/0x230
 [<ffffffff8168db43>] net_rx_action+0x203/0x480
 [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
 [<ffffffff817c156e>] __do_softirq+0xde/0x49f
 [<ffffffff81752768>] ? ip6_finish_output2+0x228/0x7b0
 [<ffffffff817c070c>] do_softirq_own_stack+0x1c/0x30
 <EOI>
 [<ffffffff8106f88b>] do_softirq.part.18+0x3b/0x40
 [<ffffffff8106f946>] __local_bh_enable_ip+0xb6/0xc0
 [<ffffffff81752791>] ip6_finish_output2+0x251/0x7b0
 [<ffffffff81754af1>] ? ip6_fragment+0xba1/0xc50
 [<ffffffff816cde2b>] ? ip_idents_reserve+0x6b/0x80
 [<ffffffff8175488f>] ? ip6_fragment+0x93f/0xc50
 [<ffffffff81754af1>] ip6_fragment+0xba1/0xc50
 [<ffffffff81752540>] ? ip6_flush_pending_frames+0x40/0x40
 [<ffffffff81754c6b>] ip6_finish_output+0xcb/0x1d0
 [<ffffffff81754dcf>] ip6_output+0x5f/0x1a0
 [<ffffffff81754ba0>] ? ip6_fragment+0xc50/0xc50
 [<ffffffff81797fbd>] ip6_local_out+0x3d/0x80
 [<ffffffff817554df>] ip6_send_skb+0x2f/0xc0
 [<ffffffff817555bd>] ip6_push_pending_frames+0x4d/0x50
 [<ffffffff81778558>] rawv6_sendmsg+0xa28/0xe30
 [<ffffffff81719097>] ? inet_sendmsg+0xc7/0x1d0
 [<ffffffff817190d6>] inet_sendmsg+0x106/0x1d0
 [<ffffffff81718fd5>] ? inet_sendmsg+0x5/0x1d0
 [<ffffffff8166d078>] sock_sendmsg+0x38/0x50
 [<ffffffff8166d4d6>] SYSC_sendto+0xf6/0x170
 [<ffffffff8100201b>] ? trace_hardirqs_on_thunk+0x1b/0x1d
 [<ffffffff8166e38e>] SyS_sendto+0xe/0x10
 [<ffffffff817bebe5>] entry_SYSCALL_64_fastpath+0x18/0xa8
Code: 06 48 83 3f 00 75 26 48 8b 87 d8 00 00 00 2b 87 d0 00 00 00 48 39 d0 72 14 8b 87 e4 00 00 00 83 f8 01 75 09 48 83 7f 18 00 74 9a <0f> 0b 41 8b 86 cc 00 00 00 49 8#
RIP  [<ffffffff8175468a>] ip6_fragment+0x73a/0xc50
 RSP <ffff880072803120>

Fixes: 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free clone
operations")
Reported-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/openvswitch/conntrack.c |    1 +
 1 file changed, 1 insertion(+)

--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -320,6 +320,7 @@ static int handle_fragments(struct net *
 	} else if (key->eth.type == htons(ETH_P_IPV6)) {
 		enum ip6_defrag_users user = IP6_DEFRAG_CONNTRACK_IN + zone;
 
+		skb_orphan(skb);
 		memset(IP6CB(skb), 0, sizeof(struct inet6_skb_parm));
 		err = nf_ct_frag6_gather(net, skb, user);
 		if (err)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 012/101] atl2: Disable unimplemented scatter/gather feature
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (8 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 011/101] openvswitch: Orphan skbs before IPv6 defrag Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 013/101] openvswitch: use flow protocol when recalculating ipv6 checksums Greg Kroah-Hartman
                   ` (82 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Justin Yackoski, Ben Hutchings,
	David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ben Hutchings <ben@decadent.org.uk>

[ Upstream commit f43bfaeddc79effbf3d0fcb53ca477cca66f3db8 ]

atl2 includes NETIF_F_SG in hw_features even though it has no support
for non-linear skbs.  This bug was originally harmless since the
driver does not claim to implement checksum offload and that used to
be a requirement for SG.

Now that SG and checksum offload are independent features, if you
explicitly enable SG *and* use one of the rare protocols that can use
SG without checkusm offload, this potentially leaks sensitive
information (before you notice that it just isn't working).  Therefore
this obscure bug has been designated CVE-2016-2117.

Reported-by: Justin Yackoski <jyackoski@crypto-nite.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: ec5f06156423 ("net: Kill link between CSUM and SG features.")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/atheros/atlx/atl2.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/ethernet/atheros/atlx/atl2.c
+++ b/drivers/net/ethernet/atheros/atlx/atl2.c
@@ -1412,7 +1412,7 @@ static int atl2_probe(struct pci_dev *pd
 
 	err = -EIO;
 
-	netdev->hw_features = NETIF_F_SG | NETIF_F_HW_VLAN_CTAG_RX;
+	netdev->hw_features = NETIF_F_HW_VLAN_CTAG_RX;
 	netdev->features |= (NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX);
 
 	/* Init PHY as early as possible due to power saving issue  */

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 013/101] openvswitch: use flow protocol when recalculating ipv6 checksums
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (9 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 012/101] atl2: Disable unimplemented scatter/gather feature Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 014/101] net/mlx5_core: Fix soft lockup in steering error flow Greg Kroah-Hartman
                   ` (81 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jarno Rajahalme, Simon Horman,
	David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Simon Horman <simon.horman@netronome.com>

[ Upstream commit b4f70527f052b0c00be4d7cac562baa75b212df5 ]

When using masked actions the ipv6_proto field of an action
to set IPv6 fields may be zero rather than the prevailing protocol
which will result in skipping checksum recalculation.

This patch resolves the problem by relying on the protocol
in the flow key rather than that in the set field action.

Fixes: 83d2b9ba1abc ("net: openvswitch: Support masked set actions.")
Cc: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/openvswitch/actions.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -461,7 +461,7 @@ static int set_ipv6(struct sk_buff *skb,
 		mask_ipv6_addr(saddr, key->ipv6_src, mask->ipv6_src, masked);
 
 		if (unlikely(memcmp(saddr, masked, sizeof(masked)))) {
-			set_ipv6_addr(skb, key->ipv6_proto, saddr, masked,
+			set_ipv6_addr(skb, flow_key->ip.proto, saddr, masked,
 				      true);
 			memcpy(&flow_key->ipv6.addr.src, masked,
 			       sizeof(flow_key->ipv6.addr.src));
@@ -483,7 +483,7 @@ static int set_ipv6(struct sk_buff *skb,
 							     NULL, &flags)
 					       != NEXTHDR_ROUTING);
 
-			set_ipv6_addr(skb, key->ipv6_proto, daddr, masked,
+			set_ipv6_addr(skb, flow_key->ip.proto, daddr, masked,
 				      recalc_csum);
 			memcpy(&flow_key->ipv6.addr.dst, masked,
 			       sizeof(flow_key->ipv6.addr.dst));

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 014/101] net/mlx5_core: Fix soft lockup in steering error flow
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (10 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 013/101] openvswitch: use flow protocol when recalculating ipv6 checksums Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 015/101] net/mlx5e: Devices mtu field is u16 and not int Greg Kroah-Hartman
                   ` (80 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Maor Gottlieb, Amir Vadai,
	Saeed Mahameed, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Maor Gottlieb <maorg@mellanox.com>

[ Upstream commit c3f9bf628bc7edda298897d952f5e761137229c9 ]

In the error flow of adding flow rule to auto-grouped flow
table, we call to tree_remove_node.

tree_remove_node locks the node's parent, however the node's parent
is already locked by mlx5_add_flow_rule and this causes a deadlock.
After this patch, if we failed to add the flow rule, we unlock the
flow table before calling to tree_remove_node.

fixes: f0d22d187473 ('net/mlx5_core: Introduce flow steering autogrouped
flow table')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reported-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c |   46 ++++++++--------------
 1 file changed, 17 insertions(+), 29 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -957,33 +957,6 @@ unlock_fg:
 	return rule;
 }
 
-static struct mlx5_flow_rule *add_rule_to_auto_fg(struct mlx5_flow_table *ft,
-						  u8 match_criteria_enable,
-						  u32 *match_criteria,
-						  u32 *match_value,
-						  u8 action,
-						  u32 flow_tag,
-						  struct mlx5_flow_destination *dest)
-{
-	struct mlx5_flow_rule *rule;
-	struct mlx5_flow_group *g;
-
-	g = create_autogroup(ft, match_criteria_enable, match_criteria);
-	if (IS_ERR(g))
-		return (void *)g;
-
-	rule = add_rule_fg(g, match_value,
-			   action, flow_tag, dest);
-	if (IS_ERR(rule)) {
-		/* Remove assumes refcount > 0 and autogroup creates a group
-		 * with a refcount = 0.
-		 */
-		tree_get_node(&g->node);
-		tree_remove_node(&g->node);
-	}
-	return rule;
-}
-
 struct mlx5_flow_rule *
 mlx5_add_flow_rule(struct mlx5_flow_table *ft,
 		   u8 match_criteria_enable,
@@ -1008,8 +981,23 @@ mlx5_add_flow_rule(struct mlx5_flow_tabl
 				goto unlock;
 		}
 
-	rule = add_rule_to_auto_fg(ft, match_criteria_enable, match_criteria,
-				   match_value, action, flow_tag, dest);
+	g = create_autogroup(ft, match_criteria_enable, match_criteria);
+	if (IS_ERR(g)) {
+		rule = (void *)g;
+		goto unlock;
+	}
+
+	rule = add_rule_fg(g, match_value,
+			   action, flow_tag, dest);
+	if (IS_ERR(rule)) {
+		/* Remove assumes refcount > 0 and autogroup creates a group
+		 * with a refcount = 0.
+		 */
+		unlock_ref_node(&ft->node);
+		tree_get_node(&g->node);
+		tree_remove_node(&g->node);
+		return rule;
+	}
 unlock:
 	unlock_ref_node(&ft->node);
 	return rule;

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 015/101] net/mlx5e: Devices mtu field is u16 and not int
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (11 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 014/101] net/mlx5_core: Fix soft lockup in steering error flow Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 016/101] net/mlx5e: Fix minimum MTU Greg Kroah-Hartman
                   ` (79 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Saeed Mahameed, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Saeed Mahameed <saeedm@mellanox.com>

[ Upstream commit 046339eaab26804f52f6604877f5674f70815b26 ]

For set/query MTU port firmware commands the MTU field
is 16 bits, here I changed all the "int mtu" parameters
of the functions wrapping those firmware commands to be u16.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/infiniband/hw/mlx5/main.c                 |    4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |    4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/port.c    |   10 +++++-----
 include/linux/mlx5/driver.h                       |    6 +++---
 4 files changed, 12 insertions(+), 12 deletions(-)

--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -654,8 +654,8 @@ static int mlx5_query_hca_port(struct ib
 	struct mlx5_ib_dev *dev = to_mdev(ibdev);
 	struct mlx5_core_dev *mdev = dev->mdev;
 	struct mlx5_hca_vport_context *rep;
-	int max_mtu;
-	int oper_mtu;
+	u16 max_mtu;
+	u16 oper_mtu;
 	int err;
 	u8 ib_link_width_oper;
 	u8 vl_hw_cap;
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1393,7 +1393,7 @@ static int mlx5e_set_dev_port_mtu(struct
 {
 	struct mlx5e_priv *priv = netdev_priv(netdev);
 	struct mlx5_core_dev *mdev = priv->mdev;
-	int hw_mtu;
+	u16 hw_mtu;
 	int err;
 
 	err = mlx5_set_port_mtu(mdev, MLX5E_SW2HW_MTU(netdev->mtu), 1);
@@ -1911,7 +1911,7 @@ static int mlx5e_change_mtu(struct net_d
 	struct mlx5e_priv *priv = netdev_priv(netdev);
 	struct mlx5_core_dev *mdev = priv->mdev;
 	bool was_opened;
-	int max_mtu;
+	u16 max_mtu;
 	int err = 0;
 
 	mlx5_query_port_max_mtu(mdev, &max_mtu, 1);
--- a/drivers/net/ethernet/mellanox/mlx5/core/port.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c
@@ -246,8 +246,8 @@ int mlx5_query_port_admin_status(struct
 }
 EXPORT_SYMBOL_GPL(mlx5_query_port_admin_status);
 
-static void mlx5_query_port_mtu(struct mlx5_core_dev *dev, int *admin_mtu,
-				int *max_mtu, int *oper_mtu, u8 port)
+static void mlx5_query_port_mtu(struct mlx5_core_dev *dev, u16 *admin_mtu,
+				u16 *max_mtu, u16 *oper_mtu, u8 port)
 {
 	u32 in[MLX5_ST_SZ_DW(pmtu_reg)];
 	u32 out[MLX5_ST_SZ_DW(pmtu_reg)];
@@ -267,7 +267,7 @@ static void mlx5_query_port_mtu(struct m
 		*admin_mtu = MLX5_GET(pmtu_reg, out, admin_mtu);
 }
 
-int mlx5_set_port_mtu(struct mlx5_core_dev *dev, int mtu, u8 port)
+int mlx5_set_port_mtu(struct mlx5_core_dev *dev, u16 mtu, u8 port)
 {
 	u32 in[MLX5_ST_SZ_DW(pmtu_reg)];
 	u32 out[MLX5_ST_SZ_DW(pmtu_reg)];
@@ -282,14 +282,14 @@ int mlx5_set_port_mtu(struct mlx5_core_d
 }
 EXPORT_SYMBOL_GPL(mlx5_set_port_mtu);
 
-void mlx5_query_port_max_mtu(struct mlx5_core_dev *dev, int *max_mtu,
+void mlx5_query_port_max_mtu(struct mlx5_core_dev *dev, u16 *max_mtu,
 			     u8 port)
 {
 	mlx5_query_port_mtu(dev, NULL, max_mtu, NULL, port);
 }
 EXPORT_SYMBOL_GPL(mlx5_query_port_max_mtu);
 
-void mlx5_query_port_oper_mtu(struct mlx5_core_dev *dev, int *oper_mtu,
+void mlx5_query_port_oper_mtu(struct mlx5_core_dev *dev, u16 *oper_mtu,
 			      u8 port)
 {
 	mlx5_query_port_mtu(dev, NULL, NULL, oper_mtu, port);
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -813,9 +813,9 @@ int mlx5_set_port_admin_status(struct ml
 int mlx5_query_port_admin_status(struct mlx5_core_dev *dev,
 				 enum mlx5_port_status *status);
 
-int mlx5_set_port_mtu(struct mlx5_core_dev *dev, int mtu, u8 port);
-void mlx5_query_port_max_mtu(struct mlx5_core_dev *dev, int *max_mtu, u8 port);
-void mlx5_query_port_oper_mtu(struct mlx5_core_dev *dev, int *oper_mtu,
+int mlx5_set_port_mtu(struct mlx5_core_dev *dev, u16 mtu, u8 port);
+void mlx5_query_port_max_mtu(struct mlx5_core_dev *dev, u16 *max_mtu, u8 port);
+void mlx5_query_port_oper_mtu(struct mlx5_core_dev *dev, u16 *oper_mtu,
 			      u8 port);
 
 int mlx5_query_port_vl_hw_cap(struct mlx5_core_dev *dev,

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 016/101] net/mlx5e: Fix minimum MTU
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (12 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 015/101] net/mlx5e: Devices mtu field is u16 and not int Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 017/101] net/mlx5e: Use vport MTU rather than physical port MTU Greg Kroah-Hartman
                   ` (78 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Saeed Mahameed, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Saeed Mahameed <saeedm@mellanox.com>

[ Upstream commit d8edd2469ace550db707798180d1c84d81f93bca ]

Minimum MTU that can be set in Connectx4 device is 68.

This fixes the case where a user wants to set invalid MTU,
the driver will fail to satisfy this request and the interface
will stay down.

It is better to report an error and continue working with old
mtu.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1906,22 +1906,27 @@ static int mlx5e_set_features(struct net
 	return err;
 }
 
+#define MXL5_HW_MIN_MTU 64
+#define MXL5E_MIN_MTU (MXL5_HW_MIN_MTU + ETH_FCS_LEN)
+
 static int mlx5e_change_mtu(struct net_device *netdev, int new_mtu)
 {
 	struct mlx5e_priv *priv = netdev_priv(netdev);
 	struct mlx5_core_dev *mdev = priv->mdev;
 	bool was_opened;
 	u16 max_mtu;
+	u16 min_mtu;
 	int err = 0;
 
 	mlx5_query_port_max_mtu(mdev, &max_mtu, 1);
 
 	max_mtu = MLX5E_HW2SW_MTU(max_mtu);
+	min_mtu = MLX5E_HW2SW_MTU(MXL5E_MIN_MTU);
 
-	if (new_mtu > max_mtu) {
+	if (new_mtu > max_mtu || new_mtu < min_mtu) {
 		netdev_err(netdev,
-			   "%s: Bad MTU (%d) > (%d) Max\n",
-			   __func__, new_mtu, max_mtu);
+			   "%s: Bad MTU (%d), valid range is: [%d..%d]\n",
+			   __func__, new_mtu, min_mtu, max_mtu);
 		return -EINVAL;
 	}
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 017/101] net/mlx5e: Use vport MTU rather than physical port MTU
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (13 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 016/101] net/mlx5e: Fix minimum MTU Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 018/101] ipv4/fib: dont warn when primary address is missing if in_dev is dead Greg Kroah-Hartman
                   ` (77 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Saeed Mahameed, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Saeed Mahameed <saeedm@mellanox.com>

[ Upstream commit cd255efff9baadd654d6160e52d17ae7c568c9d3 ]

Set and report vport MTU rather than physical MTU,
Driver will set both vport and physical port mtu and will
rely on the query of vport mtu.

SRIOV VFs have to report their MTU to their vport manager (PF),
and this will allow them to work with any MTU they need
without failing the request.

Also for some cases where the PF is not a port owner, PF can
work with MTU less than the physical port mtu if set physical
port mtu didn't take effect.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   44 +++++++++++++++++-----
 drivers/net/ethernet/mellanox/mlx5/core/vport.c   |   40 ++++++++++++++++++++
 include/linux/mlx5/vport.h                        |    2 +
 3 files changed, 77 insertions(+), 9 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1389,24 +1389,50 @@ static int mlx5e_refresh_tirs_self_loopb
 	return 0;
 }
 
-static int mlx5e_set_dev_port_mtu(struct net_device *netdev)
+static int mlx5e_set_mtu(struct mlx5e_priv *priv, u16 mtu)
 {
-	struct mlx5e_priv *priv = netdev_priv(netdev);
 	struct mlx5_core_dev *mdev = priv->mdev;
-	u16 hw_mtu;
+	u16 hw_mtu = MLX5E_SW2HW_MTU(mtu);
 	int err;
 
-	err = mlx5_set_port_mtu(mdev, MLX5E_SW2HW_MTU(netdev->mtu), 1);
+	err = mlx5_set_port_mtu(mdev, hw_mtu, 1);
 	if (err)
 		return err;
 
-	mlx5_query_port_oper_mtu(mdev, &hw_mtu, 1);
+	/* Update vport context MTU */
+	mlx5_modify_nic_vport_mtu(mdev, hw_mtu);
+	return 0;
+}
+
+static void mlx5e_query_mtu(struct mlx5e_priv *priv, u16 *mtu)
+{
+	struct mlx5_core_dev *mdev = priv->mdev;
+	u16 hw_mtu = 0;
+	int err;
+
+	err = mlx5_query_nic_vport_mtu(mdev, &hw_mtu);
+	if (err || !hw_mtu) /* fallback to port oper mtu */
+		mlx5_query_port_oper_mtu(mdev, &hw_mtu, 1);
+
+	*mtu = MLX5E_HW2SW_MTU(hw_mtu);
+}
+
+static int mlx5e_set_dev_port_mtu(struct net_device *netdev)
+{
+	struct mlx5e_priv *priv = netdev_priv(netdev);
+	u16 mtu;
+	int err;
+
+	err = mlx5e_set_mtu(priv, netdev->mtu);
+	if (err)
+		return err;
 
-	if (MLX5E_HW2SW_MTU(hw_mtu) != netdev->mtu)
-		netdev_warn(netdev, "%s: Port MTU %d is different than netdev mtu %d\n",
-			    __func__, MLX5E_HW2SW_MTU(hw_mtu), netdev->mtu);
+	mlx5e_query_mtu(priv, &mtu);
+	if (mtu != netdev->mtu)
+		netdev_warn(netdev, "%s: VPort MTU %d is different than netdev mtu %d\n",
+			    __func__, mtu, netdev->mtu);
 
-	netdev->mtu = MLX5E_HW2SW_MTU(hw_mtu);
+	netdev->mtu = mtu;
 	return 0;
 }
 
--- a/drivers/net/ethernet/mellanox/mlx5/core/vport.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vport.c
@@ -196,6 +196,46 @@ int mlx5_modify_nic_vport_mac_address(st
 }
 EXPORT_SYMBOL_GPL(mlx5_modify_nic_vport_mac_address);
 
+int mlx5_query_nic_vport_mtu(struct mlx5_core_dev *mdev, u16 *mtu)
+{
+	int outlen = MLX5_ST_SZ_BYTES(query_nic_vport_context_out);
+	u32 *out;
+	int err;
+
+	out = mlx5_vzalloc(outlen);
+	if (!out)
+		return -ENOMEM;
+
+	err = mlx5_query_nic_vport_context(mdev, 0, out, outlen);
+	if (!err)
+		*mtu = MLX5_GET(query_nic_vport_context_out, out,
+				nic_vport_context.mtu);
+
+	kvfree(out);
+	return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_query_nic_vport_mtu);
+
+int mlx5_modify_nic_vport_mtu(struct mlx5_core_dev *mdev, u16 mtu)
+{
+	int inlen = MLX5_ST_SZ_BYTES(modify_nic_vport_context_in);
+	void *in;
+	int err;
+
+	in = mlx5_vzalloc(inlen);
+	if (!in)
+		return -ENOMEM;
+
+	MLX5_SET(modify_nic_vport_context_in, in, field_select.mtu, 1);
+	MLX5_SET(modify_nic_vport_context_in, in, nic_vport_context.mtu, mtu);
+
+	err = mlx5_modify_nic_vport_context(mdev, in, inlen);
+
+	kvfree(in);
+	return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_modify_nic_vport_mtu);
+
 int mlx5_query_nic_vport_mac_list(struct mlx5_core_dev *dev,
 				  u32 vport,
 				  enum mlx5_list_type list_type,
--- a/include/linux/mlx5/vport.h
+++ b/include/linux/mlx5/vport.h
@@ -45,6 +45,8 @@ int mlx5_query_nic_vport_mac_address(str
 				     u16 vport, u8 *addr);
 int mlx5_modify_nic_vport_mac_address(struct mlx5_core_dev *dev,
 				      u16 vport, u8 *addr);
+int mlx5_query_nic_vport_mtu(struct mlx5_core_dev *mdev, u16 *mtu);
+int mlx5_modify_nic_vport_mtu(struct mlx5_core_dev *mdev, u16 mtu);
 int mlx5_query_nic_vport_system_image_guid(struct mlx5_core_dev *mdev,
 					   u64 *system_image_guid);
 int mlx5_query_nic_vport_node_guid(struct mlx5_core_dev *mdev, u64 *node_guid);

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 018/101] ipv4/fib: dont warn when primary address is missing if in_dev is dead
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (14 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 017/101] net/mlx5e: Use vport MTU rather than physical port MTU Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 019/101] net/mlx4_en: fix spurious timestamping callbacks Greg Kroah-Hartman
                   ` (76 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Paolo Abeni, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Paolo Abeni <pabeni@redhat.com>

[ Upstream commit 391a20333b8393ef2e13014e6e59d192c5594471 ]

After commit fbd40ea0180a ("ipv4: Don't do expensive useless work
during inetdev destroy.") when deleting an interface,
fib_del_ifaddr() can be executed without any primary address
present on the dead interface.

The above is safe, but triggers some "bug: prim == NULL" warnings.

This commit avoids warning if the in_dev is dead

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/fib_frontend.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -904,7 +904,11 @@ void fib_del_ifaddr(struct in_ifaddr *if
 	if (ifa->ifa_flags & IFA_F_SECONDARY) {
 		prim = inet_ifa_byprefix(in_dev, any, ifa->ifa_mask);
 		if (!prim) {
-			pr_warn("%s: bug: prim == NULL\n", __func__);
+			/* if the device has been deleted, we don't perform
+			 * address promotion
+			 */
+			if (!in_dev->dead)
+				pr_warn("%s: bug: prim == NULL\n", __func__);
 			return;
 		}
 		if (iprim && iprim != prim) {

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 019/101] net/mlx4_en: fix spurious timestamping callbacks
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (15 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 018/101] ipv4/fib: dont warn when primary address is missing if in_dev is dead Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 020/101] bpf: fix double-fdput in replace_map_fd_with_map_ptr() Greg Kroah-Hartman
                   ` (75 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Willem de Bruijn,
	Eran Ben Elisha, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit fc96256c906362e845d848d0f6a6354450059e81 ]

When multiple skb are TX-completed in a row, we might incorrectly keep
a timestamp of a prior skb and cause extra work.

Fixes: ec693d47010e8 ("net/mlx4_en: Add HW timestamping (TS) support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -402,7 +402,6 @@ static bool mlx4_en_process_tx_cq(struct
 	u32 packets = 0;
 	u32 bytes = 0;
 	int factor = priv->cqe_factor;
-	u64 timestamp = 0;
 	int done = 0;
 	int budget = priv->tx_work_limit;
 	u32 last_nr_txbb;
@@ -442,9 +441,12 @@ static bool mlx4_en_process_tx_cq(struct
 		new_index = be16_to_cpu(cqe->wqe_index) & size_mask;
 
 		do {
+			u64 timestamp = 0;
+
 			txbbs_skipped += last_nr_txbb;
 			ring_index = (ring_index + last_nr_txbb) & size_mask;
-			if (ring->tx_info[ring_index].ts_requested)
+
+			if (unlikely(ring->tx_info[ring_index].ts_requested))
 				timestamp = mlx4_en_get_cqe_ts(cqe);
 
 			/* free next descriptor */

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 020/101] bpf: fix double-fdput in replace_map_fd_with_map_ptr()
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (16 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 019/101] net/mlx4_en: fix spurious timestamping callbacks Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 021/101] bpf: fix refcnt overflow Greg Kroah-Hartman
                   ` (74 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jann Horn, Linus Torvalds,
	Alexei Starovoitov, Daniel Borkmann, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jann Horn <jannh@google.com>

[ Upstream commit 8358b02bf67d3a5d8a825070e1aa73f25fb2e4c7 ]

When bpf(BPF_PROG_LOAD, ...) was invoked with a BPF program whose bytecode
references a non-map file descriptor as a map file descriptor, the error
handling code called fdput() twice instead of once (in __bpf_map_get() and
in replace_map_fd_with_map_ptr()). If the file descriptor table of the
current task is shared, this causes f_count to be decremented too much,
allowing the struct file to be freed while it is still in use
(use-after-free). This can be exploited to gain root privileges by an
unprivileged user.

This bug was introduced in
commit 0246e64d9a5f ("bpf: handle pseudo BPF_LD_IMM64 insn"), but is only
exploitable since
commit 1be7f75d1668 ("bpf: enable non-root eBPF programs") because
previously, CAP_SYS_ADMIN was required to reach the vulnerable code.

(posted publicly according to request by maintainer)

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/bpf/verifier.c |    1 -
 1 file changed, 1 deletion(-)

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2004,7 +2004,6 @@ static int replace_map_fd_with_map_ptr(s
 			if (IS_ERR(map)) {
 				verbose("fd %d is not pointing to valid bpf_map\n",
 					insn->imm);
-				fdput(f);
 				return PTR_ERR(map);
 			}
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 021/101] bpf: fix refcnt overflow
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (17 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 020/101] bpf: fix double-fdput in replace_map_fd_with_map_ptr() Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 022/101] bpf: fix check_map_func_compatibility logic Greg Kroah-Hartman
                   ` (73 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jann Horn, Alexei Starovoitov,
	Daniel Borkmann, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexei Starovoitov <ast@fb.com>

[ Upstream commit 92117d8443bc5afacc8d5ba82e541946310f106e ]

On a system with >32Gbyte of phyiscal memory and infinite RLIMIT_MEMLOCK,
the malicious application may overflow 32-bit bpf program refcnt.
It's also possible to overflow map refcnt on 1Tb system.
Impose 32k hard limit which means that the same bpf program or
map cannot be shared by more than 32k processes.

Fixes: 1be7f75d1668 ("bpf: enable non-root eBPF programs")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/bpf.h   |    3 ++-
 kernel/bpf/inode.c    |    7 ++++---
 kernel/bpf/syscall.c  |   24 ++++++++++++++++++++----
 kernel/bpf/verifier.c |   11 +++++++----
 4 files changed, 33 insertions(+), 12 deletions(-)

--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -165,12 +165,13 @@ void bpf_register_prog_type(struct bpf_p
 void bpf_register_map_type(struct bpf_map_type_list *tl);
 
 struct bpf_prog *bpf_prog_get(u32 ufd);
+struct bpf_prog *bpf_prog_inc(struct bpf_prog *prog);
 void bpf_prog_put(struct bpf_prog *prog);
 void bpf_prog_put_rcu(struct bpf_prog *prog);
 
 struct bpf_map *bpf_map_get_with_uref(u32 ufd);
 struct bpf_map *__bpf_map_get(struct fd f);
-void bpf_map_inc(struct bpf_map *map, bool uref);
+struct bpf_map *bpf_map_inc(struct bpf_map *map, bool uref);
 void bpf_map_put_with_uref(struct bpf_map *map);
 void bpf_map_put(struct bpf_map *map);
 
--- a/kernel/bpf/inode.c
+++ b/kernel/bpf/inode.c
@@ -31,10 +31,10 @@ static void *bpf_any_get(void *raw, enum
 {
 	switch (type) {
 	case BPF_TYPE_PROG:
-		atomic_inc(&((struct bpf_prog *)raw)->aux->refcnt);
+		raw = bpf_prog_inc(raw);
 		break;
 	case BPF_TYPE_MAP:
-		bpf_map_inc(raw, true);
+		raw = bpf_map_inc(raw, true);
 		break;
 	default:
 		WARN_ON_ONCE(1);
@@ -297,7 +297,8 @@ static void *bpf_obj_do_get(const struct
 		goto out;
 
 	raw = bpf_any_get(inode->i_private, *type);
-	touch_atime(&path);
+	if (!IS_ERR(raw))
+		touch_atime(&path);
 
 	path_put(&path);
 	return raw;
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -201,11 +201,18 @@ struct bpf_map *__bpf_map_get(struct fd
 	return f.file->private_data;
 }
 
-void bpf_map_inc(struct bpf_map *map, bool uref)
+/* prog's and map's refcnt limit */
+#define BPF_MAX_REFCNT 32768
+
+struct bpf_map *bpf_map_inc(struct bpf_map *map, bool uref)
 {
-	atomic_inc(&map->refcnt);
+	if (atomic_inc_return(&map->refcnt) > BPF_MAX_REFCNT) {
+		atomic_dec(&map->refcnt);
+		return ERR_PTR(-EBUSY);
+	}
 	if (uref)
 		atomic_inc(&map->usercnt);
+	return map;
 }
 
 struct bpf_map *bpf_map_get_with_uref(u32 ufd)
@@ -217,7 +224,7 @@ struct bpf_map *bpf_map_get_with_uref(u3
 	if (IS_ERR(map))
 		return map;
 
-	bpf_map_inc(map, true);
+	map = bpf_map_inc(map, true);
 	fdput(f);
 
 	return map;
@@ -600,6 +607,15 @@ static struct bpf_prog *__bpf_prog_get(s
 	return f.file->private_data;
 }
 
+struct bpf_prog *bpf_prog_inc(struct bpf_prog *prog)
+{
+	if (atomic_inc_return(&prog->aux->refcnt) > BPF_MAX_REFCNT) {
+		atomic_dec(&prog->aux->refcnt);
+		return ERR_PTR(-EBUSY);
+	}
+	return prog;
+}
+
 /* called by sockets/tracing/seccomp before attaching program to an event
  * pairs with bpf_prog_put()
  */
@@ -612,7 +628,7 @@ struct bpf_prog *bpf_prog_get(u32 ufd)
 	if (IS_ERR(prog))
 		return prog;
 
-	atomic_inc(&prog->aux->refcnt);
+	prog = bpf_prog_inc(prog);
 	fdput(f);
 
 	return prog;
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2023,15 +2023,18 @@ static int replace_map_fd_with_map_ptr(s
 				return -E2BIG;
 			}
 
-			/* remember this map */
-			env->used_maps[env->used_map_cnt++] = map;
-
 			/* hold the map. If the program is rejected by verifier,
 			 * the map will be released by release_maps() or it
 			 * will be used by the valid program until it's unloaded
 			 * and all maps are released in free_bpf_prog_info()
 			 */
-			bpf_map_inc(map, false);
+			map = bpf_map_inc(map, false);
+			if (IS_ERR(map)) {
+				fdput(f);
+				return PTR_ERR(map);
+			}
+			env->used_maps[env->used_map_cnt++] = map;
+
 			fdput(f);
 next_insn:
 			insn++;

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 022/101] bpf: fix check_map_func_compatibility logic
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (18 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 021/101] bpf: fix refcnt overflow Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 023/101] samples/bpf: fix trace_output example Greg Kroah-Hartman
                   ` (72 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jann Horn, Alexei Starovoitov,
	Daniel Borkmann, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexei Starovoitov <ast@fb.com>

[ Upstream commit 6aff67c85c9e5a4bc99e5211c1bac547936626ca ]

The commit 35578d798400 ("bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU conuter")
introduced clever way to check bpf_helper<->map_type compatibility.
Later on commit a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") adjusted
the logic and inadvertently broke it.
Get rid of the clever bool compare and go back to two-way check
from map and from helper perspective.

Fixes: a43eec304259 ("bpf: introduce bpf_perf_event_output() helper")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/bpf/verifier.c |   53 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 32 insertions(+), 21 deletions(-)

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -239,15 +239,6 @@ static const char * const reg_type_str[]
 	[CONST_IMM]		= "imm",
 };
 
-static const struct {
-	int map_type;
-	int func_id;
-} func_limit[] = {
-	{BPF_MAP_TYPE_PROG_ARRAY, BPF_FUNC_tail_call},
-	{BPF_MAP_TYPE_PERF_EVENT_ARRAY, BPF_FUNC_perf_event_read},
-	{BPF_MAP_TYPE_PERF_EVENT_ARRAY, BPF_FUNC_perf_event_output},
-};
-
 static void print_verifier_state(struct verifier_env *env)
 {
 	enum bpf_reg_type t;
@@ -898,24 +889,44 @@ static int check_func_arg(struct verifie
 
 static int check_map_func_compatibility(struct bpf_map *map, int func_id)
 {
-	bool bool_map, bool_func;
-	int i;
-
 	if (!map)
 		return 0;
 
-	for (i = 0; i < ARRAY_SIZE(func_limit); i++) {
-		bool_map = (map->map_type == func_limit[i].map_type);
-		bool_func = (func_id == func_limit[i].func_id);
-		/* only when map & func pair match it can continue.
-		 * don't allow any other map type to be passed into
-		 * the special func;
-		 */
-		if (bool_func && bool_map != bool_func)
-			return -EINVAL;
+	/* We need a two way check, first is from map perspective ... */
+	switch (map->map_type) {
+	case BPF_MAP_TYPE_PROG_ARRAY:
+		if (func_id != BPF_FUNC_tail_call)
+			goto error;
+		break;
+	case BPF_MAP_TYPE_PERF_EVENT_ARRAY:
+		if (func_id != BPF_FUNC_perf_event_read &&
+		    func_id != BPF_FUNC_perf_event_output)
+			goto error;
+		break;
+	default:
+		break;
+	}
+
+	/* ... and second from the function itself. */
+	switch (func_id) {
+	case BPF_FUNC_tail_call:
+		if (map->map_type != BPF_MAP_TYPE_PROG_ARRAY)
+			goto error;
+		break;
+	case BPF_FUNC_perf_event_read:
+	case BPF_FUNC_perf_event_output:
+		if (map->map_type != BPF_MAP_TYPE_PERF_EVENT_ARRAY)
+			goto error;
+		break;
+	default:
+		break;
 	}
 
 	return 0;
+error:
+	verbose("cannot pass map_type %d into func %d\n",
+		map->map_type, func_id);
+	return -EINVAL;
 }
 
 static int check_call(struct verifier_env *env, int func_id)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 023/101] samples/bpf: fix trace_output example
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (19 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 022/101] bpf: fix check_map_func_compatibility logic Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 024/101] net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case Greg Kroah-Hartman
                   ` (71 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexei Starovoitov, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexei Starovoitov <ast@fb.com>

[ Upstream commit 569cc39d39385a74b23145496bca2df5ac8b2fb8 ]

llvm cannot always recognize memset as builtin function and optimize
it away, so just delete it. It was a leftover from testing
of bpf_perf_event_output() with large data structures.

Fixes: 39111695b1b8 ("samples: bpf: add bpf_perf_event_output example")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 samples/bpf/trace_output_kern.c |    1 -
 1 file changed, 1 deletion(-)

--- a/samples/bpf/trace_output_kern.c
+++ b/samples/bpf/trace_output_kern.c
@@ -18,7 +18,6 @@ int bpf_prog1(struct pt_regs *ctx)
 		u64 cookie;
 	} data;
 
-	memset(&data, 0, sizeof(data));
 	data.pid = bpf_get_current_pid_tgid();
 	data.cookie = 0x12345678;
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 024/101] net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (20 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 023/101] samples/bpf: fix trace_output example Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 025/101] gre: do not pull header in ICMP error processing Greg Kroah-Hartman
                   ` (70 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Tim Bingham, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tim Bingham <tbingham@akamai.com>

[ Upstream commit 2c94b53738549d81dc7464a32117d1f5112c64d3 ]

Prior to commit d92cff89a0c8 ("net_dbg_ratelimited: turn into no-op
when !DEBUG") the implementation of net_dbg_ratelimited() was buggy
for both the DEBUG and CONFIG_DYNAMIC_DEBUG cases.

The bug was that net_ratelimit() was being called and, despite
returning true, nothing was being printed to the console. This
resulted in messages like the following -

"net_ratelimit: %d callbacks suppressed"

with no other output nearby.

After commit d92cff89a0c8 ("net_dbg_ratelimited: turn into no-op when
!DEBUG") the bug is fixed for the DEBUG case. However, there's no
output at all for CONFIG_DYNAMIC_DEBUG case.

This patch restores debug output (if enabled) for the
CONFIG_DYNAMIC_DEBUG case.

Add a definition of net_dbg_ratelimited() for the CONFIG_DYNAMIC_DEBUG
case. The implementation takes care to check that dynamic debugging is
enabled before calling net_ratelimit().

Fixes: d92cff89a0c8 ("net_dbg_ratelimited: turn into no-op when !DEBUG")
Signed-off-by: Tim Bingham <tbingham@akamai.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/net.h |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -245,7 +245,15 @@ do {								\
 	net_ratelimited_function(pr_warn, fmt, ##__VA_ARGS__)
 #define net_info_ratelimited(fmt, ...)				\
 	net_ratelimited_function(pr_info, fmt, ##__VA_ARGS__)
-#if defined(DEBUG)
+#if defined(CONFIG_DYNAMIC_DEBUG)
+#define net_dbg_ratelimited(fmt, ...)					\
+do {									\
+	DEFINE_DYNAMIC_DEBUG_METADATA(descriptor, fmt);			\
+	if (unlikely(descriptor.flags & _DPRINTK_FLAGS_PRINT) &&	\
+	    net_ratelimit())						\
+		__dynamic_pr_debug(&descriptor, fmt, ##__VA_ARGS__);	\
+} while (0)
+#elif defined(DEBUG)
 #define net_dbg_ratelimited(fmt, ...)				\
 	net_ratelimited_function(pr_debug, fmt, ##__VA_ARGS__)
 #else

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 025/101] gre: do not pull header in ICMP error processing
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (21 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 024/101] net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 026/101] net_sched: introduce qdisc_replace() helper Greg Kroah-Hartman
                   ` (69 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Jiri Benc, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jiri Benc <jbenc@redhat.com>

[ Upstream commit b7f8fe251e4609e2a437bd2c2dea01e61db6849c ]

iptunnel_pull_header expects that IP header was already pulled; with this
expectation, it pulls the tunnel header. This is not true in gre_err.
Furthermore, ipv4_update_pmtu and ipv4_redirect expect that skb->data points
to the IP header.

We cannot pull the tunnel header in this path. It's just a matter of not
calling iptunnel_pull_header - we don't need any of its effects.

Fixes: bda7bb463436 ("gre: Allow multiple protocol listener for gre protocol.")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/ip_gre.c |   11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -179,6 +179,7 @@ static __be16 tnl_flags_to_gre_flags(__b
 	return flags;
 }
 
+/* Fills in tpi and returns header length to be pulled. */
 static int parse_gre_header(struct sk_buff *skb, struct tnl_ptk_info *tpi,
 			    bool *csum_err)
 {
@@ -238,7 +239,7 @@ static int parse_gre_header(struct sk_bu
 				return -EINVAL;
 		}
 	}
-	return iptunnel_pull_header(skb, hdr_len, tpi->proto);
+	return hdr_len;
 }
 
 static void ipgre_err(struct sk_buff *skb, u32 info,
@@ -341,7 +342,7 @@ static void gre_err(struct sk_buff *skb,
 	struct tnl_ptk_info tpi;
 	bool csum_err = false;
 
-	if (parse_gre_header(skb, &tpi, &csum_err)) {
+	if (parse_gre_header(skb, &tpi, &csum_err) < 0) {
 		if (!csum_err)		/* ignore csum errors. */
 			return;
 	}
@@ -419,6 +420,7 @@ static int gre_rcv(struct sk_buff *skb)
 {
 	struct tnl_ptk_info tpi;
 	bool csum_err = false;
+	int hdr_len;
 
 #ifdef CONFIG_NET_IPGRE_BROADCAST
 	if (ipv4_is_multicast(ip_hdr(skb)->daddr)) {
@@ -428,7 +430,10 @@ static int gre_rcv(struct sk_buff *skb)
 	}
 #endif
 
-	if (parse_gre_header(skb, &tpi, &csum_err) < 0)
+	hdr_len = parse_gre_header(skb, &tpi, &csum_err);
+	if (hdr_len < 0)
+		goto drop;
+	if (iptunnel_pull_header(skb, hdr_len, tpi.proto) < 0)
 		goto drop;
 
 	if (ipgre_rcv(skb, &tpi) == PACKET_RCVD)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 026/101] net_sched: introduce qdisc_replace() helper
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (22 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 025/101] gre: do not pull header in ICMP error processing Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 027/101] net_sched: update hierarchical backlog too Greg Kroah-Hartman
                   ` (68 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jamal Hadi Salim, Cong Wang, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: WANG Cong <xiyou.wangcong@gmail.com>

[ Upstream commit 86a7996cc8a078793670d82ed97d5a99bb4e8496 ]

Remove nearly duplicated code and prepare for the following patch.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/net/sch_generic.h |   17 +++++++++++++++++
 net/sched/sch_cbq.c       |    7 +------
 net/sched/sch_drr.c       |    6 +-----
 net/sched/sch_dsmark.c    |    8 +-------
 net/sched/sch_hfsc.c      |    6 +-----
 net/sched/sch_htb.c       |    9 +--------
 net/sched/sch_multiq.c    |    8 +-------
 net/sched/sch_netem.c     |   10 +---------
 net/sched/sch_prio.c      |    8 +-------
 net/sched/sch_qfq.c       |    6 +-----
 net/sched/sch_red.c       |    7 +------
 net/sched/sch_sfb.c       |    7 +------
 net/sched/sch_tbf.c       |    8 +-------
 13 files changed, 29 insertions(+), 78 deletions(-)

--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -707,6 +707,23 @@ static inline void qdisc_reset_queue(str
 	sch->qstats.backlog = 0;
 }
 
+static inline struct Qdisc *qdisc_replace(struct Qdisc *sch, struct Qdisc *new,
+					  struct Qdisc **pold)
+{
+	struct Qdisc *old;
+
+	sch_tree_lock(sch);
+	old = *pold;
+	*pold = new;
+	if (old != NULL) {
+		qdisc_tree_decrease_qlen(old, old->q.qlen);
+		qdisc_reset(old);
+	}
+	sch_tree_unlock(sch);
+
+	return old;
+}
+
 static inline unsigned int __qdisc_queue_drop(struct Qdisc *sch,
 					      struct sk_buff_head *list)
 {
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -1624,13 +1624,8 @@ static int cbq_graft(struct Qdisc *sch,
 			new->reshape_fail = cbq_reshape_fail;
 #endif
 	}
-	sch_tree_lock(sch);
-	*old = cl->q;
-	cl->q = new;
-	qdisc_tree_decrease_qlen(*old, (*old)->q.qlen);
-	qdisc_reset(*old);
-	sch_tree_unlock(sch);
 
+	*old = qdisc_replace(sch, new, &cl->q);
 	return 0;
 }
 
--- a/net/sched/sch_drr.c
+++ b/net/sched/sch_drr.c
@@ -226,11 +226,7 @@ static int drr_graft_class(struct Qdisc
 			new = &noop_qdisc;
 	}
 
-	sch_tree_lock(sch);
-	drr_purge_queue(cl);
-	*old = cl->qdisc;
-	cl->qdisc = new;
-	sch_tree_unlock(sch);
+	*old = qdisc_replace(sch, new, &cl->qdisc);
 	return 0;
 }
 
--- a/net/sched/sch_dsmark.c
+++ b/net/sched/sch_dsmark.c
@@ -73,13 +73,7 @@ static int dsmark_graft(struct Qdisc *sc
 			new = &noop_qdisc;
 	}
 
-	sch_tree_lock(sch);
-	*old = p->q;
-	p->q = new;
-	qdisc_tree_decrease_qlen(*old, (*old)->q.qlen);
-	qdisc_reset(*old);
-	sch_tree_unlock(sch);
-
+	*old = qdisc_replace(sch, new, &p->q);
 	return 0;
 }
 
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -1215,11 +1215,7 @@ hfsc_graft_class(struct Qdisc *sch, unsi
 			new = &noop_qdisc;
 	}
 
-	sch_tree_lock(sch);
-	hfsc_purge_queue(sch, cl);
-	*old = cl->qdisc;
-	cl->qdisc = new;
-	sch_tree_unlock(sch);
+	*old = qdisc_replace(sch, new, &cl->qdisc);
 	return 0;
 }
 
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -1163,14 +1163,7 @@ static int htb_graft(struct Qdisc *sch,
 				     cl->common.classid)) == NULL)
 		return -ENOBUFS;
 
-	sch_tree_lock(sch);
-	*old = cl->un.leaf.q;
-	cl->un.leaf.q = new;
-	if (*old != NULL) {
-		qdisc_tree_decrease_qlen(*old, (*old)->q.qlen);
-		qdisc_reset(*old);
-	}
-	sch_tree_unlock(sch);
+	*old = qdisc_replace(sch, new, &cl->un.leaf.q);
 	return 0;
 }
 
--- a/net/sched/sch_multiq.c
+++ b/net/sched/sch_multiq.c
@@ -303,13 +303,7 @@ static int multiq_graft(struct Qdisc *sc
 	if (new == NULL)
 		new = &noop_qdisc;
 
-	sch_tree_lock(sch);
-	*old = q->queues[band];
-	q->queues[band] = new;
-	qdisc_tree_decrease_qlen(*old, (*old)->q.qlen);
-	qdisc_reset(*old);
-	sch_tree_unlock(sch);
-
+	*old = qdisc_replace(sch, new, &q->queues[band]);
 	return 0;
 }
 
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -1037,15 +1037,7 @@ static int netem_graft(struct Qdisc *sch
 {
 	struct netem_sched_data *q = qdisc_priv(sch);
 
-	sch_tree_lock(sch);
-	*old = q->qdisc;
-	q->qdisc = new;
-	if (*old) {
-		qdisc_tree_decrease_qlen(*old, (*old)->q.qlen);
-		qdisc_reset(*old);
-	}
-	sch_tree_unlock(sch);
-
+	*old = qdisc_replace(sch, new, &q->qdisc);
 	return 0;
 }
 
--- a/net/sched/sch_prio.c
+++ b/net/sched/sch_prio.c
@@ -268,13 +268,7 @@ static int prio_graft(struct Qdisc *sch,
 	if (new == NULL)
 		new = &noop_qdisc;
 
-	sch_tree_lock(sch);
-	*old = q->queues[band];
-	q->queues[band] = new;
-	qdisc_tree_decrease_qlen(*old, (*old)->q.qlen);
-	qdisc_reset(*old);
-	sch_tree_unlock(sch);
-
+	*old = qdisc_replace(sch, new, &q->queues[band]);
 	return 0;
 }
 
--- a/net/sched/sch_qfq.c
+++ b/net/sched/sch_qfq.c
@@ -617,11 +617,7 @@ static int qfq_graft_class(struct Qdisc
 			new = &noop_qdisc;
 	}
 
-	sch_tree_lock(sch);
-	qfq_purge_queue(cl);
-	*old = cl->qdisc;
-	cl->qdisc = new;
-	sch_tree_unlock(sch);
+	*old = qdisc_replace(sch, new, &cl->qdisc);
 	return 0;
 }
 
--- a/net/sched/sch_red.c
+++ b/net/sched/sch_red.c
@@ -313,12 +313,7 @@ static int red_graft(struct Qdisc *sch,
 	if (new == NULL)
 		new = &noop_qdisc;
 
-	sch_tree_lock(sch);
-	*old = q->qdisc;
-	q->qdisc = new;
-	qdisc_tree_decrease_qlen(*old, (*old)->q.qlen);
-	qdisc_reset(*old);
-	sch_tree_unlock(sch);
+	*old = qdisc_replace(sch, new, &q->qdisc);
 	return 0;
 }
 
--- a/net/sched/sch_sfb.c
+++ b/net/sched/sch_sfb.c
@@ -606,12 +606,7 @@ static int sfb_graft(struct Qdisc *sch,
 	if (new == NULL)
 		new = &noop_qdisc;
 
-	sch_tree_lock(sch);
-	*old = q->qdisc;
-	q->qdisc = new;
-	qdisc_tree_decrease_qlen(*old, (*old)->q.qlen);
-	qdisc_reset(*old);
-	sch_tree_unlock(sch);
+	*old = qdisc_replace(sch, new, &q->qdisc);
 	return 0;
 }
 
--- a/net/sched/sch_tbf.c
+++ b/net/sched/sch_tbf.c
@@ -502,13 +502,7 @@ static int tbf_graft(struct Qdisc *sch,
 	if (new == NULL)
 		new = &noop_qdisc;
 
-	sch_tree_lock(sch);
-	*old = q->qdisc;
-	q->qdisc = new;
-	qdisc_tree_decrease_qlen(*old, (*old)->q.qlen);
-	qdisc_reset(*old);
-	sch_tree_unlock(sch);
-
+	*old = qdisc_replace(sch, new, &q->qdisc);
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 027/101] net_sched: update hierarchical backlog too
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (23 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 026/101] net_sched: introduce qdisc_replace() helper Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 028/101] sch_htb: update backlog as well Greg Kroah-Hartman
                   ` (67 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jamal Hadi Salim, Cong Wang, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: WANG Cong <xiyou.wangcong@gmail.com>

[ Upstream commit 2ccccf5fb43ff62b2b96cc58d95fc0b3596516e4 ]

When the bottom qdisc decides to, for example, drop some packet,
it calls qdisc_tree_decrease_qlen() to update the queue length
for all its ancestors, we need to update the backlog too to
keep the stats on root qdisc accurate.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/net/codel.h       |    4 ++++
 include/net/sch_generic.h |    5 +++--
 net/sched/sch_api.c       |    8 +++++---
 net/sched/sch_cbq.c       |    5 +++--
 net/sched/sch_choke.c     |    6 ++++--
 net/sched/sch_codel.c     |   10 ++++++----
 net/sched/sch_drr.c       |    3 ++-
 net/sched/sch_fq.c        |    4 +++-
 net/sched/sch_fq_codel.c  |   17 ++++++++++++-----
 net/sched/sch_hfsc.c      |    3 ++-
 net/sched/sch_hhf.c       |   10 +++++++---
 net/sched/sch_htb.c       |   10 ++++++----
 net/sched/sch_multiq.c    |    8 +++++---
 net/sched/sch_netem.c     |    3 ++-
 net/sched/sch_pie.c       |    5 +++--
 net/sched/sch_prio.c      |    7 ++++---
 net/sched/sch_qfq.c       |    3 ++-
 net/sched/sch_red.c       |    3 ++-
 net/sched/sch_sfb.c       |    3 ++-
 net/sched/sch_sfq.c       |   16 +++++++++-------
 net/sched/sch_tbf.c       |    7 +++++--
 21 files changed, 91 insertions(+), 49 deletions(-)

--- a/include/net/codel.h
+++ b/include/net/codel.h
@@ -162,12 +162,14 @@ struct codel_vars {
  * struct codel_stats - contains codel shared variables and stats
  * @maxpacket:	largest packet we've seen so far
  * @drop_count:	temp count of dropped packets in dequeue()
+ * @drop_len:	bytes of dropped packets in dequeue()
  * ecn_mark:	number of packets we ECN marked instead of dropping
  * ce_mark:	number of packets CE marked because sojourn time was above ce_threshold
  */
 struct codel_stats {
 	u32		maxpacket;
 	u32		drop_count;
+	u32		drop_len;
 	u32		ecn_mark;
 	u32		ce_mark;
 };
@@ -308,6 +310,7 @@ static struct sk_buff *codel_dequeue(str
 								  vars->rec_inv_sqrt);
 					goto end;
 				}
+				stats->drop_len += qdisc_pkt_len(skb);
 				qdisc_drop(skb, sch);
 				stats->drop_count++;
 				skb = dequeue_func(vars, sch);
@@ -330,6 +333,7 @@ static struct sk_buff *codel_dequeue(str
 		if (params->ecn && INET_ECN_set_ce(skb)) {
 			stats->ecn_mark++;
 		} else {
+			stats->drop_len += qdisc_pkt_len(skb);
 			qdisc_drop(skb, sch);
 			stats->drop_count++;
 
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -396,7 +396,8 @@ struct Qdisc *dev_graft_qdisc(struct net
 			      struct Qdisc *qdisc);
 void qdisc_reset(struct Qdisc *qdisc);
 void qdisc_destroy(struct Qdisc *qdisc);
-void qdisc_tree_decrease_qlen(struct Qdisc *qdisc, unsigned int n);
+void qdisc_tree_reduce_backlog(struct Qdisc *qdisc, unsigned int n,
+			       unsigned int len);
 struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
 			  const struct Qdisc_ops *ops);
 struct Qdisc *qdisc_create_dflt(struct netdev_queue *dev_queue,
@@ -716,7 +717,7 @@ static inline struct Qdisc *qdisc_replac
 	old = *pold;
 	*pold = new;
 	if (old != NULL) {
-		qdisc_tree_decrease_qlen(old, old->q.qlen);
+		qdisc_tree_reduce_backlog(old, old->q.qlen, old->qstats.backlog);
 		qdisc_reset(old);
 	}
 	sch_tree_unlock(sch);
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -744,14 +744,15 @@ static u32 qdisc_alloc_handle(struct net
 	return 0;
 }
 
-void qdisc_tree_decrease_qlen(struct Qdisc *sch, unsigned int n)
+void qdisc_tree_reduce_backlog(struct Qdisc *sch, unsigned int n,
+			       unsigned int len)
 {
 	const struct Qdisc_class_ops *cops;
 	unsigned long cl;
 	u32 parentid;
 	int drops;
 
-	if (n == 0)
+	if (n == 0 && len == 0)
 		return;
 	drops = max_t(int, n, 0);
 	rcu_read_lock();
@@ -774,11 +775,12 @@ void qdisc_tree_decrease_qlen(struct Qdi
 			cops->put(sch, cl);
 		}
 		sch->q.qlen -= n;
+		sch->qstats.backlog -= len;
 		__qdisc_qstats_drop(sch, drops);
 	}
 	rcu_read_unlock();
 }
-EXPORT_SYMBOL(qdisc_tree_decrease_qlen);
+EXPORT_SYMBOL(qdisc_tree_reduce_backlog);
 
 static void notify_and_destroy(struct net *net, struct sk_buff *skb,
 			       struct nlmsghdr *n, u32 clid,
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -1909,7 +1909,7 @@ static int cbq_delete(struct Qdisc *sch,
 {
 	struct cbq_sched_data *q = qdisc_priv(sch);
 	struct cbq_class *cl = (struct cbq_class *)arg;
-	unsigned int qlen;
+	unsigned int qlen, backlog;
 
 	if (cl->filters || cl->children || cl == &q->link)
 		return -EBUSY;
@@ -1917,8 +1917,9 @@ static int cbq_delete(struct Qdisc *sch,
 	sch_tree_lock(sch);
 
 	qlen = cl->q->q.qlen;
+	backlog = cl->q->qstats.backlog;
 	qdisc_reset(cl->q);
-	qdisc_tree_decrease_qlen(cl->q, qlen);
+	qdisc_tree_reduce_backlog(cl->q, qlen, backlog);
 
 	if (cl->next_alive)
 		cbq_deactivate_class(cl);
--- a/net/sched/sch_choke.c
+++ b/net/sched/sch_choke.c
@@ -128,8 +128,8 @@ static void choke_drop_by_idx(struct Qdi
 		choke_zap_tail_holes(q);
 
 	qdisc_qstats_backlog_dec(sch, skb);
+	qdisc_tree_reduce_backlog(sch, 1, qdisc_pkt_len(skb));
 	qdisc_drop(skb, sch);
-	qdisc_tree_decrease_qlen(sch, 1);
 	--sch->q.qlen;
 }
 
@@ -456,6 +456,7 @@ static int choke_change(struct Qdisc *sc
 		old = q->tab;
 		if (old) {
 			unsigned int oqlen = sch->q.qlen, tail = 0;
+			unsigned dropped = 0;
 
 			while (q->head != q->tail) {
 				struct sk_buff *skb = q->tab[q->head];
@@ -467,11 +468,12 @@ static int choke_change(struct Qdisc *sc
 					ntab[tail++] = skb;
 					continue;
 				}
+				dropped += qdisc_pkt_len(skb);
 				qdisc_qstats_backlog_dec(sch, skb);
 				--sch->q.qlen;
 				qdisc_drop(skb, sch);
 			}
-			qdisc_tree_decrease_qlen(sch, oqlen - sch->q.qlen);
+			qdisc_tree_reduce_backlog(sch, oqlen - sch->q.qlen, dropped);
 			q->head = 0;
 			q->tail = tail;
 		}
--- a/net/sched/sch_codel.c
+++ b/net/sched/sch_codel.c
@@ -79,12 +79,13 @@ static struct sk_buff *codel_qdisc_deque
 
 	skb = codel_dequeue(sch, &q->params, &q->vars, &q->stats, dequeue);
 
-	/* We cant call qdisc_tree_decrease_qlen() if our qlen is 0,
+	/* We cant call qdisc_tree_reduce_backlog() if our qlen is 0,
 	 * or HTB crashes. Defer it for next round.
 	 */
 	if (q->stats.drop_count && sch->q.qlen) {
-		qdisc_tree_decrease_qlen(sch, q->stats.drop_count);
+		qdisc_tree_reduce_backlog(sch, q->stats.drop_count, q->stats.drop_len);
 		q->stats.drop_count = 0;
+		q->stats.drop_len = 0;
 	}
 	if (skb)
 		qdisc_bstats_update(sch, skb);
@@ -116,7 +117,7 @@ static int codel_change(struct Qdisc *sc
 {
 	struct codel_sched_data *q = qdisc_priv(sch);
 	struct nlattr *tb[TCA_CODEL_MAX + 1];
-	unsigned int qlen;
+	unsigned int qlen, dropped = 0;
 	int err;
 
 	if (!opt)
@@ -156,10 +157,11 @@ static int codel_change(struct Qdisc *sc
 	while (sch->q.qlen > sch->limit) {
 		struct sk_buff *skb = __skb_dequeue(&sch->q);
 
+		dropped += qdisc_pkt_len(skb);
 		qdisc_qstats_backlog_dec(sch, skb);
 		qdisc_drop(skb, sch);
 	}
-	qdisc_tree_decrease_qlen(sch, qlen - sch->q.qlen);
+	qdisc_tree_reduce_backlog(sch, qlen - sch->q.qlen, dropped);
 
 	sch_tree_unlock(sch);
 	return 0;
--- a/net/sched/sch_drr.c
+++ b/net/sched/sch_drr.c
@@ -53,9 +53,10 @@ static struct drr_class *drr_find_class(
 static void drr_purge_queue(struct drr_class *cl)
 {
 	unsigned int len = cl->qdisc->q.qlen;
+	unsigned int backlog = cl->qdisc->qstats.backlog;
 
 	qdisc_reset(cl->qdisc);
-	qdisc_tree_decrease_qlen(cl->qdisc, len);
+	qdisc_tree_reduce_backlog(cl->qdisc, len, backlog);
 }
 
 static const struct nla_policy drr_policy[TCA_DRR_MAX + 1] = {
--- a/net/sched/sch_fq.c
+++ b/net/sched/sch_fq.c
@@ -662,6 +662,7 @@ static int fq_change(struct Qdisc *sch,
 	struct fq_sched_data *q = qdisc_priv(sch);
 	struct nlattr *tb[TCA_FQ_MAX + 1];
 	int err, drop_count = 0;
+	unsigned drop_len = 0;
 	u32 fq_log;
 
 	if (!opt)
@@ -736,10 +737,11 @@ static int fq_change(struct Qdisc *sch,
 
 		if (!skb)
 			break;
+		drop_len += qdisc_pkt_len(skb);
 		kfree_skb(skb);
 		drop_count++;
 	}
-	qdisc_tree_decrease_qlen(sch, drop_count);
+	qdisc_tree_reduce_backlog(sch, drop_count, drop_len);
 
 	sch_tree_unlock(sch);
 	return err;
--- a/net/sched/sch_fq_codel.c
+++ b/net/sched/sch_fq_codel.c
@@ -175,7 +175,7 @@ static unsigned int fq_codel_qdisc_drop(
 static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 {
 	struct fq_codel_sched_data *q = qdisc_priv(sch);
-	unsigned int idx;
+	unsigned int idx, prev_backlog;
 	struct fq_codel_flow *flow;
 	int uninitialized_var(ret);
 
@@ -203,6 +203,7 @@ static int fq_codel_enqueue(struct sk_bu
 	if (++sch->q.qlen <= sch->limit)
 		return NET_XMIT_SUCCESS;
 
+	prev_backlog = sch->qstats.backlog;
 	q->drop_overlimit++;
 	/* Return Congestion Notification only if we dropped a packet
 	 * from this flow.
@@ -211,7 +212,7 @@ static int fq_codel_enqueue(struct sk_bu
 		return NET_XMIT_CN;
 
 	/* As we dropped a packet, better let upper stack know this */
-	qdisc_tree_decrease_qlen(sch, 1);
+	qdisc_tree_reduce_backlog(sch, 1, prev_backlog - sch->qstats.backlog);
 	return NET_XMIT_SUCCESS;
 }
 
@@ -241,6 +242,7 @@ static struct sk_buff *fq_codel_dequeue(
 	struct fq_codel_flow *flow;
 	struct list_head *head;
 	u32 prev_drop_count, prev_ecn_mark;
+	unsigned int prev_backlog;
 
 begin:
 	head = &q->new_flows;
@@ -259,6 +261,7 @@ begin:
 
 	prev_drop_count = q->cstats.drop_count;
 	prev_ecn_mark = q->cstats.ecn_mark;
+	prev_backlog = sch->qstats.backlog;
 
 	skb = codel_dequeue(sch, &q->cparams, &flow->cvars, &q->cstats,
 			    dequeue);
@@ -276,12 +279,14 @@ begin:
 	}
 	qdisc_bstats_update(sch, skb);
 	flow->deficit -= qdisc_pkt_len(skb);
-	/* We cant call qdisc_tree_decrease_qlen() if our qlen is 0,
+	/* We cant call qdisc_tree_reduce_backlog() if our qlen is 0,
 	 * or HTB crashes. Defer it for next round.
 	 */
 	if (q->cstats.drop_count && sch->q.qlen) {
-		qdisc_tree_decrease_qlen(sch, q->cstats.drop_count);
+		qdisc_tree_reduce_backlog(sch, q->cstats.drop_count,
+					  q->cstats.drop_len);
 		q->cstats.drop_count = 0;
+		q->cstats.drop_len = 0;
 	}
 	return skb;
 }
@@ -372,11 +377,13 @@ static int fq_codel_change(struct Qdisc
 	while (sch->q.qlen > sch->limit) {
 		struct sk_buff *skb = fq_codel_dequeue(sch);
 
+		q->cstats.drop_len += qdisc_pkt_len(skb);
 		kfree_skb(skb);
 		q->cstats.drop_count++;
 	}
-	qdisc_tree_decrease_qlen(sch, q->cstats.drop_count);
+	qdisc_tree_reduce_backlog(sch, q->cstats.drop_count, q->cstats.drop_len);
 	q->cstats.drop_count = 0;
+	q->cstats.drop_len = 0;
 
 	sch_tree_unlock(sch);
 	return 0;
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -895,9 +895,10 @@ static void
 hfsc_purge_queue(struct Qdisc *sch, struct hfsc_class *cl)
 {
 	unsigned int len = cl->qdisc->q.qlen;
+	unsigned int backlog = cl->qdisc->qstats.backlog;
 
 	qdisc_reset(cl->qdisc);
-	qdisc_tree_decrease_qlen(cl->qdisc, len);
+	qdisc_tree_reduce_backlog(cl->qdisc, len, backlog);
 }
 
 static void
--- a/net/sched/sch_hhf.c
+++ b/net/sched/sch_hhf.c
@@ -382,6 +382,7 @@ static int hhf_enqueue(struct sk_buff *s
 	struct hhf_sched_data *q = qdisc_priv(sch);
 	enum wdrr_bucket_idx idx;
 	struct wdrr_bucket *bucket;
+	unsigned int prev_backlog;
 
 	idx = hhf_classify(skb, sch);
 
@@ -409,6 +410,7 @@ static int hhf_enqueue(struct sk_buff *s
 	if (++sch->q.qlen <= sch->limit)
 		return NET_XMIT_SUCCESS;
 
+	prev_backlog = sch->qstats.backlog;
 	q->drop_overlimit++;
 	/* Return Congestion Notification only if we dropped a packet from this
 	 * bucket.
@@ -417,7 +419,7 @@ static int hhf_enqueue(struct sk_buff *s
 		return NET_XMIT_CN;
 
 	/* As we dropped a packet, better let upper stack know this. */
-	qdisc_tree_decrease_qlen(sch, 1);
+	qdisc_tree_reduce_backlog(sch, 1, prev_backlog - sch->qstats.backlog);
 	return NET_XMIT_SUCCESS;
 }
 
@@ -527,7 +529,7 @@ static int hhf_change(struct Qdisc *sch,
 {
 	struct hhf_sched_data *q = qdisc_priv(sch);
 	struct nlattr *tb[TCA_HHF_MAX + 1];
-	unsigned int qlen;
+	unsigned int qlen, prev_backlog;
 	int err;
 	u64 non_hh_quantum;
 	u32 new_quantum = q->quantum;
@@ -577,12 +579,14 @@ static int hhf_change(struct Qdisc *sch,
 	}
 
 	qlen = sch->q.qlen;
+	prev_backlog = sch->qstats.backlog;
 	while (sch->q.qlen > sch->limit) {
 		struct sk_buff *skb = hhf_dequeue(sch);
 
 		kfree_skb(skb);
 	}
-	qdisc_tree_decrease_qlen(sch, qlen - sch->q.qlen);
+	qdisc_tree_reduce_backlog(sch, qlen - sch->q.qlen,
+				  prev_backlog - sch->qstats.backlog);
 
 	sch_tree_unlock(sch);
 	return 0;
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -1265,7 +1265,6 @@ static int htb_delete(struct Qdisc *sch,
 {
 	struct htb_sched *q = qdisc_priv(sch);
 	struct htb_class *cl = (struct htb_class *)arg;
-	unsigned int qlen;
 	struct Qdisc *new_q = NULL;
 	int last_child = 0;
 
@@ -1285,9 +1284,11 @@ static int htb_delete(struct Qdisc *sch,
 	sch_tree_lock(sch);
 
 	if (!cl->level) {
-		qlen = cl->un.leaf.q->q.qlen;
+		unsigned int qlen = cl->un.leaf.q->q.qlen;
+		unsigned int backlog = cl->un.leaf.q->qstats.backlog;
+
 		qdisc_reset(cl->un.leaf.q);
-		qdisc_tree_decrease_qlen(cl->un.leaf.q, qlen);
+		qdisc_tree_reduce_backlog(cl->un.leaf.q, qlen, backlog);
 	}
 
 	/* delete from hash and active; remainder in destroy_class */
@@ -1421,10 +1422,11 @@ static int htb_change_class(struct Qdisc
 		sch_tree_lock(sch);
 		if (parent && !parent->level) {
 			unsigned int qlen = parent->un.leaf.q->q.qlen;
+			unsigned int backlog = parent->un.leaf.q->qstats.backlog;
 
 			/* turn parent into inner node */
 			qdisc_reset(parent->un.leaf.q);
-			qdisc_tree_decrease_qlen(parent->un.leaf.q, qlen);
+			qdisc_tree_reduce_backlog(parent->un.leaf.q, qlen, backlog);
 			qdisc_destroy(parent->un.leaf.q);
 			if (parent->prio_activity)
 				htb_deactivate(q, parent);
--- a/net/sched/sch_multiq.c
+++ b/net/sched/sch_multiq.c
@@ -218,7 +218,8 @@ static int multiq_tune(struct Qdisc *sch
 		if (q->queues[i] != &noop_qdisc) {
 			struct Qdisc *child = q->queues[i];
 			q->queues[i] = &noop_qdisc;
-			qdisc_tree_decrease_qlen(child, child->q.qlen);
+			qdisc_tree_reduce_backlog(child, child->q.qlen,
+						  child->qstats.backlog);
 			qdisc_destroy(child);
 		}
 	}
@@ -238,8 +239,9 @@ static int multiq_tune(struct Qdisc *sch
 				q->queues[i] = child;
 
 				if (old != &noop_qdisc) {
-					qdisc_tree_decrease_qlen(old,
-								 old->q.qlen);
+					qdisc_tree_reduce_backlog(old,
+								  old->q.qlen,
+								  old->qstats.backlog);
 					qdisc_destroy(old);
 				}
 				sch_tree_unlock(sch);
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -598,7 +598,8 @@ deliver:
 				if (unlikely(err != NET_XMIT_SUCCESS)) {
 					if (net_xmit_drop_count(err)) {
 						qdisc_qstats_drop(sch);
-						qdisc_tree_decrease_qlen(sch, 1);
+						qdisc_tree_reduce_backlog(sch, 1,
+									  qdisc_pkt_len(skb));
 					}
 				}
 				goto tfifo_dequeue;
--- a/net/sched/sch_pie.c
+++ b/net/sched/sch_pie.c
@@ -183,7 +183,7 @@ static int pie_change(struct Qdisc *sch,
 {
 	struct pie_sched_data *q = qdisc_priv(sch);
 	struct nlattr *tb[TCA_PIE_MAX + 1];
-	unsigned int qlen;
+	unsigned int qlen, dropped = 0;
 	int err;
 
 	if (!opt)
@@ -232,10 +232,11 @@ static int pie_change(struct Qdisc *sch,
 	while (sch->q.qlen > sch->limit) {
 		struct sk_buff *skb = __skb_dequeue(&sch->q);
 
+		dropped += qdisc_pkt_len(skb);
 		qdisc_qstats_backlog_dec(sch, skb);
 		qdisc_drop(skb, sch);
 	}
-	qdisc_tree_decrease_qlen(sch, qlen - sch->q.qlen);
+	qdisc_tree_reduce_backlog(sch, qlen - sch->q.qlen, dropped);
 
 	sch_tree_unlock(sch);
 	return 0;
--- a/net/sched/sch_prio.c
+++ b/net/sched/sch_prio.c
@@ -191,7 +191,7 @@ static int prio_tune(struct Qdisc *sch,
 		struct Qdisc *child = q->queues[i];
 		q->queues[i] = &noop_qdisc;
 		if (child != &noop_qdisc) {
-			qdisc_tree_decrease_qlen(child, child->q.qlen);
+			qdisc_tree_reduce_backlog(child, child->q.qlen, child->qstats.backlog);
 			qdisc_destroy(child);
 		}
 	}
@@ -210,8 +210,9 @@ static int prio_tune(struct Qdisc *sch,
 				q->queues[i] = child;
 
 				if (old != &noop_qdisc) {
-					qdisc_tree_decrease_qlen(old,
-								 old->q.qlen);
+					qdisc_tree_reduce_backlog(old,
+								  old->q.qlen,
+								  old->qstats.backlog);
 					qdisc_destroy(old);
 				}
 				sch_tree_unlock(sch);
--- a/net/sched/sch_qfq.c
+++ b/net/sched/sch_qfq.c
@@ -220,9 +220,10 @@ static struct qfq_class *qfq_find_class(
 static void qfq_purge_queue(struct qfq_class *cl)
 {
 	unsigned int len = cl->qdisc->q.qlen;
+	unsigned int backlog = cl->qdisc->qstats.backlog;
 
 	qdisc_reset(cl->qdisc);
-	qdisc_tree_decrease_qlen(cl->qdisc, len);
+	qdisc_tree_reduce_backlog(cl->qdisc, len, backlog);
 }
 
 static const struct nla_policy qfq_policy[TCA_QFQ_MAX + 1] = {
--- a/net/sched/sch_red.c
+++ b/net/sched/sch_red.c
@@ -210,7 +210,8 @@ static int red_change(struct Qdisc *sch,
 	q->flags = ctl->flags;
 	q->limit = ctl->limit;
 	if (child) {
-		qdisc_tree_decrease_qlen(q->qdisc, q->qdisc->q.qlen);
+		qdisc_tree_reduce_backlog(q->qdisc, q->qdisc->q.qlen,
+					  q->qdisc->qstats.backlog);
 		qdisc_destroy(q->qdisc);
 		q->qdisc = child;
 	}
--- a/net/sched/sch_sfb.c
+++ b/net/sched/sch_sfb.c
@@ -510,7 +510,8 @@ static int sfb_change(struct Qdisc *sch,
 
 	sch_tree_lock(sch);
 
-	qdisc_tree_decrease_qlen(q->qdisc, q->qdisc->q.qlen);
+	qdisc_tree_reduce_backlog(q->qdisc, q->qdisc->q.qlen,
+				  q->qdisc->qstats.backlog);
 	qdisc_destroy(q->qdisc);
 	q->qdisc = child;
 
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -346,7 +346,7 @@ static int
 sfq_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 {
 	struct sfq_sched_data *q = qdisc_priv(sch);
-	unsigned int hash;
+	unsigned int hash, dropped;
 	sfq_index x, qlen;
 	struct sfq_slot *slot;
 	int uninitialized_var(ret);
@@ -461,7 +461,7 @@ enqueue:
 		return NET_XMIT_SUCCESS;
 
 	qlen = slot->qlen;
-	sfq_drop(sch);
+	dropped = sfq_drop(sch);
 	/* Return Congestion Notification only if we dropped a packet
 	 * from this flow.
 	 */
@@ -469,7 +469,7 @@ enqueue:
 		return NET_XMIT_CN;
 
 	/* As we dropped a packet, better let upper stack know this */
-	qdisc_tree_decrease_qlen(sch, 1);
+	qdisc_tree_reduce_backlog(sch, 1, dropped);
 	return NET_XMIT_SUCCESS;
 }
 
@@ -537,6 +537,7 @@ static void sfq_rehash(struct Qdisc *sch
 	struct sfq_slot *slot;
 	struct sk_buff_head list;
 	int dropped = 0;
+	unsigned int drop_len = 0;
 
 	__skb_queue_head_init(&list);
 
@@ -565,6 +566,7 @@ static void sfq_rehash(struct Qdisc *sch
 			if (x >= SFQ_MAX_FLOWS) {
 drop:
 				qdisc_qstats_backlog_dec(sch, skb);
+				drop_len += qdisc_pkt_len(skb);
 				kfree_skb(skb);
 				dropped++;
 				continue;
@@ -594,7 +596,7 @@ drop:
 		}
 	}
 	sch->q.qlen -= dropped;
-	qdisc_tree_decrease_qlen(sch, dropped);
+	qdisc_tree_reduce_backlog(sch, dropped, drop_len);
 }
 
 static void sfq_perturbation(unsigned long arg)
@@ -618,7 +620,7 @@ static int sfq_change(struct Qdisc *sch,
 	struct sfq_sched_data *q = qdisc_priv(sch);
 	struct tc_sfq_qopt *ctl = nla_data(opt);
 	struct tc_sfq_qopt_v1 *ctl_v1 = NULL;
-	unsigned int qlen;
+	unsigned int qlen, dropped = 0;
 	struct red_parms *p = NULL;
 
 	if (opt->nla_len < nla_attr_size(sizeof(*ctl)))
@@ -667,8 +669,8 @@ static int sfq_change(struct Qdisc *sch,
 
 	qlen = sch->q.qlen;
 	while (sch->q.qlen > q->limit)
-		sfq_drop(sch);
-	qdisc_tree_decrease_qlen(sch, qlen - sch->q.qlen);
+		dropped += sfq_drop(sch);
+	qdisc_tree_reduce_backlog(sch, qlen - sch->q.qlen, dropped);
 
 	del_timer(&q->perturb_timer);
 	if (q->perturb_period) {
--- a/net/sched/sch_tbf.c
+++ b/net/sched/sch_tbf.c
@@ -160,6 +160,7 @@ static int tbf_segment(struct sk_buff *s
 	struct tbf_sched_data *q = qdisc_priv(sch);
 	struct sk_buff *segs, *nskb;
 	netdev_features_t features = netif_skb_features(skb);
+	unsigned int len = 0, prev_len = qdisc_pkt_len(skb);
 	int ret, nb;
 
 	segs = skb_gso_segment(skb, features & ~NETIF_F_GSO_MASK);
@@ -172,6 +173,7 @@ static int tbf_segment(struct sk_buff *s
 		nskb = segs->next;
 		segs->next = NULL;
 		qdisc_skb_cb(segs)->pkt_len = segs->len;
+		len += segs->len;
 		ret = qdisc_enqueue(segs, q->qdisc);
 		if (ret != NET_XMIT_SUCCESS) {
 			if (net_xmit_drop_count(ret))
@@ -183,7 +185,7 @@ static int tbf_segment(struct sk_buff *s
 	}
 	sch->q.qlen += nb;
 	if (nb > 1)
-		qdisc_tree_decrease_qlen(sch, 1 - nb);
+		qdisc_tree_reduce_backlog(sch, 1 - nb, prev_len - len);
 	consume_skb(skb);
 	return nb > 0 ? NET_XMIT_SUCCESS : NET_XMIT_DROP;
 }
@@ -399,7 +401,8 @@ static int tbf_change(struct Qdisc *sch,
 
 	sch_tree_lock(sch);
 	if (child) {
-		qdisc_tree_decrease_qlen(q->qdisc, q->qdisc->q.qlen);
+		qdisc_tree_reduce_backlog(q->qdisc, q->qdisc->q.qlen,
+					  q->qdisc->qstats.backlog);
 		qdisc_destroy(q->qdisc);
 		q->qdisc = child;
 	}

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 028/101] sch_htb: update backlog as well
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (24 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 027/101] net_sched: update hierarchical backlog too Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 029/101] sch_dsmark: " Greg Kroah-Hartman
                   ` (66 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jamal Hadi Salim, Cong Wang, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: WANG Cong <xiyou.wangcong@gmail.com>

[ Upstream commit 431e3a8e36a05a37126f34b41aa3a5a6456af04e ]

We saw qlen!=0 but backlog==0 on our production machine:

qdisc htb 1: dev eth0 root refcnt 2 r2q 10 default 1 direct_packets_stat 0 ver 3.17
 Sent 172680457356 bytes 222469449 pkt (dropped 0, overlimits 123575834 requeues 0)
 backlog 0b 72p requeues 0

The problem is we only count qlen for HTB qdisc but not backlog.
We need to update backlog too when we update qlen, so that we
can at least know the average packet length.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sched/sch_htb.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -600,6 +600,7 @@ static int htb_enqueue(struct sk_buff *s
 		htb_activate(q, cl);
 	}
 
+	qdisc_qstats_backlog_inc(sch, skb);
 	sch->q.qlen++;
 	return NET_XMIT_SUCCESS;
 }
@@ -889,6 +890,7 @@ static struct sk_buff *htb_dequeue(struc
 ok:
 		qdisc_bstats_update(sch, skb);
 		qdisc_unthrottled(sch);
+		qdisc_qstats_backlog_dec(sch, skb);
 		sch->q.qlen--;
 		return skb;
 	}
@@ -955,6 +957,7 @@ static unsigned int htb_drop(struct Qdis
 			unsigned int len;
 			if (cl->un.leaf.q->ops->drop &&
 			    (len = cl->un.leaf.q->ops->drop(cl->un.leaf.q))) {
+				sch->qstats.backlog -= len;
 				sch->q.qlen--;
 				if (!cl->un.leaf.q->q.qlen)
 					htb_deactivate(q, cl);
@@ -984,12 +987,12 @@ static void htb_reset(struct Qdisc *sch)
 			}
 			cl->prio_activity = 0;
 			cl->cmode = HTB_CAN_SEND;
-
 		}
 	}
 	qdisc_watchdog_cancel(&q->watchdog);
 	__skb_queue_purge(&q->direct_queue);
 	sch->q.qlen = 0;
+	sch->qstats.backlog = 0;
 	memset(q->hlevel, 0, sizeof(q->hlevel));
 	memset(q->row_mask, 0, sizeof(q->row_mask));
 	for (i = 0; i < TC_HTB_NUMPRIO; i++)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 029/101] sch_dsmark: update backlog as well
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (25 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 028/101] sch_htb: update backlog as well Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 030/101] netem: Segment GSO packets on enqueue Greg Kroah-Hartman
                   ` (65 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jamal Hadi Salim, Cong Wang, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: WANG Cong <xiyou.wangcong@gmail.com>

[ Upstream commit bdf17661f63a79c3cb4209b970b1cc39e34f7543 ]

Similarly, we need to update backlog too when we update qlen.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sched/sch_dsmark.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/net/sched/sch_dsmark.c
+++ b/net/sched/sch_dsmark.c
@@ -258,6 +258,7 @@ static int dsmark_enqueue(struct sk_buff
 		return err;
 	}
 
+	qdisc_qstats_backlog_inc(sch, skb);
 	sch->q.qlen++;
 
 	return NET_XMIT_SUCCESS;
@@ -280,6 +281,7 @@ static struct sk_buff *dsmark_dequeue(st
 		return NULL;
 
 	qdisc_bstats_update(sch, skb);
+	qdisc_qstats_backlog_dec(sch, skb);
 	sch->q.qlen--;
 
 	index = skb->tc_index & (p->indices - 1);
@@ -395,6 +397,7 @@ static void dsmark_reset(struct Qdisc *s
 
 	pr_debug("%s(sch %p,[qdisc %p])\n", __func__, sch, p);
 	qdisc_reset(p->q);
+	sch->qstats.backlog = 0;
 	sch->q.qlen = 0;
 }
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 030/101] netem: Segment GSO packets on enqueue
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (26 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 029/101] sch_dsmark: " Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 031/101] ipv6/ila: fix nlsize calculation for lwtunnel Greg Kroah-Hartman
                   ` (64 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Neil Horman, Jamal Hadi Salim,
	David S. Miller, netem, eric.dumazet, stephen, Eric Dumazet

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Neil Horman <nhorman@tuxdriver.com>

[ Upstream commit 6071bd1aa13ed9e41824bafad845b7b7f4df5cfd ]

This was recently reported to me, and reproduced on the latest net kernel,
when attempting to run netperf from a host that had a netem qdisc attached
to the egress interface:

[  788.073771] ---------------------[ cut here ]---------------------------
[  788.096716] WARNING: at net/core/dev.c:2253 skb_warn_bad_offload+0xcd/0xda()
[  788.129521] bnx2: caps=(0x00000001801949b3, 0x0000000000000000) len=2962
data_len=0 gso_size=1448 gso_type=1 ip_summed=3
[  788.182150] Modules linked in: sch_netem kvm_amd kvm crc32_pclmul ipmi_ssif
ghash_clmulni_intel sp5100_tco amd64_edac_mod aesni_intel lrw gf128mul
glue_helper ablk_helper edac_mce_amd cryptd pcspkr sg edac_core hpilo ipmi_si
i2c_piix4 k10temp fam15h_power hpwdt ipmi_msghandler shpchp acpi_power_meter
pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c
sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt
i2c_algo_bit drm_kms_helper ahci ata_generic pata_acpi ttm libahci
crct10dif_pclmul pata_atiixp tg3 libata crct10dif_common drm crc32c_intel ptp
serio_raw bnx2 r8169 hpsa pps_core i2c_core mii dm_mirror dm_region_hash dm_log
dm_mod
[  788.465294] CPU: 16 PID: 0 Comm: swapper/16 Tainted: G        W
------------   3.10.0-327.el7.x86_64 #1
[  788.511521] Hardware name: HP ProLiant DL385p Gen8, BIOS A28 12/17/2012
[  788.542260]  ffff880437c036b8 f7afc56532a53db9 ffff880437c03670
ffffffff816351f1
[  788.576332]  ffff880437c036a8 ffffffff8107b200 ffff880633e74200
ffff880231674000
[  788.611943]  0000000000000001 0000000000000003 0000000000000000
ffff880437c03710
[  788.647241] Call Trace:
[  788.658817]  <IRQ>  [<ffffffff816351f1>] dump_stack+0x19/0x1b
[  788.686193]  [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0
[  788.713803]  [<ffffffff8107b29c>] warn_slowpath_fmt+0x5c/0x80
[  788.741314]  [<ffffffff812f92f3>] ? ___ratelimit+0x93/0x100
[  788.767018]  [<ffffffff81637f49>] skb_warn_bad_offload+0xcd/0xda
[  788.796117]  [<ffffffff8152950c>] skb_checksum_help+0x17c/0x190
[  788.823392]  [<ffffffffa01463a1>] netem_enqueue+0x741/0x7c0 [sch_netem]
[  788.854487]  [<ffffffff8152cb58>] dev_queue_xmit+0x2a8/0x570
[  788.880870]  [<ffffffff8156ae1d>] ip_finish_output+0x53d/0x7d0
...

The problem occurs because netem is not prepared to handle GSO packets (as it
uses skb_checksum_help in its enqueue path, which cannot manipulate these
frames).

The solution I think is to simply segment the skb in a simmilar fashion to the
way we do in __dev_queue_xmit (via validate_xmit_skb), with some minor changes.
When we decide to corrupt an skb, if the frame is GSO, we segment it, corrupt
the first segment, and enqueue the remaining ones.

tested successfully by myself on the latest net kernel, to which this applies

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Jamal Hadi Salim <jhs@mojatatu.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: netem@lists.linux-foundation.org
CC: eric.dumazet@gmail.com
CC: stephen@networkplumber.org
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sched/sch_netem.c |   61 ++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 59 insertions(+), 2 deletions(-)

--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -395,6 +395,25 @@ static void tfifo_enqueue(struct sk_buff
 	sch->q.qlen++;
 }
 
+/* netem can't properly corrupt a megapacket (like we get from GSO), so instead
+ * when we statistically choose to corrupt one, we instead segment it, returning
+ * the first packet to be corrupted, and re-enqueue the remaining frames
+ */
+static struct sk_buff *netem_segment(struct sk_buff *skb, struct Qdisc *sch)
+{
+	struct sk_buff *segs;
+	netdev_features_t features = netif_skb_features(skb);
+
+	segs = skb_gso_segment(skb, features & ~NETIF_F_GSO_MASK);
+
+	if (IS_ERR_OR_NULL(segs)) {
+		qdisc_reshape_fail(skb, sch);
+		return NULL;
+	}
+	consume_skb(skb);
+	return segs;
+}
+
 /*
  * Insert one skb into qdisc.
  * Note: parent depends on return value to account for queue length.
@@ -407,7 +426,11 @@ static int netem_enqueue(struct sk_buff
 	/* We don't fill cb now as skb_unshare() may invalidate it */
 	struct netem_skb_cb *cb;
 	struct sk_buff *skb2;
+	struct sk_buff *segs = NULL;
+	unsigned int len = 0, last_len, prev_len = qdisc_pkt_len(skb);
+	int nb = 0;
 	int count = 1;
+	int rc = NET_XMIT_SUCCESS;
 
 	/* Random duplication */
 	if (q->duplicate && q->duplicate >= get_crandom(&q->dup_cor))
@@ -453,10 +476,23 @@ static int netem_enqueue(struct sk_buff
 	 * do it now in software before we mangle it.
 	 */
 	if (q->corrupt && q->corrupt >= get_crandom(&q->corrupt_cor)) {
+		if (skb_is_gso(skb)) {
+			segs = netem_segment(skb, sch);
+			if (!segs)
+				return NET_XMIT_DROP;
+		} else {
+			segs = skb;
+		}
+
+		skb = segs;
+		segs = segs->next;
+
 		if (!(skb = skb_unshare(skb, GFP_ATOMIC)) ||
 		    (skb->ip_summed == CHECKSUM_PARTIAL &&
-		     skb_checksum_help(skb)))
-			return qdisc_drop(skb, sch);
+		     skb_checksum_help(skb))) {
+			rc = qdisc_drop(skb, sch);
+			goto finish_segs;
+		}
 
 		skb->data[prandom_u32() % skb_headlen(skb)] ^=
 			1<<(prandom_u32() % 8);
@@ -516,6 +552,27 @@ static int netem_enqueue(struct sk_buff
 		sch->qstats.requeues++;
 	}
 
+finish_segs:
+	if (segs) {
+		while (segs) {
+			skb2 = segs->next;
+			segs->next = NULL;
+			qdisc_skb_cb(segs)->pkt_len = segs->len;
+			last_len = segs->len;
+			rc = qdisc_enqueue(segs, sch);
+			if (rc != NET_XMIT_SUCCESS) {
+				if (net_xmit_drop_count(rc))
+					qdisc_qstats_drop(sch);
+			} else {
+				nb++;
+				len += last_len;
+			}
+			segs = skb2;
+		}
+		sch->q.qlen += nb;
+		if (nb > 1)
+			qdisc_tree_reduce_backlog(sch, 1 - nb, prev_len - len);
+	}
 	return NET_XMIT_SUCCESS;
 }
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 031/101] ipv6/ila: fix nlsize calculation for lwtunnel
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (27 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 030/101] netem: Segment GSO packets on enqueue Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 035/101] net/mlx4_en: Fix endianness bug in IPV6 csum calculation Greg Kroah-Hartman
                   ` (63 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tom Herbert, Nicolas Dichtel,
	David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>

[ Upstream commit 79e8dc8b80bff0bc5bbb90ca5e73044bf207c8ac ]

The handler 'ila_fill_encap_info' adds one attribute: ILA_ATTR_LOCATOR.

Fixes: 65d7ab8de582 ("net: Identifier Locator Addressing module")
CC: Tom Herbert <tom@herbertland.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/ila/ila_lwt.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/net/ipv6/ila/ila_lwt.c
+++ b/net/ipv6/ila/ila_lwt.c
@@ -120,8 +120,7 @@ nla_put_failure:
 
 static int ila_encap_nlsize(struct lwtunnel_state *lwtstate)
 {
-	/* No encapsulation overhead */
-	return 0;
+	return nla_total_size(sizeof(u64)); /* ILA_ATTR_LOCATOR */
 }
 
 static int ila_encap_cmp(struct lwtunnel_state *a, struct lwtunnel_state *b)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 035/101] net/mlx4_en: Fix endianness bug in IPV6 csum calculation
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (28 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 031/101] ipv6/ila: fix nlsize calculation for lwtunnel Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 036/101] VSOCK: do not disconnect socket when peer has shutdown SEND only Greg Kroah-Hartman
                   ` (62 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Daniel Jurgens, Carol Soto,
	Tariq Toukan, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Jurgens <danielj@mellanox.com>

[ Upstream commit 82d69203df634b4dfa765c94f60ce9482bcc44d6 ]

Use htons instead of unconditionally byte swapping nexthdr.  On a little
endian systems shifting the byte is correct behavior, but it results in
incorrect csums on big endian architectures.

Fixes: f8c6455bb04b ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Carol Soto <clsoto@us.ibm.com>
Tested-by: Carol Soto <clsoto@us.ibm.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -704,7 +704,7 @@ static int get_fixed_ipv6_csum(__wsum hw
 
 	if (ipv6h->nexthdr == IPPROTO_FRAGMENT || ipv6h->nexthdr == IPPROTO_HOPOPTS)
 		return -1;
-	hw_checksum = csum_add(hw_checksum, (__force __wsum)(ipv6h->nexthdr << 8));
+	hw_checksum = csum_add(hw_checksum, (__force __wsum)htons(ipv6h->nexthdr));
 
 	csum_pseudo_hdr = csum_partial(&ipv6h->saddr,
 				       sizeof(ipv6h->saddr) + sizeof(ipv6h->daddr), 0);

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 036/101] VSOCK: do not disconnect socket when peer has shutdown SEND only
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (29 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 035/101] net/mlx4_en: Fix endianness bug in IPV6 csum calculation Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 037/101] net: bridge: fix old ioctl unlocked net device walk Greg Kroah-Hartman
                   ` (61 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ian Campbell, David S. Miller,
	Stefan Hajnoczi, Claudio Imbrenda, Andy King, Dmitry Torokhov,
	Jorgen Hansen, Adit Ranadive, netdev

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ian Campbell <ian.campbell@docker.com>

[ Upstream commit dedc58e067d8c379a15a8a183c5db318201295bb ]

The peer may be expecting a reply having sent a request and then done a
shutdown(SHUT_WR), so tearing down the whole socket at this point seems
wrong and breaks for me with a client which does a SHUT_WR.

Looking at other socket family's stream_recvmsg callbacks doing a shutdown
here does not seem to be the norm and removing it does not seem to have
had any adverse effects that I can see.

I'm using Stefan's RFC virtio transport patches, I'm unsure of the impact
on the vmci transport.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Cc: Andy King <acking@vmware.com>
Cc: Dmitry Torokhov <dtor@vmware.com>
Cc: Jorgen Hansen <jhansen@vmware.com>
Cc: Adit Ranadive <aditr@vmware.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/vmw_vsock/af_vsock.c |   21 +--------------------
 1 file changed, 1 insertion(+), 20 deletions(-)

--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1789,27 +1789,8 @@ vsock_stream_recvmsg(struct socket *sock
 	else if (sk->sk_shutdown & RCV_SHUTDOWN)
 		err = 0;
 
-	if (copied > 0) {
-		/* We only do these additional bookkeeping/notification steps
-		 * if we actually copied something out of the queue pair
-		 * instead of just peeking ahead.
-		 */
-
-		if (!(flags & MSG_PEEK)) {
-			/* If the other side has shutdown for sending and there
-			 * is nothing more to read, then modify the socket
-			 * state.
-			 */
-			if (vsk->peer_shutdown & SEND_SHUTDOWN) {
-				if (vsock_stream_has_data(vsk) <= 0) {
-					sk->sk_state = SS_UNCONNECTED;
-					sock_set_flag(sk, SOCK_DONE);
-					sk->sk_state_change(sk);
-				}
-			}
-		}
+	if (copied > 0)
 		err = copied;
-	}
 
 out:
 	release_sock(sk);

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 037/101] net: bridge: fix old ioctl unlocked net device walk
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (30 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 036/101] VSOCK: do not disconnect socket when peer has shutdown SEND only Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 040/101] net: fix a kernel infoleak in x25 module Greg Kroah-Hartman
                   ` (60 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Nikolay Aleksandrov, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>

[ Upstream commit 31ca0458a61a502adb7ed192bf9716c6d05791a5 ]

get_bridge_ifindices() is used from the old "deviceless" bridge ioctl
calls which aren't called with rtnl held. The comment above says that it is
called with rtnl but that is not really the case.
Here's a sample output from a test ASSERT_RTNL() which I put in
get_bridge_ifindices and executed "brctl show":
[  957.422726] RTNL: assertion failed at net/bridge//br_ioctl.c (30)
[  957.422925] CPU: 0 PID: 1862 Comm: brctl Tainted: G        W  O
4.6.0-rc4+ #157
[  957.423009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.8.1-20150318_183358- 04/01/2014
[  957.423009]  0000000000000000 ffff880058adfdf0 ffffffff8138dec5
0000000000000400
[  957.423009]  ffffffff81ce8380 ffff880058adfe58 ffffffffa05ead32
0000000000000001
[  957.423009]  00007ffec1a444b0 0000000000000400 ffff880053c19130
0000000000008940
[  957.423009] Call Trace:
[  957.423009]  [<ffffffff8138dec5>] dump_stack+0x85/0xc0
[  957.423009]  [<ffffffffa05ead32>]
br_ioctl_deviceless_stub+0x212/0x2e0 [bridge]
[  957.423009]  [<ffffffff81515beb>] sock_ioctl+0x22b/0x290
[  957.423009]  [<ffffffff8126ba75>] do_vfs_ioctl+0x95/0x700
[  957.423009]  [<ffffffff8126c159>] SyS_ioctl+0x79/0x90
[  957.423009]  [<ffffffff8163a4c0>] entry_SYSCALL_64_fastpath+0x23/0xc1

Since it only reads bridge ifindices, we can use rcu to safely walk the net
device list. Also remove the wrong rtnl comment above.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/bridge/br_ioctl.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/net/bridge/br_ioctl.c
+++ b/net/bridge/br_ioctl.c
@@ -21,18 +21,19 @@
 #include <asm/uaccess.h>
 #include "br_private.h"
 
-/* called with RTNL */
 static int get_bridge_ifindices(struct net *net, int *indices, int num)
 {
 	struct net_device *dev;
 	int i = 0;
 
-	for_each_netdev(net, dev) {
+	rcu_read_lock();
+	for_each_netdev_rcu(net, dev) {
 		if (i >= num)
 			break;
 		if (dev->priv_flags & IFF_EBRIDGE)
 			indices[i++] = dev->ifindex;
 	}
+	rcu_read_unlock();
 
 	return i;
 }

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 040/101] net: fix a kernel infoleak in x25 module
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (31 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 037/101] net: bridge: fix old ioctl unlocked net device walk Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 041/101] net: thunderx: avoid exposing kernel stack Greg Kroah-Hartman
                   ` (59 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Kangjie Lu, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kangjie Lu <kangjielu@gmail.com>

[ Upstream commit 79e48650320e6fba48369fccf13fd045315b19b8 ]

Stack object "dte_facilities" is allocated in x25_rx_call_request(),
which is supposed to be initialized in x25_negotiate_facilities.
However, 5 fields (8 bytes in total) are not initialized. This
object is then copied to userland via copy_to_user, thus infoleak
occurs.

Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/x25/x25_facilities.c |    1 +
 1 file changed, 1 insertion(+)

--- a/net/x25/x25_facilities.c
+++ b/net/x25/x25_facilities.c
@@ -277,6 +277,7 @@ int x25_negotiate_facilities(struct sk_b
 
 	memset(&theirs, 0, sizeof(theirs));
 	memcpy(new, ours, sizeof(*new));
+	memset(dte, 0, sizeof(*dte));
 
 	len = x25_parse_facilities(skb, &theirs, dte, &x25->vc_facil_mask);
 	if (len < 0)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 041/101] net: thunderx: avoid exposing kernel stack
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (32 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 040/101] net: fix a kernel infoleak in x25 module Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 042/101] tcp: refresh skb timestamp at retransmit time Greg Kroah-Hartman
                   ` (58 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Heinrich Schuchardt, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: "xypron.glpk@gmx.de" <xypron.glpk@gmx.de>

[ Upstream commit 161de2caf68c549c266e571ffba8e2163886fb10 ]

Reserved fields should be set to zero to avoid exposing
bits from the kernel stack.

Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -519,6 +519,7 @@ static void nicvf_rcv_queue_config(struc
 		nicvf_config_vlan_stripping(nic, nic->netdev->features);
 
 	/* Enable Receive queue */
+	memset(&rq_cfg, 0, sizeof(struct rq_cfg));
 	rq_cfg.ena = 1;
 	rq_cfg.tcp_ena = 0;
 	nicvf_queue_reg_write(nic, NIC_QSET_RQ_0_7_CFG, qidx, *(u64 *)&rq_cfg);
@@ -551,6 +552,7 @@ void nicvf_cmp_queue_config(struct nicvf
 			      qidx, (u64)(cq->dmem.phys_base));
 
 	/* Enable Completion queue */
+	memset(&cq_cfg, 0, sizeof(struct cq_cfg));
 	cq_cfg.ena = 1;
 	cq_cfg.reset = 0;
 	cq_cfg.caching = 0;
@@ -599,6 +601,7 @@ static void nicvf_snd_queue_config(struc
 			      qidx, (u64)(sq->dmem.phys_base));
 
 	/* Enable send queue  & set queue size */
+	memset(&sq_cfg, 0, sizeof(struct sq_cfg));
 	sq_cfg.ena = 1;
 	sq_cfg.reset = 0;
 	sq_cfg.ldwb = 0;
@@ -635,6 +638,7 @@ static void nicvf_rbdr_config(struct nic
 
 	/* Enable RBDR  & set queue size */
 	/* Buffer size should be in multiples of 128 bytes */
+	memset(&rbdr_cfg, 0, sizeof(struct rbdr_cfg));
 	rbdr_cfg.ena = 1;
 	rbdr_cfg.reset = 0;
 	rbdr_cfg.ldwb = 0;

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 042/101] tcp: refresh skb timestamp at retransmit time
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (33 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 041/101] net: thunderx: avoid exposing kernel stack Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 043/101] net/route: enforce hoplimit max value Greg Kroah-Hartman
                   ` (57 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Yuchung Cheng, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 10a81980fc47e64ffac26a073139813d3f697b64 ]

In the very unlikely case __tcp_retransmit_skb() can not use the cloning
done in tcp_transmit_skb(), we need to refresh skb_mstamp before doing
the copy and transmit, otherwise TCP TS val will be an exact copy of
original transmit.

Fixes: 7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/tcp_output.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2625,8 +2625,10 @@ int __tcp_retransmit_skb(struct sock *sk
 	 */
 	if (unlikely((NET_IP_ALIGN && ((unsigned long)skb->data & 3)) ||
 		     skb_headroom(skb) >= 0xFFFF)) {
-		struct sk_buff *nskb = __pskb_copy(skb, MAX_TCP_HEADER,
-						   GFP_ATOMIC);
+		struct sk_buff *nskb;
+
+		skb_mstamp_get(&skb->skb_mstamp);
+		nskb = __pskb_copy(skb, MAX_TCP_HEADER, GFP_ATOMIC);
 		err = nskb ? tcp_transmit_skb(sk, nskb, 0, GFP_ATOMIC) :
 			     -ENOBUFS;
 	} else {

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 043/101] net/route: enforce hoplimit max value
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (34 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 042/101] tcp: refresh skb timestamp at retransmit time Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 044/101] ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang Greg Kroah-Hartman
                   ` (56 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Paolo Abeni, David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Paolo Abeni <pabeni@redhat.com>

[ Upstream commit 626abd59e51d4d8c6367e03aae252a8aa759ac78 ]

Currently, when creating or updating a route, no check is performed
in both ipv4 and ipv6 code to the hoplimit value.

The caller can i.e. set hoplimit to 256, and when such route will
 be used, packets will be sent with hoplimit/ttl equal to 0.

This commit adds checks for the RTAX_HOPLIMIT value, in both ipv4
ipv6 route code, substituting any value greater than 255 with 255.

This is consistent with what is currently done for ADVMSS and MTU
in the ipv4 code.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/fib_semantics.c |    2 ++
 net/ipv6/route.c         |    2 ++
 2 files changed, 4 insertions(+)

--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -975,6 +975,8 @@ fib_convert_metrics(struct fib_info *fi,
 			val = 65535 - 40;
 		if (type == RTAX_MTU && val > 65535 - 15)
 			val = 65535 - 15;
+		if (type == RTAX_HOPLIMIT && val > 255)
+			val = 255;
 		if (type == RTAX_FEATURES && (val & ~RTAX_FEATURE_MASK))
 			return -EINVAL;
 		fi->fib_metrics[type - 1] = val;
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1737,6 +1737,8 @@ static int ip6_convert_metrics(struct mx
 		} else {
 			val = nla_get_u32(nla);
 		}
+		if (type == RTAX_HOPLIMIT && val > 255)
+			val = 255;
 		if (type == RTAX_FEATURES && (val & ~RTAX_FEATURE_MASK))
 			goto err;
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 044/101] ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (35 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 043/101] net/route: enforce hoplimit max value Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 045/101] ocfs2: fix posix_acl_create deadlock Greg Kroah-Hartman
                   ` (55 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tariq Saeed, Junxiao Bi, Mark Fasheh,
	Joel Becker, Joseph Qi, Andrew Morton, Linus Torvalds

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Junxiao Bi <junxiao.bi@oracle.com>

commit 5ee0fbd50fdf1c1329de8bee35ea9d7c6a81a2e0 upstream.

Commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
introduced this issue.  ocfs2_setattr called by chmod command holds
cluster wide inode lock when calling posix_acl_chmod.  This latter
function in turn calls ocfs2_iop_get_acl and ocfs2_iop_set_acl.  These
two are also called directly from vfs layer for getfacl/setfacl commands
and therefore acquire the cluster wide inode lock.  If a remote
conversion request comes after the first inode lock in ocfs2_setattr,
OCFS2_LOCK_BLOCKED will be set.  And this will cause the second call to
inode lock from the ocfs2_iop_get_acl() to block indefinetly.

The deleted version of ocfs2_acl_chmod() calls __posix_acl_chmod() which
does not call back into the filesystem.  Therefore, we restore
ocfs2_acl_chmod(), modify it slightly for locking as needed, and use that
instead.

Fixes: 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com>
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/ocfs2/acl.c  |   24 ++++++++++++++++++++++++
 fs/ocfs2/acl.h  |    1 +
 fs/ocfs2/file.c |    4 ++--
 3 files changed, 27 insertions(+), 2 deletions(-)

--- a/fs/ocfs2/acl.c
+++ b/fs/ocfs2/acl.c
@@ -322,3 +322,27 @@ struct posix_acl *ocfs2_iop_get_acl(stru
 	brelse(di_bh);
 	return acl;
 }
+
+int ocfs2_acl_chmod(struct inode *inode, struct buffer_head *bh)
+{
+	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
+	struct posix_acl *acl;
+	int ret;
+
+	if (S_ISLNK(inode->i_mode))
+		return -EOPNOTSUPP;
+
+	if (!(osb->s_mount_opt & OCFS2_MOUNT_POSIX_ACL))
+		return 0;
+
+	acl = ocfs2_get_acl_nolock(inode, ACL_TYPE_ACCESS, bh);
+	if (IS_ERR(acl) || !acl)
+		return PTR_ERR(acl);
+	ret = __posix_acl_chmod(&acl, GFP_KERNEL, inode->i_mode);
+	if (ret)
+		return ret;
+	ret = ocfs2_set_acl(NULL, inode, NULL, ACL_TYPE_ACCESS,
+			    acl, NULL, NULL);
+	posix_acl_release(acl);
+	return ret;
+}
--- a/fs/ocfs2/acl.h
+++ b/fs/ocfs2/acl.h
@@ -35,5 +35,6 @@ int ocfs2_set_acl(handle_t *handle,
 			 struct posix_acl *acl,
 			 struct ocfs2_alloc_context *meta_ac,
 			 struct ocfs2_alloc_context *data_ac);
+extern int ocfs2_acl_chmod(struct inode *, struct buffer_head *);
 
 #endif /* OCFS2_ACL_H */
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -1268,20 +1268,20 @@ bail_unlock_rw:
 	if (size_change)
 		ocfs2_rw_unlock(inode, 1);
 bail:
-	brelse(bh);
 
 	/* Release quota pointers in case we acquired them */
 	for (qtype = 0; qtype < OCFS2_MAXQUOTAS; qtype++)
 		dqput(transfer_to[qtype]);
 
 	if (!status && attr->ia_valid & ATTR_MODE) {
-		status = posix_acl_chmod(inode, inode->i_mode);
+		status = ocfs2_acl_chmod(inode, bh);
 		if (status < 0)
 			mlog_errno(status);
 	}
 	if (inode_locked)
 		ocfs2_inode_unlock(inode, 1);
 
+	brelse(bh);
 	return status;
 }
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 045/101] ocfs2: fix posix_acl_create deadlock
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (36 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 044/101] ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 046/101] zsmalloc: fix zs_can_compact() integer overflow Greg Kroah-Hartman
                   ` (54 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tariq Saeed, Junxiao Bi, Mark Fasheh,
	Joel Becker, Joseph Qi, Andrew Morton, Linus Torvalds

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Junxiao Bi <junxiao.bi@oracle.com>

commit c25a1e0671fbca7b2c0d0757d533bd2650d6dc0c upstream.

Commit 702e5bc68ad2 ("ocfs2: use generic posix ACL infrastructure")
refactored code to use posix_acl_create.  The problem with this function
is that it is not mindful of the cluster wide inode lock making it
unsuitable for use with ocfs2 inode creation with ACLs.  For example,
when used in ocfs2_mknod, this function can cause deadlock as follows.
The parent dir inode lock is taken when calling posix_acl_create ->
get_acl -> ocfs2_iop_get_acl which takes the inode lock again.  This can
cause deadlock if there is a blocked remote lock request waiting for the
lock to be downconverted.  And same deadlock happened in ocfs2_reflink.
This fix is to revert back using ocfs2_init_acl.

Fixes: 702e5bc68ad2 ("ocfs2: use generic posix ACL infrastructure")
Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com>
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/ocfs2/acl.c          |   63 ++++++++++++++++++++++++++++++++++++++++++++++++
 fs/ocfs2/acl.h          |    4 +++
 fs/ocfs2/namei.c        |   23 +----------------
 fs/ocfs2/refcounttree.c |   17 +-----------
 fs/ocfs2/xattr.c        |   14 +++-------
 fs/ocfs2/xattr.h        |    4 ---
 6 files changed, 77 insertions(+), 48 deletions(-)

--- a/fs/ocfs2/acl.c
+++ b/fs/ocfs2/acl.c
@@ -346,3 +346,66 @@ int ocfs2_acl_chmod(struct inode *inode,
 	posix_acl_release(acl);
 	return ret;
 }
+
+/*
+ * Initialize the ACLs of a new inode. If parent directory has default ACL,
+ * then clone to new inode. Called from ocfs2_mknod.
+ */
+int ocfs2_init_acl(handle_t *handle,
+		   struct inode *inode,
+		   struct inode *dir,
+		   struct buffer_head *di_bh,
+		   struct buffer_head *dir_bh,
+		   struct ocfs2_alloc_context *meta_ac,
+		   struct ocfs2_alloc_context *data_ac)
+{
+	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
+	struct posix_acl *acl = NULL;
+	int ret = 0, ret2;
+	umode_t mode;
+
+	if (!S_ISLNK(inode->i_mode)) {
+		if (osb->s_mount_opt & OCFS2_MOUNT_POSIX_ACL) {
+			acl = ocfs2_get_acl_nolock(dir, ACL_TYPE_DEFAULT,
+						   dir_bh);
+			if (IS_ERR(acl))
+				return PTR_ERR(acl);
+		}
+		if (!acl) {
+			mode = inode->i_mode & ~current_umask();
+			ret = ocfs2_acl_set_mode(inode, di_bh, handle, mode);
+			if (ret) {
+				mlog_errno(ret);
+				goto cleanup;
+			}
+		}
+	}
+	if ((osb->s_mount_opt & OCFS2_MOUNT_POSIX_ACL) && acl) {
+		if (S_ISDIR(inode->i_mode)) {
+			ret = ocfs2_set_acl(handle, inode, di_bh,
+					    ACL_TYPE_DEFAULT, acl,
+					    meta_ac, data_ac);
+			if (ret)
+				goto cleanup;
+		}
+		mode = inode->i_mode;
+		ret = __posix_acl_create(&acl, GFP_NOFS, &mode);
+		if (ret < 0)
+			return ret;
+
+		ret2 = ocfs2_acl_set_mode(inode, di_bh, handle, mode);
+		if (ret2) {
+			mlog_errno(ret2);
+			ret = ret2;
+			goto cleanup;
+		}
+		if (ret > 0) {
+			ret = ocfs2_set_acl(handle, inode,
+					    di_bh, ACL_TYPE_ACCESS,
+					    acl, meta_ac, data_ac);
+		}
+	}
+cleanup:
+	posix_acl_release(acl);
+	return ret;
+}
--- a/fs/ocfs2/acl.h
+++ b/fs/ocfs2/acl.h
@@ -36,5 +36,9 @@ int ocfs2_set_acl(handle_t *handle,
 			 struct ocfs2_alloc_context *meta_ac,
 			 struct ocfs2_alloc_context *data_ac);
 extern int ocfs2_acl_chmod(struct inode *, struct buffer_head *);
+extern int ocfs2_init_acl(handle_t *, struct inode *, struct inode *,
+			  struct buffer_head *, struct buffer_head *,
+			  struct ocfs2_alloc_context *,
+			  struct ocfs2_alloc_context *);
 
 #endif /* OCFS2_ACL_H */
--- a/fs/ocfs2/namei.c
+++ b/fs/ocfs2/namei.c
@@ -259,7 +259,6 @@ static int ocfs2_mknod(struct inode *dir
 	struct ocfs2_dir_lookup_result lookup = { NULL, };
 	sigset_t oldset;
 	int did_block_signals = 0;
-	struct posix_acl *default_acl = NULL, *acl = NULL;
 	struct ocfs2_dentry_lock *dl = NULL;
 
 	trace_ocfs2_mknod(dir, dentry, dentry->d_name.len, dentry->d_name.name,
@@ -367,12 +366,6 @@ static int ocfs2_mknod(struct inode *dir
 		goto leave;
 	}
 
-	status = posix_acl_create(dir, &inode->i_mode, &default_acl, &acl);
-	if (status) {
-		mlog_errno(status);
-		goto leave;
-	}
-
 	handle = ocfs2_start_trans(osb, ocfs2_mknod_credits(osb->sb,
 							    S_ISDIR(mode),
 							    xattr_credits));
@@ -421,16 +414,8 @@ static int ocfs2_mknod(struct inode *dir
 		inc_nlink(dir);
 	}
 
-	if (default_acl) {
-		status = ocfs2_set_acl(handle, inode, new_fe_bh,
-				       ACL_TYPE_DEFAULT, default_acl,
-				       meta_ac, data_ac);
-	}
-	if (!status && acl) {
-		status = ocfs2_set_acl(handle, inode, new_fe_bh,
-				       ACL_TYPE_ACCESS, acl,
-				       meta_ac, data_ac);
-	}
+	status = ocfs2_init_acl(handle, inode, dir, new_fe_bh, parent_fe_bh,
+			 meta_ac, data_ac);
 
 	if (status < 0) {
 		mlog_errno(status);
@@ -472,10 +457,6 @@ static int ocfs2_mknod(struct inode *dir
 	d_instantiate(dentry, inode);
 	status = 0;
 leave:
-	if (default_acl)
-		posix_acl_release(default_acl);
-	if (acl)
-		posix_acl_release(acl);
 	if (status < 0 && did_quota_inode)
 		dquot_free_inode(inode);
 	if (handle)
--- a/fs/ocfs2/refcounttree.c
+++ b/fs/ocfs2/refcounttree.c
@@ -4248,20 +4248,12 @@ static int ocfs2_reflink(struct dentry *
 	struct inode *inode = d_inode(old_dentry);
 	struct buffer_head *old_bh = NULL;
 	struct inode *new_orphan_inode = NULL;
-	struct posix_acl *default_acl, *acl;
-	umode_t mode;
 
 	if (!ocfs2_refcount_tree(OCFS2_SB(inode->i_sb)))
 		return -EOPNOTSUPP;
 
-	mode = inode->i_mode;
-	error = posix_acl_create(dir, &mode, &default_acl, &acl);
-	if (error) {
-		mlog_errno(error);
-		return error;
-	}
 
-	error = ocfs2_create_inode_in_orphan(dir, mode,
+	error = ocfs2_create_inode_in_orphan(dir, inode->i_mode,
 					     &new_orphan_inode);
 	if (error) {
 		mlog_errno(error);
@@ -4300,16 +4292,11 @@ static int ocfs2_reflink(struct dentry *
 	/* If the security isn't preserved, we need to re-initialize them. */
 	if (!preserve) {
 		error = ocfs2_init_security_and_acl(dir, new_orphan_inode,
-						    &new_dentry->d_name,
-						    default_acl, acl);
+						    &new_dentry->d_name);
 		if (error)
 			mlog_errno(error);
 	}
 out:
-	if (default_acl)
-		posix_acl_release(default_acl);
-	if (acl)
-		posix_acl_release(acl);
 	if (!error) {
 		error = ocfs2_mv_orphaned_inode_to_new(dir, new_orphan_inode,
 						       new_dentry);
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -7216,12 +7216,10 @@ out:
  */
 int ocfs2_init_security_and_acl(struct inode *dir,
 				struct inode *inode,
-				const struct qstr *qstr,
-				struct posix_acl *default_acl,
-				struct posix_acl *acl)
+				const struct qstr *qstr)
 {
-	struct buffer_head *dir_bh = NULL;
 	int ret = 0;
+	struct buffer_head *dir_bh = NULL;
 
 	ret = ocfs2_init_security_get(inode, dir, qstr, NULL);
 	if (ret) {
@@ -7234,11 +7232,9 @@ int ocfs2_init_security_and_acl(struct i
 		mlog_errno(ret);
 		goto leave;
 	}
-
-	if (!ret && default_acl)
-		ret = ocfs2_iop_set_acl(inode, default_acl, ACL_TYPE_DEFAULT);
-	if (!ret && acl)
-		ret = ocfs2_iop_set_acl(inode, acl, ACL_TYPE_ACCESS);
+	ret = ocfs2_init_acl(NULL, inode, dir, NULL, dir_bh, NULL, NULL);
+	if (ret)
+		mlog_errno(ret);
 
 	ocfs2_inode_unlock(dir, 0);
 	brelse(dir_bh);
--- a/fs/ocfs2/xattr.h
+++ b/fs/ocfs2/xattr.h
@@ -94,7 +94,5 @@ int ocfs2_reflink_xattrs(struct inode *o
 			 bool preserve_security);
 int ocfs2_init_security_and_acl(struct inode *dir,
 				struct inode *inode,
-				const struct qstr *qstr,
-				struct posix_acl *default_acl,
-				struct posix_acl *acl);
+				const struct qstr *qstr);
 #endif /* OCFS2_XATTR_H */

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 046/101] zsmalloc: fix zs_can_compact() integer overflow
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (37 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 045/101] ocfs2: fix posix_acl_create deadlock Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 047/101] mm: thp: calculate the mapcount correctly for THP pages during WP faults Greg Kroah-Hartman
                   ` (53 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Sergey Senozhatsky, Minchan Kim,
	Andrew Morton, Linus Torvalds

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

commit 44f43e99fe70833058482d183e99fdfd11220996 upstream.

zs_can_compact() has two race conditions in its core calculation:

unsigned long obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
				zs_stat_get(class, OBJ_USED);

1) classes are not locked, so the numbers of allocated and used
   objects can change by the concurrent ops happening on other CPUs
2) shrinker invokes it from preemptible context

Depending on the circumstances, thus, OBJ_ALLOCATED can become
less than OBJ_USED, which can result in either very high or
negative `total_scan' value calculated later in do_shrink_slab().

do_shrink_slab() has some logic to prevent those cases:

 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-64
 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62

However, due to the way `total_scan' is calculated, not every
shrinker->count_objects() overflow can be spotted and handled.
To demonstrate the latter, I added some debugging code to do_shrink_slab()
(x86_64) and the results were:

 vmscan: OVERFLOW: shrinker->count_objects() == -1 [18446744073709551615]
 vmscan: but total_scan > 0: 92679974445502
 vmscan: resulting total_scan: 92679974445502
[..]
 vmscan: OVERFLOW: shrinker->count_objects() == -1 [18446744073709551615]
 vmscan: but total_scan > 0: 22634041808232578
 vmscan: resulting total_scan: 22634041808232578

Even though shrinker->count_objects() has returned an overflowed value,
the resulting `total_scan' is positive, and, what is more worrisome, it
is insanely huge. This value is getting used later on in
shrinker->scan_objects() loop:

        while (total_scan >= batch_size ||
               total_scan >= freeable) {
                unsigned long ret;
                unsigned long nr_to_scan = min(batch_size, total_scan);

                shrinkctl->nr_to_scan = nr_to_scan;
                ret = shrinker->scan_objects(shrinker, shrinkctl);
                if (ret == SHRINK_STOP)
                        break;
                freed += ret;

                count_vm_events(SLABS_SCANNED, nr_to_scan);
                total_scan -= nr_to_scan;

                cond_resched();
        }

`total_scan >= batch_size' is true for a very-very long time and
'total_scan >= freeable' is also true for quite some time, because
`freeable < 0' and `total_scan' is large enough, for example,
22634041808232578. The only break condition, in the given scheme of
things, is shrinker->scan_objects() == SHRINK_STOP test, which is a
bit too weak to rely on, especially in heavy zsmalloc-usage scenarios.

To fix the issue, take a pool stat snapshot and use it instead of
racy zs_stat_get() calls.

Link: http://lkml.kernel.org/r/20160509140052.3389-1-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/zsmalloc.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1732,10 +1732,13 @@ static struct page *isolate_source_page(
 static unsigned long zs_can_compact(struct size_class *class)
 {
 	unsigned long obj_wasted;
+	unsigned long obj_allocated = zs_stat_get(class, OBJ_ALLOCATED);
+	unsigned long obj_used = zs_stat_get(class, OBJ_USED);
 
-	obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
-		zs_stat_get(class, OBJ_USED);
+	if (obj_allocated <= obj_used)
+		return 0;
 
+	obj_wasted = obj_allocated - obj_used;
 	obj_wasted /= get_maxobj_per_zspage(class->size,
 			class->pages_per_zspage);
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 047/101] mm: thp: calculate the mapcount correctly for THP pages during WP faults
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (38 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 046/101] zsmalloc: fix zs_can_compact() integer overflow Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 048/101] crypto: qat - fix invalid pf2vf_resp_wq logic Greg Kroah-Hartman
                   ` (52 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Andrea Arcangeli, Kirill A. Shutemov,
	Dean Luick, Alex Williamson, Mike Marciniszyn, Josh Collier,
	Marc Haber, Andrew Morton, Linus Torvalds

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andrea Arcangeli <aarcange@redhat.com>

commit 6d0a07edd17cfc12fdc1f36de8072fa17cc3666f upstream.

This will provide fully accuracy to the mapcount calculation in the
write protect faults, so page pinning will not get broken by false
positive copy-on-writes.

total_mapcount() isn't the right calculation needed in
reuse_swap_page(), so this introduces a page_trans_huge_mapcount()
that is effectively the full accurate return value for page_mapcount()
if dealing with Transparent Hugepages, however we only use the
page_trans_huge_mapcount() during COW faults where it strictly needed,
due to its higher runtime cost.

This also provide at practical zero cost the total_mapcount
information which is needed to know if we can still relocate the page
anon_vma to the local vma. If page_trans_huge_mapcount() returns 1 we
can reuse the page no matter if it's a pte or a pmd_trans_huge
triggering the fault, but we can only relocate the page anon_vma to
the local vma->anon_vma if we're sure it's only this "vma" mapping the
whole THP physical range.

Kirill A. Shutemov discovered the problem with moving the page
anon_vma to the local vma->anon_vma in a previous version of this
patch and another problem in the way page_move_anon_rmap() was called.

Andrew Morton discovered that CONFIG_SWAP=n wouldn't build in a
previous version, because reuse_swap_page must be a macro to call
page_trans_huge_mapcount from swap.h, so this uses a macro again
instead of an inline function. With this change at least it's a less
dangerous usage than it was before, because "page" is used only once
now, while with the previous code reuse_swap_page(page++) would have
called page_mapcount on page+1 and it would have increased page twice
instead of just once.

Dean Luick noticed an uninitialized variable that could result in a
rmap inefficiency for the non-THP case in a previous version.

Mike Marciniszyn said:

: Our RDMA tests are seeing an issue with memory locking that bisects to
: commit 61f5d698cc97 ("mm: re-enable THP")
:
: The test program registers two rather large MRs (512M) and RDMA
: writes data to a passive peer using the first and RDMA reads it back
: into the second MR and compares that data.  The sizes are chosen randomly
: between 0 and 1024 bytes.
:
: The test will get through a few (<= 4 iterations) and then gets a
: compare error.
:
: Tracing indicates the kernel logical addresses associated with the individual
: pages at registration ARE correct , the data in the "RDMA read response only"
: packets ARE correct.
:
: The "corruption" occurs when the packet crosse two pages that are not physically
: contiguous.   The second page reads back as zero in the program.
:
: It looks like the user VA at the point of the compare error no longer points to
: the same physical address as was registered.
:
: This patch totally resolves the issue!

Link: http://lkml.kernel.org/r/1462547040-1737-2-git-send-email-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: "Kirill A. Shutemov" <kirill@shutemov.name>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Tested-by: Josh Collier <josh.d.collier@intel.com>
Cc: Marc Haber <mh+linux-kernel@zugschlus.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/linux/mm.h   |    9 ++++++
 include/linux/swap.h |    6 ++--
 mm/huge_memory.c     |   71 ++++++++++++++++++++++++++++++++++++++++++++-------
 mm/memory.c          |   22 ++++++++++-----
 mm/swapfile.c        |   13 +++++----
 5 files changed, 95 insertions(+), 26 deletions(-)

--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -456,11 +456,20 @@ static inline int page_mapcount(struct p
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 int total_mapcount(struct page *page);
+int page_trans_huge_mapcount(struct page *page, int *total_mapcount);
 #else
 static inline int total_mapcount(struct page *page)
 {
 	return page_mapcount(page);
 }
+static inline int page_trans_huge_mapcount(struct page *page,
+					   int *total_mapcount)
+{
+	int mapcount = page_mapcount(page);
+	if (total_mapcount)
+		*total_mapcount = mapcount;
+	return mapcount;
+}
 #endif
 
 static inline int page_count(struct page *page)
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -418,7 +418,7 @@ extern sector_t swapdev_block(int, pgoff
 extern int page_swapcount(struct page *);
 extern int swp_swapcount(swp_entry_t entry);
 extern struct swap_info_struct *page_swap_info(struct page *);
-extern int reuse_swap_page(struct page *);
+extern bool reuse_swap_page(struct page *, int *);
 extern int try_to_free_swap(struct page *);
 struct backing_dev_info;
 
@@ -513,8 +513,8 @@ static inline int swp_swapcount(swp_entr
 	return 0;
 }
 
-#define reuse_swap_page(page) \
-	(!PageTransCompound(page) && page_mapcount(page) == 1)
+#define reuse_swap_page(page, total_mapcount) \
+	(page_trans_huge_mapcount(page, total_mapcount) == 1)
 
 static inline int try_to_free_swap(struct page *page)
 {
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1257,15 +1257,9 @@ int do_huge_pmd_wp_page(struct mm_struct
 	VM_BUG_ON_PAGE(!PageCompound(page) || !PageHead(page), page);
 	/*
 	 * We can only reuse the page if nobody else maps the huge page or it's
-	 * part. We can do it by checking page_mapcount() on each sub-page, but
-	 * it's expensive.
-	 * The cheaper way is to check page_count() to be equal 1: every
-	 * mapcount takes page reference reference, so this way we can
-	 * guarantee, that the PMD is the only mapping.
-	 * This can give false negative if somebody pinned the page, but that's
-	 * fine.
+	 * part.
 	 */
-	if (page_mapcount(page) == 1 && page_count(page) == 1) {
+	if (page_trans_huge_mapcount(page, NULL) == 1) {
 		pmd_t entry;
 		entry = pmd_mkyoung(orig_pmd);
 		entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
@@ -2038,7 +2032,8 @@ static int __collapse_huge_page_isolate(
 		if (pte_write(pteval)) {
 			writable = true;
 		} else {
-			if (PageSwapCache(page) && !reuse_swap_page(page)) {
+			if (PageSwapCache(page) &&
+			    !reuse_swap_page(page, NULL)) {
 				unlock_page(page);
 				result = SCAN_SWAP_CACHE_PAGE;
 				goto out;
@@ -3337,6 +3332,64 @@ int total_mapcount(struct page *page)
 	return ret;
 }
 
+/*
+ * This calculates accurately how many mappings a transparent hugepage
+ * has (unlike page_mapcount() which isn't fully accurate). This full
+ * accuracy is primarily needed to know if copy-on-write faults can
+ * reuse the page and change the mapping to read-write instead of
+ * copying them. At the same time this returns the total_mapcount too.
+ *
+ * The function returns the highest mapcount any one of the subpages
+ * has. If the return value is one, even if different processes are
+ * mapping different subpages of the transparent hugepage, they can
+ * all reuse it, because each process is reusing a different subpage.
+ *
+ * The total_mapcount is instead counting all virtual mappings of the
+ * subpages. If the total_mapcount is equal to "one", it tells the
+ * caller all mappings belong to the same "mm" and in turn the
+ * anon_vma of the transparent hugepage can become the vma->anon_vma
+ * local one as no other process may be mapping any of the subpages.
+ *
+ * It would be more accurate to replace page_mapcount() with
+ * page_trans_huge_mapcount(), however we only use
+ * page_trans_huge_mapcount() in the copy-on-write faults where we
+ * need full accuracy to avoid breaking page pinning, because
+ * page_trans_huge_mapcount() is slower than page_mapcount().
+ */
+int page_trans_huge_mapcount(struct page *page, int *total_mapcount)
+{
+	int i, ret, _total_mapcount, mapcount;
+
+	/* hugetlbfs shouldn't call it */
+	VM_BUG_ON_PAGE(PageHuge(page), page);
+
+	if (likely(!PageTransCompound(page))) {
+		mapcount = atomic_read(&page->_mapcount) + 1;
+		if (total_mapcount)
+			*total_mapcount = mapcount;
+		return mapcount;
+	}
+
+	page = compound_head(page);
+
+	_total_mapcount = ret = 0;
+	for (i = 0; i < HPAGE_PMD_NR; i++) {
+		mapcount = atomic_read(&page[i]._mapcount) + 1;
+		ret = max(ret, mapcount);
+		_total_mapcount += mapcount;
+	}
+	if (PageDoubleMap(page)) {
+		ret -= 1;
+		_total_mapcount -= HPAGE_PMD_NR;
+	}
+	mapcount = compound_mapcount(page);
+	ret += mapcount;
+	_total_mapcount += mapcount;
+	if (total_mapcount)
+		*total_mapcount = _total_mapcount;
+	return ret;
+}
+
 /*
  * This function splits huge page into normal pages. @page can point to any
  * subpage of huge page to split. Split doesn't change the position of @page.
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2357,6 +2357,7 @@ static int do_wp_page(struct mm_struct *
 	 * not dirty accountable.
 	 */
 	if (PageAnon(old_page) && !PageKsm(old_page)) {
+		int total_mapcount;
 		if (!trylock_page(old_page)) {
 			page_cache_get(old_page);
 			pte_unmap_unlock(page_table, ptl);
@@ -2371,13 +2372,18 @@ static int do_wp_page(struct mm_struct *
 			}
 			page_cache_release(old_page);
 		}
-		if (reuse_swap_page(old_page)) {
-			/*
-			 * The page is all ours.  Move it to our anon_vma so
-			 * the rmap code will not search our parent or siblings.
-			 * Protected against the rmap code by the page lock.
-			 */
-			page_move_anon_rmap(old_page, vma, address);
+		if (reuse_swap_page(old_page, &total_mapcount)) {
+			if (total_mapcount == 1) {
+				/*
+				 * The page is all ours. Move it to
+				 * our anon_vma so the rmap code will
+				 * not search our parent or siblings.
+				 * Protected against the rmap code by
+				 * the page lock.
+				 */
+				page_move_anon_rmap(compound_head(old_page),
+						    vma, address);
+			}
 			unlock_page(old_page);
 			return wp_page_reuse(mm, vma, address, page_table, ptl,
 					     orig_pte, old_page, 0, 0);
@@ -2602,7 +2608,7 @@ static int do_swap_page(struct mm_struct
 	inc_mm_counter_fast(mm, MM_ANONPAGES);
 	dec_mm_counter_fast(mm, MM_SWAPENTS);
 	pte = mk_pte(page, vma->vm_page_prot);
-	if ((flags & FAULT_FLAG_WRITE) && reuse_swap_page(page)) {
+	if ((flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) {
 		pte = maybe_mkwrite(pte_mkdirty(pte), vma);
 		flags &= ~FAULT_FLAG_WRITE;
 		ret |= VM_FAULT_WRITE;
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -916,18 +916,19 @@ out:
  * to it.  And as a side-effect, free up its swap: because the old content
  * on disk will never be read, and seeking back there to write new content
  * later would only waste time away from clustering.
+ *
+ * NOTE: total_mapcount should not be relied upon by the caller if
+ * reuse_swap_page() returns false, but it may be always overwritten
+ * (see the other implementation for CONFIG_SWAP=n).
  */
-int reuse_swap_page(struct page *page)
+bool reuse_swap_page(struct page *page, int *total_mapcount)
 {
 	int count;
 
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
 	if (unlikely(PageKsm(page)))
-		return 0;
-	/* The page is part of THP and cannot be reused */
-	if (PageTransCompound(page))
-		return 0;
-	count = page_mapcount(page);
+		return false;
+	count = page_trans_huge_mapcount(page, total_mapcount);
 	if (count <= 1 && PageSwapCache(page)) {
 		count += page_swapcount(page);
 		if (count == 1 && !PageWriteback(page)) {

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 048/101] crypto: qat - fix invalid pf2vf_resp_wq logic
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (39 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 047/101] mm: thp: calculate the mapcount correctly for THP pages during WP faults Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 049/101] crypto: qat - fix adf_ctl_drv.c:undefined reference to adf_init_pf_wq Greg Kroah-Hartman
                   ` (51 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Suresh Marikkannu, Tadeusz Struk, Herbert Xu

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tadeusz Struk <tadeusz.struk@intel.com>

commit 9e209fcfb804da262e38e5cd2e680c47a41f0f95 upstream.

The pf2vf_resp_wq is a global so it has to be created at init
and destroyed at exit, instead of per device.

Tested-by: Suresh Marikkannu <sureshx.marikkannu@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/crypto/qat/qat_common/adf_common_drv.h |    2 +
 drivers/crypto/qat/qat_common/adf_ctl_drv.c    |    6 +++++
 drivers/crypto/qat/qat_common/adf_sriov.c      |   26 +++++++++++++++----------
 3 files changed, 24 insertions(+), 10 deletions(-)

--- a/drivers/crypto/qat/qat_common/adf_common_drv.h
+++ b/drivers/crypto/qat/qat_common/adf_common_drv.h
@@ -144,6 +144,8 @@ void adf_disable_aer(struct adf_accel_de
 void adf_dev_restore(struct adf_accel_dev *accel_dev);
 int adf_init_aer(void);
 void adf_exit_aer(void);
+int adf_init_pf_wq(void);
+void adf_exit_pf_wq(void);
 int adf_init_admin_comms(struct adf_accel_dev *accel_dev);
 void adf_exit_admin_comms(struct adf_accel_dev *accel_dev);
 int adf_send_admin_init(struct adf_accel_dev *accel_dev);
--- a/drivers/crypto/qat/qat_common/adf_ctl_drv.c
+++ b/drivers/crypto/qat/qat_common/adf_ctl_drv.c
@@ -462,12 +462,17 @@ static int __init adf_register_ctl_devic
 	if (adf_init_aer())
 		goto err_aer;
 
+	if (adf_init_pf_wq())
+		goto err_pf_wq;
+
 	if (qat_crypto_register())
 		goto err_crypto_register;
 
 	return 0;
 
 err_crypto_register:
+	adf_exit_pf_wq();
+err_pf_wq:
 	adf_exit_aer();
 err_aer:
 	adf_chr_drv_destroy();
@@ -480,6 +485,7 @@ static void __exit adf_unregister_ctl_de
 {
 	adf_chr_drv_destroy();
 	adf_exit_aer();
+	adf_exit_pf_wq();
 	qat_crypto_unregister();
 	adf_clean_vf_map(false);
 	mutex_destroy(&adf_ctl_lock);
--- a/drivers/crypto/qat/qat_common/adf_sriov.c
+++ b/drivers/crypto/qat/qat_common/adf_sriov.c
@@ -119,11 +119,6 @@ static int adf_enable_sriov(struct adf_a
 	int i;
 	u32 reg;
 
-	/* Workqueue for PF2VF responses */
-	pf2vf_resp_wq = create_workqueue("qat_pf2vf_resp_wq");
-	if (!pf2vf_resp_wq)
-		return -ENOMEM;
-
 	for (i = 0, vf_info = accel_dev->pf.vf_info; i < totalvfs;
 	     i++, vf_info++) {
 		/* This ptr will be populated when VFs will be created */
@@ -216,11 +211,6 @@ void adf_disable_sriov(struct adf_accel_
 
 	kfree(accel_dev->pf.vf_info);
 	accel_dev->pf.vf_info = NULL;
-
-	if (pf2vf_resp_wq) {
-		destroy_workqueue(pf2vf_resp_wq);
-		pf2vf_resp_wq = NULL;
-	}
 }
 EXPORT_SYMBOL_GPL(adf_disable_sriov);
 
@@ -304,3 +294,19 @@ int adf_sriov_configure(struct pci_dev *
 	return numvfs;
 }
 EXPORT_SYMBOL_GPL(adf_sriov_configure);
+
+int __init adf_init_pf_wq(void)
+{
+	/* Workqueue for PF2VF responses */
+	pf2vf_resp_wq = create_workqueue("qat_pf2vf_resp_wq");
+
+	return !pf2vf_resp_wq ? -ENOMEM : 0;
+}
+
+void adf_exit_pf_wq(void)
+{
+	if (pf2vf_resp_wq) {
+		destroy_workqueue(pf2vf_resp_wq);
+		pf2vf_resp_wq = NULL;
+	}
+}

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 049/101] crypto: qat - fix adf_ctl_drv.c:undefined reference to adf_init_pf_wq
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (40 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 048/101] crypto: qat - fix invalid pf2vf_resp_wq logic Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 050/101] crypto: hash - Fix page length clamping in hash walk Greg Kroah-Hartman
                   ` (50 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, kbuild test robot, Tadeusz Struk, Herbert Xu

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tadeusz Struk <tadeusz.struk@intel.com>

commit 6dc5df71ee5c8b44607928bfe27be50314dcf848 upstream.

Fix undefined reference issue reported by kbuild test robot.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/crypto/qat/qat_common/adf_common_drv.h |   13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

--- a/drivers/crypto/qat/qat_common/adf_common_drv.h
+++ b/drivers/crypto/qat/qat_common/adf_common_drv.h
@@ -144,8 +144,6 @@ void adf_disable_aer(struct adf_accel_de
 void adf_dev_restore(struct adf_accel_dev *accel_dev);
 int adf_init_aer(void);
 void adf_exit_aer(void);
-int adf_init_pf_wq(void);
-void adf_exit_pf_wq(void);
 int adf_init_admin_comms(struct adf_accel_dev *accel_dev);
 void adf_exit_admin_comms(struct adf_accel_dev *accel_dev);
 int adf_send_admin_init(struct adf_accel_dev *accel_dev);
@@ -238,6 +236,8 @@ void adf_enable_vf2pf_interrupts(struct
 				 uint32_t vf_mask);
 void adf_enable_pf2vf_interrupts(struct adf_accel_dev *accel_dev);
 void adf_disable_pf2vf_interrupts(struct adf_accel_dev *accel_dev);
+int adf_init_pf_wq(void);
+void adf_exit_pf_wq(void);
 #else
 static inline int adf_sriov_configure(struct pci_dev *pdev, int numvfs)
 {
@@ -255,5 +255,14 @@ static inline void adf_enable_pf2vf_inte
 static inline void adf_disable_pf2vf_interrupts(struct adf_accel_dev *accel_dev)
 {
 }
+
+static inline int adf_init_pf_wq(void)
+{
+	return 0;
+}
+
+static inline void adf_exit_pf_wq(void)
+{
+}
 #endif
 #endif

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 050/101] crypto: hash - Fix page length clamping in hash walk
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (41 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 049/101] crypto: qat - fix adf_ctl_drv.c:undefined reference to adf_init_pf_wq Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 051/101] crypto: testmgr - Use kmalloc memory for RSA input Greg Kroah-Hartman
                   ` (49 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Steffen Klassert, Herbert Xu

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Herbert Xu <herbert@gondor.apana.org.au>

commit 13f4bb78cf6a312bbdec367ba3da044b09bf0e29 upstream.

The crypto hash walk code is broken when supplied with an offset
greater than or equal to PAGE_SIZE.  This patch fixes it by adjusting
walk->pg and walk->offset when this happens.

Reported-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 crypto/ahash.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -69,8 +69,9 @@ static int hash_walk_new_entry(struct cr
 	struct scatterlist *sg;
 
 	sg = walk->sg;
-	walk->pg = sg_page(sg);
 	walk->offset = sg->offset;
+	walk->pg = sg_page(walk->sg) + (walk->offset >> PAGE_SHIFT);
+	walk->offset = offset_in_page(walk->offset);
 	walk->entrylen = sg->length;
 
 	if (walk->entrylen > walk->total)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 051/101] crypto: testmgr - Use kmalloc memory for RSA input
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (42 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 050/101] crypto: hash - Fix page length clamping in hash walk Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 052/101] ALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2) Greg Kroah-Hartman
                   ` (48 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Anatoly Pugachev, Herbert Xu

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Herbert Xu <herbert@gondor.apana.org.au>

commit df27b26f04ed388ff4cc2b5d8cfdb5d97678816f upstream.

As akcipher uses an SG interface, you must not use vmalloc memory
as input for it.  This patch fixes testmgr to copy the vmalloc
test vectors to kmalloc memory before running the test.

This patch also removes a superfluous sg_virt call in do_test_rsa.

Reported-by: Anatoly Pugachev <matorola@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 crypto/testmgr.c |   27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)

--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -1849,6 +1849,7 @@ static int alg_test_drbg(const struct al
 static int do_test_rsa(struct crypto_akcipher *tfm,
 		       struct akcipher_testvec *vecs)
 {
+	char *xbuf[XBUFSIZE];
 	struct akcipher_request *req;
 	void *outbuf_enc = NULL;
 	void *outbuf_dec = NULL;
@@ -1857,9 +1858,12 @@ static int do_test_rsa(struct crypto_akc
 	int err = -ENOMEM;
 	struct scatterlist src, dst, src_tab[2];
 
+	if (testmgr_alloc_buf(xbuf))
+		return err;
+
 	req = akcipher_request_alloc(tfm, GFP_KERNEL);
 	if (!req)
-		return err;
+		goto free_xbuf;
 
 	init_completion(&result.completion);
 
@@ -1877,9 +1881,14 @@ static int do_test_rsa(struct crypto_akc
 	if (!outbuf_enc)
 		goto free_req;
 
+	if (WARN_ON(vecs->m_size > PAGE_SIZE))
+		goto free_all;
+
+	memcpy(xbuf[0], vecs->m, vecs->m_size);
+
 	sg_init_table(src_tab, 2);
-	sg_set_buf(&src_tab[0], vecs->m, 8);
-	sg_set_buf(&src_tab[1], vecs->m + 8, vecs->m_size - 8);
+	sg_set_buf(&src_tab[0], xbuf[0], 8);
+	sg_set_buf(&src_tab[1], xbuf[0] + 8, vecs->m_size - 8);
 	sg_init_one(&dst, outbuf_enc, out_len_max);
 	akcipher_request_set_crypt(req, src_tab, &dst, vecs->m_size,
 				   out_len_max);
@@ -1898,7 +1907,7 @@ static int do_test_rsa(struct crypto_akc
 		goto free_all;
 	}
 	/* verify that encrypted message is equal to expected */
-	if (memcmp(vecs->c, sg_virt(req->dst), vecs->c_size)) {
+	if (memcmp(vecs->c, outbuf_enc, vecs->c_size)) {
 		pr_err("alg: rsa: encrypt test failed. Invalid output\n");
 		err = -EINVAL;
 		goto free_all;
@@ -1913,7 +1922,13 @@ static int do_test_rsa(struct crypto_akc
 		err = -ENOMEM;
 		goto free_all;
 	}
-	sg_init_one(&src, vecs->c, vecs->c_size);
+
+	if (WARN_ON(vecs->c_size > PAGE_SIZE))
+		goto free_all;
+
+	memcpy(xbuf[0], vecs->c, vecs->c_size);
+
+	sg_init_one(&src, xbuf[0], vecs->c_size);
 	sg_init_one(&dst, outbuf_dec, out_len_max);
 	init_completion(&result.completion);
 	akcipher_request_set_crypt(req, &src, &dst, vecs->c_size, out_len_max);
@@ -1940,6 +1955,8 @@ free_all:
 	kfree(outbuf_enc);
 free_req:
 	akcipher_request_free(req);
+free_xbuf:
+	testmgr_free_buf(xbuf);
 	return err;
 }
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 052/101] ALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2)
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (43 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 051/101] crypto: testmgr - Use kmalloc memory for RSA input Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 053/101] ALSA: usb-audio: Yet another Phoneix Audio device quirk Greg Kroah-Hartman
                   ` (47 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Takashi Iwai

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <tiwai@suse.de>

commit 2d2c038a9999f423e820d89db2b5d7774b67ba49 upstream.

Phoenix Audio MT202pcs (1de7:0114) and MT202exe (1de7:0013) need the
same workaround as TMX320 for avoiding the firmware bug.  It fixes the
frequent error about the sample rate inquiries and the slow device
probe as consequence.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=117321
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/usb/quirks.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/sound/usb/quirks.c
+++ b/sound/usb/quirks.c
@@ -1139,7 +1139,9 @@ bool snd_usb_get_sample_rate_quirk(struc
 	case USB_ID(0x047F, 0xAA05): /* Plantronics DA45 */
 	case USB_ID(0x04D8, 0xFEEA): /* Benchmark DAC1 Pre */
 	case USB_ID(0x074D, 0x3553): /* Outlaw RR2150 (Micronas UAC3553B) */
+	case USB_ID(0x1de7, 0x0013): /* Phoenix Audio MT202exe */
 	case USB_ID(0x1de7, 0x0014): /* Phoenix Audio TMX320 */
+	case USB_ID(0x1de7, 0x0114): /* Phoenix Audio MT202pcs */
 	case USB_ID(0x21B4, 0x0081): /* AudioQuest DragonFly */
 		return true;
 	}

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 053/101] ALSA: usb-audio: Yet another Phoneix Audio device quirk
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (44 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 052/101] ALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2) Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:20 ` [PATCH 4.5 054/101] ALSA: hda - Fix subwoofer pin on ASUS N751 and N551 Greg Kroah-Hartman
                   ` (46 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Takashi Iwai

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <tiwai@suse.de>

commit 84add303ef950b8d85f54bc2248c2bc73467c329 upstream.

Phoenix Audio has yet another device with another id (even a different
vendor id, 0556:0014) that requires the same quirk for the sample
rate.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=110221
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/usb/quirks.c |    1 +
 1 file changed, 1 insertion(+)

--- a/sound/usb/quirks.c
+++ b/sound/usb/quirks.c
@@ -1138,6 +1138,7 @@ bool snd_usb_get_sample_rate_quirk(struc
 	case USB_ID(0x047F, 0x0415): /* Plantronics BT-300 */
 	case USB_ID(0x047F, 0xAA05): /* Plantronics DA45 */
 	case USB_ID(0x04D8, 0xFEEA): /* Benchmark DAC1 Pre */
+	case USB_ID(0x0556, 0x0014): /* Phoenix Audio TMX320VC */
 	case USB_ID(0x074D, 0x3553): /* Outlaw RR2150 (Micronas UAC3553B) */
 	case USB_ID(0x1de7, 0x0013): /* Phoenix Audio MT202exe */
 	case USB_ID(0x1de7, 0x0014): /* Phoenix Audio TMX320 */

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 054/101] ALSA: hda - Fix subwoofer pin on ASUS N751 and N551
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (45 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 053/101] ALSA: usb-audio: Yet another Phoneix Audio device quirk Greg Kroah-Hartman
@ 2016-05-17  1:20 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 055/101] ALSA: hda - Fix white noise on Asus UX501VW headset Greg Kroah-Hartman
                   ` (45 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Yura Pakhuchiy, Takashi Iwai

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Yura Pakhuchiy <pakhuchiy@gmail.com>

commit 3231e2053eaeee70bdfb216a78a30f11e88e2243 upstream.

Subwoofer does not work out of the box on ASUS N751/N551 laptops. This
patch fixes it. Patch tested on N751 laptop. N551 part is not tested,
but according to [1] and [2] this laptop requires similar changes, so I
included them in the patch.

1. https://github.com/honsiorovskyi/asus-n551-hda-fix
2. https://bugs.launchpad.net/ubuntu/+source/alsa-tools/+bug/1405691

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=117781
Signed-off-by: Yura Pakhuchiy <pakhuchiy@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/pci/hda/patch_realtek.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -6426,6 +6426,7 @@ enum {
 	ALC668_FIXUP_DELL_DISABLE_AAMIX,
 	ALC668_FIXUP_DELL_XPS13,
 	ALC662_FIXUP_ASUS_Nx50,
+	ALC668_FIXUP_ASUS_Nx51,
 };
 
 static const struct hda_fixup alc662_fixups[] = {
@@ -6672,6 +6673,15 @@ static const struct hda_fixup alc662_fix
 		.chained = true,
 		.chain_id = ALC662_FIXUP_BASS_1A
 	},
+	[ALC668_FIXUP_ASUS_Nx51] = {
+		.type = HDA_FIXUP_PINS,
+		.v.pins = (const struct hda_pintbl[]) {
+			{0x1a, 0x90170151}, /* bass speaker */
+			{}
+		},
+		.chained = true,
+		.chain_id = ALC662_FIXUP_BASS_CHMAP,
+	},
 };
 
 static const struct snd_pci_quirk alc662_fixup_tbl[] = {
@@ -6699,6 +6709,8 @@ static const struct snd_pci_quirk alc662
 	SND_PCI_QUIRK(0x1043, 0x129d, "Asus N750", ALC662_FIXUP_ASUS_Nx50),
 	SND_PCI_QUIRK(0x1043, 0x1477, "ASUS N56VZ", ALC662_FIXUP_BASS_MODE4_CHMAP),
 	SND_PCI_QUIRK(0x1043, 0x15a7, "ASUS UX51VZH", ALC662_FIXUP_BASS_16),
+	SND_PCI_QUIRK(0x1043, 0x177d, "ASUS N551", ALC668_FIXUP_ASUS_Nx51),
+	SND_PCI_QUIRK(0x1043, 0x17bd, "ASUS N751", ALC668_FIXUP_ASUS_Nx51),
 	SND_PCI_QUIRK(0x1043, 0x1b73, "ASUS N55SF", ALC662_FIXUP_BASS_16),
 	SND_PCI_QUIRK(0x1043, 0x1bf3, "ASUS N76VZ", ALC662_FIXUP_BASS_MODE4_CHMAP),
 	SND_PCI_QUIRK(0x1043, 0x8469, "ASUS mobo", ALC662_FIXUP_NO_JACK_DETECT),

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 055/101] ALSA: hda - Fix white noise on Asus UX501VW headset
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (46 preceding siblings ...)
  2016-05-17  1:20 ` [PATCH 4.5 054/101] ALSA: hda - Fix subwoofer pin on ASUS N751 and N551 Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 056/101] ALSA: hda - Fix broken reconfig Greg Kroah-Hartman
                   ` (44 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Kaho Ng, Takashi Iwai

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kaho Ng <ngkaho1234@gmail.com>

commit 2da2dc9ead232f25601404335cca13c0f722d41b upstream.

For reducing the noise from the headset output on ASUS UX501VW,
call the existing fixup, alc_fixup_headset_mode_alc668(), additionally.

Thread: https://bbs.archlinux.org/viewtopic.php?id=209554

Signed-off-by: Kaho Ng <ngkaho1234@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/pci/hda/patch_realtek.c |    1 +
 1 file changed, 1 insertion(+)

--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -6704,6 +6704,7 @@ static const struct snd_pci_quirk alc662
 	SND_PCI_QUIRK(0x1028, 0x0698, "Dell", ALC668_FIXUP_DELL_MIC_NO_PRESENCE),
 	SND_PCI_QUIRK(0x1028, 0x069f, "Dell", ALC668_FIXUP_DELL_MIC_NO_PRESENCE),
 	SND_PCI_QUIRK(0x103c, 0x1632, "HP RP5800", ALC662_FIXUP_HP_RP5800),
+	SND_PCI_QUIRK(0x1043, 0x1080, "Asus UX501VW", ALC668_FIXUP_HEADSET_MODE),
 	SND_PCI_QUIRK(0x1043, 0x11cd, "Asus N550", ALC662_FIXUP_ASUS_Nx50),
 	SND_PCI_QUIRK(0x1043, 0x13df, "Asus N550JX", ALC662_FIXUP_BASS_1A),
 	SND_PCI_QUIRK(0x1043, 0x129d, "Asus N750", ALC662_FIXUP_ASUS_Nx50),

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 056/101] ALSA: hda - Fix broken reconfig
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (47 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 055/101] ALSA: hda - Fix white noise on Asus UX501VW headset Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 057/101] spi: pxa2xx: Do not detect number of enabled chip selects on Intel SPT Greg Kroah-Hartman
                   ` (43 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Jochen Henneberg, Takashi Iwai

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <tiwai@suse.de>

commit addacd801e1638f41d659cb53b9b73fc14322cb1 upstream.

The HD-audio reconfig function got broken in the recent kernels,
typically resulting in a failure like:
  snd_hda_intel 0000:00:1b.0: control 3:0:0:Playback Channel Map:0 is already present

This is because of the code restructuring to move the PCM and control
instantiation into the codec drive probe, by the commit [bcd96557bd0a:
ALSA: hda - Build PCMs and controls at codec driver probe].  Although
the commit above removed the calls of snd_hda_codec_build_pcms() and
*_build_controls() at the controller driver probe, the similar calls
in the reconfig were still left forgotten.  This caused the
conflicting and duplicated PCMs and controls.

The fix is trivial: just remove these superfluous calls from
reconfig_codec().

Fixes: bcd96557bd0a ('ALSA: hda - Build PCMs and controls at codec driver probe')
Reported-by: Jochen Henneberg <jh@henneberg-systemdesign.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/pci/hda/hda_sysfs.c |    8 --------
 1 file changed, 8 deletions(-)

--- a/sound/pci/hda/hda_sysfs.c
+++ b/sound/pci/hda/hda_sysfs.c
@@ -141,14 +141,6 @@ static int reconfig_codec(struct hda_cod
 	err = snd_hda_codec_configure(codec);
 	if (err < 0)
 		goto error;
-	/* rebuild PCMs */
-	err = snd_hda_codec_build_pcms(codec);
-	if (err < 0)
-		goto error;
-	/* rebuild mixers */
-	err = snd_hda_codec_build_controls(codec);
-	if (err < 0)
-		goto error;
 	err = snd_card_register(codec->card);
  error:
 	snd_hda_power_down(codec);

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 057/101] spi: pxa2xx: Do not detect number of enabled chip selects on Intel SPT
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (48 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 056/101] ALSA: hda - Fix broken reconfig Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 058/101] spi: spi-ti-qspi: Fix FLEN and WLEN settings if bits_per_word is overridden Greg Kroah-Hartman
                   ` (42 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Jarkko Nikula, Mark Brown

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jarkko Nikula <jarkko.nikula@linux.intel.com>

commit 66ec246eb9982e7eb8e15e1fc55f543230310dd0 upstream.

Certain Intel Sunrisepoint PCH variants report zero chip selects in SPI
capabilities register even they have one per port. Detection in
pxa2xx_spi_probe() sets master->num_chipselect to 0 leading to -EINVAL
from spi_register_master() where chip select count is validated.

Fix this by not using SPI capabilities register on Sunrisepoint. They don't
have more than one chip select so use the default value 1 instead of
detection.

Fixes: 8b136baa5892 ("spi: pxa2xx: Detect number of enabled Intel LPSS SPI chip select signals")
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/spi/spi-pxa2xx.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/spi/spi-pxa2xx.c
+++ b/drivers/spi/spi-pxa2xx.c
@@ -111,7 +111,7 @@ static const struct lpss_config lpss_pla
 		.reg_general = -1,
 		.reg_ssp = 0x20,
 		.reg_cs_ctrl = 0x24,
-		.reg_capabilities = 0xfc,
+		.reg_capabilities = -1,
 		.rx_threshold = 1,
 		.tx_threshold_lo = 32,
 		.tx_threshold_hi = 56,

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 058/101] spi: spi-ti-qspi: Fix FLEN and WLEN settings if bits_per_word is overridden
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (49 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 057/101] spi: pxa2xx: Do not detect number of enabled chip selects on Intel SPT Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 059/101] spi: spi-ti-qspi: Handle truncated frames properly Greg Kroah-Hartman
                   ` (41 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Ben Hutchings, Mark Brown

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ben Hutchings <ben.hutchings@codethink.co.uk>

commit ea1b60fb085839a9544cb3a0069992991beabb7f upstream.

Each transfer can specify 8, 16 or 32 bits per word independently of
the default for the device being addressed.  However, currently we
calculate the number of words in the frame assuming that the word size
is the device default.

If multiple transfers in the same message have differing
bits_per_word, we bitwise-or the different values in the WLEN register
field.

Fix both of these.  Also rename 'frame_length' to 'frame_len_words' to
make clear that it's not a byte count like spi_message::frame_length.

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/spi/spi-ti-qspi.c |   15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

--- a/drivers/spi/spi-ti-qspi.c
+++ b/drivers/spi/spi-ti-qspi.c
@@ -94,6 +94,7 @@ struct ti_qspi {
 #define QSPI_FLEN(n)			((n - 1) << 0)
 #define QSPI_WLEN_MAX_BITS		128
 #define QSPI_WLEN_MAX_BYTES		16
+#define QSPI_WLEN_MASK			QSPI_WLEN(QSPI_WLEN_MAX_BITS)
 
 /* STATUS REGISTER */
 #define BUSY				0x01
@@ -373,7 +374,7 @@ static int ti_qspi_start_transfer_one(st
 	struct spi_device *spi = m->spi;
 	struct spi_transfer *t;
 	int status = 0, ret;
-	int frame_length;
+	unsigned int frame_len_words;
 
 	/* setup device control reg */
 	qspi->dc = 0;
@@ -385,21 +386,23 @@ static int ti_qspi_start_transfer_one(st
 	if (spi->mode & SPI_CS_HIGH)
 		qspi->dc |= QSPI_CSPOL(spi->chip_select);
 
-	frame_length = (m->frame_length << 3) / spi->bits_per_word;
-
-	frame_length = clamp(frame_length, 0, QSPI_FRAME);
+	frame_len_words = 0;
+	list_for_each_entry(t, &m->transfers, transfer_list)
+		frame_len_words += t->len / (t->bits_per_word >> 3);
+	frame_len_words = min_t(unsigned int, frame_len_words, QSPI_FRAME);
 
 	/* setup command reg */
 	qspi->cmd = 0;
 	qspi->cmd |= QSPI_EN_CS(spi->chip_select);
-	qspi->cmd |= QSPI_FLEN(frame_length);
+	qspi->cmd |= QSPI_FLEN(frame_len_words);
 
 	ti_qspi_write(qspi, qspi->dc, QSPI_SPI_DC_REG);
 
 	mutex_lock(&qspi->list_lock);
 
 	list_for_each_entry(t, &m->transfers, transfer_list) {
-		qspi->cmd |= QSPI_WLEN(t->bits_per_word);
+		qspi->cmd = ((qspi->cmd & ~QSPI_WLEN_MASK) |
+			     QSPI_WLEN(t->bits_per_word));
 
 		ret = qspi_transfer_msg(qspi, t);
 		if (ret) {

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 059/101] spi: spi-ti-qspi: Handle truncated frames properly
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (50 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 058/101] spi: spi-ti-qspi: Fix FLEN and WLEN settings if bits_per_word is overridden Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 060/101] pinctrl: at91-pio4: fix pull-up/down logic Greg Kroah-Hartman
                   ` (40 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Ben Hutchings, Mark Brown

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ben Hutchings <ben.hutchings@codethink.co.uk>

commit 1ff7760ff66b98ef244bf0e5e2bd5310651205ad upstream.

We clamp frame_len_words to a maximum of 4096, but do not actually
limit the number of words written or read through the DATA registers
or the length added to spi_message::actual_length.  This results in
silent data corruption for commands longer than this maximum.

Recalculate the length of each transfer, taking frame_len_words into
account.  Use this length in qspi_{read,write}_msg(), and to increment
spi_message::actual_length.

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/spi/spi-ti-qspi.c |   32 ++++++++++++++++++++------------
 1 file changed, 20 insertions(+), 12 deletions(-)

--- a/drivers/spi/spi-ti-qspi.c
+++ b/drivers/spi/spi-ti-qspi.c
@@ -225,16 +225,16 @@ static inline int ti_qspi_poll_wc(struct
 	return  -ETIMEDOUT;
 }
 
-static int qspi_write_msg(struct ti_qspi *qspi, struct spi_transfer *t)
+static int qspi_write_msg(struct ti_qspi *qspi, struct spi_transfer *t,
+			  int count)
 {
-	int wlen, count, xfer_len;
+	int wlen, xfer_len;
 	unsigned int cmd;
 	const u8 *txbuf;
 	u32 data;
 
 	txbuf = t->tx_buf;
 	cmd = qspi->cmd | QSPI_WR_SNGL;
-	count = t->len;
 	wlen = t->bits_per_word >> 3;	/* in bytes */
 	xfer_len = wlen;
 
@@ -294,9 +294,10 @@ static int qspi_write_msg(struct ti_qspi
 	return 0;
 }
 
-static int qspi_read_msg(struct ti_qspi *qspi, struct spi_transfer *t)
+static int qspi_read_msg(struct ti_qspi *qspi, struct spi_transfer *t,
+			 int count)
 {
-	int wlen, count;
+	int wlen;
 	unsigned int cmd;
 	u8 *rxbuf;
 
@@ -313,7 +314,6 @@ static int qspi_read_msg(struct ti_qspi
 		cmd |= QSPI_RD_SNGL;
 		break;
 	}
-	count = t->len;
 	wlen = t->bits_per_word >> 3;	/* in bytes */
 
 	while (count) {
@@ -344,12 +344,13 @@ static int qspi_read_msg(struct ti_qspi
 	return 0;
 }
 
-static int qspi_transfer_msg(struct ti_qspi *qspi, struct spi_transfer *t)
+static int qspi_transfer_msg(struct ti_qspi *qspi, struct spi_transfer *t,
+			     int count)
 {
 	int ret;
 
 	if (t->tx_buf) {
-		ret = qspi_write_msg(qspi, t);
+		ret = qspi_write_msg(qspi, t, count);
 		if (ret) {
 			dev_dbg(qspi->dev, "Error while writing\n");
 			return ret;
@@ -357,7 +358,7 @@ static int qspi_transfer_msg(struct ti_q
 	}
 
 	if (t->rx_buf) {
-		ret = qspi_read_msg(qspi, t);
+		ret = qspi_read_msg(qspi, t, count);
 		if (ret) {
 			dev_dbg(qspi->dev, "Error while reading\n");
 			return ret;
@@ -374,7 +375,8 @@ static int ti_qspi_start_transfer_one(st
 	struct spi_device *spi = m->spi;
 	struct spi_transfer *t;
 	int status = 0, ret;
-	unsigned int frame_len_words;
+	unsigned int frame_len_words, transfer_len_words;
+	int wlen;
 
 	/* setup device control reg */
 	qspi->dc = 0;
@@ -404,14 +406,20 @@ static int ti_qspi_start_transfer_one(st
 		qspi->cmd = ((qspi->cmd & ~QSPI_WLEN_MASK) |
 			     QSPI_WLEN(t->bits_per_word));
 
-		ret = qspi_transfer_msg(qspi, t);
+		wlen = t->bits_per_word >> 3;
+		transfer_len_words = min(t->len / wlen, frame_len_words);
+
+		ret = qspi_transfer_msg(qspi, t, transfer_len_words * wlen);
 		if (ret) {
 			dev_dbg(qspi->dev, "transfer message failed\n");
 			mutex_unlock(&qspi->list_lock);
 			return -EINVAL;
 		}
 
-		m->actual_length += t->len;
+		m->actual_length += transfer_len_words * wlen;
+		frame_len_words -= transfer_len_words;
+		if (frame_len_words == 0)
+			break;
 	}
 
 	mutex_unlock(&qspi->list_lock);

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 060/101] pinctrl: at91-pio4: fix pull-up/down logic
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (51 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 059/101] spi: spi-ti-qspi: Handle truncated frames properly Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 061/101] regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case Greg Kroah-Hartman
                   ` (39 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ludovic Desroches, Alexandre Belloni,
	Nicolas Ferre, Linus Walleij

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ludovic Desroches <ludovic.desroches@atmel.com>

commit 5305a7b7e860bb40ab226bc7d58019416073948a upstream.

The default configuration of a pin is often with a value in the
pull-up/down field at chip reset. So, even if the internal logic of the
controller prevents writing a configuration with pull-up and pull-down at
the same time, we must ensure explicitly this condition before writing the
register.

This was leading to a pull-down condition not taken into account for
instance.

Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Fixes: 776180848b57 ("pinctrl: introduce driver for Atmel PIO4 controller")
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/pinctrl/pinctrl-at91-pio4.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/pinctrl/pinctrl-at91-pio4.c
+++ b/drivers/pinctrl/pinctrl-at91-pio4.c
@@ -722,9 +722,11 @@ static int atmel_conf_pin_config_group_s
 			break;
 		case PIN_CONFIG_BIAS_PULL_UP:
 			conf |= ATMEL_PIO_PUEN_MASK;
+			conf &= (~ATMEL_PIO_PDEN_MASK);
 			break;
 		case PIN_CONFIG_BIAS_PULL_DOWN:
 			conf |= ATMEL_PIO_PDEN_MASK;
+			conf &= (~ATMEL_PIO_PUEN_MASK);
 			break;
 		case PIN_CONFIG_DRIVE_OPEN_DRAIN:
 			if (arg == 0)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 061/101] regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (52 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 060/101] pinctrl: at91-pio4: fix pull-up/down logic Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 062/101] perf diff: Fix duplicated output column Greg Kroah-Hartman
                   ` (38 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Jack Pham, Mark Brown

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jack Pham <jackp@codeaurora.org>

commit dec8e8f6e6504aa3496c0f7cc10c756bb0e10f44 upstream.

Specifically for the case of reads that use the Extended Register
Read Long command, a multi-byte read operation is broken up into
8-byte chunks.  However the call to spmi_ext_register_readl() is
incorrectly passing 'val_size', which if greater than 8 will
always fail.  The argument should instead be 'len'.

Fixes: c9afbb05a9ff ("regmap: spmi: support base and extended register spaces")
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/base/regmap/regmap-spmi.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/base/regmap/regmap-spmi.c
+++ b/drivers/base/regmap/regmap-spmi.c
@@ -142,7 +142,7 @@ static int regmap_spmi_ext_read(void *co
 	while (val_size) {
 		len = min_t(size_t, val_size, 8);
 
-		err = spmi_ext_register_readl(context, addr, val, val_size);
+		err = spmi_ext_register_readl(context, addr, val, len);
 		if (err)
 			goto err_out;
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 062/101] perf diff: Fix duplicated output column
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (53 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 061/101] regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 063/101] perf/core: Disable the event on a truncated AUX record Greg Kroah-Hartman
                   ` (37 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Namhyung Kim,
	Arnaldo Carvalho de Melo, Jiri Olsa, Linus Torvalds,
	Peter Zijlstra, Thomas Gleixner, Ingo Molnar

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Namhyung Kim <namhyung@kernel.org>

commit e9d848cb65d5f6f7731d12bd1b6d994bfdbcc94f upstream.

The commit b97511c5bc94 ("perf tools: Add overhead/overhead_children
keys defaults via string") moved initialization of column headers but it
missed to check the sort__mode.  As 'perf diff' doesn't call
perf_hpp__init(), the setup_overhead() also should not be called.

Before:

  # Baseline    Delta  Children  Overhead  Shared Object        Symbol
  # ........  .......  ........  ........  ...................  .......................
  #
      28.48%  -28.47%    28.48%    28.48%  [kernel.vmlinux ]    [k] intel_idle
      11.51%  -11.47%    11.51%    11.51%  libxul.so            [.] 0x0000000001a360f7
       3.49%   -3.49%     3.49%     3.49%  [kernel.vmlinux]     [k] generic_exec_single
       2.91%   -2.89%     2.91%     2.91%  libdbus-1.so.3.8.11  [.] 0x000000000000cdc2
       2.86%   -2.85%     2.86%     2.86%  libxcb.so.1.1.0      [.] 0x000000000000c890
       2.44%   -2.39%     2.44%     2.44%  [kernel.vmlinux]     [k] perf_event_aux_ctx

After:

  # Baseline    Delta  Shared Object        Symbol
  # ........  .......  ...................  .......................
  #
      28.48%  -28.47%  [kernel.vmlinux]     [k] intel_idle
      11.51%  -11.47%  libxul.so            [.] 0x0000000001a360f7
       3.49%   -3.49%  [kernel.vmlinux]     [k] generic_exec_single
       2.91%   -2.89%  libdbus-1.so.3.8.11  [.] 0x000000000000cdc2
       2.86%   -2.85%  libxcb.so.1.1.0      [.] 0x000000000000c890
       2.44%   -2.39%  [kernel.vmlinux]     [k] perf_event_aux_ctx

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: b97511c5bc94 ("perf tools: Add overhead/overhead_children keys defaults via string")
Link: http://lkml.kernel.org/r/1462890384-12486-2-git-send-email-acme@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 tools/perf/util/sort.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -2272,6 +2272,9 @@ static char *prefix_if_not_in(const char
 
 static char *setup_overhead(char *keys)
 {
+	if (sort__mode == SORT_MODE__DIFF)
+		return keys;
+
 	keys = prefix_if_not_in("overhead", keys);
 
 	if (symbol_conf.cumulate_callchain)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 063/101] perf/core: Disable the event on a truncated AUX record
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (54 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 062/101] perf diff: Fix duplicated output column Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 064/101] vfs: add vfs_select_inode() helper Greg Kroah-Hartman
                   ` (36 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Markus Metzger, Alexander Shishkin,
	Peter Zijlstra (Intel),
	Arnaldo Carvalho de Melo, Arnaldo Carvalho de Melo,
	Borislav Petkov, Jiri Olsa, Linus Torvalds, Stephane Eranian,
	Thomas Gleixner, Vince Weaver, vince, Ingo Molnar

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexander Shishkin <alexander.shishkin@linux.intel.com>

commit 9f448cd3cbcec8995935e60b27802ae56aac8cc0 upstream.

When the PMU driver reports a truncated AUX record, it effectively means
that there is no more usable room in the event's AUX buffer (even though
there may still be some room, so that perf_aux_output_begin() doesn't take
action). At this point the consumer still has to be woken up and the event
has to be disabled, otherwise the event will just keep spinning between
perf_aux_output_begin() and perf_aux_output_end() until its context gets
unscheduled.

Again, for cpu-wide events this means never, so once in this condition,
they will be forever losing data.

Fix this by disabling the event and waking up the consumer in case of a
truncated AUX record.

Reported-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1462886313-13660-3-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/events/ring_buffer.c |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -347,6 +347,7 @@ void perf_aux_output_end(struct perf_out
 			 bool truncated)
 {
 	struct ring_buffer *rb = handle->rb;
+	bool wakeup = truncated;
 	unsigned long aux_head;
 	u64 flags = 0;
 
@@ -375,9 +376,16 @@ void perf_aux_output_end(struct perf_out
 	aux_head = rb->user_page->aux_head = local_read(&rb->aux_head);
 
 	if (aux_head - local_read(&rb->aux_wakeup) >= rb->aux_watermark) {
-		perf_output_wakeup(handle);
+		wakeup = true;
 		local_add(rb->aux_watermark, &rb->aux_wakeup);
 	}
+
+	if (wakeup) {
+		if (truncated)
+			handle->event->pending_disable = 1;
+		perf_output_wakeup(handle);
+	}
+
 	handle->event = NULL;
 
 	local_set(&rb->aux_nest, 0);

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 064/101] vfs: add vfs_select_inode() helper
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (55 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 063/101] perf/core: Disable the event on a truncated AUX record Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 065/101] vfs: rename: check backing inode being equal Greg Kroah-Hartman
                   ` (35 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Miklos Szeredi

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Miklos Szeredi <mszeredi@redhat.com>

commit 54d5ca871e72f2bb172ec9323497f01cd5091ec7 upstream.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/open.c              |   12 ++++--------
 include/linux/dcache.h |   12 ++++++++++++
 2 files changed, 16 insertions(+), 8 deletions(-)

--- a/fs/open.c
+++ b/fs/open.c
@@ -840,16 +840,12 @@ EXPORT_SYMBOL(file_path);
 int vfs_open(const struct path *path, struct file *file,
 	     const struct cred *cred)
 {
-	struct dentry *dentry = path->dentry;
-	struct inode *inode = dentry->d_inode;
+	struct inode *inode = vfs_select_inode(path->dentry, file->f_flags);
 
-	file->f_path = *path;
-	if (dentry->d_flags & DCACHE_OP_SELECT_INODE) {
-		inode = dentry->d_op->d_select_inode(dentry, file->f_flags);
-		if (IS_ERR(inode))
-			return PTR_ERR(inode);
-	}
+	if (IS_ERR(inode))
+		return PTR_ERR(inode);
 
+	file->f_path = *path;
 	return do_dentry_open(file, inode, NULL, cred);
 }
 
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -592,4 +592,16 @@ static inline struct dentry *d_real(stru
 		return dentry;
 }
 
+static inline struct inode *vfs_select_inode(struct dentry *dentry,
+					     unsigned open_flags)
+{
+	struct inode *inode = d_inode(dentry);
+
+	if (inode && unlikely(dentry->d_flags & DCACHE_OP_SELECT_INODE))
+		inode = dentry->d_op->d_select_inode(dentry, open_flags);
+
+	return inode;
+}
+
+
 #endif	/* __LINUX_DCACHE_H */

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 065/101] vfs: rename: check backing inode being equal
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (56 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 064/101] vfs: add vfs_select_inode() helper Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 066/101] ARM: dts: at91: sam9x5: Fix the memory range assigned to the PMC Greg Kroah-Hartman
                   ` (34 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Miklos Szeredi

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Miklos Szeredi <mszeredi@redhat.com>

commit 9409e22acdfc9153f88d9b1ed2bd2a5b34d2d3ca upstream.

If a file is renamed to a hardlink of itself POSIX specifies that rename(2)
should do nothing and return success.

This condition is checked in vfs_rename().  However it won't detect hard
links on overlayfs where these are given separate inodes on the overlayfs
layer.

Overlayfs itself detects this condition and returns success without doing
anything, but then vfs_rename() will proceed as if this was a successful
rename (detach_mounts(), d_move()).

The correct thing to do is to detect this condition before even calling
into overlayfs.  This patch does this by calling vfs_select_inode() to get
the underlying inodes.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/namei.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4258,7 +4258,11 @@ int vfs_rename(struct inode *old_dir, st
 	bool new_is_dir = false;
 	unsigned max_links = new_dir->i_sb->s_max_links;
 
-	if (source == target)
+	/*
+	 * Check source == target.
+	 * On overlayfs need to look at underlying inodes.
+	 */
+	if (vfs_select_inode(old_dentry, 0) == vfs_select_inode(new_dentry, 0))
 		return 0;
 
 	error = may_delete(old_dir, old_dentry, is_dir);

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 066/101] ARM: dts: at91: sam9x5: Fix the memory range assigned to the PMC
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (57 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 065/101] vfs: rename: check backing inode being equal Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 068/101] regulator: s2mps11: Fix invalid selector mask and voltages for buck9 Greg Kroah-Hartman
                   ` (33 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Boris Brezillon, Richard Genoud,
	Nicolas Ferre

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Boris Brezillon <boris.brezillon@free-electrons.com>

commit aab0a4c83ceb344d2327194bf354820e50607af6 upstream.

The memory range assigned to the PMC (Power Management Controller) was
not including the PMC_PCR register which are used to control peripheral
clocks.

This was working fine thanks to the page granularity of ioremap(), but
started to fail when we switched to syscon/regmap, because regmap is
making sure that all accesses are falling into the reserved range.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reported-by: Richard Genoud <richard.genoud@gmail.com>
Tested-by: Richard Genoud <richard.genoud@gmail.com>
Fixes: 863a81c3be1d ("clk: at91: make use of syscon to share PMC registers in several drivers")
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arm/boot/dts/at91sam9x5.dtsi |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/arm/boot/dts/at91sam9x5.dtsi
+++ b/arch/arm/boot/dts/at91sam9x5.dtsi
@@ -106,7 +106,7 @@
 
 			pmc: pmc@fffffc00 {
 				compatible = "atmel,at91sam9x5-pmc", "syscon";
-				reg = <0xfffffc00 0x100>;
+				reg = <0xfffffc00 0x200>;
 				interrupts = <1 IRQ_TYPE_LEVEL_HIGH 7>;
 				interrupt-controller;
 				#address-cells = <1>;

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 068/101] regulator: s2mps11: Fix invalid selector mask and voltages for buck9
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (58 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 066/101] ARM: dts: at91: sam9x5: Fix the memory range assigned to the PMC Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 069/101] regulator: axp20x: Fix axp22x ldo_io voltage ranges Greg Kroah-Hartman
                   ` (32 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Krzysztof Kozlowski,
	Javier Martinez Canillas, Mark Brown

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Krzysztof Kozlowski <k.kozlowski@samsung.com>

commit 3b672623079bb3e5685b8549e514f2dfaa564406 upstream.

The buck9 regulator of S2MPS11 PMIC had incorrect vsel_mask (0xff
instead of 0x1f) thus reading entire register as buck9's voltage. This
effectively caused regulator core to interpret values as higher voltages
than they were and then to set real voltage much lower than intended.

The buck9 provides power to other regulators, including LDO13
and LDO19 which supply the MMC2 (SD card). On Odroid XU3/XU4 the lower
voltage caused SD card detection errors on Odroid XU3/XU4:
	mmc1: card never left busy state
	mmc1: error -110 whilst initialising SD card

During driver probe the regulator core was checking whether initial
voltage matches the constraints. With incorrect vsel_mask of 0xff and
default value of 0x50, the core interpreted this as 5 V which is outside
of constraints (3-3.775 V). Then the regulator core was adjusting the
voltage to match the constraints. With incorrect vsel_mask this new
voltage mapped to a vere low voltage in the driver.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Tested-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/regulator/s2mps11.c         |   28 ++++++++++++++++++++++------
 include/linux/mfd/samsung/s2mps11.h |    2 ++
 2 files changed, 24 insertions(+), 6 deletions(-)

--- a/drivers/regulator/s2mps11.c
+++ b/drivers/regulator/s2mps11.c
@@ -306,7 +306,7 @@ static struct regulator_ops s2mps11_buck
 	.enable_mask	= S2MPS11_ENABLE_MASK			\
 }
 
-#define regulator_desc_s2mps11_buck6_10(num, min, step) {	\
+#define regulator_desc_s2mps11_buck67810(num, min, step) {	\
 	.name		= "BUCK"#num,				\
 	.id		= S2MPS11_BUCK##num,			\
 	.ops		= &s2mps11_buck_ops,			\
@@ -322,6 +322,22 @@ static struct regulator_ops s2mps11_buck
 	.enable_mask	= S2MPS11_ENABLE_MASK			\
 }
 
+#define regulator_desc_s2mps11_buck9 {				\
+	.name		= "BUCK9",				\
+	.id		= S2MPS11_BUCK9,			\
+	.ops		= &s2mps11_buck_ops,			\
+	.type		= REGULATOR_VOLTAGE,			\
+	.owner		= THIS_MODULE,				\
+	.min_uV		= MIN_3000_MV,				\
+	.uV_step	= STEP_25_MV,				\
+	.n_voltages	= S2MPS11_BUCK9_N_VOLTAGES,		\
+	.ramp_delay	= S2MPS11_RAMP_DELAY,			\
+	.vsel_reg	= S2MPS11_REG_B9CTRL2,			\
+	.vsel_mask	= S2MPS11_BUCK9_VSEL_MASK,		\
+	.enable_reg	= S2MPS11_REG_B9CTRL1,			\
+	.enable_mask	= S2MPS11_ENABLE_MASK			\
+}
+
 static const struct regulator_desc s2mps11_regulators[] = {
 	regulator_desc_s2mps11_ldo(1, STEP_25_MV),
 	regulator_desc_s2mps11_ldo(2, STEP_50_MV),
@@ -366,11 +382,11 @@ static const struct regulator_desc s2mps
 	regulator_desc_s2mps11_buck1_4(3),
 	regulator_desc_s2mps11_buck1_4(4),
 	regulator_desc_s2mps11_buck5,
-	regulator_desc_s2mps11_buck6_10(6, MIN_600_MV, STEP_6_25_MV),
-	regulator_desc_s2mps11_buck6_10(7, MIN_600_MV, STEP_6_25_MV),
-	regulator_desc_s2mps11_buck6_10(8, MIN_600_MV, STEP_6_25_MV),
-	regulator_desc_s2mps11_buck6_10(9, MIN_3000_MV, STEP_25_MV),
-	regulator_desc_s2mps11_buck6_10(10, MIN_750_MV, STEP_12_5_MV),
+	regulator_desc_s2mps11_buck67810(6, MIN_600_MV, STEP_6_25_MV),
+	regulator_desc_s2mps11_buck67810(7, MIN_600_MV, STEP_6_25_MV),
+	regulator_desc_s2mps11_buck67810(8, MIN_600_MV, STEP_6_25_MV),
+	regulator_desc_s2mps11_buck9,
+	regulator_desc_s2mps11_buck67810(10, MIN_750_MV, STEP_12_5_MV),
 };
 
 static struct regulator_ops s2mps14_reg_ops;
--- a/include/linux/mfd/samsung/s2mps11.h
+++ b/include/linux/mfd/samsung/s2mps11.h
@@ -173,10 +173,12 @@ enum s2mps11_regulators {
 
 #define S2MPS11_LDO_VSEL_MASK	0x3F
 #define S2MPS11_BUCK_VSEL_MASK	0xFF
+#define S2MPS11_BUCK9_VSEL_MASK	0x1F
 #define S2MPS11_ENABLE_MASK	(0x03 << S2MPS11_ENABLE_SHIFT)
 #define S2MPS11_ENABLE_SHIFT	0x06
 #define S2MPS11_LDO_N_VOLTAGES	(S2MPS11_LDO_VSEL_MASK + 1)
 #define S2MPS11_BUCK_N_VOLTAGES (S2MPS11_BUCK_VSEL_MASK + 1)
+#define S2MPS11_BUCK9_N_VOLTAGES (S2MPS11_BUCK9_VSEL_MASK + 1)
 #define S2MPS11_RAMP_DELAY	25000		/* uV/us */
 
 #define S2MPS11_CTRL1_PWRHOLD_MASK	BIT(4)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 069/101] regulator: axp20x: Fix axp22x ldo_io voltage ranges
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (59 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 068/101] regulator: s2mps11: Fix invalid selector mask and voltages for buck9 Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 070/101] atomic_open(): fix the handling of create_error Greg Kroah-Hartman
                   ` (31 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hans de Goede, Chen-Yu Tsai, Mark Brown

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hans de Goede <hdegoede@redhat.com>

commit a2262e5a12e05389ab4c7fc5cf60016b041dd8dc upstream.

The minium voltage of 1800mV is a copy and paste error from the axp20x
regulator info. The correct minimum voltage for the ldo_io regulators
on the axp22x is 700mV.

Fixes: 1b82b4e4f954 ("regulator: axp20x: Add support for AXP22X regulators")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/regulator/axp20x-regulator.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/regulator/axp20x-regulator.c
+++ b/drivers/regulator/axp20x-regulator.c
@@ -221,10 +221,10 @@ static const struct regulator_desc axp22
 		 AXP22X_ELDO2_V_OUT, 0x1f, AXP22X_PWR_OUT_CTRL2, BIT(1)),
 	AXP_DESC(AXP22X, ELDO3, "eldo3", "eldoin", 700, 3300, 100,
 		 AXP22X_ELDO3_V_OUT, 0x1f, AXP22X_PWR_OUT_CTRL2, BIT(2)),
-	AXP_DESC_IO(AXP22X, LDO_IO0, "ldo_io0", "ips", 1800, 3300, 100,
+	AXP_DESC_IO(AXP22X, LDO_IO0, "ldo_io0", "ips", 700, 3300, 100,
 		    AXP22X_LDO_IO0_V_OUT, 0x1f, AXP20X_GPIO0_CTRL, 0x07,
 		    AXP22X_IO_ENABLED, AXP22X_IO_DISABLED),
-	AXP_DESC_IO(AXP22X, LDO_IO1, "ldo_io1", "ips", 1800, 3300, 100,
+	AXP_DESC_IO(AXP22X, LDO_IO1, "ldo_io1", "ips", 700, 3300, 100,
 		    AXP22X_LDO_IO1_V_OUT, 0x1f, AXP20X_GPIO1_CTRL, 0x07,
 		    AXP22X_IO_ENABLED, AXP22X_IO_DISABLED),
 	AXP_DESC_FIXED(AXP22X, RTC_LDO, "rtc_ldo", "ips", 3000),

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 070/101] atomic_open(): fix the handling of create_error
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (60 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 069/101] regulator: axp20x: Fix axp22x ldo_io voltage ranges Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 071/101] qla1280: Dont allocate 512kb of host tags Greg Kroah-Hartman
                   ` (30 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Miklos Szeredi, Al Viro

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Al Viro <viro@zeniv.linux.org.uk>

commit 10c64cea04d3c75c306b3f990586ffb343b63287 upstream.

* if we have a hashed negative dentry and either CREAT|EXCL on
r/o filesystem, or CREAT|TRUNC on r/o filesystem, or CREAT|EXCL
with failing may_o_create(), we should fail with EROFS or the
error may_o_create() has returned, but not ENOENT.  Which is what
the current code ends up returning.

* if we have CREAT|TRUNC hitting a regular file on a read-only
filesystem, we can't fail with EROFS here.  At the very least,
not until we'd done follow_managed() - we might have a writable
file (or a device, for that matter) bound on top of that one.
Moreover, the code downstream will see that O_TRUNC and attempt
to grab the write access (*after* following possible mount), so
if we really should fail with EROFS, it will happen.  No need
to do that inside atomic_open().

The real logics is much simpler than what the current code is
trying to do - if we decided to go for simple lookup, ended
up with a negative dentry *and* had create_error set, fail with
create_error.  No matter whether we'd got that negative dentry
from lookup_real() or had found it in dcache.

Acked-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/namei.c |   20 ++++----------------
 1 file changed, 4 insertions(+), 16 deletions(-)

--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2968,22 +2968,10 @@ no_open:
 		dentry = lookup_real(dir, dentry, nd->flags);
 		if (IS_ERR(dentry))
 			return PTR_ERR(dentry);
-
-		if (create_error) {
-			int open_flag = op->open_flag;
-
-			error = create_error;
-			if ((open_flag & O_EXCL)) {
-				if (!dentry->d_inode)
-					goto out;
-			} else if (!dentry->d_inode) {
-				goto out;
-			} else if ((open_flag & O_TRUNC) &&
-				   d_is_reg(dentry)) {
-				goto out;
-			}
-			/* will fail later, go on to get the right error */
-		}
+	}
+	if (create_error && !dentry->d_inode) {
+		error = create_error;
+		goto out;
 	}
 looked_up:
 	path->dentry = dentry;

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 071/101] qla1280: Dont allocate 512kb of host tags
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (61 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 070/101] atomic_open(): fix the handling of create_error Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 072/101] tools lib traceevent: Do not reassign parg after collapse_tree() Greg Kroah-Hartman
                   ` (29 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Johannes Thumshirn, Laura Abbott,
	Michael Reed, Laurence Oberman, Lee Duncan, Martin K. Petersen,
	James Bottomley

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johannes Thumshirn <jthumshirn@suse.de>

commit 2bcbc81421c511ef117cadcf0bee9c4340e68db0 upstream.

The qla1280 driver sets the scsi_host_template's can_queue field to 0xfffff
which results in an allocation failure when allocating the block layer tags
for the driver's queues. This was introduced with the change for host wide
tags in commit 64d513ac31b - "scsi: use host wide tags by default".

Reduce can_queue to MAX_OUTSTANDING_COMMANDS (512) to solve the allocation
error.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Fixes: 64d513ac31b - "scsi: use host wide tags by default"
Cc: Laura Abbott <labbott@redhat.com>
Cc: Michael Reed <mdr@sgi.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <jejb@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/scsi/qla1280.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/scsi/qla1280.c
+++ b/drivers/scsi/qla1280.c
@@ -4214,7 +4214,7 @@ static struct scsi_host_template qla1280
 	.eh_bus_reset_handler	= qla1280_eh_bus_reset,
 	.eh_host_reset_handler	= qla1280_eh_adapter_reset,
 	.bios_param		= qla1280_biosparam,
-	.can_queue		= 0xfffff,
+	.can_queue		= MAX_OUTSTANDING_COMMANDS,
 	.this_id		= -1,
 	.sg_tablesize		= SG_ALL,
 	.use_clustering		= ENABLE_CLUSTERING,

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 072/101] tools lib traceevent: Do not reassign parg after collapse_tree()
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (62 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 071/101] qla1280: Dont allocate 512kb of host tags Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 073/101] get_rock_ridge_filename(): handle malformed NM entries Greg Kroah-Hartman
                   ` (28 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Steven Rostedt, Namhyung Kim,
	Arnaldo Carvalho de Melo

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Steven Rostedt <rostedt@goodmis.org>

commit 106b816cb46ebd87408b4ed99a2e16203114daa6 upstream.

At the end of process_filter(), collapse_tree() was changed to update
the parg parameter, but the reassignment after the call wasn't removed.

What happens is that the "current_op" gets modified and freed and parg
is assigned to the new allocated argument. But after the call to
collapse_tree(), parg is assigned again to the just freed "current_op",
and this causes the tool to crash.

The current_op variable must also be assigned to NULL in case of error,
otherwise it will cause it to be free()ed twice.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Fixes: 42d6194d133c ("tools lib traceevent: Refactor process_filter()")
Link: http://lkml.kernel.org/r/20160511150936.678c18a1@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 tools/lib/traceevent/parse-filter.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -1164,11 +1164,11 @@ process_filter(struct event_format *even
 		current_op = current_exp;
 
 	ret = collapse_tree(current_op, parg, error_str);
+	/* collapse_tree() may free current_op, and updates parg accordingly */
+	current_op = NULL;
 	if (ret < 0)
 		goto fail;
 
-	*parg = current_op;
-
 	free(token);
 	return 0;
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 073/101] get_rock_ridge_filename(): handle malformed NM entries
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (63 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 072/101] tools lib traceevent: Do not reassign parg after collapse_tree() Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 074/101] Input: max8997-haptic - fix NULL pointer dereference Greg Kroah-Hartman
                   ` (27 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Al Viro

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Al Viro <viro@zeniv.linux.org.uk>

commit 99d825822eade8d827a1817357cbf3f889a552d6 upstream.

Payloads of NM entries are not supposed to contain NUL.  When we run
into such, only the part prior to the first NUL goes into the
concatenation (i.e. the directory entry name being encoded by a bunch
of NM entries).  We do stop when the amount collected so far + the
claimed amount in the current NM entry exceed 254.  So far, so good,
but what we return as the total length is the sum of *claimed*
sizes, not the actual amount collected.  And that can grow pretty
large - not unlimited, since you'd need to put CE entries in
between to be able to get more than the maximum that could be
contained in one isofs directory entry / continuation chunk and
we are stop once we'd encountered 32 CEs, but you can get about 8Kb
easily.  And that's what will be passed to readdir callback as the
name length.  8Kb __copy_to_user() from a buffer allocated by
__get_free_page()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/isofs/rock.c |   13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

--- a/fs/isofs/rock.c
+++ b/fs/isofs/rock.c
@@ -203,6 +203,8 @@ int get_rock_ridge_filename(struct iso_d
 	int retnamlen = 0;
 	int truncate = 0;
 	int ret = 0;
+	char *p;
+	int len;
 
 	if (!ISOFS_SB(inode->i_sb)->s_rock)
 		return 0;
@@ -267,12 +269,17 @@ repeat:
 					rr->u.NM.flags);
 				break;
 			}
-			if ((strlen(retname) + rr->len - 5) >= 254) {
+			len = rr->len - 5;
+			if (retnamlen + len >= 254) {
 				truncate = 1;
 				break;
 			}
-			strncat(retname, rr->u.NM.name, rr->len - 5);
-			retnamlen += rr->len - 5;
+			p = memchr(rr->u.NM.name, '\0', len);
+			if (unlikely(p))
+				len = p - rr->u.NM.name;
+			memcpy(retname + retnamlen, rr->u.NM.name, len);
+			retnamlen += len;
+			retname[retnamlen] = '\0';
 			break;
 		case SIG('R', 'E'):
 			kfree(rs.buffer);

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 074/101] Input: max8997-haptic - fix NULL pointer dereference
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (64 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 073/101] get_rock_ridge_filename(): handle malformed NM entries Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 075/101] Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing" Greg Kroah-Hartman
                   ` (26 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Marek Szyprowski,
	Krzysztof Kozlowski, Dmitry Torokhov

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Marek Szyprowski <m.szyprowski@samsung.com>

commit 6ae645d5fa385f3787bf1723639cd907fe5865e7 upstream.

NULL pointer derefence happens when booting with DTB because the
platform data for haptic device is not set in supplied data from parent
MFD device.

The MFD device creates only platform data (from Device Tree) for itself,
not for haptic child.

Unable to handle kernel NULL pointer dereference at virtual address 0000009c
pgd = c0004000
	[0000009c] *pgd=00000000
	Internal error: Oops: 5 [#1] PREEMPT SMP ARM
	(max8997_haptic_probe) from [<c03f9cec>] (platform_drv_probe+0x4c/0xb0)
	(platform_drv_probe) from [<c03f8440>] (driver_probe_device+0x214/0x2c0)
	(driver_probe_device) from [<c03f8598>] (__driver_attach+0xac/0xb0)
	(__driver_attach) from [<c03f67ac>] (bus_for_each_dev+0x68/0x9c)
	(bus_for_each_dev) from [<c03f7a38>] (bus_add_driver+0x1a0/0x218)
	(bus_add_driver) from [<c03f8db0>] (driver_register+0x78/0xf8)
	(driver_register) from [<c0101774>] (do_one_initcall+0x90/0x1d8)
	(do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)
	(kernel_init_freeable) from [<c06bb5b4>] (kernel_init+0x8/0x114)
	(kernel_init) from [<c0107938>] (ret_from_fork+0x14/0x3c)

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: 104594b01ce7 ("Input: add driver support for MAX8997-haptic")
[k.kozlowski: Write commit message, add CC-stable]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/misc/max8997_haptic.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/input/misc/max8997_haptic.c
+++ b/drivers/input/misc/max8997_haptic.c
@@ -255,12 +255,14 @@ static int max8997_haptic_probe(struct p
 	struct max8997_dev *iodev = dev_get_drvdata(pdev->dev.parent);
 	const struct max8997_platform_data *pdata =
 					dev_get_platdata(iodev->dev);
-	const struct max8997_haptic_platform_data *haptic_pdata =
-					pdata->haptic_pdata;
+	const struct max8997_haptic_platform_data *haptic_pdata = NULL;
 	struct max8997_haptic *chip;
 	struct input_dev *input_dev;
 	int error;
 
+	if (pdata)
+		haptic_pdata = pdata->haptic_pdata;
+
 	if (!haptic_pdata) {
 		dev_err(&pdev->dev, "no haptic platform data\n");
 		return -EINVAL;

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 075/101] Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing"
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (65 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 074/101] Input: max8997-haptic - fix NULL pointer dereference Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 077/101] drm/radeon: fix PLL sharing on DCE6.1 (v2) Greg Kroah-Hartman
                   ` (25 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Sakari Ailus, Hans Verkuil,
	Mauro Carvalho Chehab

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mauro Carvalho Chehab <mchehab@osg.samsung.com>

commit 93f0750dcdaed083d6209b01e952e98ca730db66 upstream.

This patch causes a Kernel panic when called on a DVB driver.

This was also reported by David R <david@unsolicited.net>:

May  7 14:47:35 server kernel: [  501.247123] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
May  7 14:47:35 server kernel: [  501.247239] IP: [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.247354] PGD cae6f067 PUD ca99c067 PMD 0
May  7 14:47:35 server kernel: [  501.247426] Oops: 0000 [#1] SMP
May  7 14:47:35 server kernel: [  501.247482] Modules linked in: xfs tun
xt_connmark xt_TCPMSS xt_tcpmss xt_owner xt_REDIRECT nf_nat_redirect
xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 ts_kmp ts_bm xt_string
ipt_REJECT nf_reject_ipv4 xt_recent xt_conntrack xt_multiport xt_pkttype
xt_tcpudp xt_mark nf_log_ipv4 nf_log_common xt_LOG xt_limit
iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
nf_nat nf_conntrack iptable_filter ip_tables ip6table_filter ip6_tables
x_tables pppoe pppox dm_crypt ts2020 regmap_i2c ds3000 cx88_dvb dvb_pll
cx88_vp3054_i2c mt352 videobuf2_dvb cx8800 cx8802 cx88xx pl2303 tveeprom
videobuf2_dma_sg ppdev videobuf2_memops videobuf2_v4l2 videobuf2_core
dvb_usb_digitv snd_hda_codec_via snd_hda_codec_hdmi
snd_hda_codec_generic radeon dvb_usb snd_hda_intel amd64_edac_mod
serio_raw snd_hda_codec edac_core fbcon k10temp bitblit softcursor
snd_hda_core font snd_pcm_oss i2c_piix4 snd_mixer_oss tileblit
drm_kms_helper syscopyarea snd_pcm snd_seq_dummy sysfillrect snd_seq_oss
sysimgblt fb_sys_fops ttm snd_seq_midi r8169 snd_rawmidi drm
snd_seq_midi_event e1000e snd_seq snd_seq_device snd_timer snd ptp
pps_core i2c_algo_bit soundcore parport_pc ohci_pci shpchp tpm_tis tpm
nfsd auth_rpcgss oid_registry hwmon_vid exportfs nfs_acl mii nfs bonding
lockd grace lp sunrpc parport May  7 14:47:35 server kernel: [
501.249564] CPU: 1 PID: 6889 Comm: vb2-cx88[0] Not tainted 4.5.3 #3
May  7 14:47:35 server kernel: [  501.249644] Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 0211    07/08/2009
May  7 14:47:35 server kernel: [  501.249767] task: ffff8800aebf3600 ti: ffff8801e07a0000 task.ti: ffff8801e07a0000
May  7 14:47:35 server kernel: [  501.249861] RIP: 0010:[<ffffffffa0222c71>]  [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.250002] RSP: 0018:ffff8801e07a3de8  EFLAGS: 00010086
May  7 14:47:35 server kernel: [  501.250071] RAX: 0000000000000283 RBX: ffff880210dc5000 RCX: 0000000000000283
May  7 14:47:35 server kernel: [  501.250161] RDX: ffffffffa0222cf0 RSI: 0000000000000000 RDI: ffff880210dc5014
May  7 14:47:35 server kernel: [  501.250251] RBP: ffff8801e07a3df8 R08: ffff8801e07a0000 R09: 0000000000000000
May  7 14:47:35 server kernel: [  501.250348] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800cda2a9d8
May  7 14:47:35 server kernel: [  501.250438] R13: ffff880210dc51b8 R14: 0000000000000000 R15: ffff8800cda2a828
May  7 14:47:35 server kernel: [  501.250528] FS:  00007f5b77fff700(0000) GS:ffff88021fc40000(0000) knlGS:00000000adaffb40
May  7 14:47:35 server kernel: [  501.250631] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May  7 14:47:35 server kernel: [  501.250704] CR2: 0000000000000004 CR3: 00000000ca19d000 CR4: 00000000000006e0
May  7 14:47:35 server kernel: [  501.250794] Stack:
May  7 14:47:35 server kernel: [  501.250822]  ffff8801e07a3df8 ffffffffa0222cfd ffff8801e07a3e70 ffffffffa0236beb
May  7 14:47:35 server kernel: [  501.250937]  0000000000000283 ffff8801e07a3e94 0000000000000000 0000000000000000
May  7 14:47:35 server kernel: [  501.251051]  ffff8800aebf3600 ffffffff8108d8e0 ffff8801e07a3e38 ffff8801e07a3e38
May  7 14:47:35 server kernel: [  501.251165] Call Trace:
May  7 14:47:35 server kernel: [  501.251200]  [<ffffffffa0222cfd>] ? __verify_planes_array_core+0xd/0x10 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.251306]  [<ffffffffa0236beb>] vb2_core_dqbuf+0x2eb/0x4c0 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251398]  [<ffffffff8108d8e0>] ? prepare_to_wait_event+0x100/0x100
May  7 14:47:35 server kernel: [  501.251482]  [<ffffffffa023855b>] vb2_thread+0x1cb/0x220 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251569]  [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251662]  [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.255982]  [<ffffffff8106f984>] kthread+0xc4/0xe0
May  7 14:47:35 server kernel: [  501.260292]  [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50
May  7 14:47:35 server kernel: [  501.264615]  [<ffffffff81697a5f>] ret_from_fork+0x3f/0x70
May  7 14:47:35 server kernel: [  501.268962]  [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50
May  7 14:47:35 server kernel: [  501.273216] Code: 0d 01 74 16 48 8b 46 28 48 8b 56 30 48 89 87 d0 01 00 00 48 89 97 d8 01 00 00 5d c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 <8b> 46 04 48 89 e5 8d 50 f7 31 c0 83 fa 01 76 02 5d c3 48 83 7e
May  7 14:47:35 server kernel: [  501.282146] RIP  [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.286391]  RSP <ffff8801e07a3de8>
May  7 14:47:35 server kernel: [  501.290619] CR2: 0000000000000004
May  7 14:47:35 server kernel: [  501.294786] ---[ end trace b2b354153ccad110 ]---

This reverts commit 2c1f6951a8a82e6de0d82b1158b5e493fc6c54ab.

Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Fixes: 2c1f6951a8a8 ("[media] videobuf2-v4l2: Verify planes array in buffer dequeueing")
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/media/v4l2-core/videobuf2-v4l2.c |    6 ------
 1 file changed, 6 deletions(-)

--- a/drivers/media/v4l2-core/videobuf2-v4l2.c
+++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
@@ -74,11 +74,6 @@ static int __verify_planes_array(struct
 	return 0;
 }
 
-static int __verify_planes_array_core(struct vb2_buffer *vb, const void *pb)
-{
-	return __verify_planes_array(vb, pb);
-}
-
 /**
  * __verify_length() - Verify that the bytesused value for each plane fits in
  * the plane length and that the data offset doesn't exceed the bytesused value.
@@ -442,7 +437,6 @@ static int __fill_vb2_buffer(struct vb2_
 }
 
 static const struct vb2_buf_ops v4l2_buf_ops = {
-	.verify_planes_array	= __verify_planes_array_core,
 	.fill_user_buffer	= __fill_v4l2_buffer,
 	.fill_vb2_buffer	= __fill_vb2_buffer,
 	.copy_timestamp		= __copy_timestamp,

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 077/101] drm/radeon: fix PLL sharing on DCE6.1 (v2)
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (66 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 075/101] Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing" Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 078/101] drm/i915: Bail out of pipe config compute loop on LPT Greg Kroah-Hartman
                   ` (24 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Lucas Stach, Alex Deucher

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Lucas Stach <dev@lynxeye.de>

commit e3c00d87845ab375f90fa6e10a5e72a3a5778cd3 upstream.

On DCE6.1 PPLL2 is exclusively available to UNIPHYA, so it should not
be taken into consideration when looking for an already enabled PLL
to be shared with other outputs.

This fixes the broken VGA port (TRAVIS DP->VGA bridge) on my Richland
based laptop, where the internal display is connected to UNIPHYA through
a TRAVIS DP->LVDS bridge.

Bug:
https://bugs.freedesktop.org/show_bug.cgi?id=78987

v2: agd: add check in radeon_get_shared_nondp_ppll as well, drop
    extra parameter.

Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/gpu/drm/radeon/atombios_crtc.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

--- a/drivers/gpu/drm/radeon/atombios_crtc.c
+++ b/drivers/gpu/drm/radeon/atombios_crtc.c
@@ -1740,6 +1740,7 @@ static u32 radeon_get_pll_use_mask(struc
 static int radeon_get_shared_dp_ppll(struct drm_crtc *crtc)
 {
 	struct drm_device *dev = crtc->dev;
+	struct radeon_device *rdev = dev->dev_private;
 	struct drm_crtc *test_crtc;
 	struct radeon_crtc *test_radeon_crtc;
 
@@ -1749,6 +1750,10 @@ static int radeon_get_shared_dp_ppll(str
 		test_radeon_crtc = to_radeon_crtc(test_crtc);
 		if (test_radeon_crtc->encoder &&
 		    ENCODER_MODE_IS_DP(atombios_get_encoder_mode(test_radeon_crtc->encoder))) {
+			/* PPLL2 is exclusive to UNIPHYA on DCE61 */
+			if (ASIC_IS_DCE61(rdev) && !ASIC_IS_DCE8(rdev) &&
+			    test_radeon_crtc->pll_id == ATOM_PPLL2)
+				continue;
 			/* for DP use the same PLL for all */
 			if (test_radeon_crtc->pll_id != ATOM_PPLL_INVALID)
 				return test_radeon_crtc->pll_id;
@@ -1770,6 +1775,7 @@ static int radeon_get_shared_nondp_ppll(
 {
 	struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc);
 	struct drm_device *dev = crtc->dev;
+	struct radeon_device *rdev = dev->dev_private;
 	struct drm_crtc *test_crtc;
 	struct radeon_crtc *test_radeon_crtc;
 	u32 adjusted_clock, test_adjusted_clock;
@@ -1785,6 +1791,10 @@ static int radeon_get_shared_nondp_ppll(
 		test_radeon_crtc = to_radeon_crtc(test_crtc);
 		if (test_radeon_crtc->encoder &&
 		    !ENCODER_MODE_IS_DP(atombios_get_encoder_mode(test_radeon_crtc->encoder))) {
+			/* PPLL2 is exclusive to UNIPHYA on DCE61 */
+			if (ASIC_IS_DCE61(rdev) && !ASIC_IS_DCE8(rdev) &&
+			    test_radeon_crtc->pll_id == ATOM_PPLL2)
+				continue;
 			/* check if we are already driving this connector with another crtc */
 			if (test_radeon_crtc->connector == radeon_crtc->connector) {
 				/* if we are, return that pll */

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 078/101] drm/i915: Bail out of pipe config compute loop on LPT
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (67 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 077/101] drm/radeon: fix PLL sharing on DCE6.1 (v2) Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 079/101] Revert "drm/i915: start adding dp mst audio" Greg Kroah-Hartman
                   ` (23 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Chris Wilson, Maarten Lankhorst,
	Daniel Vetter, Daniel Vetter, Jani Nikula

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Vetter <daniel.vetter@ffwll.ch>

commit 2700818ac9f935d8590715eecd7e8cadbca552b6 upstream.

LPT is pch, so might run into the fdi bandwidth constraint (especially
since it has only 2 lanes). But right now we just force pipe_bpp back
to 24, resulting in a nice loop (which we bail out with a loud
WARN_ON). Fix this.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=93477
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1462264381-7573-1-git-send-email-daniel.vetter@ffwll.ch
(cherry picked from commit f58a1acc7e4a1f37d26124ce4c875c647fbcc61f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/gpu/drm/i915/intel_crt.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- a/drivers/gpu/drm/i915/intel_crt.c
+++ b/drivers/gpu/drm/i915/intel_crt.c
@@ -255,8 +255,14 @@ static bool intel_crt_compute_config(str
 		pipe_config->has_pch_encoder = true;
 
 	/* LPT FDI RX only supports 8bpc. */
-	if (HAS_PCH_LPT(dev))
+	if (HAS_PCH_LPT(dev)) {
+		if (pipe_config->bw_constrained && pipe_config->pipe_bpp < 24) {
+			DRM_DEBUG_KMS("LPT only supports 24bpp\n");
+			return false;
+		}
+
 		pipe_config->pipe_bpp = 24;
+	}
 
 	/* FDI must always be 2.7 GHz */
 	if (HAS_DDI(dev)) {

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 079/101] Revert "drm/i915: start adding dp mst audio"
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (68 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 078/101] drm/i915: Bail out of pipe config compute loop on LPT Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 081/101] drm/radeon: fix DP link training issue with second 4K monitor Greg Kroah-Hartman
                   ` (22 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Lyude, Daniel Vetter, Jani Nikula

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Lyude <cpaul@redhat.com>

commit 26792526cc3e29e3ccbc15c996beb61fa64be5af upstream.

Right now MST audio is causing too many kernel panics to really keep
around in the kernel. On top of that, even after fixing said panics it's
still basically non-functional (at least on all the setups I've tested
it on). Revert until we have a proper solution for this.

This reverts commit 3d52ccf52f2c51f613e42e65be0f06e4e6788093.

Signed-off-by: Lyude <cpaul@redhat.com>
Fixes: 3d52ccf52f2c ("drm/i915: start adding dp mst audio")
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1462287692-28570-1-git-send-email-cpaul@redhat.com
(cherry picked from commit 5a8f97ea04c98201deeb973c3f711c3c156115e9)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/gpu/drm/i915/i915_debugfs.c |   16 ----------------
 drivers/gpu/drm/i915/intel_audio.c  |    9 +++------
 drivers/gpu/drm/i915/intel_ddi.c    |   24 +++++-------------------
 drivers/gpu/drm/i915/intel_dp_mst.c |   22 ----------------------
 drivers/gpu/drm/i915/intel_drv.h    |    2 --
 5 files changed, 8 insertions(+), 65 deletions(-)

--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2860,20 +2860,6 @@ static void intel_dp_info(struct seq_fil
 		intel_panel_info(m, &intel_connector->panel);
 }
 
-static void intel_dp_mst_info(struct seq_file *m,
-			  struct intel_connector *intel_connector)
-{
-	struct intel_encoder *intel_encoder = intel_connector->encoder;
-	struct intel_dp_mst_encoder *intel_mst =
-		enc_to_mst(&intel_encoder->base);
-	struct intel_digital_port *intel_dig_port = intel_mst->primary;
-	struct intel_dp *intel_dp = &intel_dig_port->dp;
-	bool has_audio = drm_dp_mst_port_has_audio(&intel_dp->mst_mgr,
-					intel_connector->port);
-
-	seq_printf(m, "\taudio support: %s\n", yesno(has_audio));
-}
-
 static void intel_hdmi_info(struct seq_file *m,
 			    struct intel_connector *intel_connector)
 {
@@ -2917,8 +2903,6 @@ static void intel_connector_info(struct
 			intel_hdmi_info(m, intel_connector);
 		else if (intel_encoder->type == INTEL_OUTPUT_LVDS)
 			intel_lvds_info(m, intel_connector);
-		else if (intel_encoder->type == INTEL_OUTPUT_DP_MST)
-			intel_dp_mst_info(m, intel_connector);
 	}
 
 	seq_printf(m, "\tmodes:\n");
--- a/drivers/gpu/drm/i915/intel_audio.c
+++ b/drivers/gpu/drm/i915/intel_audio.c
@@ -262,8 +262,7 @@ static void hsw_audio_codec_disable(stru
 	tmp |= AUD_CONFIG_N_PROG_ENABLE;
 	tmp &= ~AUD_CONFIG_UPPER_N_MASK;
 	tmp &= ~AUD_CONFIG_LOWER_N_MASK;
-	if (intel_pipe_has_type(intel_crtc, INTEL_OUTPUT_DISPLAYPORT) ||
-	    intel_pipe_has_type(intel_crtc, INTEL_OUTPUT_DP_MST))
+	if (intel_pipe_has_type(intel_crtc, INTEL_OUTPUT_DISPLAYPORT))
 		tmp |= AUD_CONFIG_N_VALUE_INDEX;
 	I915_WRITE(HSW_AUD_CFG(pipe), tmp);
 
@@ -476,8 +475,7 @@ static void ilk_audio_codec_enable(struc
 	tmp &= ~AUD_CONFIG_N_VALUE_INDEX;
 	tmp &= ~AUD_CONFIG_N_PROG_ENABLE;
 	tmp &= ~AUD_CONFIG_PIXEL_CLOCK_HDMI_MASK;
-	if (intel_pipe_has_type(intel_crtc, INTEL_OUTPUT_DISPLAYPORT) ||
-	    intel_pipe_has_type(intel_crtc, INTEL_OUTPUT_DP_MST))
+	if (intel_pipe_has_type(intel_crtc, INTEL_OUTPUT_DISPLAYPORT))
 		tmp |= AUD_CONFIG_N_VALUE_INDEX;
 	else
 		tmp |= audio_config_hdmi_pixel_clock(adjusted_mode);
@@ -515,8 +513,7 @@ void intel_audio_codec_enable(struct int
 
 	/* ELD Conn_Type */
 	connector->eld[5] &= ~(3 << 2);
-	if (intel_pipe_has_type(crtc, INTEL_OUTPUT_DISPLAYPORT) ||
-	    intel_pipe_has_type(crtc, INTEL_OUTPUT_DP_MST))
+	if (intel_pipe_has_type(crtc, INTEL_OUTPUT_DISPLAYPORT))
 		connector->eld[5] |= (1 << 2);
 
 	connector->eld[6] = drm_av_sync_delay(connector, adjusted_mode) / 2;
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -3165,23 +3165,6 @@ void intel_ddi_fdi_disable(struct drm_cr
 	I915_WRITE(FDI_RX_CTL(PIPE_A), val);
 }
 
-bool intel_ddi_is_audio_enabled(struct drm_i915_private *dev_priv,
-				 struct intel_crtc *intel_crtc)
-{
-	u32 temp;
-
-	if (intel_display_power_get_if_enabled(dev_priv, POWER_DOMAIN_AUDIO)) {
-		temp = I915_READ(HSW_AUD_PIN_ELD_CP_VLD);
-
-		intel_display_power_put(dev_priv, POWER_DOMAIN_AUDIO);
-
-		if (temp & AUDIO_OUTPUT_ENABLE(intel_crtc->pipe))
-			return true;
-	}
-
-	return false;
-}
-
 void intel_ddi_get_config(struct intel_encoder *encoder,
 			  struct intel_crtc_state *pipe_config)
 {
@@ -3242,8 +3225,11 @@ void intel_ddi_get_config(struct intel_e
 		break;
 	}
 
-	pipe_config->has_audio =
-		intel_ddi_is_audio_enabled(dev_priv, intel_crtc);
+	if (intel_display_power_is_enabled(dev_priv, POWER_DOMAIN_AUDIO)) {
+		temp = I915_READ(HSW_AUD_PIN_ELD_CP_VLD);
+		if (temp & AUDIO_OUTPUT_ENABLE(intel_crtc->pipe))
+			pipe_config->has_audio = true;
+	}
 
 	if (encoder->type == INTEL_OUTPUT_EDP && dev_priv->vbt.edp_bpp &&
 	    pipe_config->pipe_bpp > dev_priv->vbt.edp_bpp) {
--- a/drivers/gpu/drm/i915/intel_dp_mst.c
+++ b/drivers/gpu/drm/i915/intel_dp_mst.c
@@ -78,8 +78,6 @@ static bool intel_dp_mst_compute_config(
 		return false;
 	}
 
-	if (drm_dp_mst_port_has_audio(&intel_dp->mst_mgr, found->port))
-		pipe_config->has_audio = true;
 	mst_pbn = drm_dp_calc_pbn_mode(adjusted_mode->crtc_clock, bpp);
 
 	pipe_config->pbn = mst_pbn;
@@ -104,11 +102,6 @@ static void intel_mst_disable_dp(struct
 	struct intel_dp_mst_encoder *intel_mst = enc_to_mst(&encoder->base);
 	struct intel_digital_port *intel_dig_port = intel_mst->primary;
 	struct intel_dp *intel_dp = &intel_dig_port->dp;
-	struct drm_device *dev = encoder->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct drm_crtc *crtc = encoder->base.crtc;
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-
 	int ret;
 
 	DRM_DEBUG_KMS("%d\n", intel_dp->active_mst_links);
@@ -119,10 +112,6 @@ static void intel_mst_disable_dp(struct
 	if (ret) {
 		DRM_ERROR("failed to update payload %d\n", ret);
 	}
-	if (intel_crtc->config->has_audio) {
-		intel_audio_codec_disable(encoder);
-		intel_display_power_put(dev_priv, POWER_DOMAIN_AUDIO);
-	}
 }
 
 static void intel_mst_post_disable_dp(struct intel_encoder *encoder)
@@ -219,7 +208,6 @@ static void intel_mst_enable_dp(struct i
 	struct intel_dp *intel_dp = &intel_dig_port->dp;
 	struct drm_device *dev = intel_dig_port->base.base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_crtc *crtc = to_intel_crtc(encoder->base.crtc);
 	enum port port = intel_dig_port->port;
 	int ret;
 
@@ -232,13 +220,6 @@ static void intel_mst_enable_dp(struct i
 	ret = drm_dp_check_act_status(&intel_dp->mst_mgr);
 
 	ret = drm_dp_update_payload_part2(&intel_dp->mst_mgr);
-
-	if (crtc->config->has_audio) {
-		DRM_DEBUG_DRIVER("Enabling DP audio on pipe %c\n",
-				 pipe_name(crtc->pipe));
-		intel_display_power_get(dev_priv, POWER_DOMAIN_AUDIO);
-		intel_audio_codec_enable(encoder);
-	}
 }
 
 static bool intel_dp_mst_enc_get_hw_state(struct intel_encoder *encoder,
@@ -264,9 +245,6 @@ static void intel_dp_mst_enc_get_config(
 
 	pipe_config->has_dp_encoder = true;
 
-	pipe_config->has_audio =
-		intel_ddi_is_audio_enabled(dev_priv, crtc);
-
 	temp = I915_READ(TRANS_DDI_FUNC_CTL(cpu_transcoder));
 	if (temp & TRANS_DDI_PHSYNC)
 		flags |= DRM_MODE_FLAG_PHSYNC;
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1013,8 +1013,6 @@ void intel_ddi_set_pipe_settings(struct
 void intel_ddi_prepare_link_retrain(struct intel_dp *intel_dp);
 bool intel_ddi_connector_get_hw_state(struct intel_connector *intel_connector);
 void intel_ddi_fdi_disable(struct drm_crtc *crtc);
-bool intel_ddi_is_audio_enabled(struct drm_i915_private *dev_priv,
-				 struct intel_crtc *intel_crtc);
 void intel_ddi_get_config(struct intel_encoder *encoder,
 			  struct intel_crtc_state *pipe_config);
 struct intel_encoder *

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 081/101] drm/radeon: fix DP link training issue with second 4K monitor
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (69 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 079/101] Revert "drm/i915: start adding dp mst audio" Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 082/101] drm/radeon: fix DP mode validation Greg Kroah-Hartman
                   ` (21 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Arindam Nath, Alex Deucher

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Arindam Nath <arindam.nath@amd.com>

commit 1a738347df2ee4977459a8776fe2c62196bdcb1b upstream.

There is an issue observed when we hotplug a second DP
4K monitor to the system. Sometimes, the link training
fails for the second monitor after HPD interrupt
generation.

The issue happens when some queued or deferred transactions
are already present on the AUX channel when we initiate
a new transcation to (say) get DPCD or during link training.

We set AUX_IGNORE_HPD_DISCON bit in the AUX_CONTROL
register so that we can ignore any such deferred
transactions when a new AUX transaction is initiated.

Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/gpu/drm/radeon/radeon_dp_auxch.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/radeon/radeon_dp_auxch.c
+++ b/drivers/gpu/drm/radeon/radeon_dp_auxch.c
@@ -105,7 +105,7 @@ radeon_dp_aux_transfer_native(struct drm
 
 	tmp &= AUX_HPD_SEL(0x7);
 	tmp |= AUX_HPD_SEL(chan->rec.hpd);
-	tmp |= AUX_EN | AUX_LS_READ_EN;
+	tmp |= AUX_EN | AUX_LS_READ_EN | AUX_HPD_DISCON(0x1);
 
 	WREG32(AUX_CONTROL + aux_offset[instance], tmp);
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 082/101] drm/radeon: fix DP mode validation
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (70 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 081/101] drm/radeon: fix DP link training issue with second 4K monitor Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 083/101] drm/amdgpu: " Greg Kroah-Hartman
                   ` (20 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Alex Deucher

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alex Deucher <alexander.deucher@amd.com>

commit ff0bd441bdfbfa09d05fdba9829a0401a46635c1 upstream.

Switch the order of the loops to walk the rates on the top
so we exhaust all DP 1.1 rate/lane combinations before trying
DP 1.2 rate/lane combos.

This avoids selecting rates that are supported by the monitor,
but not the connector leading to valid modes getting rejected.

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=95206

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/gpu/drm/radeon/atombios_dp.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/gpu/drm/radeon/atombios_dp.c
+++ b/drivers/gpu/drm/radeon/atombios_dp.c
@@ -326,8 +326,8 @@ int radeon_dp_get_dp_link_config(struct
 			}
 		}
 	} else {
-		for (lane_num = 1; lane_num <= max_lane_num; lane_num <<= 1) {
-			for (i = 0; i < ARRAY_SIZE(link_rates) && link_rates[i] <= max_link_rate; i++) {
+		for (i = 0; i < ARRAY_SIZE(link_rates) && link_rates[i] <= max_link_rate; i++) {
+			for (lane_num = 1; lane_num <= max_lane_num; lane_num <<= 1) {
 				max_pix_clock = (lane_num * link_rates[i] * 8) / bpp;
 				if (max_pix_clock >= pix_clock) {
 					*dp_lanes = lane_num;

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 083/101] drm/amdgpu: fix DP mode validation
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (71 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 082/101] drm/radeon: fix DP mode validation Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 084/101] btrfs: reada: Fix in-segment calculation for reada Greg Kroah-Hartman
                   ` (19 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Alex Deucher

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alex Deucher <alexander.deucher@amd.com>

commit c47b9e0944e483309d66c807d650ac8b8ceafb57 upstream.

Switch the order of the loops to walk the rates on the top
so we exhaust all DP 1.1 rate/lane combinations before trying
DP 1.2 rate/lane combos.

This avoids selecting rates that are supported by the monitor,
but not the connector leading to valid modes getting rejected.

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=95206

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/gpu/drm/amd/amdgpu/atombios_dp.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/gpu/drm/amd/amdgpu/atombios_dp.c
+++ b/drivers/gpu/drm/amd/amdgpu/atombios_dp.c
@@ -276,8 +276,8 @@ static int amdgpu_atombios_dp_get_dp_lin
 			}
 		}
 	} else {
-		for (lane_num = 1; lane_num <= max_lane_num; lane_num <<= 1) {
-			for (i = 0; i < ARRAY_SIZE(link_rates) && link_rates[i] <= max_link_rate; i++) {
+		for (i = 0; i < ARRAY_SIZE(link_rates) && link_rates[i] <= max_link_rate; i++) {
+			for (lane_num = 1; lane_num <= max_lane_num; lane_num <<= 1) {
 				max_pix_clock = (lane_num * link_rates[i] * 8) / bpp;
 				if (max_pix_clock >= pix_clock) {
 					*dp_lanes = lane_num;

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 084/101] btrfs: reada: Fix in-segment calculation for reada
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (72 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 083/101] drm/amdgpu: " Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 085/101] Btrfs: fix truncate_space_check Greg Kroah-Hartman
                   ` (18 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Zhao Lei, David Sterba

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Zhao Lei <zhaolei@cn.fujitsu.com>

commit 503785306d182ab624a2232856ef8ab503ee85f9 upstream.

reada_zone->end is end pos of segment:
 end = start + cache->key.offset - 1;

So we need to use "<=" in condition to judge is a pos in the
segment.

The problem happened rearly, because logical pos rarely pointed
to last 4k of a blockgroup, but we need to fix it to make code
right in logic.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/reada.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/fs/btrfs/reada.c
+++ b/fs/btrfs/reada.c
@@ -265,7 +265,7 @@ static struct reada_zone *reada_find_zon
 	spin_unlock(&fs_info->reada_lock);
 
 	if (ret == 1) {
-		if (logical >= zone->start && logical < zone->end)
+		if (logical >= zone->start && logical <= zone->end)
 			return zone;
 		spin_lock(&fs_info->reada_lock);
 		kref_put(&zone->refcnt, reada_zone_release);
@@ -679,7 +679,7 @@ static int reada_start_machine_dev(struc
 	 */
 	ret = radix_tree_gang_lookup(&dev->reada_extents, (void **)&re,
 				     dev->reada_next >> PAGE_CACHE_SHIFT, 1);
-	if (ret == 0 || re->logical >= dev->reada_curr_zone->end) {
+	if (ret == 0 || re->logical > dev->reada_curr_zone->end) {
 		ret = reada_pick_zone(dev);
 		if (!ret) {
 			spin_unlock(&fs_info->reada_lock);

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 085/101] Btrfs: fix truncate_space_check
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (73 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 084/101] btrfs: reada: Fix in-segment calculation for reada Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 086/101] btrfs: remove error message from search ioctl for nonexistent tree Greg Kroah-Hartman
                   ` (17 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Josef Bacik, David Sterba

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Josef Bacik <jbacik@fb.com>

commit dc95f7bfc57fa4b75a77d0da84d5db249d74aa3f upstream.

truncate_space_check is using btrfs_csum_bytes_to_leaves() but forgetting to
multiply by nodesize so we get an actual byte count.  We need a tracepoint here
so that we have the matching reserve for the release that will come later.  Also
add a comment to make clear what the intent of truncate_space_check is.

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/inode.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4211,11 +4211,20 @@ static int truncate_space_check(struct b
 {
 	int ret;
 
+	/*
+	 * This is only used to apply pressure to the enospc system, we don't
+	 * intend to use this reservation at all.
+	 */
 	bytes_deleted = btrfs_csum_bytes_to_leaves(root, bytes_deleted);
+	bytes_deleted *= root->nodesize;
 	ret = btrfs_block_rsv_add(root, &root->fs_info->trans_block_rsv,
 				  bytes_deleted, BTRFS_RESERVE_NO_FLUSH);
-	if (!ret)
+	if (!ret) {
+		trace_btrfs_space_reservation(root->fs_info, "transaction",
+					      trans->transid,
+					      bytes_deleted, 1);
 		trans->bytes_reserved += bytes_deleted;
+	}
 	return ret;
 
 }

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 086/101] btrfs: remove error message from search ioctl for nonexistent tree
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (74 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 085/101] Btrfs: fix truncate_space_check Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 087/101] btrfs: change max_inline default to 2048 Greg Kroah-Hartman
                   ` (16 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Vlastimil Babka, David Sterba

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: David Sterba <dsterba@suse.com>

commit 11ea474f74709fc764fb7e80306e0776f94ce8b8 upstream.

Let's remove the error message that appears when the tree_id is not
present. This can happen with the quota tree and has been observed in
practice. The applications are supposed to handle -ENOENT and we don't
need to report that in the system log as it's not a fatal error.

Reported-by: Vlastimil Babka <vbabka@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/ioctl.c |    2 --
 1 file changed, 2 deletions(-)

--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2097,8 +2097,6 @@ static noinline int search_ioctl(struct
 		key.offset = (u64)-1;
 		root = btrfs_read_fs_root_no_name(info, &key);
 		if (IS_ERR(root)) {
-			btrfs_err(info, "could not find root %llu",
-			       sk->tree_id);
 			btrfs_free_path(path);
 			return -ENOENT;
 		}

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 087/101] btrfs: change max_inline default to 2048
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (75 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 086/101] btrfs: remove error message from search ioctl for nonexistent tree Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 088/101] Btrfs: fix unreplayable log after snapshot delete + parent dir fsync Greg Kroah-Hartman
                   ` (15 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, David Sterba

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: David Sterba <dsterba@suse.com>

commit f7e98a7fff8634ae655c666dc2c9fc55a48d0a73 upstream.

The current practical default is ~4k on x86_64 (the logic is more complex,
simplified for brevity), the inlined files land in the metadata group and
thus consume space that could be needed for the real metadata.

The inlining brings some usability surprises:

1) total space consumption measured on various filesystems and btrfs
   with DUP metadata was quite visible because of the duplicated data
   within metadata

2) inlined data may exhaust the metadata, which are more precious in case
   the entire device space is allocated to chunks (ie. balance cannot
   make the space more compact)

3) performance suffers a bit as the inlined blocks are duplicate and
   stored far away on the device.

Proposed fix: set the default to 2048

This fixes namely 1), the total filesysystem space consumption will be on
par with other filesystems.

Partially fixes 2), more data are pushed to the data block groups.

The characteristics of 3) are based on actual small file size
distribution.

The change is independent of the metadata blockgroup type (though it's
most visible with DUP) or system page size as these parameters are not
trival to find out, compared to file size.

Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/ctree.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2252,7 +2252,7 @@ struct btrfs_ioctl_defrag_range_args {
 #define BTRFS_MOUNT_FREE_SPACE_TREE	(1 << 26)
 
 #define BTRFS_DEFAULT_COMMIT_INTERVAL	(30)
-#define BTRFS_DEFAULT_MAX_INLINE	(8192)
+#define BTRFS_DEFAULT_MAX_INLINE	(2048)
 
 #define btrfs_clear_opt(o, opt)		((o) &= ~BTRFS_MOUNT_##opt)
 #define btrfs_set_opt(o, opt)		((o) |= BTRFS_MOUNT_##opt)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 088/101] Btrfs: fix unreplayable log after snapshot delete + parent dir fsync
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (76 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 087/101] btrfs: change max_inline default to 2048 Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 089/101] Btrfs: fix file loss on log replay after renaming a file and fsync Greg Kroah-Hartman
                   ` (14 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Filipe Manana, Liu Bo, Chris Mason

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Filipe Manana <fdmanana@suse.com>

commit 1ec9a1ae1e30c733077c0b288c4301b66b7a81f2 upstream.

If we delete a snapshot, fsync its parent directory and crash/power fail
before the next transaction commit, on the next mount when we attempt to
replay the log tree of the root containing the parent directory we will
fail and prevent the filesystem from mounting, which is solvable by wiping
out the log trees with the btrfs-zero-log tool but very inconvenient as
we will lose any data and metadata fsynced before the parent directory
was fsynced.

For example:

  $ mkfs.btrfs -f /dev/sdc
  $ mount /dev/sdc /mnt
  $ mkdir /mnt/testdir
  $ btrfs subvolume snapshot /mnt /mnt/testdir/snap
  $ btrfs subvolume delete /mnt/testdir/snap
  $ xfs_io -c "fsync" /mnt/testdir
  < crash / power failure and reboot >
  $ mount /dev/sdc /mnt
  mount: mount(2) failed: No such file or directory

And in dmesg/syslog we get the following message and trace:

[192066.361162] BTRFS info (device dm-0): failed to delete reference to snap, inode 257 parent 257
[192066.363010] ------------[ cut here ]------------
[192066.365268] WARNING: CPU: 4 PID: 5130 at fs/btrfs/inode.c:3986 __btrfs_unlink_inode+0x17a/0x354 [btrfs]()
[192066.367250] BTRFS: Transaction aborted (error -2)
[192066.368401] Modules linked in: btrfs dm_flakey dm_mod ppdev sha256_generic xor raid6_pq hmac drbg ansi_cprng aesni_intel acpi_cpufreq tpm_tis aes_x86_64 tpm ablk_helper evdev cryptd sg parport_pc i2c_piix4 psmouse lrw parport i2c_core pcspkr gf128mul processor serio_raw glue_helper button loop autofs4 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring crc32c_intel scsi_mod e1000 virtio floppy [last unloaded: btrfs]
[192066.377154] CPU: 4 PID: 5130 Comm: mount Tainted: G        W       4.4.0-rc6-btrfs-next-20+ #1
[192066.378875] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
[192066.380889]  0000000000000000 ffff880143923670 ffffffff81257570 ffff8801439236b8
[192066.382561]  ffff8801439236a8 ffffffff8104ec07 ffffffffa039dc2c 00000000fffffffe
[192066.384191]  ffff8801ed31d000 ffff8801b9fc9c88 ffff8801086875e0 ffff880143923710
[192066.385827] Call Trace:
[192066.386373]  [<ffffffff81257570>] dump_stack+0x4e/0x79
[192066.387387]  [<ffffffff8104ec07>] warn_slowpath_common+0x99/0xb2
[192066.388429]  [<ffffffffa039dc2c>] ? __btrfs_unlink_inode+0x17a/0x354 [btrfs]
[192066.389236]  [<ffffffff8104ec68>] warn_slowpath_fmt+0x48/0x50
[192066.389884]  [<ffffffffa039dc2c>] __btrfs_unlink_inode+0x17a/0x354 [btrfs]
[192066.390621]  [<ffffffff81184b55>] ? iput+0xb0/0x266
[192066.391200]  [<ffffffffa039ea25>] btrfs_unlink_inode+0x1c/0x3d [btrfs]
[192066.391930]  [<ffffffffa03ca623>] check_item_in_log+0x1fe/0x29b [btrfs]
[192066.392715]  [<ffffffffa03ca827>] replay_dir_deletes+0x167/0x1cf [btrfs]
[192066.393510]  [<ffffffffa03cccc7>] replay_one_buffer+0x417/0x570 [btrfs]
[192066.394241]  [<ffffffffa03ca164>] walk_up_log_tree+0x10e/0x1dc [btrfs]
[192066.394958]  [<ffffffffa03cac72>] walk_log_tree+0xa5/0x190 [btrfs]
[192066.395628]  [<ffffffffa03ce8b8>] btrfs_recover_log_trees+0x239/0x32c [btrfs]
[192066.396790]  [<ffffffffa03cc8b0>] ? replay_one_extent+0x50a/0x50a [btrfs]
[192066.397891]  [<ffffffffa0394041>] open_ctree+0x1d8b/0x2167 [btrfs]
[192066.398897]  [<ffffffffa03706e1>] btrfs_mount+0x5ef/0x729 [btrfs]
[192066.399823]  [<ffffffff8108ad98>] ? trace_hardirqs_on+0xd/0xf
[192066.400739]  [<ffffffff8108959b>] ? lockdep_init_map+0xb9/0x1b3
[192066.401700]  [<ffffffff811714b9>] mount_fs+0x67/0x131
[192066.402482]  [<ffffffff81188560>] vfs_kern_mount+0x6c/0xde
[192066.403930]  [<ffffffffa03702bd>] btrfs_mount+0x1cb/0x729 [btrfs]
[192066.404831]  [<ffffffff8108ad98>] ? trace_hardirqs_on+0xd/0xf
[192066.405726]  [<ffffffff8108959b>] ? lockdep_init_map+0xb9/0x1b3
[192066.406621]  [<ffffffff811714b9>] mount_fs+0x67/0x131
[192066.407401]  [<ffffffff81188560>] vfs_kern_mount+0x6c/0xde
[192066.408247]  [<ffffffff8118ae36>] do_mount+0x893/0x9d2
[192066.409047]  [<ffffffff8113009b>] ? strndup_user+0x3f/0x8c
[192066.409842]  [<ffffffff8118b187>] SyS_mount+0x75/0xa1
[192066.410621]  [<ffffffff8147e517>] entry_SYSCALL_64_fastpath+0x12/0x6b
[192066.411572] ---[ end trace 2de42126c1e0a0f0 ]---
[192066.412344] BTRFS: error (device dm-0) in __btrfs_unlink_inode:3986: errno=-2 No such entry
[192066.413748] BTRFS: error (device dm-0) in btrfs_replay_log:2464: errno=-2 No such entry (Failed to recover log tree)
[192066.415458] BTRFS error (device dm-0): cleaner transaction attach returned -30
[192066.444613] BTRFS: open_ctree failed

This happens because when we are replaying the log and processing the
directory entry pointing to the snapshot in the subvolume tree, we treat
its btrfs_dir_item item as having a location with a key type matching
BTRFS_INODE_ITEM_KEY, which is wrong because the type matches
BTRFS_ROOT_ITEM_KEY and therefore must be processed differently, as the
object id refers to a root number and not to an inode in the root
containing the parent directory.

So fix this by triggering a transaction commit if an fsync against the
parent directory is requested after deleting a snapshot. This is the
simplest approach for a rare use case. Some alternative that avoids the
transaction commit would require more code to explicitly delete the
snapshot at log replay time (factoring out common code from ioctl.c:
btrfs_ioctl_snap_destroy()), special care at fsync time to remove the
log tree of the snapshot's root from the log root of the root of tree
roots, amongst other steps.

A test case for xfstests that triggers the issue follows.

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"
  tmp=/tmp/$$
  status=1	# failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
      _cleanup_flakey
      cd /
      rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter
  . ./common/dmflakey

  # real QA test starts here
  _need_to_be_root
  _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_dm_target flakey
  _require_metadata_journaling $SCRATCH_DEV

  rm -f $seqres.full

  _scratch_mkfs >>$seqres.full 2>&1
  _init_flakey
  _mount_flakey

  # Create a snapshot at the root of our filesystem (mount point path), delete it,
  # fsync the mount point path, crash and mount to replay the log. This should
  # succeed and after the filesystem is mounted the snapshot should not be visible
  # anymore.
  _run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap1
  _run_btrfs_util_prog subvolume delete $SCRATCH_MNT/snap1
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT
  _flakey_drop_and_remount
  [ -e $SCRATCH_MNT/snap1 ] && \
      echo "Snapshot snap1 still exists after log replay"

  # Similar scenario as above, but this time the snapshot is created inside a
  # directory and not directly under the root (mount point path).
  mkdir $SCRATCH_MNT/testdir
  _run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/testdir/snap2
  _run_btrfs_util_prog subvolume delete $SCRATCH_MNT/testdir/snap2
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/testdir
  _flakey_drop_and_remount
  [ -e $SCRATCH_MNT/testdir/snap2 ] && \
      echo "Snapshot snap2 still exists after log replay"

  _unmount_flakey

  echo "Silence is golden"
  status=0
  exit

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Tested-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/ioctl.c    |    3 +++
 fs/btrfs/tree-log.c |   15 +++++++++++++++
 fs/btrfs/tree-log.h |    2 ++
 3 files changed, 20 insertions(+)

--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -59,6 +59,7 @@
 #include "props.h"
 #include "sysfs.h"
 #include "qgroup.h"
+#include "tree-log.h"
 
 #ifdef CONFIG_64BIT
 /* If we have a 32-bit userspace and 64-bit kernel, then the UAPI
@@ -2525,6 +2526,8 @@ static noinline int btrfs_ioctl_snap_des
 out_end_trans:
 	trans->block_rsv = NULL;
 	trans->bytes_reserved = 0;
+	if (!err)
+		btrfs_record_snapshot_destroy(trans, dir);
 	ret = btrfs_end_transaction(trans, root);
 	if (ret && !err)
 		err = ret;
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -5635,6 +5635,21 @@ record:
 }
 
 /*
+ * Make sure that if someone attempts to fsync the parent directory of a deleted
+ * snapshot, it ends up triggering a transaction commit. This is to guarantee
+ * that after replaying the log tree of the parent directory's root we will not
+ * see the snapshot anymore and at log replay time we will not see any log tree
+ * corresponding to the deleted snapshot's root, which could lead to replaying
+ * it after replaying the log tree of the parent directory (which would replay
+ * the snapshot delete operation).
+ */
+void btrfs_record_snapshot_destroy(struct btrfs_trans_handle *trans,
+				   struct inode *dir)
+{
+	BTRFS_I(dir)->last_unlink_trans = trans->transid;
+}
+
+/*
  * Call this after adding a new name for a file and it will properly
  * update the log to reflect the new name.
  *
--- a/fs/btrfs/tree-log.h
+++ b/fs/btrfs/tree-log.h
@@ -79,6 +79,8 @@ int btrfs_pin_log_trans(struct btrfs_roo
 void btrfs_record_unlink_dir(struct btrfs_trans_handle *trans,
 			     struct inode *dir, struct inode *inode,
 			     int for_rename);
+void btrfs_record_snapshot_destroy(struct btrfs_trans_handle *trans,
+				   struct inode *dir);
 int btrfs_log_new_name(struct btrfs_trans_handle *trans,
 			struct inode *inode, struct inode *old_dir,
 			struct dentry *parent);

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 089/101] Btrfs: fix file loss on log replay after renaming a file and fsync
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (77 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 088/101] Btrfs: fix unreplayable log after snapshot delete + parent dir fsync Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 090/101] Btrfs: fix extent_same allowing destination offset beyond i_size Greg Kroah-Hartman
                   ` (13 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Filipe Manana, Chris Mason

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Filipe Manana <fdmanana@suse.com>

commit 2be63d5ce929603d4e7cedabd9e992eb34a0ff95 upstream.

We have two cases where we end up deleting a file at log replay time
when we should not. For this to happen the file must have been renamed
and a directory inode must have been fsynced/logged.

Two examples that exercise these two cases are listed below.

  Case 1)

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt
  $ mkdir -p /mnt/a/b
  $ mkdir /mnt/c
  $ touch /mnt/a/b/foo
  $ sync
  $ mv /mnt/a/b/foo /mnt/c/
  # Create file bar just to make sure the fsync on directory a/ does
  # something and it's not a no-op.
  $ touch /mnt/a/bar
  $ xfs_io -c "fsync" /mnt/a
  < power fail / crash >

  The next time the filesystem is mounted, the log replay procedure
  deletes file foo.

  Case 2)

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt
  $ mkdir /mnt/a
  $ mkdir /mnt/b
  $ mkdir /mnt/c
  $ touch /mnt/a/foo
  $ ln /mnt/a/foo /mnt/b/foo_link
  $ touch /mnt/b/bar
  $ sync
  $ unlink /mnt/b/foo_link
  $ mv /mnt/b/bar /mnt/c/
  $ xfs_io -c "fsync" /mnt/a/foo
  < power fail / crash >

  The next time the filesystem is mounted, the log replay procedure
  deletes file bar.

The reason why the files are deleted is because when we log inodes
other then the fsync target inode, we ignore their last_unlink_trans
value and leave the log without enough information to later replay the
rename operations. So we need to look at the last_unlink_trans values
and fallback to a transaction commit if they are greater than the
id of the last committed transaction.

So fix this by looking at the last_unlink_trans values and fallback to
transaction commits when needed. Also, when logging other inodes (for
case 1 we logged descendants of the fsync target inode while for case 2
we logged ascendants) we need to care about concurrent tasks updating
the last_unlink_trans of inodes we are logging (which was already an
existing problem in check_parent_dirs_for_sync()). Since we can not
acquire their inode mutex (vfs' struct inode ->i_mutex), as that causes
deadlocks with other concurrent operations that acquire the i_mutex of
2 inodes (other fsyncs or renames for example), we need to serialize on
the log_mutex of the inode we are logging. A task setting a new value for
an inode's last_unlink_trans must acquire the inode's log_mutex and it
must do this update before doing the actual unlink operation (which is
already the case except when deleting a snapshot). Conversely the task
logging the inode must first log the inode and then check the inode's
last_unlink_trans value while holding its log_mutex, as if its value is
not greater then the id of the last committed transaction it means it
logged a safe state of the inode's items, while if its value is not
smaller then the id of the last committed transaction it means the inode
state it has logged might not be safe (the concurrent task might have
just updated last_unlink_trans but hasn't done yet the unlink operation)
and therefore a transaction commit must be done.

Test cases for xfstests follow in separate patches.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/ioctl.c    |    4 +--
 fs/btrfs/tree-log.c |   67 ++++++++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 59 insertions(+), 12 deletions(-)

--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2475,6 +2475,8 @@ static noinline int btrfs_ioctl_snap_des
 	trans->block_rsv = &block_rsv;
 	trans->bytes_reserved = block_rsv.size;
 
+	btrfs_record_snapshot_destroy(trans, dir);
+
 	ret = btrfs_unlink_subvol(trans, root, dir,
 				dest->root_key.objectid,
 				dentry->d_name.name,
@@ -2526,8 +2528,6 @@ static noinline int btrfs_ioctl_snap_des
 out_end_trans:
 	trans->block_rsv = NULL;
 	trans->bytes_reserved = 0;
-	if (!err)
-		btrfs_record_snapshot_destroy(trans, dir);
 	ret = btrfs_end_transaction(trans, root);
 	if (ret && !err)
 		err = ret;
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -4909,6 +4909,42 @@ out_unlock:
 }
 
 /*
+ * Check if we must fallback to a transaction commit when logging an inode.
+ * This must be called after logging the inode and is used only in the context
+ * when fsyncing an inode requires the need to log some other inode - in which
+ * case we can't lock the i_mutex of each other inode we need to log as that
+ * can lead to deadlocks with concurrent fsync against other inodes (as we can
+ * log inodes up or down in the hierarchy) or rename operations for example. So
+ * we take the log_mutex of the inode after we have logged it and then check for
+ * its last_unlink_trans value - this is safe because any task setting
+ * last_unlink_trans must take the log_mutex and it must do this before it does
+ * the actual unlink operation, so if we do this check before a concurrent task
+ * sets last_unlink_trans it means we've logged a consistent version/state of
+ * all the inode items, otherwise we are not sure and must do a transaction
+ * commit (the concurrent task migth have only updated last_unlink_trans before
+ * we logged the inode or it might have also done the unlink).
+ */
+static bool btrfs_must_commit_transaction(struct btrfs_trans_handle *trans,
+					  struct inode *inode)
+{
+	struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info;
+	bool ret = false;
+
+	mutex_lock(&BTRFS_I(inode)->log_mutex);
+	if (BTRFS_I(inode)->last_unlink_trans > fs_info->last_trans_committed) {
+		/*
+		 * Make sure any commits to the log are forced to be full
+		 * commits.
+		 */
+		btrfs_set_log_full_commit(fs_info, trans);
+		ret = true;
+	}
+	mutex_unlock(&BTRFS_I(inode)->log_mutex);
+
+	return ret;
+}
+
+/*
  * follow the dentry parent pointers up the chain and see if any
  * of the directories in it require a full commit before they can
  * be logged.  Returns zero if nothing special needs to be done or 1 if
@@ -4921,7 +4957,6 @@ static noinline int check_parent_dirs_fo
 					       u64 last_committed)
 {
 	int ret = 0;
-	struct btrfs_root *root;
 	struct dentry *old_parent = NULL;
 	struct inode *orig_inode = inode;
 
@@ -4953,14 +4988,7 @@ static noinline int check_parent_dirs_fo
 			BTRFS_I(inode)->logged_trans = trans->transid;
 		smp_mb();
 
-		if (BTRFS_I(inode)->last_unlink_trans > last_committed) {
-			root = BTRFS_I(inode)->root;
-
-			/*
-			 * make sure any commits to the log are forced
-			 * to be full commits
-			 */
-			btrfs_set_log_full_commit(root->fs_info, trans);
+		if (btrfs_must_commit_transaction(trans, inode)) {
 			ret = 1;
 			break;
 		}
@@ -5119,6 +5147,9 @@ process_leaf:
 			btrfs_release_path(path);
 			ret = btrfs_log_inode(trans, root, di_inode,
 					      log_mode, 0, LLONG_MAX, ctx);
+			if (!ret &&
+			    btrfs_must_commit_transaction(trans, di_inode))
+				ret = 1;
 			iput(di_inode);
 			if (ret)
 				goto next_dir_inode;
@@ -5233,6 +5264,9 @@ static int btrfs_log_all_parents(struct
 
 			ret = btrfs_log_inode(trans, root, dir_inode,
 					      LOG_INODE_ALL, 0, LLONG_MAX, ctx);
+			if (!ret &&
+			    btrfs_must_commit_transaction(trans, dir_inode))
+				ret = 1;
 			iput(dir_inode);
 			if (ret)
 				goto out;
@@ -5584,6 +5618,9 @@ error:
  * They revolve around files there were unlinked from the directory, and
  * this function updates the parent directory so that a full commit is
  * properly done if it is fsync'd later after the unlinks are done.
+ *
+ * Must be called before the unlink operations (updates to the subvolume tree,
+ * inodes, etc) are done.
  */
 void btrfs_record_unlink_dir(struct btrfs_trans_handle *trans,
 			     struct inode *dir, struct inode *inode,
@@ -5599,8 +5636,11 @@ void btrfs_record_unlink_dir(struct btrf
 	 * into the file.  When the file is logged we check it and
 	 * don't log the parents if the file is fully on disk.
 	 */
-	if (S_ISREG(inode->i_mode))
+	if (S_ISREG(inode->i_mode)) {
+		mutex_lock(&BTRFS_I(inode)->log_mutex);
 		BTRFS_I(inode)->last_unlink_trans = trans->transid;
+		mutex_unlock(&BTRFS_I(inode)->log_mutex);
+	}
 
 	/*
 	 * if this directory was already logged any new
@@ -5631,7 +5671,9 @@ void btrfs_record_unlink_dir(struct btrf
 	return;
 
 record:
+	mutex_lock(&BTRFS_I(dir)->log_mutex);
 	BTRFS_I(dir)->last_unlink_trans = trans->transid;
+	mutex_unlock(&BTRFS_I(dir)->log_mutex);
 }
 
 /*
@@ -5642,11 +5684,16 @@ record:
  * corresponding to the deleted snapshot's root, which could lead to replaying
  * it after replaying the log tree of the parent directory (which would replay
  * the snapshot delete operation).
+ *
+ * Must be called before the actual snapshot destroy operation (updates to the
+ * parent root and tree of tree roots trees, etc) are done.
  */
 void btrfs_record_snapshot_destroy(struct btrfs_trans_handle *trans,
 				   struct inode *dir)
 {
+	mutex_lock(&BTRFS_I(dir)->log_mutex);
 	BTRFS_I(dir)->last_unlink_trans = trans->transid;
+	mutex_unlock(&BTRFS_I(dir)->log_mutex);
 }
 
 /*

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 090/101] Btrfs: fix extent_same allowing destination offset beyond i_size
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (78 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 089/101] Btrfs: fix file loss on log replay after renaming a file and fsync Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 091/101] Btrfs: fix deadlock between direct IO reads and buffered writes Greg Kroah-Hartman
                   ` (12 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Filipe Manana, Chris Mason

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Filipe Manana <fdmanana@suse.com>

commit f4dfe6871006c62abdccc77b2818b11f376e98e2 upstream.

When using the same file as the source and destination for a dedup
(extent_same ioctl) operation we were allowing it to dedup to a
destination offset beyond the file's size, which doesn't make sense and
it's not allowed for the case where the source and destination files are
not the same file. This made de deduplication operation successful only
when the source range corresponded to a hole, a prealloc extent or an
extent with all bytes having a value of 0x00. This was also leaving a
file hole (between i_size and destination offset) without the
corresponding file extent items, which can be reproduced with the
following steps for example:

  $ mkfs.btrfs -f /dev/sdi
  $ mount /dev/sdi /mnt/sdi

  $ xfs_io -f -c "pwrite -S 0xab 304457 404990" /mnt/sdi/foobar
  wrote 404990/404990 bytes at offset 304457
  395 KiB, 99 ops; 0.0000 sec (31.150 MiB/sec and 7984.5149 ops/sec)

  $ /git/hub/duperemove/btrfs-extent-same 24576 /mnt/sdi/foobar 28672 /mnt/sdi/foobar 929792
  Deduping 2 total files
  (28672, 24576): /mnt/sdi/foobar
  (929792, 24576): /mnt/sdi/foobar
  1 files asked to be deduped
  i: 0, status: 0, bytes_deduped: 24576
  24576 total bytes deduped in this operation

  $ umount /mnt/sdi
  $ btrfsck /dev/sdi
  Checking filesystem on /dev/sdi
  UUID: 98c528aa-0833-427d-9403-b98032ffbf9d
  checking extents
  checking free space cache
  checking fs roots
  root 5 inode 257 errors 100, file extent discount
  Found file extent holes:
          start: 712704, len: 217088
  found 540673 bytes used err is 1
  total csum bytes: 400
  total tree bytes: 131072
  total fs tree bytes: 32768
  total extent tree bytes: 16384
  btree space waste bytes: 123675
  file data blocks allocated: 671744
    referenced 671744
  btrfs-progs v4.2.3

So fix this by not allowing the destination to go beyond the file's size,
just as we do for the same where the source and destination files are not
the same.

A test for xfstests follows.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/ioctl.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3069,6 +3069,9 @@ static int btrfs_extent_same(struct inod
 		ret = extent_same_check_offsets(src, loff, &len, olen);
 		if (ret)
 			goto out_unlock;
+		ret = extent_same_check_offsets(src, dst_loff, &len, olen);
+		if (ret)
+			goto out_unlock;
 
 		/*
 		 * Single inode case wants the same checks, except we

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 091/101] Btrfs: fix deadlock between direct IO reads and buffered writes
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (79 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 090/101] Btrfs: fix extent_same allowing destination offset beyond i_size Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 092/101] Btrfs: fix race when checking if we can skip fsyncing an inode Greg Kroah-Hartman
                   ` (11 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Filipe Manana, Liu Bo, Chris Mason

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Filipe Manana <fdmanana@suse.com>

commit ade770294df29e08f913e5d733a756893128f45e upstream.

While running a test with a mix of buffered IO and direct IO against
the same files I hit a deadlock reported by the following trace:

[11642.140352] INFO: task kworker/u32:3:15282 blocked for more than 120 seconds.
[11642.142452]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.143982] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.146332] kworker/u32:3   D ffff880230ef7988 [11642.147737] systemd-journald[571]: Sent WATCHDOG=1 notification.
[11642.149771]     0 15282      2 0x00000000
[11642.151205] Workqueue: btrfs-flush_delalloc btrfs_flush_delalloc_helper [btrfs]
[11642.154074]  ffff880230ef7988 0000000000000246 0000000000014ec0 ffff88023ec94ec0
[11642.156722]  ffff880233fe8f80 ffff880230ef8000 ffff88023ec94ec0 7fffffffffffffff
[11642.159205]  0000000000000002 ffffffff8147b7f9 ffff880230ef79a0 ffffffff8147b541
[11642.161403] Call Trace:
[11642.162129]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.163396]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.164871]  [<ffffffff8147e7fe>] schedule_timeout+0x43/0x109
[11642.167020]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.167931]  [<ffffffff8108afd1>] ? trace_hardirqs_on_caller+0x17b/0x197
[11642.182320]  [<ffffffff8108affa>] ? trace_hardirqs_on+0xd/0xf
[11642.183762]  [<ffffffff810b079b>] ? timekeeping_get_ns+0xe/0x33
[11642.185308]  [<ffffffff810b0f61>] ? ktime_get+0x41/0x52
[11642.186782]  [<ffffffff8147ac08>] io_schedule_timeout+0xa0/0x102
[11642.188217]  [<ffffffff8147ac08>] ? io_schedule_timeout+0xa0/0x102
[11642.189626]  [<ffffffff8147b814>] bit_wait_io+0x1b/0x39
[11642.190803]  [<ffffffff8147bb21>] __wait_on_bit_lock+0x4c/0x90
[11642.192158]  [<ffffffff8111829f>] __lock_page+0x66/0x68
[11642.193379]  [<ffffffff81082f29>] ? autoremove_wake_function+0x3a/0x3a
[11642.194831]  [<ffffffffa0450ddd>] lock_page+0x31/0x34 [btrfs]
[11642.197068]  [<ffffffffa0454e3b>] extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs]
[11642.199188]  [<ffffffffa0455373>] extent_writepages+0x4b/0x5c [btrfs]
[11642.200723]  [<ffffffffa043c913>] ? btrfs_writepage_start_hook+0xce/0xce [btrfs]
[11642.202465]  [<ffffffffa043aa82>] btrfs_writepages+0x28/0x2a [btrfs]
[11642.203836]  [<ffffffff811236bc>] do_writepages+0x23/0x2c
[11642.205624]  [<ffffffff811198c9>] __filemap_fdatawrite_range+0x5a/0x61
[11642.207057]  [<ffffffff81119946>] filemap_fdatawrite_range+0x13/0x15
[11642.208529]  [<ffffffffa044f87e>] btrfs_start_ordered_extent+0xd0/0x1a1 [btrfs]
[11642.210375]  [<ffffffffa0462613>] ? btrfs_scrubparity_helper+0x140/0x33a [btrfs]
[11642.212132]  [<ffffffffa044f974>] btrfs_run_ordered_extent_work+0x25/0x34 [btrfs]
[11642.213837]  [<ffffffffa046262f>] btrfs_scrubparity_helper+0x15c/0x33a [btrfs]
[11642.215457]  [<ffffffffa046293b>] btrfs_flush_delalloc_helper+0xe/0x10 [btrfs]
[11642.217095]  [<ffffffff8106483e>] process_one_work+0x256/0x48b
[11642.218324]  [<ffffffff81064f20>] worker_thread+0x1f5/0x2a7
[11642.219466]  [<ffffffff81064d2b>] ? rescuer_thread+0x289/0x289
[11642.220801]  [<ffffffff8106a500>] kthread+0xd4/0xdc
[11642.222032]  [<ffffffff8106a42c>] ? kthread_parkme+0x24/0x24
[11642.223190]  [<ffffffff8147fdef>] ret_from_fork+0x3f/0x70
[11642.224394]  [<ffffffff8106a42c>] ? kthread_parkme+0x24/0x24
[11642.226295] 2 locks held by kworker/u32:3/15282:
[11642.227273]  #0:  ("%s-%s""btrfs", name){++++.+}, at: [<ffffffff8106474d>] process_one_work+0x165/0x48b
[11642.229412]  #1:  ((&work->normal_work)){+.+.+.}, at: [<ffffffff8106474d>] process_one_work+0x165/0x48b
[11642.231414] INFO: task kworker/u32:8:15289 blocked for more than 120 seconds.
[11642.232872]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.234109] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.235776] kworker/u32:8   D ffff88020de5f848     0 15289      2 0x00000000
[11642.237412] Workqueue: writeback wb_workfn (flush-btrfs-481)
[11642.238670]  ffff88020de5f848 0000000000000246 0000000000014ec0 ffff88023ed54ec0
[11642.240475]  ffff88021b1ece40 ffff88020de60000 ffff88023ed54ec0 7fffffffffffffff
[11642.242154]  0000000000000002 ffffffff8147b7f9 ffff88020de5f860 ffffffff8147b541
[11642.243715] Call Trace:
[11642.244390]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.245432]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.246392]  [<ffffffff8147e7fe>] schedule_timeout+0x43/0x109
[11642.247479]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.248551]  [<ffffffff8108afd1>] ? trace_hardirqs_on_caller+0x17b/0x197
[11642.249968]  [<ffffffff8108affa>] ? trace_hardirqs_on+0xd/0xf
[11642.251043]  [<ffffffff810b079b>] ? timekeeping_get_ns+0xe/0x33
[11642.252202]  [<ffffffff810b0f61>] ? ktime_get+0x41/0x52
[11642.253210]  [<ffffffff8147ac08>] io_schedule_timeout+0xa0/0x102
[11642.254307]  [<ffffffff8147ac08>] ? io_schedule_timeout+0xa0/0x102
[11642.256118]  [<ffffffff8147b814>] bit_wait_io+0x1b/0x39
[11642.257131]  [<ffffffff8147bb21>] __wait_on_bit_lock+0x4c/0x90
[11642.258200]  [<ffffffff8111829f>] __lock_page+0x66/0x68
[11642.259168]  [<ffffffff81082f29>] ? autoremove_wake_function+0x3a/0x3a
[11642.260516]  [<ffffffffa0450ddd>] lock_page+0x31/0x34 [btrfs]
[11642.261841]  [<ffffffffa0454e3b>] extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs]
[11642.263531]  [<ffffffffa0455373>] extent_writepages+0x4b/0x5c [btrfs]
[11642.264747]  [<ffffffffa043c913>] ? btrfs_writepage_start_hook+0xce/0xce [btrfs]
[11642.266148]  [<ffffffffa043aa82>] btrfs_writepages+0x28/0x2a [btrfs]
[11642.267264]  [<ffffffff811236bc>] do_writepages+0x23/0x2c
[11642.268280]  [<ffffffff81192a2b>] __writeback_single_inode+0xda/0x5ba
[11642.269407]  [<ffffffff811939f0>] writeback_sb_inodes+0x27b/0x43d
[11642.270476]  [<ffffffff81193c28>] __writeback_inodes_wb+0x76/0xae
[11642.271547]  [<ffffffff81193ea6>] wb_writeback+0x19e/0x41c
[11642.272588]  [<ffffffff81194821>] wb_workfn+0x201/0x341
[11642.273523]  [<ffffffff81194821>] ? wb_workfn+0x201/0x341
[11642.274479]  [<ffffffff8106483e>] process_one_work+0x256/0x48b
[11642.275497]  [<ffffffff81064f20>] worker_thread+0x1f5/0x2a7
[11642.276518]  [<ffffffff81064d2b>] ? rescuer_thread+0x289/0x289
[11642.277520]  [<ffffffff81064d2b>] ? rescuer_thread+0x289/0x289
[11642.278517]  [<ffffffff8106a500>] kthread+0xd4/0xdc
[11642.279371]  [<ffffffff8106a42c>] ? kthread_parkme+0x24/0x24
[11642.280468]  [<ffffffff8147fdef>] ret_from_fork+0x3f/0x70
[11642.281607]  [<ffffffff8106a42c>] ? kthread_parkme+0x24/0x24
[11642.282604] 3 locks held by kworker/u32:8/15289:
[11642.283423]  #0:  ("writeback"){++++.+}, at: [<ffffffff8106474d>] process_one_work+0x165/0x48b
[11642.285629]  #1:  ((&(&wb->dwork)->work)){+.+.+.}, at: [<ffffffff8106474d>] process_one_work+0x165/0x48b
[11642.287538]  #2:  (&type->s_umount_key#37){+++++.}, at: [<ffffffff81171217>] trylock_super+0x1b/0x4b
[11642.289423] INFO: task fdm-stress:26848 blocked for more than 120 seconds.
[11642.290547]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.291453] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.292864] fdm-stress      D ffff88022c107c20     0 26848  26591 0x00000000
[11642.294118]  ffff88022c107c20 000000038108affa 0000000000014ec0 ffff88023ed54ec0
[11642.295602]  ffff88013ab1ca40 ffff88022c108000 ffff8800b2fc19d0 00000000000e0fff
[11642.297098]  ffff8800b2fc19b0 ffff88022c107c88 ffff88022c107c38 ffffffff8147b541
[11642.298433] Call Trace:
[11642.298896]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.299738]  [<ffffffffa045225d>] lock_extent_bits+0xfe/0x1a3 [btrfs]
[11642.300833]  [<ffffffff81082eef>] ? add_wait_queue_exclusive+0x44/0x44
[11642.301943]  [<ffffffffa0447516>] lock_and_cleanup_extent_if_need+0x68/0x18e [btrfs]
[11642.303270]  [<ffffffffa04485ba>] __btrfs_buffered_write+0x238/0x4c1 [btrfs]
[11642.304552]  [<ffffffffa044b50a>] ? btrfs_file_write_iter+0x17c/0x408 [btrfs]
[11642.305782]  [<ffffffffa044b682>] btrfs_file_write_iter+0x2f4/0x408 [btrfs]
[11642.306878]  [<ffffffff8116e298>] __vfs_write+0x7c/0xa5
[11642.307729]  [<ffffffff8116e7d1>] vfs_write+0x9d/0xe8
[11642.308602]  [<ffffffff8116efbb>] SyS_write+0x50/0x7e
[11642.309410]  [<ffffffff8147fa97>] entry_SYSCALL_64_fastpath+0x12/0x6b
[11642.310403] 3 locks held by fdm-stress/26848:
[11642.311108]  #0:  (&f->f_pos_lock){+.+.+.}, at: [<ffffffff811877e8>] __fdget_pos+0x3a/0x40
[11642.312578]  #1:  (sb_writers#11){.+.+.+}, at: [<ffffffff811706ee>] __sb_start_write+0x5f/0xb0
[11642.314170]  #2:  (&sb->s_type->i_mutex_key#15){+.+.+.}, at: [<ffffffffa044b401>] btrfs_file_write_iter+0x73/0x408 [btrfs]
[11642.316796] INFO: task fdm-stress:26849 blocked for more than 120 seconds.
[11642.317842]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.318691] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.319959] fdm-stress      D ffff8801964ffa68     0 26849  26591 0x00000000
[11642.321312]  ffff8801964ffa68 00ff8801e9975f80 0000000000014ec0 ffff88023ed94ec0
[11642.322555]  ffff8800b00b4840 ffff880196500000 ffff8801e9975f20 0000000000000002
[11642.323715]  ffff8801e9975f18 ffff8800b00b4840 ffff8801964ffa80 ffffffff8147b541
[11642.325096] Call Trace:
[11642.325532]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.326303]  [<ffffffff8147e7fe>] schedule_timeout+0x43/0x109
[11642.327180]  [<ffffffff8108ae40>] ? mark_held_locks+0x5e/0x74
[11642.328114]  [<ffffffff8147f30e>] ? _raw_spin_unlock_irq+0x2c/0x4a
[11642.329051]  [<ffffffff8108afd1>] ? trace_hardirqs_on_caller+0x17b/0x197
[11642.330053]  [<ffffffff8147bceb>] __wait_for_common+0x109/0x147
[11642.330952]  [<ffffffff8147bceb>] ? __wait_for_common+0x109/0x147
[11642.331869]  [<ffffffff8147e7bb>] ? usleep_range+0x4a/0x4a
[11642.332925]  [<ffffffff81074075>] ? wake_up_q+0x47/0x47
[11642.333736]  [<ffffffff8147bd4d>] wait_for_completion+0x24/0x26
[11642.334672]  [<ffffffffa044f5ce>] btrfs_wait_ordered_extents+0x1c8/0x217 [btrfs]
[11642.335858]  [<ffffffffa0465b5a>] btrfs_mksubvol+0x224/0x45d [btrfs]
[11642.336854]  [<ffffffff81082eef>] ? add_wait_queue_exclusive+0x44/0x44
[11642.337820]  [<ffffffffa0465edb>] btrfs_ioctl_snap_create_transid+0x148/0x17a [btrfs]
[11642.339026]  [<ffffffffa046603b>] btrfs_ioctl_snap_create_v2+0xc7/0x110 [btrfs]
[11642.340214]  [<ffffffffa0468582>] btrfs_ioctl+0x590/0x27bd [btrfs]
[11642.341123]  [<ffffffff8147dc00>] ? mutex_unlock+0xe/0x10
[11642.341934]  [<ffffffffa00fa6e9>] ? ext4_file_write_iter+0x2a3/0x36f [ext4]
[11642.342936]  [<ffffffff8108895d>] ? __lock_is_held+0x3c/0x57
[11642.343772]  [<ffffffff81186a1d>] ? rcu_read_unlock+0x3e/0x5d
[11642.344673]  [<ffffffff8117dc95>] do_vfs_ioctl+0x458/0x4dc
[11642.346024]  [<ffffffff81186bbe>] ? __fget_light+0x62/0x71
[11642.346873]  [<ffffffff8117dd70>] SyS_ioctl+0x57/0x79
[11642.347720]  [<ffffffff8147fa97>] entry_SYSCALL_64_fastpath+0x12/0x6b
[11642.350222] 4 locks held by fdm-stress/26849:
[11642.350898]  #0:  (sb_writers#11){.+.+.+}, at: [<ffffffff811706ee>] __sb_start_write+0x5f/0xb0
[11642.352375]  #1:  (&type->i_mutex_dir_key#4/1){+.+.+.}, at: [<ffffffffa0465981>] btrfs_mksubvol+0x4b/0x45d [btrfs]
[11642.354072]  #2:  (&fs_info->subvol_sem){++++..}, at: [<ffffffffa0465a2a>] btrfs_mksubvol+0xf4/0x45d [btrfs]
[11642.355647]  #3:  (&root->ordered_extent_mutex){+.+...}, at: [<ffffffffa044f456>] btrfs_wait_ordered_extents+0x50/0x217 [btrfs]
[11642.357516] INFO: task fdm-stress:26850 blocked for more than 120 seconds.
[11642.358508]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.359376] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.368625] fdm-stress      D ffff88021f167688     0 26850  26591 0x00000000
[11642.369716]  ffff88021f167688 0000000000000001 0000000000014ec0 ffff88023edd4ec0
[11642.370950]  ffff880128a98680 ffff88021f168000 ffff88023edd4ec0 7fffffffffffffff
[11642.372210]  0000000000000002 ffffffff8147b7f9 ffff88021f1676a0 ffffffff8147b541
[11642.373430] Call Trace:
[11642.373853]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.374623]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.375948]  [<ffffffff8147e7fe>] schedule_timeout+0x43/0x109
[11642.376862]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.377637]  [<ffffffff8108afd1>] ? trace_hardirqs_on_caller+0x17b/0x197
[11642.378610]  [<ffffffff8108affa>] ? trace_hardirqs_on+0xd/0xf
[11642.379457]  [<ffffffff810b079b>] ? timekeeping_get_ns+0xe/0x33
[11642.380366]  [<ffffffff810b0f61>] ? ktime_get+0x41/0x52
[11642.381353]  [<ffffffff8147ac08>] io_schedule_timeout+0xa0/0x102
[11642.382255]  [<ffffffff8147ac08>] ? io_schedule_timeout+0xa0/0x102
[11642.383162]  [<ffffffff8147b814>] bit_wait_io+0x1b/0x39
[11642.383945]  [<ffffffff8147bb21>] __wait_on_bit_lock+0x4c/0x90
[11642.384875]  [<ffffffff8111829f>] __lock_page+0x66/0x68
[11642.385749]  [<ffffffff81082f29>] ? autoremove_wake_function+0x3a/0x3a
[11642.386721]  [<ffffffffa0450ddd>] lock_page+0x31/0x34 [btrfs]
[11642.387596]  [<ffffffffa0454e3b>] extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs]
[11642.389030]  [<ffffffffa0455373>] extent_writepages+0x4b/0x5c [btrfs]
[11642.389973]  [<ffffffff810a25ad>] ? rcu_read_lock_sched_held+0x61/0x69
[11642.390939]  [<ffffffffa043c913>] ? btrfs_writepage_start_hook+0xce/0xce [btrfs]
[11642.392271]  [<ffffffffa0451c32>] ? __clear_extent_bit+0x26e/0x2c0 [btrfs]
[11642.393305]  [<ffffffffa043aa82>] btrfs_writepages+0x28/0x2a [btrfs]
[11642.394239]  [<ffffffff811236bc>] do_writepages+0x23/0x2c
[11642.395045]  [<ffffffff811198c9>] __filemap_fdatawrite_range+0x5a/0x61
[11642.395991]  [<ffffffff81119946>] filemap_fdatawrite_range+0x13/0x15
[11642.397144]  [<ffffffffa044f87e>] btrfs_start_ordered_extent+0xd0/0x1a1 [btrfs]
[11642.398392]  [<ffffffffa0452094>] ? clear_extent_bit+0x17/0x19 [btrfs]
[11642.399363]  [<ffffffffa0445945>] btrfs_get_blocks_direct+0x12b/0x61c [btrfs]
[11642.400445]  [<ffffffff8119f7a1>] ? dio_bio_add_page+0x3d/0x54
[11642.401309]  [<ffffffff8119fa93>] ? submit_page_section+0x7b/0x111
[11642.402213]  [<ffffffff811a0258>] do_blockdev_direct_IO+0x685/0xc24
[11642.403139]  [<ffffffffa044581a>] ? btrfs_page_exists_in_range+0x1a1/0x1a1 [btrfs]
[11642.404360]  [<ffffffffa043d267>] ? btrfs_get_extent_fiemap+0x1c0/0x1c0 [btrfs]
[11642.406187]  [<ffffffff811a0828>] __blockdev_direct_IO+0x31/0x33
[11642.407070]  [<ffffffff811a0828>] ? __blockdev_direct_IO+0x31/0x33
[11642.407990]  [<ffffffffa043d267>] ? btrfs_get_extent_fiemap+0x1c0/0x1c0 [btrfs]
[11642.409192]  [<ffffffffa043b4ca>] btrfs_direct_IO+0x1c7/0x27e [btrfs]
[11642.410146]  [<ffffffffa043d267>] ? btrfs_get_extent_fiemap+0x1c0/0x1c0 [btrfs]
[11642.411291]  [<ffffffff81119a2c>] generic_file_read_iter+0x89/0x4e1
[11642.412263]  [<ffffffff8108ac05>] ? mark_lock+0x24/0x201
[11642.413057]  [<ffffffff8116e1f8>] __vfs_read+0x79/0x9d
[11642.413897]  [<ffffffff8116e6f1>] vfs_read+0x8f/0xd2
[11642.414708]  [<ffffffff8116ef3d>] SyS_read+0x50/0x7e
[11642.415573]  [<ffffffff8147fa97>] entry_SYSCALL_64_fastpath+0x12/0x6b
[11642.416572] 1 lock held by fdm-stress/26850:
[11642.417345]  #0:  (&f->f_pos_lock){+.+.+.}, at: [<ffffffff811877e8>] __fdget_pos+0x3a/0x40
[11642.418703] INFO: task fdm-stress:26851 blocked for more than 120 seconds.
[11642.419698]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.420612] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.421807] fdm-stress      D ffff880196483d28     0 26851  26591 0x00000000
[11642.422878]  ffff880196483d28 00ff8801c8f60740 0000000000014ec0 ffff88023ed94ec0
[11642.424149]  ffff8801c8f60740 ffff880196484000 0000000000000246 ffff8801c8f60740
[11642.425374]  ffff8801bb711840 ffff8801bb711878 ffff880196483d40 ffffffff8147b541
[11642.426591] Call Trace:
[11642.427013]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.427856]  [<ffffffff8147b6d5>] schedule_preempt_disabled+0x18/0x24
[11642.428852]  [<ffffffff8147c23a>] mutex_lock_nested+0x1d7/0x3b4
[11642.429743]  [<ffffffffa044f456>] ? btrfs_wait_ordered_extents+0x50/0x217 [btrfs]
[11642.430911]  [<ffffffffa044f456>] btrfs_wait_ordered_extents+0x50/0x217 [btrfs]
[11642.432102]  [<ffffffffa044f674>] ? btrfs_wait_ordered_roots+0x57/0x191 [btrfs]
[11642.433259]  [<ffffffffa044f456>] ? btrfs_wait_ordered_extents+0x50/0x217 [btrfs]
[11642.434431]  [<ffffffffa044f6ea>] btrfs_wait_ordered_roots+0xcd/0x191 [btrfs]
[11642.436079]  [<ffffffffa0410cab>] btrfs_sync_fs+0xe0/0x1ad [btrfs]
[11642.437009]  [<ffffffff81197900>] ? SyS_tee+0x23c/0x23c
[11642.437860]  [<ffffffff81197920>] sync_fs_one_sb+0x20/0x22
[11642.438723]  [<ffffffff81171435>] iterate_supers+0x75/0xc2
[11642.439597]  [<ffffffff81197d00>] sys_sync+0x52/0x80
[11642.440454]  [<ffffffff8147fa97>] entry_SYSCALL_64_fastpath+0x12/0x6b
[11642.441533] 3 locks held by fdm-stress/26851:
[11642.442370]  #0:  (&type->s_umount_key#37){+++++.}, at: [<ffffffff8117141f>] iterate_supers+0x5f/0xc2
[11642.444043]  #1:  (&fs_info->ordered_operations_mutex){+.+...}, at: [<ffffffffa044f661>] btrfs_wait_ordered_roots+0x44/0x191 [btrfs]
[11642.446010]  #2:  (&root->ordered_extent_mutex){+.+...}, at: [<ffffffffa044f456>] btrfs_wait_ordered_extents+0x50/0x217 [btrfs]

This happened because under specific timings the path for direct IO reads
can deadlock with concurrent buffered writes. The diagram below shows how
this happens for an example file that has the following layout:

     [  extent A  ]  [  extent B  ]  [ ....
     0K              4K              8K

     CPU 1                                               CPU 2                             CPU 3

DIO read against range
 [0K, 8K[ starts

btrfs_direct_IO()
  --> calls btrfs_get_blocks_direct()
      which finds the extent map for the
      extent A and leaves the range
      [0K, 4K[ locked in the inode's
      io tree

                                                   buffered write against
                                                   range [4K, 8K[ starts

                                                   __btrfs_buffered_write()
                                                     --> dirties page at 4K

                                                                                     a user space
                                                                                     task calls sync
                                                                                     for e.g or
                                                                                     writepages() is
                                                                                     invoked by mm

                                                                                     writepages()
                                                                                       run_delalloc_range()
                                                                                         cow_file_range()
                                                                                           --> ordered extent X
                                                                                               for the buffered
                                                                                               write is created
                                                                                               and
                                                                                               writeback starts

  --> calls btrfs_get_blocks_direct()
      again, without submitting first
      a bio for reading extent A, and
      finds the extent map for extent B

  --> calls lock_extent_direct()

      --> locks range [4K, 8K[
      --> finds ordered extent X
          covering range [4K, 8K[
      --> unlocks range [4K, 8K[

                                                  buffered write against
                                                  range [0K, 8K[ starts

                                                  __btrfs_buffered_write()
                                                    prepare_pages()
                                                      --> locks pages with
                                                          offsets 0 and 4K
                                                    lock_and_cleanup_extent_if_need()
                                                      --> blocks attempting to
                                                          lock range [0K, 8K[ in
                                                          the inode's io tree,
                                                          because the range [0, 4K[
                                                          is already locked by the
                                                          direct IO task at CPU 1

      --> calls
          btrfs_start_ordered_extent(oe X)

          btrfs_start_ordered_extent(oe X)

            --> At this point writeback for ordered
                extent X has not finished yet

            filemap_fdatawrite_range()
              btrfs_writepages()
                extent_writepages()
                  extent_write_cache_pages()
                    --> finds page with offset 0
                        with the writeback tag
                        (and not dirty)
                    --> tries to lock it
                         --> deadlock, task at CPU 2
                             has the page locked and
                             is blocked on the io range
                             [0, 4K[ that was locked
                             earlier by this task

So fix this by falling back to a buffered read in the direct IO read path
when an ordered extent for a buffered write is found.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/inode.c |   25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7423,7 +7423,26 @@ static int lock_extent_direct(struct ino
 				     cached_state, GFP_NOFS);
 
 		if (ordered) {
-			btrfs_start_ordered_extent(inode, ordered, 1);
+			/*
+			 * If we are doing a DIO read and the ordered extent we
+			 * found is for a buffered write, we can not wait for it
+			 * to complete and retry, because if we do so we can
+			 * deadlock with concurrent buffered writes on page
+			 * locks. This happens only if our DIO read covers more
+			 * than one extent map, if at this point has already
+			 * created an ordered extent for a previous extent map
+			 * and locked its range in the inode's io tree, and a
+			 * concurrent write against that previous extent map's
+			 * range and this range started (we unlock the ranges
+			 * in the io tree only when the bios complete and
+			 * buffered writes always lock pages before attempting
+			 * to lock range in the io tree).
+			 */
+			if (writing ||
+			    test_bit(BTRFS_ORDERED_DIRECT, &ordered->flags))
+				btrfs_start_ordered_extent(inode, ordered, 1);
+			else
+				ret = -ENOTBLK;
 			btrfs_put_ordered_extent(ordered);
 		} else {
 			/*
@@ -7440,9 +7459,11 @@ static int lock_extent_direct(struct ino
 			 * that page.
 			 */
 			ret = -ENOTBLK;
-			break;
 		}
 
+		if (ret)
+			break;
+
 		cond_resched();
 	}
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 092/101] Btrfs: fix race when checking if we can skip fsyncing an inode
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (80 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 091/101] Btrfs: fix deadlock between direct IO reads and buffered writes Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 093/101] Btrfs: do not collect ordered extents when logging that inode exists Greg Kroah-Hartman
                   ` (10 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Filipe Manana, Chris Mason

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Filipe Manana <fdmanana@suse.com>

commit affc0ff902d539ebe9bba405d330410314f46e9f upstream.

If we're about to do a fast fsync for an inode and btrfs_inode_in_log()
returns false, it's possible that we had an ordered extent in progress
(btrfs_finish_ordered_io() not run yet) when we noticed that the inode's
last_trans field was not greater than the id of the last committed
transaction, but shortly after, before we checked if there were any
ongoing ordered extents, the ordered extent had just completed and
removed itself from the inode's ordered tree, in which case we end up not
logging the inode, losing some data if a power failure or crash happens
after the fsync handler returns and before the transaction is committed.

Fix this by checking first if there are any ongoing ordered extents
before comparing the inode's last_trans with the id of the last committed
transaction - when it completes, an ordered extent always updates the
inode's last_trans before it removes itself from the inode's ordered
tree (at btrfs_finish_ordered_io()).

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/file.c |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1996,10 +1996,11 @@ int btrfs_sync_file(struct file *file, l
 	 */
 	smp_mb();
 	if (btrfs_inode_in_log(inode, root->fs_info->generation) ||
-	    (BTRFS_I(inode)->last_trans <=
-	     root->fs_info->last_trans_committed &&
-	     (full_sync ||
-	      !btrfs_have_ordered_extents_in_range(inode, start, len)))) {
+	    (full_sync && BTRFS_I(inode)->last_trans <=
+	     root->fs_info->last_trans_committed) ||
+	    (!btrfs_have_ordered_extents_in_range(inode, start, len) &&
+	     BTRFS_I(inode)->last_trans
+	     <= root->fs_info->last_trans_committed)) {
 		/*
 		 * We'v had everything committed since the last time we were
 		 * modified so clear this flag in case it was set for whatever

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 093/101] Btrfs: do not collect ordered extents when logging that inode exists
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (81 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 092/101] Btrfs: fix race when checking if we can skip fsyncing an inode Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 094/101] btrfs: csum_tree_block: return proper errno value Greg Kroah-Hartman
                   ` (9 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Filipe Manana, Chris Mason

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Filipe Manana <fdmanana@suse.com>

commit 5e33a2bd7ca7fa687fb0965869196eea6815d1f3 upstream.

When logging that an inode exists, for example as part of a directory
fsync operation, we were collecting any ordered extents for the inode but
we ended up doing nothing with them except tagging them as processed, by
setting the flag BTRFS_ORDERED_LOGGED on them, which prevented a
subsequent fsync of that inode (using the LOG_INODE_ALL mode) from
collecting and processing them. This created a time window where a second
fsync against the inode, using the fast path, ended up not logging the
checksums for the new extents but it logged the extents since they were
part of the list of modified extents. This happened because the ordered
extents were not collected and checksums were not yet added to the csum
tree - the ordered extents have not gone through btrfs_finish_ordered_io()
yet (which is where we add them to the csum tree by calling
inode.c:add_pending_csums()).

So fix this by not collecting an inode's ordered extents if we are logging
it with the LOG_INODE_EXISTS mode.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/tree-log.c |   17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -4621,7 +4621,22 @@ static int btrfs_log_inode(struct btrfs_
 
 	mutex_lock(&BTRFS_I(inode)->log_mutex);
 
-	btrfs_get_logged_extents(inode, &logged_list, start, end);
+	/*
+	 * Collect ordered extents only if we are logging data. This is to
+	 * ensure a subsequent request to log this inode in LOG_INODE_ALL mode
+	 * will process the ordered extents if they still exists at the time,
+	 * because when we collect them we test and set for the flag
+	 * BTRFS_ORDERED_LOGGED to prevent multiple log requests to process the
+	 * same ordered extents. The consequence for the LOG_INODE_ALL log mode
+	 * not processing the ordered extents is that we end up logging the
+	 * corresponding file extent items, based on the extent maps in the
+	 * inode's extent_map_tree's modified_list, without logging the
+	 * respective checksums (since the may still be only attached to the
+	 * ordered extents and have not been inserted in the csum tree by
+	 * btrfs_finish_ordered_io() yet).
+	 */
+	if (inode_only == LOG_INODE_ALL)
+		btrfs_get_logged_extents(inode, &logged_list, start, end);
 
 	/*
 	 * a brute force approach to making sure we get the most uptodate

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 094/101] btrfs: csum_tree_block: return proper errno value
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (82 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 093/101] Btrfs: do not collect ordered extents when logging that inode exists Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 095/101] btrfs: do not write corrupted metadata blocks to disk Greg Kroah-Hartman
                   ` (8 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alex Lyakas, Filipe Manana, David Sterba

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alex Lyakas <alex.bolshoy@gmail.com>

commit 8bd98f0e6bf792e8fa7c3fed709321ad42ba8d2e upstream.

Signed-off-by: Alex Lyakas <alex@zadarastorage.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/disk-io.c |   13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -303,7 +303,7 @@ static int csum_tree_block(struct btrfs_
 		err = map_private_extent_buffer(buf, offset, 32,
 					&kaddr, &map_start, &map_len);
 		if (err)
-			return 1;
+			return err;
 		cur_len = min(len, map_len - (offset - map_start));
 		crc = btrfs_csum_data(kaddr + offset - map_start,
 				      crc, cur_len);
@@ -313,7 +313,7 @@ static int csum_tree_block(struct btrfs_
 	if (csum_size > sizeof(inline_result)) {
 		result = kzalloc(csum_size, GFP_NOFS);
 		if (!result)
-			return 1;
+			return -ENOMEM;
 	} else {
 		result = (char *)&inline_result;
 	}
@@ -334,7 +334,7 @@ static int csum_tree_block(struct btrfs_
 				val, found, btrfs_header_level(buf));
 			if (result != (char *)&inline_result)
 				kfree(result);
-			return 1;
+			return -EUCLEAN;
 		}
 	} else {
 		write_extent_buffer(buf, result, 0, csum_size);
@@ -516,8 +516,7 @@ static int csum_dirty_buffer(struct btrf
 	found_start = btrfs_header_bytenr(eb);
 	if (WARN_ON(found_start != start || !PageUptodate(page)))
 		return 0;
-	csum_tree_block(fs_info, eb, 0);
-	return 0;
+	return csum_tree_block(fs_info, eb, 0);
 }
 
 static int check_tree_block_fsid(struct btrfs_fs_info *fs_info,
@@ -660,10 +659,8 @@ static int btree_readpage_end_io_hook(st
 				       eb, found_level);
 
 	ret = csum_tree_block(root->fs_info, eb, 1);
-	if (ret) {
-		ret = -EIO;
+	if (ret)
 		goto err;
-	}
 
 	/*
 	 * If this is a leaf block and it is corrupt, set the corrupt bit so

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 095/101] btrfs: do not write corrupted metadata blocks to disk
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (83 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 094/101] btrfs: csum_tree_block: return proper errno value Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 096/101] Btrfs: fix invalid reference in replace_path Greg Kroah-Hartman
                   ` (7 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alex Lyakas, Filipe Manana, David Sterba

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alex Lyakas <alex.bolshoy@gmail.com>

commit 0f805531daa2ebfb5706422dc2ead1cff9e53e65 upstream.

csum_dirty_buffer was issuing a warning in case the extent buffer
did not look alright, but was still returning success.
Let's return error in this case, and also add an additional sanity
check on the extent buffer header.
The caller up the chain may BUG_ON on this, for example flush_epd_write_bio will,
but it is better than to have a silent metadata corruption on disk.

Signed-off-by: Alex Lyakas <alex@zadarastorage.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/disk-io.c |   15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -513,9 +513,20 @@ static int csum_dirty_buffer(struct btrf
 	eb = (struct extent_buffer *)page->private;
 	if (page != eb->pages[0])
 		return 0;
+
 	found_start = btrfs_header_bytenr(eb);
-	if (WARN_ON(found_start != start || !PageUptodate(page)))
-		return 0;
+	/*
+	 * Please do not consolidate these warnings into a single if.
+	 * It is useful to know what went wrong.
+	 */
+	if (WARN_ON(found_start != start))
+		return -EUCLEAN;
+	if (WARN_ON(!PageUptodate(page)))
+		return -EUCLEAN;
+
+	ASSERT(memcmp_extent_buffer(eb, fs_info->fsid,
+			btrfs_header_fsid(), BTRFS_FSID_SIZE) == 0);
+
 	return csum_tree_block(fs_info, eb, 0);
 }
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 096/101] Btrfs: fix invalid reference in replace_path
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (84 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 095/101] btrfs: do not write corrupted metadata blocks to disk Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 097/101] btrfs: handle non-fatal errors in btrfs_qgroup_inherit() Greg Kroah-Hartman
                   ` (6 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dan Carpenter, Liu Bo, David Sterba

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Liu Bo <bo.li.liu@oracle.com>

commit 264813acb1c756aebc337b16b832604a0c9aadaf upstream.

Dan Carpenter's static checker has found this error, it's introduced by
commit 64c043de466d
("Btrfs: fix up read_tree_block to return proper error")

It's really supposed to 'break' the loop on error like others.

Cc: Dan Carpenter <dan.carpenter@oracle.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/relocation.c |    1 +
 1 file changed, 1 insertion(+)

--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1850,6 +1850,7 @@ again:
 			eb = read_tree_block(dest, old_bytenr, old_ptr_gen);
 			if (IS_ERR(eb)) {
 				ret = PTR_ERR(eb);
+				break;
 			} else if (!extent_buffer_uptodate(eb)) {
 				ret = -EIO;
 				free_extent_buffer(eb);

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 097/101] btrfs: handle non-fatal errors in btrfs_qgroup_inherit()
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (85 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 096/101] Btrfs: fix invalid reference in replace_path Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 098/101] btrfs: fallback to vmalloc in btrfs_compare_tree Greg Kroah-Hartman
                   ` (5 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mark Fasheh, Qu Wenruo, David Sterba

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Fasheh <mfasheh@suse.de>

commit 918c2ee103cf9956f1c61d3f848dbb49fd2d104a upstream.

create_pending_snapshot() will go readonly on _any_ error return from
btrfs_qgroup_inherit(). If qgroups are enabled, a user can crash their fs by
just making a snapshot and asking it to inherit from an invalid qgroup. For
example:

$ btrfs sub snap -i 1/10 /btrfs/ /btrfs/foo

Will cause a transaction abort.

Fix this by only throwing errors in btrfs_qgroup_inherit() when we know
going readonly is acceptable.

The following xfstests test case reproduces this bug:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"

  here=`pwd`
  tmp=/tmp/$$
  status=1	# failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
  	cd /
  	rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter

  # remove previous $seqres.full before test
  rm -f $seqres.full

  # real QA test starts here
  _supported_fs btrfs
  _supported_os Linux
  _require_scratch

  rm -f $seqres.full

  _scratch_mkfs
  _scratch_mount
  _run_btrfs_util_prog quota enable $SCRATCH_MNT
  # The qgroup '1/10' does not exist and should be silently ignored
  _run_btrfs_util_prog subvolume snapshot -i 1/10 $SCRATCH_MNT $SCRATCH_MNT/snap1

  _scratch_unmount

  echo "Silence is golden"

  status=0
  exit

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/qgroup.c |   54 ++++++++++++++++++++++++++++++++----------------------
 1 file changed, 32 insertions(+), 22 deletions(-)

--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1842,8 +1842,10 @@ out:
 }
 
 /*
- * copy the acounting information between qgroups. This is necessary when a
- * snapshot or a subvolume is created
+ * Copy the acounting information between qgroups. This is necessary
+ * when a snapshot or a subvolume is created. Throwing an error will
+ * cause a transaction abort so we take extra care here to only error
+ * when a readonly fs is a reasonable outcome.
  */
 int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans,
 			 struct btrfs_fs_info *fs_info, u64 srcid, u64 objectid,
@@ -1873,15 +1875,15 @@ int btrfs_qgroup_inherit(struct btrfs_tr
 		       2 * inherit->num_excl_copies;
 		for (i = 0; i < nums; ++i) {
 			srcgroup = find_qgroup_rb(fs_info, *i_qgroups);
-			if (!srcgroup) {
-				ret = -EINVAL;
-				goto out;
-			}
 
-			if ((srcgroup->qgroupid >> 48) <= (objectid >> 48)) {
-				ret = -EINVAL;
-				goto out;
-			}
+			/*
+			 * Zero out invalid groups so we can ignore
+			 * them later.
+			 */
+			if (!srcgroup ||
+			    ((srcgroup->qgroupid >> 48) <= (objectid >> 48)))
+				*i_qgroups = 0ULL;
+
 			++i_qgroups;
 		}
 	}
@@ -1916,17 +1918,19 @@ int btrfs_qgroup_inherit(struct btrfs_tr
 	 */
 	if (inherit) {
 		i_qgroups = (u64 *)(inherit + 1);
-		for (i = 0; i < inherit->num_qgroups; ++i) {
+		for (i = 0; i < inherit->num_qgroups; ++i, ++i_qgroups) {
+			if (*i_qgroups == 0)
+				continue;
 			ret = add_qgroup_relation_item(trans, quota_root,
 						       objectid, *i_qgroups);
-			if (ret)
+			if (ret && ret != -EEXIST)
 				goto out;
 			ret = add_qgroup_relation_item(trans, quota_root,
 						       *i_qgroups, objectid);
-			if (ret)
+			if (ret && ret != -EEXIST)
 				goto out;
-			++i_qgroups;
 		}
+		ret = 0;
 	}
 
 
@@ -1987,17 +1991,22 @@ int btrfs_qgroup_inherit(struct btrfs_tr
 
 	i_qgroups = (u64 *)(inherit + 1);
 	for (i = 0; i < inherit->num_qgroups; ++i) {
-		ret = add_relation_rb(quota_root->fs_info, objectid,
-				      *i_qgroups);
-		if (ret)
-			goto unlock;
+		if (*i_qgroups) {
+			ret = add_relation_rb(quota_root->fs_info, objectid,
+					      *i_qgroups);
+			if (ret)
+				goto unlock;
+		}
 		++i_qgroups;
 	}
 
-	for (i = 0; i <  inherit->num_ref_copies; ++i) {
+	for (i = 0; i <  inherit->num_ref_copies; ++i, i_qgroups += 2) {
 		struct btrfs_qgroup *src;
 		struct btrfs_qgroup *dst;
 
+		if (!i_qgroups[0] || !i_qgroups[1])
+			continue;
+
 		src = find_qgroup_rb(fs_info, i_qgroups[0]);
 		dst = find_qgroup_rb(fs_info, i_qgroups[1]);
 
@@ -2008,12 +2017,14 @@ int btrfs_qgroup_inherit(struct btrfs_tr
 
 		dst->rfer = src->rfer - level_size;
 		dst->rfer_cmpr = src->rfer_cmpr - level_size;
-		i_qgroups += 2;
 	}
-	for (i = 0; i <  inherit->num_excl_copies; ++i) {
+	for (i = 0; i <  inherit->num_excl_copies; ++i, i_qgroups += 2) {
 		struct btrfs_qgroup *src;
 		struct btrfs_qgroup *dst;
 
+		if (!i_qgroups[0] || !i_qgroups[1])
+			continue;
+
 		src = find_qgroup_rb(fs_info, i_qgroups[0]);
 		dst = find_qgroup_rb(fs_info, i_qgroups[1]);
 
@@ -2024,7 +2035,6 @@ int btrfs_qgroup_inherit(struct btrfs_tr
 
 		dst->excl = src->excl + level_size;
 		dst->excl_cmpr = src->excl_cmpr + level_size;
-		i_qgroups += 2;
 	}
 
 unlock:

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 098/101] btrfs: fallback to vmalloc in btrfs_compare_tree
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (86 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 097/101] btrfs: handle non-fatal errors in btrfs_qgroup_inherit() Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 099/101] Btrfs: dont use src fd for printk Greg Kroah-Hartman
                   ` (4 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, David Sterba

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: David Sterba <dsterba@suse.com>

commit 8f282f71eaee7ac979cdbe525f76daa0722798a8 upstream.

The allocation of node could fail if the memory is too fragmented for a
given node size, practically observed with 64k.

http://article.gmane.org/gmane.comp.file-systems.btrfs/54689

Reported-and-tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/ctree.c |   12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -19,6 +19,7 @@
 #include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/rbtree.h>
+#include <linux/vmalloc.h>
 #include "ctree.h"
 #include "disk-io.h"
 #include "transaction.h"
@@ -5361,10 +5362,13 @@ int btrfs_compare_trees(struct btrfs_roo
 		goto out;
 	}
 
-	tmp_buf = kmalloc(left_root->nodesize, GFP_NOFS);
+	tmp_buf = kmalloc(left_root->nodesize, GFP_KERNEL | __GFP_NOWARN);
 	if (!tmp_buf) {
-		ret = -ENOMEM;
-		goto out;
+		tmp_buf = vmalloc(left_root->nodesize);
+		if (!tmp_buf) {
+			ret = -ENOMEM;
+			goto out;
+		}
 	}
 
 	left_path->search_commit_root = 1;
@@ -5565,7 +5569,7 @@ int btrfs_compare_trees(struct btrfs_roo
 out:
 	btrfs_free_path(left_path);
 	btrfs_free_path(right_path);
-	kfree(tmp_buf);
+	kvfree(tmp_buf);
 	return ret;
 }
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 099/101] Btrfs: dont use src fd for printk
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (87 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 098/101] btrfs: fallback to vmalloc in btrfs_compare_tree Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 100/101] btrfs: Reset IO error counters before start of device replacing Greg Kroah-Hartman
                   ` (3 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Josef Bacik, David Sterba

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Josef Bacik <jbacik@fb.com>

commit c79b4713304f812d3d6c95826fc3e5fc2c0b0c14 upstream.

The fd we pass in may not be on a btrfs file system, so don't try to do
BTRFS_I() on it.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/ioctl.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1657,7 +1657,7 @@ static noinline int btrfs_ioctl_snap_cre
 
 		src_inode = file_inode(src.file);
 		if (src_inode->i_sb != file_inode(file)->i_sb) {
-			btrfs_info(BTRFS_I(src_inode)->root->fs_info,
+			btrfs_info(BTRFS_I(file_inode(file))->root->fs_info,
 				   "Snapshot src from another FS");
 			ret = -EXDEV;
 		} else if (!inode_owner_or_capable(src_inode)) {

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 100/101] btrfs: Reset IO error counters before start of device replacing
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (88 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 099/101] Btrfs: dont use src fd for printk Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17  1:21 ` [PATCH 4.5 101/101] nf_conntrack: avoid kernel pointer value leak in slab name Greg Kroah-Hartman
                   ` (2 subsequent siblings)
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Yauhen Kharuzhy, David Sterba

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Yauhen Kharuzhy <yauhen.kharuzhy@zavadatar.com>

commit 7ccefb98ce3e5c4493cd213cd03714b7149cf0cb upstream.

If device replace entry was found on disk at mounting and its num_write_errors
stats counter has non-NULL value, then replace operation will never be
finished and -EIO error will be reported by btrfs_scrub_dev() because
this counter is never reset.

 # mount -o degraded /media/a4fb5c0a-21c5-4fe7-8d0e-fdd87d5f71ee/
 # btrfs replace status /media/a4fb5c0a-21c5-4fe7-8d0e-fdd87d5f71ee/
 Started on 25.Mar 07:28:00, canceled on 25.Mar 07:28:01 at 0.0%, 40 write errs, 0 uncorr. read errs
 # btrfs replace start -B 4 /dev/sdg /media/a4fb5c0a-21c5-4fe7-8d0e-fdd87d5f71ee/
 ERROR: ioctl(DEV_REPLACE_START) failed on "/media/a4fb5c0a-21c5-4fe7-8d0e-fdd87d5f71ee/": Input/output error, no error

Reset num_write_errors and num_uncorrectable_read_errors counters in the
dev_replace structure before start of replacing.

Signed-off-by: Yauhen Kharuzhy <yauhen.kharuzhy@zavadatar.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/dev-replace.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -394,6 +394,8 @@ int btrfs_dev_replace_start(struct btrfs
 	dev_replace->cursor_right = 0;
 	dev_replace->is_valid = 1;
 	dev_replace->item_needs_writeback = 1;
+	atomic64_set(&dev_replace->num_write_errors, 0);
+	atomic64_set(&dev_replace->num_uncorrectable_read_errors, 0);
 	args->result = BTRFS_IOCTL_DEV_REPLACE_RESULT_NO_ERROR;
 	btrfs_dev_replace_unlock(dev_replace);
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH 4.5 101/101] nf_conntrack: avoid kernel pointer value leak in slab name
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (89 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 100/101] btrfs: Reset IO error counters before start of device replacing Greg Kroah-Hartman
@ 2016-05-17  1:21 ` Greg Kroah-Hartman
  2016-05-17 17:28 ` [PATCH 4.5 000/101] 4.5.5-stable review Guenter Roeck
  2016-05-17 17:28 ` Shuah Khan
  92 siblings, 0 replies; 94+ messages in thread
From: Greg Kroah-Hartman @ 2016-05-17  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Linus Torvalds, Eric Dumazet,
	David S. Miller

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <torvalds@linux-foundation.org>

commit 31b0b385f69d8d5491a4bca288e25e63f1d945d0 upstream.

The slab name ends up being visible in the directory structure under
/sys, and even if you don't have access rights to the file you can see
the filenames.

Just use a 64-bit counter instead of the pointer to the 'net' structure
to generate a unique name.

This code will go away in 4.7 when the conntrack code moves to a single
kmemcache, but this is the backportable simple solution to avoiding
leaking kernel pointers to user space.

Fixes: 5b3501faa874 ("netfilter: nf_conntrack: per netns nf_conntrack_cachep")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/netfilter/nf_conntrack_core.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1780,6 +1780,7 @@ void nf_conntrack_init_end(void)
 
 int nf_conntrack_init_net(struct net *net)
 {
+	static atomic64_t unique_id;
 	int ret = -ENOMEM;
 	int cpu;
 
@@ -1802,7 +1803,8 @@ int nf_conntrack_init_net(struct net *ne
 	if (!net->ct.stat)
 		goto err_pcpu_lists;
 
-	net->ct.slabname = kasprintf(GFP_KERNEL, "nf_conntrack_%p", net);
+	net->ct.slabname = kasprintf(GFP_KERNEL, "nf_conntrack_%llu",
+				(u64)atomic64_inc_return(&unique_id));
 	if (!net->ct.slabname)
 		goto err_slabname;
 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 4.5 000/101] 4.5.5-stable review
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (90 preceding siblings ...)
  2016-05-17  1:21 ` [PATCH 4.5 101/101] nf_conntrack: avoid kernel pointer value leak in slab name Greg Kroah-Hartman
@ 2016-05-17 17:28 ` Guenter Roeck
  2016-05-17 17:28 ` Shuah Khan
  92 siblings, 0 replies; 94+ messages in thread
From: Guenter Roeck @ 2016-05-17 17:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, shuah.kh, patches, stable

On Mon, May 16, 2016 at 06:20:05PM -0700, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.5.5 release.
> There are 101 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu May 19 01:14:44 UTC 2016.
> Anything received after that time might be too late.
> 

Build results:
	total: 149 pass: 149 fail: 0
Qemu test results:
	total: 108 pass: 108 fail: 0

Details are available at http://kerneltests.org/builders.

Guenter

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH 4.5 000/101] 4.5.5-stable review
  2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
                   ` (91 preceding siblings ...)
  2016-05-17 17:28 ` [PATCH 4.5 000/101] 4.5.5-stable review Guenter Roeck
@ 2016-05-17 17:28 ` Shuah Khan
  92 siblings, 0 replies; 94+ messages in thread
From: Shuah Khan @ 2016-05-17 17:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, linux, shuah.kh, patches, stable

On 05/16/2016 07:20 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.5.5 release.
> There are 101 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu May 19 01:14:44 UTC 2016.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.5.5-rc1.gz
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 94+ messages in thread

end of thread, other threads:[~2016-05-17 17:28 UTC | newest]

Thread overview: 94+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-17  1:20 [PATCH 4.5 000/101] 4.5.5-stable review Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 001/101] staging: wilc1000: remove extraneous variable Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 002/101] decnet: Do not build routes to devices without decnet private data Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 003/101] route: do not cache fib route info on local routes with oif Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 004/101] packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 005/101] net: sched: do not requeue a NULL skb Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 006/101] bpf/verifier: reject invalid LD_ABS | BPF_DW instruction Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 009/101] net: use skb_postpush_rcsum instead of own implementations Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 010/101] vlan: pull on __vlan_insert_tag error path and fix csum correction Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 011/101] openvswitch: Orphan skbs before IPv6 defrag Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 012/101] atl2: Disable unimplemented scatter/gather feature Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 013/101] openvswitch: use flow protocol when recalculating ipv6 checksums Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 014/101] net/mlx5_core: Fix soft lockup in steering error flow Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 015/101] net/mlx5e: Devices mtu field is u16 and not int Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 016/101] net/mlx5e: Fix minimum MTU Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 017/101] net/mlx5e: Use vport MTU rather than physical port MTU Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 018/101] ipv4/fib: dont warn when primary address is missing if in_dev is dead Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 019/101] net/mlx4_en: fix spurious timestamping callbacks Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 020/101] bpf: fix double-fdput in replace_map_fd_with_map_ptr() Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 021/101] bpf: fix refcnt overflow Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 022/101] bpf: fix check_map_func_compatibility logic Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 023/101] samples/bpf: fix trace_output example Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 024/101] net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 025/101] gre: do not pull header in ICMP error processing Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 026/101] net_sched: introduce qdisc_replace() helper Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 027/101] net_sched: update hierarchical backlog too Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 028/101] sch_htb: update backlog as well Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 029/101] sch_dsmark: " Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 030/101] netem: Segment GSO packets on enqueue Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 031/101] ipv6/ila: fix nlsize calculation for lwtunnel Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 035/101] net/mlx4_en: Fix endianness bug in IPV6 csum calculation Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 036/101] VSOCK: do not disconnect socket when peer has shutdown SEND only Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 037/101] net: bridge: fix old ioctl unlocked net device walk Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 040/101] net: fix a kernel infoleak in x25 module Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 041/101] net: thunderx: avoid exposing kernel stack Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 042/101] tcp: refresh skb timestamp at retransmit time Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 043/101] net/route: enforce hoplimit max value Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 044/101] ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 045/101] ocfs2: fix posix_acl_create deadlock Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 046/101] zsmalloc: fix zs_can_compact() integer overflow Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 047/101] mm: thp: calculate the mapcount correctly for THP pages during WP faults Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 048/101] crypto: qat - fix invalid pf2vf_resp_wq logic Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 049/101] crypto: qat - fix adf_ctl_drv.c:undefined reference to adf_init_pf_wq Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 050/101] crypto: hash - Fix page length clamping in hash walk Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 051/101] crypto: testmgr - Use kmalloc memory for RSA input Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 052/101] ALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2) Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 053/101] ALSA: usb-audio: Yet another Phoneix Audio device quirk Greg Kroah-Hartman
2016-05-17  1:20 ` [PATCH 4.5 054/101] ALSA: hda - Fix subwoofer pin on ASUS N751 and N551 Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 055/101] ALSA: hda - Fix white noise on Asus UX501VW headset Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 056/101] ALSA: hda - Fix broken reconfig Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 057/101] spi: pxa2xx: Do not detect number of enabled chip selects on Intel SPT Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 058/101] spi: spi-ti-qspi: Fix FLEN and WLEN settings if bits_per_word is overridden Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 059/101] spi: spi-ti-qspi: Handle truncated frames properly Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 060/101] pinctrl: at91-pio4: fix pull-up/down logic Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 061/101] regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 062/101] perf diff: Fix duplicated output column Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 063/101] perf/core: Disable the event on a truncated AUX record Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 064/101] vfs: add vfs_select_inode() helper Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 065/101] vfs: rename: check backing inode being equal Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 066/101] ARM: dts: at91: sam9x5: Fix the memory range assigned to the PMC Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 068/101] regulator: s2mps11: Fix invalid selector mask and voltages for buck9 Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 069/101] regulator: axp20x: Fix axp22x ldo_io voltage ranges Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 070/101] atomic_open(): fix the handling of create_error Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 071/101] qla1280: Dont allocate 512kb of host tags Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 072/101] tools lib traceevent: Do not reassign parg after collapse_tree() Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 073/101] get_rock_ridge_filename(): handle malformed NM entries Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 074/101] Input: max8997-haptic - fix NULL pointer dereference Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 075/101] Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing" Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 077/101] drm/radeon: fix PLL sharing on DCE6.1 (v2) Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 078/101] drm/i915: Bail out of pipe config compute loop on LPT Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 079/101] Revert "drm/i915: start adding dp mst audio" Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 081/101] drm/radeon: fix DP link training issue with second 4K monitor Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 082/101] drm/radeon: fix DP mode validation Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 083/101] drm/amdgpu: " Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 084/101] btrfs: reada: Fix in-segment calculation for reada Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 085/101] Btrfs: fix truncate_space_check Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 086/101] btrfs: remove error message from search ioctl for nonexistent tree Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 087/101] btrfs: change max_inline default to 2048 Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 088/101] Btrfs: fix unreplayable log after snapshot delete + parent dir fsync Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 089/101] Btrfs: fix file loss on log replay after renaming a file and fsync Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 090/101] Btrfs: fix extent_same allowing destination offset beyond i_size Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 091/101] Btrfs: fix deadlock between direct IO reads and buffered writes Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 092/101] Btrfs: fix race when checking if we can skip fsyncing an inode Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 093/101] Btrfs: do not collect ordered extents when logging that inode exists Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 094/101] btrfs: csum_tree_block: return proper errno value Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 095/101] btrfs: do not write corrupted metadata blocks to disk Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 096/101] Btrfs: fix invalid reference in replace_path Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 097/101] btrfs: handle non-fatal errors in btrfs_qgroup_inherit() Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 098/101] btrfs: fallback to vmalloc in btrfs_compare_tree Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 099/101] Btrfs: dont use src fd for printk Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 100/101] btrfs: Reset IO error counters before start of device replacing Greg Kroah-Hartman
2016-05-17  1:21 ` [PATCH 4.5 101/101] nf_conntrack: avoid kernel pointer value leak in slab name Greg Kroah-Hartman
2016-05-17 17:28 ` [PATCH 4.5 000/101] 4.5.5-stable review Guenter Roeck
2016-05-17 17:28 ` Shuah Khan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).